date:20200930

[jira] [Updated] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection

2020-09-30 Thread Denys Kuzmenko (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko updated HIVE-24211:
--
Component/s: Transactions

> Replace Snapshot invalidate logic with WriteSet check for txn conflict 
> detection
> 
>
> Key: HIVE-24211
> URL: https://issues.apache.org/jira/browse/HIVE-24211
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>
> *Issue with concurrent writes on partitioned table:*
> Concurrent writes on different partitions should execute in parallel without 
> issues. They acquire a shared lock on table level and exclusive write on 
> partition level (hive.txn.xlock.write=true).
> However there is a problem with the Snapshot validation. It compares valid 
> writeIds seen by current transaction, recorded before locking, with the 
> actual list of writeIds. The Issue is that writeId in Snapshot has no 
> information on partition, meaning that concurrent writes to different 
> partitions would be seen as writes to the same non-partitioned table causing 
> Snapshot to be obsolete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection

2020-09-30 Thread Denys Kuzmenko (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Denys Kuzmenko reassigned HIVE-24211:
-

Assignee: Denys Kuzmenko

> Replace Snapshot invalidate logic with WriteSet check for txn conflict 
> detection
> 
>
> Key: HIVE-24211
> URL: https://issues.apache.org/jira/browse/HIVE-24211
> Project: Hive
>  Issue Type: Bug
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>
> *Issue with concurrent writes on partitioned table:*
> Concurrent writes on different partitions should execute in parallel without 
> issues. They acquire a shared lock on table level and exclusive write on 
> partition level (hive.txn.xlock.write=true).
> However there is a problem with the Snapshot validation. It compares valid 
> writeIds seen by current transaction, recorded before locking, with the 
> actual list of writeIds. The Issue is that writeId in Snapshot has no 
> information on partition, meaning that concurrent writes to different 
> partitions would be seen as writes to the same non-partitioned table causing 
> Snapshot to be obsolete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24088) hive中把小数转为百分数导致超出预留的位数

2020-09-30 Thread bianqi (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204581#comment-17204581
 ] 

bianqi commented on HIVE-24088:
---

Please do not ask questions in Chinese

> hive中把小数转为百分数导致超出预留的位数
> --
>
> Key: HIVE-24088
> URL: https://issues.apache.org/jira/browse/HIVE-24088
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
> Environment: hive 
>Reporter: niaoshu
>Priority: Blocker
> Attachments: image-2020-08-28-19-10-45-892.png
>
>
> concat(round((class_xiaozhuan_xubao_stu_cnt / cast(class_xiaozhuan_stu_cnt as 
> double)),4) * 100,'%')
> !image-2020-08-28-19-10-45-892.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24088) hive中把小数转为百分数导致超出预留的位数

2020-09-30 Thread bianqi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bianqi resolved HIVE-24088.
---
Resolution: Invalid

> hive中把小数转为百分数导致超出预留的位数
> --
>
> Key: HIVE-24088
> URL: https://issues.apache.org/jira/browse/HIVE-24088
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
> Environment: hive 
>Reporter: niaoshu
>Priority: Blocker
> Attachments: image-2020-08-28-19-10-45-892.png
>
>
> concat(round((class_xiaozhuan_xubao_stu_cnt / cast(class_xiaozhuan_stu_cnt as 
> double)),4) * 100,'%')
> !image-2020-08-28-19-10-45-892.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-24038) Hive更改表中字段数据类型导致的异常

2020-09-30 Thread bianqi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bianqi resolved HIVE-24038.
---
Resolution: Invalid

> Hive更改表中字段数据类型导致的异常
> ---
>
> Key: HIVE-24038
> URL: https://issues.apache.org/jira/browse/HIVE-24038
> Project: Hive
>  Issue Type: Bug
>Reporter: Zuo Junhao
>Priority: Major
>
> 一张表有两个字段，在建表时(内部表)定义为int类型(user_role,label_type)，在向该表中insert数据时，这两个字段是string类型的，由于Hive的读时校验模式，不会报错，但是在再次select的时候，对应的字段会是null；此时我
>  alter table xxx chanage column xxx xxx 
> string后，再次向表中insert了一遍数据，再次select的时候，会报以下异常：Failed with exception 
> java.io.IOException:java.lang.ClassCastException: org.apache.hadoop.io.Text 
> cannot be cast to org.apache.hadoop.io.IntWritable



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-19253) HMS ignores tableType property for external tables

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19253?focusedWorklogId=492853&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492853
 ]

ASF GitHub Bot logged work on HIVE-19253:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 09:27
Start Date: 30/Sep/20 09:27
Worklog Time Spent: 10m 
  Work Description: szehonCriteo opened a new pull request #1537:
URL: https://github.com/apache/hive/pull/1537


   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492853)
Remaining Estimate: 0h
Time Spent: 10m

> HMS ignores tableType property for external tables
> --
>
> Key: HIVE-19253
> URL: https://issues.apache.org/jira/browse/HIVE-19253
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0, 4.0.0
>Reporter: Alex Kolbasov
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: newbie
> Attachments: HIVE-19253.01.patch, HIVE-19253.02.patch, 
> HIVE-19253.03.patch, HIVE-19253.03.patch, HIVE-19253.04.patch, 
> HIVE-19253.05.patch, HIVE-19253.06.patch, HIVE-19253.07.patch, 
> HIVE-19253.08.patch, HIVE-19253.09.patch, HIVE-19253.10.patch, 
> HIVE-19253.11.patch, HIVE-19253.12.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When someone creates a table using Thrift API they may think that setting 
> tableType to {{EXTERNAL_TABLE}} creates an external table. And boom - their 
> table is gone later because HMS will silently change it to managed table.
> here is the offending code:
> {code:java}
>   private MTable convertToMTable(Table tbl) throws InvalidObjectException,
>   MetaException {
> ...
> // If the table has property EXTERNAL set, update table type
> // accordingly
> String tableType = tbl.getTableType();
> boolean isExternal = 
> Boolean.parseBoolean(tbl.getParameters().get("EXTERNAL"));
> if (TableType.MANAGED_TABLE.toString().equals(tableType)) {
>   if (isExternal) {
> tableType = TableType.EXTERNAL_TABLE.toString();
>   }
> }
> if (TableType.EXTERNAL_TABLE.toString().equals(tableType)) {
>   if (!isExternal) { // Here!
> tableType = TableType.MANAGED_TABLE.toString();
>   }
> }
> {code}
> So if the EXTERNAL parameter is not set, table type is changed to managed 
> even if it was external in the first place - which is wrong.
> More over, in other places code looks at the table property to decide table 
> type and some places look at parameter. HMS should really make its mind which 
> one to use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-19253) HMS ignores tableType property for external tables

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19253:
--
Labels: newbie pull-request-available  (was: newbie)

> HMS ignores tableType property for external tables
> --
>
> Key: HIVE-19253
> URL: https://issues.apache.org/jira/browse/HIVE-19253
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0, 4.0.0
>Reporter: Alex Kolbasov
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HIVE-19253.01.patch, HIVE-19253.02.patch, 
> HIVE-19253.03.patch, HIVE-19253.03.patch, HIVE-19253.04.patch, 
> HIVE-19253.05.patch, HIVE-19253.06.patch, HIVE-19253.07.patch, 
> HIVE-19253.08.patch, HIVE-19253.09.patch, HIVE-19253.10.patch, 
> HIVE-19253.11.patch, HIVE-19253.12.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When someone creates a table using Thrift API they may think that setting 
> tableType to {{EXTERNAL_TABLE}} creates an external table. And boom - their 
> table is gone later because HMS will silently change it to managed table.
> here is the offending code:
> {code:java}
>   private MTable convertToMTable(Table tbl) throws InvalidObjectException,
>   MetaException {
> ...
> // If the table has property EXTERNAL set, update table type
> // accordingly
> String tableType = tbl.getTableType();
> boolean isExternal = 
> Boolean.parseBoolean(tbl.getParameters().get("EXTERNAL"));
> if (TableType.MANAGED_TABLE.toString().equals(tableType)) {
>   if (isExternal) {
> tableType = TableType.EXTERNAL_TABLE.toString();
>   }
> }
> if (TableType.EXTERNAL_TABLE.toString().equals(tableType)) {
>   if (!isExternal) { // Here!
> tableType = TableType.MANAGED_TABLE.toString();
>   }
> }
> {code}
> So if the EXTERNAL parameter is not set, table type is changed to managed 
> even if it was external in the first place - which is wrong.
> More over, in other places code looks at the table property to decide table 
> type and some places look at parameter. HMS should really make its mind which 
> one to use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-19253) HMS ignores tableType property for external tables

2020-09-30 Thread Szehon Ho (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-19253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204602#comment-17204602
 ] 

Szehon Ho commented on HIVE-19253:
--

Hello guys, would it be possible to merge this patch?   This broken API makes 
it quite confusing.

Alex's approach seems reasonable good (accept tableType while keeping the old 
way for compatibility).  I took the liberty of rebasing and submitting a pull 
request as per the new method, hope you guys don't mind. 



> HMS ignores tableType property for external tables
> --
>
> Key: HIVE-19253
> URL: https://issues.apache.org/jira/browse/HIVE-19253
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.0.0, 3.1.0, 4.0.0
>Reporter: Alex Kolbasov
>Assignee: Vihang Karajgaonkar
>Priority: Major
>  Labels: newbie, pull-request-available
> Attachments: HIVE-19253.01.patch, HIVE-19253.02.patch, 
> HIVE-19253.03.patch, HIVE-19253.03.patch, HIVE-19253.04.patch, 
> HIVE-19253.05.patch, HIVE-19253.06.patch, HIVE-19253.07.patch, 
> HIVE-19253.08.patch, HIVE-19253.09.patch, HIVE-19253.10.patch, 
> HIVE-19253.11.patch, HIVE-19253.12.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When someone creates a table using Thrift API they may think that setting 
> tableType to {{EXTERNAL_TABLE}} creates an external table. And boom - their 
> table is gone later because HMS will silently change it to managed table.
> here is the offending code:
> {code:java}
>   private MTable convertToMTable(Table tbl) throws InvalidObjectException,
>   MetaException {
> ...
> // If the table has property EXTERNAL set, update table type
> // accordingly
> String tableType = tbl.getTableType();
> boolean isExternal = 
> Boolean.parseBoolean(tbl.getParameters().get("EXTERNAL"));
> if (TableType.MANAGED_TABLE.toString().equals(tableType)) {
>   if (isExternal) {
> tableType = TableType.EXTERNAL_TABLE.toString();
>   }
> }
> if (TableType.EXTERNAL_TABLE.toString().equals(tableType)) {
>   if (!isExternal) { // Here!
> tableType = TableType.MANAGED_TABLE.toString();
>   }
> }
> {code}
> So if the EXTERNAL parameter is not set, table type is changed to managed 
> even if it was external in the first place - which is wrong.
> More over, in other places code looks at the table property to decide table 
> type and some places look at parameter. HMS should really make its mind which 
> one to use.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches in Hive streaming v1

2020-09-30 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita reassigned HIVE-21375:
-

Assignee: Ádám Szita

> Closing TransactionBatch closes FileSystem for other batches in Hive 
> streaming v1
> -
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Reporter: Shawn Weeks
>Assignee: Ádám Szita
>Priority: Minor
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches in Hive streaming v1

2020-09-30 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita updated HIVE-21375:
--
Summary: Closing TransactionBatch closes FileSystem for other batches in 
Hive streaming v1  (was: Closing TransactionBatch closes FileSystem for other 
batches)

> Closing TransactionBatch closes FileSystem for other batches in Hive 
> streaming v1
> -
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Reporter: Shawn Weeks
>Priority: Minor
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24211:
--
Labels: pull-request-available  (was: )

> Replace Snapshot invalidate logic with WriteSet check for txn conflict 
> detection
> 
>
> Key: HIVE-24211
> URL: https://issues.apache.org/jira/browse/HIVE-24211
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Issue with concurrent writes on partitioned table:*
> Concurrent writes on different partitions should execute in parallel without 
> issues. They acquire a shared lock on table level and exclusive write on 
> partition level (hive.txn.xlock.write=true).
> However there is a problem with the Snapshot validation. It compares valid 
> writeIds seen by current transaction, recorded before locking, with the 
> actual list of writeIds. The Issue is that writeId in Snapshot has no 
> information on partition, meaning that concurrent writes to different 
> partitions would be seen as writes to the same non-partitioned table causing 
> Snapshot to be obsolete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24211?focusedWorklogId=492855&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492855
 ]

ASF GitHub Bot logged work on HIVE-24211:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 09:44
Start Date: 30/Sep/20 09:44
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1533:
URL: https://github.com/apache/hive/pull/1533#discussion_r497380399



##
File path: ql/src/java/org/apache/hadoop/hive/ql/Driver.java
##
@@ -497,38 +497,41 @@ private void runInternal(String command, boolean 
alreadyCompiled) throws Command
 HiveConf.ConfVars.HIVE_TXN_MAX_RETRYSNAPSHOT_COUNT);
 
   try {
-while (!driverTxnHandler.isValidTxnListState() && ++retryShapshotCnt 
<= maxRetrySnapshotCnt) {
-  LOG.info("Re-compiling after acquiring locks, attempt #" + 
retryShapshotCnt);
-  // Snapshot was outdated when locks were acquired, hence regenerate 
context, txn list and retry.
-  // TODO: Lock acquisition should be moved before analyze, this is a 
bit hackish.
-  // Currently, we acquire a snapshot, compile the query with that 
snapshot, and then - acquire locks.
-  // If snapshot is still valid, we continue as usual.
-  // But if snapshot is not valid, we recompile the query.
-  if (driverContext.isOutdatedTxn()) {
-LOG.info("Snapshot is outdated, re-initiating transaction ...");
-driverContext.getTxnManager().rollbackTxn();
-
-String userFromUGI = DriverUtils.getUserFromUGI(driverContext);
-driverContext.getTxnManager().openTxn(context, userFromUGI, 
driverContext.getTxnType());
-lockAndRespond();
+do {
+  driverContext.setOutdatedTxn(false);
+
+  if (!driverTxnHandler.isValidTxnListState()) {
+LOG.info("Re-compiling after acquiring locks, attempt #" + 
retryShapshotCnt);
+// Snapshot was outdated when locks were acquired, hence 
regenerate context, txn list and retry.
+// TODO: Lock acquisition should be moved before analyze, this is 
a bit hackish.
+// Currently, we acquire a snapshot, compile the query with that 
snapshot, and then - acquire locks.
+// If snapshot is still valid, we continue as usual.
+// But if snapshot is not valid, we recompile the query.
+if (driverContext.isOutdatedTxn()) {
+  LOG.info("Snapshot is outdated, re-initiating transaction ...");
+  driverContext.getTxnManager().rollbackTxn();
+
+  String userFromUGI = DriverUtils.getUserFromUGI(driverContext);
+  driverContext.getTxnManager().openTxn(context, userFromUGI, 
driverContext.getTxnType());
+  lockAndRespond();
+}
+driverContext.setRetrial(true);
+driverContext.getBackupContext().addSubContext(context);
+
driverContext.getBackupContext().setHiveLocks(context.getHiveLocks());
+context = driverContext.getBackupContext();
+
+driverContext.getConf().set(ValidTxnList.VALID_TXNS_KEY,
+  driverContext.getTxnManager().getValidTxns().toString());
+
+if (driverContext.getPlan().hasAcidResourcesInQuery()) {
+  compileInternal(context.getCmd(), true);
+  driverTxnHandler.recordValidWriteIds();
+  driverTxnHandler.setWriteIdForAcidFileSinks();
+}
+// Since we're reusing the compiled plan, we need to update its 
start time for current run
+
driverContext.getPlan().setQueryStartTime(driverContext.getQueryDisplay().getQueryStartTime());
   }
-
-  driverContext.setRetrial(true);
-  driverContext.getBackupContext().addSubContext(context);
-  
driverContext.getBackupContext().setHiveLocks(context.getHiveLocks());
-  context = driverContext.getBackupContext();
-
-  driverContext.getConf().set(ValidTxnList.VALID_TXNS_KEY,
-driverContext.getTxnManager().getValidTxns().toString());
-
-  if (driverContext.getPlan().hasAcidResourcesInQuery()) {
-compileInternal(context.getCmd(), true);
-driverTxnHandler.recordValidWriteIds();
-driverTxnHandler.setWriteIdForAcidFileSinks();
-  }
-  // Since we're reusing the compiled plan, we need to update its 
start time for current run
-  
driverContext.getPlan().setQueryStartTime(driverContext.getQueryDisplay().getQueryStartTime());
-}
+} while (driverContext.isOutdatedTxn() && ++retryShapshotCnt <= 
maxRetrySnapshotCnt);

Review comment:
   added comments





This is an automated message from the Apache Git Service.
To respond to the message, ple

[jira] [Work logged] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24211?focusedWorklogId=492864&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492864
 ]

ASF GitHub Bot logged work on HIVE-24211:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 10:22
Start Date: 30/Sep/20 10:22
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #1533:
URL: https://github.com/apache/hive/pull/1533#discussion_r497402229



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
##
@@ -1408,6 +1410,43 @@ private boolean isUpdateOrDelete(Statement stmt, String 
conflictSQLSuffix) throw
 }
   }
 
+  public long getLatestTxnInConflict(long txnid) throws MetaException {

Review comment:
   added javadoc





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492864)
Time Spent: 20m  (was: 10m)

> Replace Snapshot invalidate logic with WriteSet check for txn conflict 
> detection
> 
>
> Key: HIVE-24211
> URL: https://issues.apache.org/jira/browse/HIVE-24211
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *Issue with concurrent writes on partitioned table:*
> Concurrent writes on different partitions should execute in parallel without 
> issues. They acquire a shared lock on table level and exclusive write on 
> partition level (hive.txn.xlock.write=true).
> However there is a problem with the Snapshot validation. It compares valid 
> writeIds seen by current transaction, recorded before locking, with the 
> actual list of writeIds. The Issue is that writeId in Snapshot has no 
> information on partition, meaning that concurrent writes to different 
> partitions would be seen as writes to the same non-partitioned table causing 
> Snapshot to be obsolete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24212) Refactor to take advantage of list* optimisations in cloud storage connectors

2020-09-30 Thread Rajesh Balamohan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-24212:

Summary: Refactor to take advantage of list* optimisations in cloud storage 
connectors  (was: Refactor to take advantage of listStatus optimisations in 
cloud storage connectors)

> Refactor to take advantage of list* optimisations in cloud storage connectors
> -
>
> Key: HIVE-24212
> URL: https://issues.apache.org/jira/browse/HIVE-24212
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>
> https://issues.apache.org/jira/browse/HADOOP-17022, 
> https://issues.apache.org/jira/browse/HADOOP-17281, 
> https://issues.apache.org/jira/browse/HADOOP-16830 etc help in reducing 
> number of roundtrips to remote systems in cloud storage.
> Creating this ticket to do minor refactoring to take advantage of the above 
> optimizations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24213) Incorrect exception in the Merge MapJoinTask into its child MapRedTask optimizer

2020-09-30 Thread Zoltan Matyus (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Matyus reassigned HIVE-24213:



> Incorrect exception in the Merge MapJoinTask into its child MapRedTask 
> optimizer
> 
>
> Key: HIVE-24213
> URL: https://issues.apache.org/jira/browse/HIVE-24213
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Zoltan Matyus
>Assignee: Zoltan Matyus
>Priority: Minor
>
> The {{CommonJoinTaskDispatcher#mergeMapJoinTaskIntoItsChildMapRedTask}} 
> method throws a {{SemanticException}} if the number of {{FileSinkOperator}}s 
> it finds is not exactly 1. The exception is valid if zero operators are 
> found, but there can be valid use cases where multiple FileSinkOperators 
> exist.
> Example: the MapJoin and it child are used in a common table expression, 
> which is used for multiple inserts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24213) Incorrect exception in the Merge MapJoinTask into its child MapRedTask optimizer

2020-09-30 Thread Zoltan Matyus (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Matyus updated HIVE-24213:
-
Attachment: HIVE-24213.patch

> Incorrect exception in the Merge MapJoinTask into its child MapRedTask 
> optimizer
> 
>
> Key: HIVE-24213
> URL: https://issues.apache.org/jira/browse/HIVE-24213
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Zoltan Matyus
>Assignee: Zoltan Matyus
>Priority: Minor
> Attachments: HIVE-24213.patch
>
>
> The {{CommonJoinTaskDispatcher#mergeMapJoinTaskIntoItsChildMapRedTask}} 
> method throws a {{SemanticException}} if the number of {{FileSinkOperator}}s 
> it finds is not exactly 1. The exception is valid if zero operators are 
> found, but there can be valid use cases where multiple FileSinkOperators 
> exist.
> Example: the MapJoin and it child are used in a common table expression, 
> which is used for multiple inserts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-24213) Incorrect exception in the Merge MapJoinTask into its child MapRedTask optimizer

2020-09-30 Thread Zoltan Matyus (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24213 started by Zoltan Matyus.

> Incorrect exception in the Merge MapJoinTask into its child MapRedTask 
> optimizer
> 
>
> Key: HIVE-24213
> URL: https://issues.apache.org/jira/browse/HIVE-24213
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Zoltan Matyus
>Assignee: Zoltan Matyus
>Priority: Minor
> Attachments: HIVE-24213.patch
>
>
> The {{CommonJoinTaskDispatcher#mergeMapJoinTaskIntoItsChildMapRedTask}} 
> method throws a {{SemanticException}} if the number of {{FileSinkOperator}}s 
> it finds is not exactly 1. The exception is valid if zero operators are 
> found, but there can be valid use cases where multiple FileSinkOperators 
> exist.
> Example: the MapJoin and it child are used in a common table expression, 
> which is used for multiple inserts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24213) Incorrect exception in the Merge MapJoinTask into its child MapRedTask optimizer

2020-09-30 Thread Zoltan Matyus (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Matyus updated HIVE-24213:
-
Status: Patch Available  (was: In Progress)

> Incorrect exception in the Merge MapJoinTask into its child MapRedTask 
> optimizer
> 
>
> Key: HIVE-24213
> URL: https://issues.apache.org/jira/browse/HIVE-24213
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Zoltan Matyus
>Assignee: Zoltan Matyus
>Priority: Minor
> Attachments: HIVE-24213.patch
>
>
> The {{CommonJoinTaskDispatcher#mergeMapJoinTaskIntoItsChildMapRedTask}} 
> method throws a {{SemanticException}} if the number of {{FileSinkOperator}}s 
> it finds is not exactly 1. The exception is valid if zero operators are 
> found, but there can be valid use cases where multiple FileSinkOperators 
> exist.
> Example: the MapJoin and it child are used in a common table expression, 
> which is used for multiple inserts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24212) Refactor to take advantage of list* optimisations in cloud storage connectors

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24212:
--
Labels: pull-request-available  (was: )

> Refactor to take advantage of list* optimisations in cloud storage connectors
> -
>
> Key: HIVE-24212
> URL: https://issues.apache.org/jira/browse/HIVE-24212
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/HADOOP-17022, 
> https://issues.apache.org/jira/browse/HADOOP-17281, 
> https://issues.apache.org/jira/browse/HADOOP-16830 etc help in reducing 
> number of roundtrips to remote systems in cloud storage.
> Creating this ticket to do minor refactoring to take advantage of the above 
> optimizations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24212) Refactor to take advantage of list* optimisations in cloud storage connectors

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24212?focusedWorklogId=492887&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492887
 ]

ASF GitHub Bot logged work on HIVE-24212:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 11:19
Start Date: 30/Sep/20 11:19
Worklog Time Spent: 10m 
  Work Description: rbalamohan opened a new pull request #1538:
URL: https://github.com/apache/hive/pull/1538


   https://issues.apache.org/jira/browse/HIVE-24212
   
   Minor refactoring to take advantage of the optimizations listed below.
   
   https://issues.apache.org/jira/browse/HADOOP-17022, 
https://issues.apache.org/jira/browse/HADOOP-17281, 
https://issues.apache.org/jira/browse/HADOOP-16830 etc help in reducing number 
of round trips to remote systems in cloud storage.
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492887)
Remaining Estimate: 0h
Time Spent: 10m

> Refactor to take advantage of list* optimisations in cloud storage connectors
> -
>
> Key: HIVE-24212
> URL: https://issues.apache.org/jira/browse/HIVE-24212
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://issues.apache.org/jira/browse/HADOOP-17022, 
> https://issues.apache.org/jira/browse/HADOOP-17281, 
> https://issues.apache.org/jira/browse/HADOOP-16830 etc help in reducing 
> number of roundtrips to remote systems in cloud storage.
> Creating this ticket to do minor refactoring to take advantage of the above 
> optimizations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24211) Replace Snapshot invalidate logic with WriteSet check for txn conflict detection

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24211?focusedWorklogId=492943&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492943
 ]

ASF GitHub Bot logged work on HIVE-24211:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 13:02
Start Date: 30/Sep/20 13:02
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on pull request #1533:
URL: https://github.com/apache/hive/pull/1533#issuecomment-701375986


   LGTM +1



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492943)
Time Spent: 0.5h  (was: 20m)

> Replace Snapshot invalidate logic with WriteSet check for txn conflict 
> detection
> 
>
> Key: HIVE-24211
> URL: https://issues.apache.org/jira/browse/HIVE-24211
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Denys Kuzmenko
>Assignee: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> *Issue with concurrent writes on partitioned table:*
> Concurrent writes on different partitions should execute in parallel without 
> issues. They acquire a shared lock on table level and exclusive write on 
> partition level (hive.txn.xlock.write=true).
> However there is a problem with the Snapshot validation. It compares valid 
> writeIds seen by current transaction, recorded before locking, with the 
> actual list of writeIds. The Issue is that writeId in Snapshot has no 
> information on partition, meaning that concurrent writes to different 
> partitions would be seen as writes to the same non-partitioned table causing 
> Snapshot to be obsolete. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches in Hive streaming v1

2020-09-30 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-21375 started by Ádám Szita.
-
> Closing TransactionBatch closes FileSystem for other batches in Hive 
> streaming v1
> -
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Reporter: Shawn Weeks
>Assignee: Ádám Szita
>Priority: Minor
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24213) Incorrect exception in the Merge MapJoinTask into its child MapRedTask optimizer

2020-09-30 Thread Zoltan Matyus (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Matyus updated HIVE-24213:
-
Attachment: (was: HIVE-24213.patch)

> Incorrect exception in the Merge MapJoinTask into its child MapRedTask 
> optimizer
> 
>
> Key: HIVE-24213
> URL: https://issues.apache.org/jira/browse/HIVE-24213
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Zoltan Matyus
>Assignee: Zoltan Matyus
>Priority: Minor
>
> The {{CommonJoinTaskDispatcher#mergeMapJoinTaskIntoItsChildMapRedTask}} 
> method throws a {{SemanticException}} if the number of {{FileSinkOperator}}s 
> it finds is not exactly 1. The exception is valid if zero operators are 
> found, but there can be valid use cases where multiple FileSinkOperators 
> exist.
> Example: the MapJoin and it child are used in a common table expression, 
> which is used for multiple inserts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24214) Reduce alter_table() function overloading in metastore

2020-09-30 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Sharma reassigned HIVE-24214:



> Reduce alter_table() function overloading in metastore
> --
>
> Key: HIVE-24214
> URL: https://issues.apache.org/jira/browse/HIVE-24214
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Minor
>
> Reduce following overloading method to less number of methods
> void alter_table(String catName, String dbName, String tblName, Table 
> newTable,
>   EnvironmentContext envContext)
>   throws InvalidOperationException, MetaException, TException;
> default void alter_table(String catName, String dbName, String tblName, Table 
> newTable)
>   throws InvalidOperationException, MetaException, TException {
> alter_table(catName, dbName, tblName, newTable, null);
>   }  
> @Deprecated
>   void alter_table(String defaultDatabaseName, String tblName, Table table,
>   boolean cascade) throws InvalidOperationException, MetaException, 
> TException;
> @Deprecated
>   void alter_table_with_environmentContext(String databaseName, String 
> tblName, Table table,
>   EnvironmentContext environmentContext) throws 
> InvalidOperationException, MetaException,
>   TException;
>   void alter_table(String catName, String databaseName, String tblName, Table 
> table,
>   EnvironmentContext environmentContext, String validWriteIdList)
>   throws InvalidOperationException, MetaException, TException;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23879) Data has been lost after table location was altered

2020-09-30 Thread Ashish Sharma (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-23879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204716#comment-17204716
 ] 

Ashish Sharma commented on HIVE-23879:
--

Current alter table set location only update the metadata and not move the data 
from old location to new location. That is expected behaviour

> Data has been lost after table location was altered
> ---
>
> Key: HIVE-23879
> URL: https://issues.apache.org/jira/browse/HIVE-23879
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Demyd
>Assignee: Ashish Sharma
>Priority: Major
>
> When I alter location for not empty table and inserts data to it. I don't see 
> old data at work with hs2. But I can find there in maprfs by old table 
> location.
> Steps to reproduce:
> {code:sql}
> 1. connect to hs2 by beeline"
>  hive --service beeline -u "jdbc:hive2://:1/;"
> 2. create test db:
>  create database dbtest1 location 'hdfs:///dbtest1.db';
> 3. create test table:
>  create table dbtest1.t1 (id int);
> 4. insert data to table:
>  insert into dbtest1.t1 (id) values (1);
> 5. set new table location:
>  alter table dbtest1.t1 set location 'hdfs:///dbtest1a/t1';
> 6. insert data to table:
>  insert into dbtest1.t1 (id) values (2);
> {code}
> Actual result:
> {code:sql}
> jdbc:hive2://:> select * from dbtest1.t1;
>  ++
> |t1.id      |
> ++
> |2            |
> ++
>  1 row selected (0.097 seconds)
> {code}
> Expected result:
> {code:sql}
> jdbc:hive2://:> select * from dbtest1.t1;
>  ++
> |t1.id      |
> ++
> |2            |
> ++
> |1            |
> ++
>  1 row selected (0.097 seconds)
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work stopped] (HIVE-23879) Data has been lost after table location was altered

2020-09-30 Thread Ashish Sharma (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-23879 stopped by Ashish Sharma.

> Data has been lost after table location was altered
> ---
>
> Key: HIVE-23879
> URL: https://issues.apache.org/jira/browse/HIVE-23879
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Demyd
>Assignee: Ashish Sharma
>Priority: Major
>
> When I alter location for not empty table and inserts data to it. I don't see 
> old data at work with hs2. But I can find there in maprfs by old table 
> location.
> Steps to reproduce:
> {code:sql}
> 1. connect to hs2 by beeline"
>  hive --service beeline -u "jdbc:hive2://:1/;"
> 2. create test db:
>  create database dbtest1 location 'hdfs:///dbtest1.db';
> 3. create test table:
>  create table dbtest1.t1 (id int);
> 4. insert data to table:
>  insert into dbtest1.t1 (id) values (1);
> 5. set new table location:
>  alter table dbtest1.t1 set location 'hdfs:///dbtest1a/t1';
> 6. insert data to table:
>  insert into dbtest1.t1 (id) values (2);
> {code}
> Actual result:
> {code:sql}
> jdbc:hive2://:> select * from dbtest1.t1;
>  ++
> |t1.id      |
> ++
> |2            |
> ++
>  1 row selected (0.097 seconds)
> {code}
> Expected result:
> {code:sql}
> jdbc:hive2://:> select * from dbtest1.t1;
>  ++
> |t1.id      |
> ++
> |2            |
> ++
> |1            |
> ++
>  1 row selected (0.097 seconds)
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24178) managedlocation is missing in SHOW CREATE DATABASE

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24178?focusedWorklogId=492952&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492952
 ]

ASF GitHub Bot logged work on HIVE-24178:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 13:18
Start Date: 30/Sep/20 13:18
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on pull request #1508:
URL: https://github.com/apache/hive/pull/1508#issuecomment-701384899


   LGTM !  Good catch.  I also linked your JIRA (HIVE-24178) to the root JIRA 
(HIVE-22995)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492952)
Time Spent: 20m  (was: 10m)

> managedlocation is missing in SHOW CREATE DATABASE
> --
>
> Key: HIVE-24178
> URL: https://issues.apache.org/jira/browse/HIVE-24178
> Project: Hive
>  Issue Type: Bug
>Reporter: Csaba Ringhofer
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The output of SHOW CREATE DATABASE contains location, but doesn't contain 
> managed location, so the database it would create would be actually different.
> To reproduce:
> create database db1 location "/test-warehouse/a" managedlocation 
> "test-warehouse/b";
> show create database db1;
> result: 
> +--+
> |createdb_stmt |
> +--+
> | CREATE DATABASE `db1`|
> | LOCATION |
> |   'hdfs://localhost:20500/test-warehouse/a'  |
> +--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24213) Incorrect exception in the Merge MapJoinTask into its child MapRedTask optimizer

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24213:
--
Labels: pull-request-available  (was: )

> Incorrect exception in the Merge MapJoinTask into its child MapRedTask 
> optimizer
> 
>
> Key: HIVE-24213
> URL: https://issues.apache.org/jira/browse/HIVE-24213
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Zoltan Matyus
>Assignee: Zoltan Matyus
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{CommonJoinTaskDispatcher#mergeMapJoinTaskIntoItsChildMapRedTask}} 
> method throws a {{SemanticException}} if the number of {{FileSinkOperator}}s 
> it finds is not exactly 1. The exception is valid if zero operators are 
> found, but there can be valid use cases where multiple FileSinkOperators 
> exist.
> Example: the MapJoin and it child are used in a common table expression, 
> which is used for multiple inserts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24213) Incorrect exception in the Merge MapJoinTask into its child MapRedTask optimizer

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24213?focusedWorklogId=492954&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492954
 ]

ASF GitHub Bot logged work on HIVE-24213:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 13:23
Start Date: 30/Sep/20 13:23
Worklog Time Spent: 10m 
  Work Description: zmatyus opened a new pull request #1539:
URL: https://github.com/apache/hive/pull/1539


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492954)
Remaining Estimate: 0h
Time Spent: 10m

> Incorrect exception in the Merge MapJoinTask into its child MapRedTask 
> optimizer
> 
>
> Key: HIVE-24213
> URL: https://issues.apache.org/jira/browse/HIVE-24213
> Project: Hive
>  Issue Type: Bug
>  Components: Physical Optimizer
>Affects Versions: 4.0.0
>Reporter: Zoltan Matyus
>Assignee: Zoltan Matyus
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The {{CommonJoinTaskDispatcher#mergeMapJoinTaskIntoItsChildMapRedTask}} 
> method throws a {{SemanticException}} if the number of {{FileSinkOperator}}s 
> it finds is not exactly 1. The exception is valid if zero operators are 
> found, but there can be valid use cases where multiple FileSinkOperators 
> exist.
> Example: the MapJoin and it child are used in a common table expression, 
> which is used for multiple inserts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24082) Expose information whether AcidUtils.ParsedDelta contains statementId

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24082?focusedWorklogId=492961&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492961
 ]

ASF GitHub Bot logged work on HIVE-24082:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 13:32
Start Date: 30/Sep/20 13:32
Worklog Time Spent: 10m 
  Work Description: harmandeeps commented on a change in pull request #1438:
URL: https://github.com/apache/hive/pull/1438#discussion_r497511777



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -1031,8 +1031,12 @@ public Path getPath() {
   return path;
 }
 
+public boolean hasStatementId() {

Review comment:
   In case of statementId = -1, which means statement ID is not present, it 
returns 0
   If statementId = 0, it returns 0.
   
   So, isn't it coming to the original issue? Am I missing something?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492961)
Time Spent: 2h  (was: 1h 50m)

> Expose information whether AcidUtils.ParsedDelta contains statementId
> -
>
> Key: HIVE-24082
> URL: https://issues.apache.org/jira/browse/HIVE-24082
> Project: Hive
>  Issue Type: Improvement
>Reporter: Piotr Findeisen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> In [Presto|https://prestosql.io] we support reading ORC ACID tables by 
> leveraging AcidUtils rather than duplicate the file name parsing logic in our 
> code.
> To do this fully correctly, we need information whether 
> {{org.apache.hadoop.hive.ql.io.AcidUtils.ParsedDelta}} contains 
> {{statementId}} information or not. 
> Currently, a getter of that property does not allow us to access this 
> information.
> [https://github.com/apache/hive/blob/468907eab36f78df3e14a24005153c9a23d62555/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L804-L806]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24203) Implement stats annotation rule for the LateralViewJoinOperator

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24203?focusedWorklogId=492970&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492970
 ]

ASF GitHub Bot logged work on HIVE-24203:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 13:51
Start Date: 30/Sep/20 13:51
Worklog Time Spent: 10m 
  Work Description: okumin commented on a change in pull request #1531:
URL: https://github.com/apache/hive/pull/1531#discussion_r497526561



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java
##
@@ -2921,6 +2920,77 @@ public Object process(Node nd, Stack stack, 
NodeProcessorCtx procCtx,
 }
   }
 
+  /**
+   * LateralViewJoinOperator joins the output of select with the output of 
UDTF.

Review comment:
   @zabetak Thanks for taking a look!
   I added a description. Please feel free to ask me if something doesn't make 
sense.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492970)
Time Spent: 40m  (was: 0.5h)

> Implement stats annotation rule for the LateralViewJoinOperator
> ---
>
> Key: HIVE-24203
> URL: https://issues.apache.org/jira/browse/HIVE-24203
> Project: Hive
>  Issue Type: Improvement
>  Components: Physical Optimizer
>Affects Versions: 4.0.0, 3.1.2, 2.3.7
>Reporter: okumin
>Assignee: okumin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> StatsRulesProcFactory doesn't have any rules to handle a JOIN by LATERAL VIEW.
> This can cause an underestimation in case that UDTF in LATERAL VIEW generates 
> multiple rows.
> HIVE-20262 has already added the rule for UDTF.
> This issue would add the rule for LateralViewJoinOperator.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24215) Total function count is incorrect in Replication Metrics

2020-09-30 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi reassigned HIVE-24215:
--


> Total function count is incorrect in Replication Metrics
> 
>
> Key: HIVE-24215
> URL: https://issues.apache.org/jira/browse/HIVE-24215
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21611) Date.getTime() can be changed to System.currentTimeMillis()

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21611?focusedWorklogId=492975&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492975
 ]

ASF GitHub Bot logged work on HIVE-21611:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 14:10
Start Date: 30/Sep/20 14:10
Worklog Time Spent: 10m 
  Work Description: belugabehr closed pull request #595:
URL: https://github.com/apache/hive/pull/595


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492975)
Time Spent: 1h 40m  (was: 1.5h)

> Date.getTime() can be changed to System.currentTimeMillis()
> ---
>
> Key: HIVE-21611
> URL: https://issues.apache.org/jira/browse/HIVE-21611
> Project: Hive
>  Issue Type: Bug
>Reporter: bd2019us
>Assignee: Hunter Logan
>Priority: Major
>  Labels: pull-request-available
> Attachments: 1.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Hello,
> I found that System.currentTimeMillis() can be used here instead of new 
> Date.getTime().
> Since new Date() is a thin wrapper of light method 
> System.currentTimeMillis(). The performance will be greatly damaged if it is 
> invoked too much times.
> According to my local testing at the same environment, 
> System.currentTimeMillis() can achieve a speedup to 5 times (435 ms vs 2073 
> ms), when these two methods are invoked 5,000,000 times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21611) Date.getTime() can be changed to System.currentTimeMillis()

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21611?focusedWorklogId=492976&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492976
 ]

ASF GitHub Bot logged work on HIVE-21611:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 14:10
Start Date: 30/Sep/20 14:10
Worklog Time Spent: 10m 
  Work Description: bd2019us opened a new pull request #595:
URL: https://github.com/apache/hive/pull/595


   Hello,
   new Date() is just a thin wrapper around System.currentTimeMillis(). Using 
System.currentTimeMillis() can help speed up the system.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492976)
Time Spent: 1h 50m  (was: 1h 40m)

> Date.getTime() can be changed to System.currentTimeMillis()
> ---
>
> Key: HIVE-21611
> URL: https://issues.apache.org/jira/browse/HIVE-21611
> Project: Hive
>  Issue Type: Bug
>Reporter: bd2019us
>Assignee: Hunter Logan
>Priority: Major
>  Labels: pull-request-available
> Attachments: 1.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Hello,
> I found that System.currentTimeMillis() can be used here instead of new 
> Date.getTime().
> Since new Date() is a thin wrapper of light method 
> System.currentTimeMillis(). The performance will be greatly damaged if it is 
> invoked too much times.
> According to my local testing at the same environment, 
> System.currentTimeMillis() can achieve a speedup to 5 times (435 ms vs 2073 
> ms), when these two methods are invoked 5,000,000 times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24215) Total function count is incorrect in Replication Metrics

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24215?focusedWorklogId=492979&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492979
 ]

ASF GitHub Bot logged work on HIVE-24215:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 14:15
Start Date: 30/Sep/20 14:15
Worklog Time Spent: 10m 
  Work Description: aasha opened a new pull request #1540:
URL: https://github.com/apache/hive/pull/1540


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492979)
Remaining Estimate: 0h
Time Spent: 10m

> Total function count is incorrect in Replication Metrics
> 
>
> Key: HIVE-24215
> URL: https://issues.apache.org/jira/browse/HIVE-24215
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-24215.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-24215) Total function count is incorrect in Replication Metrics

2020-09-30 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-24215 started by Aasha Medhi.
--
> Total function count is incorrect in Replication Metrics
> 
>
> Key: HIVE-24215
> URL: https://issues.apache.org/jira/browse/HIVE-24215
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-24215.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24215) Total function count is incorrect in Replication Metrics

2020-09-30 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-24215:
---
Attachment: HIVE-24215.01.patch
Status: Patch Available  (was: In Progress)

> Total function count is incorrect in Replication Metrics
> 
>
> Key: HIVE-24215
> URL: https://issues.apache.org/jira/browse/HIVE-24215
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-24215.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24215) Total function count is incorrect in Replication Metrics

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24215:
--
Labels: pull-request-available  (was: )

> Total function count is incorrect in Replication Metrics
> 
>
> Key: HIVE-24215
> URL: https://issues.apache.org/jira/browse/HIVE-24215
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24215.01.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23618) Add notification events for default/check constraints and enable replication.

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23618?focusedWorklogId=492987&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492987
 ]

ASF GitHub Bot logged work on HIVE-23618:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 14:33
Start Date: 30/Sep/20 14:33
Worklog Time Spent: 10m 
  Work Description: sankarh commented on pull request #1237:
URL: https://github.com/apache/hive/pull/1237#issuecomment-701430847


   +1, LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492987)
Time Spent: 3h 10m  (was: 3h)

> Add notification events for default/check constraints and enable replication.
> -
>
> Key: HIVE-23618
> URL: https://issues.apache.org/jira/browse/HIVE-23618
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> This should follow similar approach of notNull/Unique constraints. This will 
> also include event replication for these constraints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23618) Add notification events for default/check constraints and enable replication.

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23618?focusedWorklogId=492986&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492986
 ]

ASF GitHub Bot logged work on HIVE-23618:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 14:33
Start Date: 30/Sep/20 14:33
Worklog Time Spent: 10m 
  Work Description: sankarh merged pull request #1237:
URL: https://github.com/apache/hive/pull/1237


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492986)
Time Spent: 3h  (was: 2h 50m)

> Add notification events for default/check constraints and enable replication.
> -
>
> Key: HIVE-23618
> URL: https://issues.apache.org/jira/browse/HIVE-23618
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> This should follow similar approach of notNull/Unique constraints. This will 
> also include event replication for these constraints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23618) Add notification events for default/check constraints and enable replication.

2020-09-30 Thread Sankar Hariappan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan resolved HIVE-23618.
-
   Fix Version/s: 4.0.0
Target Version/s: 4.0.0
  Resolution: Fixed

Merged to master!
Thanks [~adeshrao] for the patch and [~aasha] for the review!

> Add notification events for default/check constraints and enable replication.
> -
>
> Key: HIVE-23618
> URL: https://issues.apache.org/jira/browse/HIVE-23618
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> This should follow similar approach of notNull/Unique constraints. This will 
> also include event replication for these constraints.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24208) LLAP: query job stuck due to race conditions

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24208?focusedWorklogId=492993&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-492993
 ]

ASF GitHub Bot logged work on HIVE-24208:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 14:40
Start Date: 30/Sep/20 14:40
Worklog Time Spent: 10m 
  Work Description: pgaref commented on pull request #1534:
URL: https://github.com/apache/hive/pull/1534#issuecomment-701434972


   Hey @bymm thanks for fixing this!
   Still haven't looked at 2.3 branch and why this happens, however,
   it would be super helpful if you could add a simple testCase reproducing the 
behaviour :) 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 492993)
Time Spent: 20m  (was: 10m)

> LLAP: query job stuck due to race conditions
> 
>
> Key: HIVE-24208
> URL: https://issues.apache.org/jira/browse/HIVE-24208
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.4
>Reporter: Yuriy Baltovskyy
>Assignee: Yuriy Baltovskyy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When issuing an LLAP query, sometimes the TEZ job on LLAP server never ends 
> and it never returns the data reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches in Hive streaming v1

2020-09-30 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita updated HIVE-21375:
--
Affects Version/s: 3.2.0

> Closing TransactionBatch closes FileSystem for other batches in Hive 
> streaming v1
> -
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Affects Versions: 3.2.0
>Reporter: Shawn Weeks
>Assignee: Ádám Szita
>Priority: Minor
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21052) Make sure transactions get cleaned if they are aborted before addPartitions is called

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21052?focusedWorklogId=493001&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493001
 ]

ASF GitHub Bot logged work on HIVE-21052:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 15:00
Start Date: 30/Sep/20 15:00
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on pull request #1415:
URL: https://github.com/apache/hive/pull/1415#issuecomment-701447242


   @vpnvishv, thank you for your efforts, patch looks good! Could you please 
submit similar pull request for the master? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493001)
Time Spent: 20m  (was: 10m)

> Make sure transactions get cleaned if they are aborted before addPartitions 
> is called
> -
>
> Key: HIVE-21052
> URL: https://issues.apache.org/jira/browse/HIVE-21052
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.0.0, 3.1.1
>Reporter: Jaume M
>Assignee: Jaume M
>Priority: Critical
>  Labels: pull-request-available
> Attachments: Aborted Txn w_Direct Write.pdf, HIVE-21052.1.patch, 
> HIVE-21052.10.patch, HIVE-21052.11.patch, HIVE-21052.12.patch, 
> HIVE-21052.2.patch, HIVE-21052.3.patch, HIVE-21052.4.patch, 
> HIVE-21052.5.patch, HIVE-21052.6.patch, HIVE-21052.7.patch, 
> HIVE-21052.8.patch, HIVE-21052.9.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If the transaction is aborted between openTxn and addPartitions and data has 
> been written on the table the transaction manager will think it's an empty 
> transaction and no cleaning will be done.
> This is currently an issue in the streaming API and in micromanaged tables. 
> As proposed by [~ekoifman] this can be solved by:
> * Writing an entry with a special marker to TXN_COMPONENTS at openTxn and 
> when addPartitions is called remove this entry from TXN_COMPONENTS and add 
> the corresponding partition entry to TXN_COMPONENTS.
> * If the cleaner finds and entry with a special marker in TXN_COMPONENTS that 
> specifies that a transaction was opened and it was aborted it must generate 
> jobs for the worker for every possible partition available.
> cc [~ewohlstadter]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24082) Expose information whether AcidUtils.ParsedDelta contains statementId

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24082?focusedWorklogId=493012&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493012
 ]

ASF GitHub Bot logged work on HIVE-24082:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 15:29
Start Date: 30/Sep/20 15:29
Worklog Time Spent: 10m 
  Work Description: findepi commented on a change in pull request #1438:
URL: https://github.com/apache/hive/pull/1438#discussion_r497602930



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -1031,8 +1031,12 @@ public Path getPath() {
   return path;
 }
 
+public boolean hasStatementId() {

Review comment:
   you mean that `getStatementId()` returns 0 in two cases:
   
   - not set
   - set to 0
   
   ?
   
   yes. i think this could be changed too. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493012)
Time Spent: 2h 10m  (was: 2h)

> Expose information whether AcidUtils.ParsedDelta contains statementId
> -
>
> Key: HIVE-24082
> URL: https://issues.apache.org/jira/browse/HIVE-24082
> Project: Hive
>  Issue Type: Improvement
>Reporter: Piotr Findeisen
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> In [Presto|https://prestosql.io] we support reading ORC ACID tables by 
> leveraging AcidUtils rather than duplicate the file name parsing logic in our 
> code.
> To do this fully correctly, we need information whether 
> {{org.apache.hadoop.hive.ql.io.AcidUtils.ParsedDelta}} contains 
> {{statementId}} information or not. 
> Currently, a getter of that property does not allow us to access this 
> information.
> [https://github.com/apache/hive/blob/468907eab36f78df3e14a24005153c9a23d62555/ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java#L804-L806]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24157) Strict mode to fail on CAST timestamp <-> numeric

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24157?focusedWorklogId=493020&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493020
 ]

ASF GitHub Bot logged work on HIVE-24157:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 15:43
Start Date: 30/Sep/20 15:43
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #1497:
URL: https://github.com/apache/hive/pull/1497#discussion_r497613491



##
File path: ql/src/test/queries/clientnegative/strict_numeric_to_timestamp.q
##
@@ -0,0 +1,2 @@
+set hive.strict.timestamp.conversion=true;
+select cast(123 as timestamp);

Review comment:
   I've added a few more tests; and I've also added one with a `struct` - 
however I don't really feel it that much different from the others (the way the 
check works) - I might be too much "inside the box"; so if you have any 
suggestions what tests to add - please let me know





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493020)
Time Spent: 40m  (was: 0.5h)

> Strict mode to fail on CAST timestamp <-> numeric
> -
>
> Key: HIVE-24157
> URL: https://issues.apache.org/jira/browse/HIVE-24157
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Jesus Camacho Rodriguez
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> There is some interest in enforcing that CAST numeric <\-> timestamp is 
> disallowed to avoid confusion among users, e.g., SQL standard does not allow 
> numeric <\-> timestamp casting, timestamp type is timezone agnostic, etc.
> We should introduce a strict config for timestamp (similar to others before): 
> If the config is true, we shall fail while compiling the query with a 
> meaningful message.
> To provide similar behavior, Hive has multiple functions that provide clearer 
> semantics for numeric to timestamp conversion (and vice versa):
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-09-30 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar reassigned HIVE-24217:



> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-09-30 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-24217:
-
Attachment: HPL_SQL storedproc HMS storage.pdf

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-09-30 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-24217:
-
Description: 
HPL/SQL procedures are currently stored in text files. The goal of this Jira is 
to implement a Metastore backend for storing and loading these procedures. This 
is an incremental step towards having fully capable stored procedures in Hive.

 

See the attached design for more information.

  was:
HPL/SQL procedures are currently stored in text files. The goal of this Jira is 
to implement a Metastore backend for storing and loading these procedures.

 

See the attached design for more information.


> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-09-30 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-24217:
-
Description: 
HPL/SQL procedures are currently stored in text files. The goal of this Jira is 
to implement a Metastore backend for storing and loading these procedures.

 

See the attached design for more information.

  was:HPL/SQL procedures are currently stored in text files. The goal of this 
Jira is to implement a Metastore backend for storing and loading these 
procedures.


> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches in Hive streaming v1

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-21375:
--
Labels: pull-request-available  (was: )

> Closing TransactionBatch closes FileSystem for other batches in Hive 
> streaming v1
> -
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Affects Versions: 3.2.0
>Reporter: Shawn Weeks
>Assignee: Ádám Szita
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches in Hive streaming v1

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21375?focusedWorklogId=493048&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493048
 ]

ASF GitHub Bot logged work on HIVE-21375:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 16:13
Start Date: 30/Sep/20 16:13
Worklog Time Spent: 10m 
  Work Description: szlta opened a new pull request #1541:
URL: https://github.com/apache/hive/pull/1541


   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493048)
Remaining Estimate: 0h
Time Spent: 10m

> Closing TransactionBatch closes FileSystem for other batches in Hive 
> streaming v1
> -
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Affects Versions: 3.2.0
>Reporter: Shawn Weeks
>Assignee: Ádám Szita
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches in Hive streaming v1

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21375?focusedWorklogId=493064&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493064
 ]

ASF GitHub Bot logged work on HIVE-21375:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 16:37
Start Date: 30/Sep/20 16:37
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #1541:
URL: https://github.com/apache/hive/pull/1541#discussion_r497649749



##
File path: 
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
##
@@ -394,10 +404,12 @@ public Void run() throws Exception {
 return null;
   }
 });
-try {
-  FileSystem.closeAllForUGI(ugi);
-} catch (IOException exception) {
-  LOG.error("Could not clean up file-system handles for UGI: " + ugi, 
exception);
+if (decRefForUgi(ugi) == 0) {
+  try {
+  FileSystem.closeAllForUGI(ugi);

Review comment:
   nit: indent





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493064)
Time Spent: 20m  (was: 10m)

> Closing TransactionBatch closes FileSystem for other batches in Hive 
> streaming v1
> -
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Affects Versions: 3.2.0
>Reporter: Shawn Weeks
>Assignee: Ádám Szita
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches in Hive streaming v1

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21375?focusedWorklogId=493068&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493068
 ]

ASF GitHub Bot logged work on HIVE-21375:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 16:38
Start Date: 30/Sep/20 16:38
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #1541:
URL: https://github.com/apache/hive/pull/1541#discussion_r497650656



##
File path: 
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
##
@@ -1107,4 +1114,37 @@ private static void setHiveConf(HiveConf conf, 
HiveConf.ConfVars var, boolean va
 conf.setBoolVar(var, value);
   }
 
+  private static void incRefForUgi(UserGroupInformation ugi) {
+refCountLock.lock();
+try {
+  Long prevCount = ugiConnectionRefCount.putIfAbsent(ugi, 1L);
+  if (prevCount != null) {

Review comment:
   How can this be null?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493068)
Time Spent: 0.5h  (was: 20m)

> Closing TransactionBatch closes FileSystem for other batches in Hive 
> streaming v1
> -
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Affects Versions: 3.2.0
>Reporter: Shawn Weeks
>Assignee: Ádám Szita
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?focusedWorklogId=493066&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493066
 ]

ASF GitHub Bot logged work on HIVE-24217:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 16:38
Start Date: 30/Sep/20 16:38
Worklog Time Spent: 10m 
  Work Description: zeroflag opened a new pull request #1542:
URL: https://github.com/apache/hive/pull/1542


   work in progress



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493066)
Remaining Estimate: 0h
Time Spent: 10m

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24217) HMS storage backend for HPL/SQL stored procedures

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24217:
--
Labels: pull-request-available  (was: )

> HMS storage backend for HPL/SQL stored procedures
> -
>
> Key: HIVE-24217
> URL: https://issues.apache.org/jira/browse/HIVE-24217
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, hpl/sql, Metastore
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
>  Labels: pull-request-available
> Attachments: HPL_SQL storedproc HMS storage.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HPL/SQL procedures are currently stored in text files. The goal of this Jira 
> is to implement a Metastore backend for storing and loading these procedures. 
> This is an incremental step towards having fully capable stored procedures in 
> Hive.
>  
> See the attached design for more information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21375) Closing TransactionBatch closes FileSystem for other batches in Hive streaming v1

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21375?focusedWorklogId=493069&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493069
 ]

ASF GitHub Bot logged work on HIVE-21375:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 16:40
Start Date: 30/Sep/20 16:40
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #1541:
URL: https://github.com/apache/hive/pull/1541#discussion_r497651467



##
File path: 
hcatalog/streaming/src/java/org/apache/hive/hcatalog/streaming/HiveEndPoint.java
##
@@ -1107,4 +1114,37 @@ private static void setHiveConf(HiveConf conf, 
HiveConf.ConfVars var, boolean va
 conf.setBoolVar(var, value);
   }
 
+  private static void incRefForUgi(UserGroupInformation ugi) {
+refCountLock.lock();
+try {
+  Long prevCount = ugiConnectionRefCount.putIfAbsent(ugi, 1L);
+  if (prevCount != null) {

Review comment:
   Sorry my mistake





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493069)
Time Spent: 40m  (was: 0.5h)

> Closing TransactionBatch closes FileSystem for other batches in Hive 
> streaming v1
> -
>
> Key: HIVE-21375
> URL: https://issues.apache.org/jira/browse/HIVE-21375
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Streaming
>Affects Versions: 3.2.0
>Reporter: Shawn Weeks
>Assignee: Ádám Szita
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The patch in HIVE-13151 added FileSystem.closeAllForUGI(ugi); to the close 
> method of HiveEndPoint for the legacy Streaming API. This seems to have a 
> side effect of closing the FileSystem for all open TransactionBatches as used 
> by NiFi and Storm when writing to multiple partitions. Setting 
> fs.hdfs.impl.disable.cache=true negates the issue but at a performance cost.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24210) PartitionManagementTask fails if one of tables dropped after fetching TableMeta

2020-09-30 Thread Naveen Gangam (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204925#comment-17204925
 ] 

Naveen Gangam commented on HIVE-24210:
--

[~nareshpr] Could you please add a reproducer or some description of the 
problem or the solution. This makes it easier to understand the problem and 
helps with the code review. Thank you

> PartitionManagementTask fails if one of tables dropped after fetching 
> TableMeta
> ---
>
> Key: HIVE-24210
> URL: https://issues.apache.org/jira/browse/HIVE-24210
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> 2020-09-21T10:45:15,875 ERROR [pool-4-thread-150]: 
> metastore.PartitionManagementTask (PartitionManagementTask.java:run(163)) - 
> Exception while running partition discovery task for table: null
> org.apache.hadoop.hive.metastore.api.NoSuchObjectException: 
> hive.default.test_table table not found
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_core(HiveMetaStore.java:3391)
>  
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getTableInternal(HiveMetaStore.java:3315)
>  
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_req(HiveMetaStore.java:3291)
>  
>  at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
>  at java.lang.reflect.Method.invoke(Method.java:498) 
>  at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  
>  at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  
>  at com.sun.proxy.$Proxy30.get_table_req(Unknown Source) ~[?:?]
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1804)
>  
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1791)
>  
>  at 
> org.apache.hadoop.hive.metastore.PartitionManagementTask.run(PartitionManagementTask.java:130){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24210) PartitionManagementTask fails if one of tables dropped after fetching TableMeta

2020-09-30 Thread Naresh P R (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R updated HIVE-24210:
--
Description: 
After fetching tableMeta based on configured dbPattern & tablePattern for PMT

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java#L130]

If one of the tables dropped before scheduling AutoPartition Discovery or MSCK, 
then entire PMT will be stopped because of below exception even though we can 
run MSCK for other valid tables.
{code:java}
2020-09-21T10:45:15,875 ERROR [pool-4-thread-150]: 
metastore.PartitionManagementTask (PartitionManagementTask.java:run(163)) - 
Exception while running partition discovery task for table: null
org.apache.hadoop.hive.metastore.api.NoSuchObjectException: 
hive.default.test_table table not found
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_core(HiveMetaStore.java:3391)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getTableInternal(HiveMetaStore.java:3315)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_req(HiveMetaStore.java:3291)
 
 at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
 at java.lang.reflect.Method.invoke(Method.java:498) 
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 
 at com.sun.proxy.$Proxy30.get_table_req(Unknown Source) ~[?:?]
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1804)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1791)
 
 at 
org.apache.hadoop.hive.metastore.PartitionManagementTask.run(PartitionManagementTask.java:130){code}
Exception is thrown from here.

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java#L130]

  was:
 
{code:java}
2020-09-21T10:45:15,875 ERROR [pool-4-thread-150]: 
metastore.PartitionManagementTask (PartitionManagementTask.java:run(163)) - 
Exception while running partition discovery task for table: null
org.apache.hadoop.hive.metastore.api.NoSuchObjectException: 
hive.default.test_table table not found
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_core(HiveMetaStore.java:3391)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getTableInternal(HiveMetaStore.java:3315)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_req(HiveMetaStore.java:3291)
 
 at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
 at java.lang.reflect.Method.invoke(Method.java:498) 
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 
 at com.sun.proxy.$Proxy30.get_table_req(Unknown Source) ~[?:?]
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1804)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1791)
 
 at 
org.apache.hadoop.hive.metastore.PartitionManagementTask.run(PartitionManagementTask.java:130){code}


> PartitionManagementTask fails if one of tables dropped after fetching 
> TableMeta
> ---
>
> Key: HIVE-24210
> URL: https://issues.apache.org/jira/browse/HIVE-24210
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After fetching tableMeta based on configured dbPattern & tablePattern for PMT
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java#L130]
> If one of the tables dropped before scheduling AutoPartition Discovery or 
> MSCK, then entire PMT will be stopped because of below exception even though 
> we can run MSCK for other valid tables.
> {code:java}
> 2020-09-21T10:45:15,875 ERROR [pool-4-thread-150]: 
> metastore.PartitionManagementTask (PartitionManagementTask.java:run(163)) - 
> Exception while running partition discovery task for table: null
> org.apache.hadoop.hive.metastore.api.NoSuchObjectException: 
> hive.default.test_table table not found
>  at 
> org.apache.hadoop.hive.metastore.HiveMet

[jira] [Commented] (HIVE-24210) PartitionManagementTask fails if one of tables dropped after fetching TableMeta

2020-09-30 Thread Naresh P R (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204942#comment-17204942
 ] 

Naresh P R commented on HIVE-24210:
---

Updated Description. Please let me know if that is ok.

> PartitionManagementTask fails if one of tables dropped after fetching 
> TableMeta
> ---
>
> Key: HIVE-24210
> URL: https://issues.apache.org/jira/browse/HIVE-24210
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After fetching tableMeta based on configured dbPattern & tablePattern for PMT
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java#L130]
> If one of the tables dropped before scheduling AutoPartition Discovery or 
> MSCK, then entire PMT will be stopped because of below exception even though 
> we can run MSCK for other valid tables.
> {code:java}
> 2020-09-21T10:45:15,875 ERROR [pool-4-thread-150]: 
> metastore.PartitionManagementTask (PartitionManagementTask.java:run(163)) - 
> Exception while running partition discovery task for table: null
> org.apache.hadoop.hive.metastore.api.NoSuchObjectException: 
> hive.default.test_table table not found
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_core(HiveMetaStore.java:3391)
>  
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getTableInternal(HiveMetaStore.java:3315)
>  
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_req(HiveMetaStore.java:3291)
>  
>  at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
>  at java.lang.reflect.Method.invoke(Method.java:498) 
>  at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  
>  at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  
>  at com.sun.proxy.$Proxy30.get_table_req(Unknown Source) ~[?:?]
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1804)
>  
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1791)
>  
>  at 
> org.apache.hadoop.hive.metastore.PartitionManagementTask.run(PartitionManagementTask.java:130){code}
> Exception is thrown from here.
> [https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java#L130]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24210) PartitionManagementTask fails if one of tables dropped after fetching TableMeta

2020-09-30 Thread Naresh P R (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naresh P R updated HIVE-24210:
--
Description: 
After fetching tableMeta based on configured dbPattern & tablePattern for PMT

https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java#L125

If one of the tables dropped before scheduling AutoPartition Discovery or MSCK, 
then entire PMT will be stopped because of below exception even though we can 
run MSCK for other valid tables.
{code:java}
2020-09-21T10:45:15,875 ERROR [pool-4-thread-150]: 
metastore.PartitionManagementTask (PartitionManagementTask.java:run(163)) - 
Exception while running partition discovery task for table: null
org.apache.hadoop.hive.metastore.api.NoSuchObjectException: 
hive.default.test_table table not found
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_core(HiveMetaStore.java:3391)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getTableInternal(HiveMetaStore.java:3315)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_req(HiveMetaStore.java:3291)
 
 at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
 at java.lang.reflect.Method.invoke(Method.java:498) 
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 
 at com.sun.proxy.$Proxy30.get_table_req(Unknown Source) ~[?:?]
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1804)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1791)
 
 at 
org.apache.hadoop.hive.metastore.PartitionManagementTask.run(PartitionManagementTask.java:130){code}
Exception is thrown from here.

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java#L130]

  was:
After fetching tableMeta based on configured dbPattern & tablePattern for PMT

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java#L130]

If one of the tables dropped before scheduling AutoPartition Discovery or MSCK, 
then entire PMT will be stopped because of below exception even though we can 
run MSCK for other valid tables.
{code:java}
2020-09-21T10:45:15,875 ERROR [pool-4-thread-150]: 
metastore.PartitionManagementTask (PartitionManagementTask.java:run(163)) - 
Exception while running partition discovery task for table: null
org.apache.hadoop.hive.metastore.api.NoSuchObjectException: 
hive.default.test_table table not found
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_core(HiveMetaStore.java:3391)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getTableInternal(HiveMetaStore.java:3315)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_req(HiveMetaStore.java:3291)
 
 at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
 at java.lang.reflect.Method.invoke(Method.java:498) 
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 
 at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 
 at com.sun.proxy.$Proxy30.get_table_req(Unknown Source) ~[?:?]
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1804)
 
 at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:1791)
 
 at 
org.apache.hadoop.hive.metastore.PartitionManagementTask.run(PartitionManagementTask.java:130){code}
Exception is thrown from here.

[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/PartitionManagementTask.java#L130]


> PartitionManagementTask fails if one of tables dropped after fetching 
> TableMeta
> ---
>
> Key: HIVE-24210
> URL: https://issues.apache.org/jira/browse/HIVE-24210
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After fetching tableMeta based on configured dbPattern & tablePattern for PMT
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apa

[jira] [Work logged] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24154?focusedWorklogId=493165&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493165
 ]

ASF GitHub Bot logged work on HIVE-24154:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 20:07
Start Date: 30/Sep/20 20:07
Worklog Time Spent: 10m 
  Work Description: jcamachor merged pull request #1492:
URL: https://github.com/apache/hive/pull/1492


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493165)
Time Spent: 3h 10m  (was: 3h)

> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-30 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-24154:
---
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24202) Clean up local HS2 HMS cache code (II)

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24202:
--
Labels: pull-request-available  (was: )

> Clean up local HS2 HMS cache code (II)
> --
>
> Key: HIVE-24202
> URL: https://issues.apache.org/jira/browse/HIVE-24202
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Follow-up for HIVE-24183 (split into different JIRAs).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24202) Clean up local HS2 HMS cache code (II)

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24202?focusedWorklogId=493206&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493206
 ]

ASF GitHub Bot logged work on HIVE-24202:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 23:08
Start Date: 30/Sep/20 23:08
Worklog Time Spent: 10m 
  Work Description: jcamachor opened a new pull request #1543:
URL: https://github.com/apache/hive/pull/1543


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493206)
Remaining Estimate: 0h
Time Spent: 10m

> Clean up local HS2 HMS cache code (II)
> --
>
> Key: HIVE-24202
> URL: https://issues.apache.org/jira/browse/HIVE-24202
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Follow-up for HIVE-24183 (split into different JIRAs).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-24157) Strict mode to fail on CAST timestamp <-> numeric

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24157?focusedWorklogId=493223&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493223
 ]

ASF GitHub Bot logged work on HIVE-24157:
-

Author: ASF GitHub Bot
Created on: 30/Sep/20 23:39
Start Date: 30/Sep/20 23:39
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1497:
URL: https://github.com/apache/hive/pull/1497#discussion_r497858468



##
File path: ql/src/test/queries/clientnegative/strict_numeric_to_timestamp.q
##
@@ -0,0 +1,2 @@
+set hive.strict.timestamp.conversion=true;
+select cast(123 as timestamp);

Review comment:
   Maybe a couple with a comparison timestamp and numeric (e.g., `=` or 
`<`) which should lead to a cast to double being introduced (yeah, comparison 
is effectively done in double) and then the query failing.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493223)
Time Spent: 50m  (was: 40m)

> Strict mode to fail on CAST timestamp <-> numeric
> -
>
> Key: HIVE-24157
> URL: https://issues.apache.org/jira/browse/HIVE-24157
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Jesus Camacho Rodriguez
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> There is some interest in enforcing that CAST numeric <\-> timestamp is 
> disallowed to avoid confusion among users, e.g., SQL standard does not allow 
> numeric <\-> timestamp casting, timestamp type is timezone agnostic, etc.
> We should introduce a strict config for timestamp (similar to others before): 
> If the config is true, we shall fail while compiling the query with a 
> meaningful message.
> To provide similar behavior, Hive has multiple functions that provide clearer 
> semantics for numeric to timestamp conversion (and vice versa):
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-24202) Clean up local HS2 HMS cache code (II)

2020-09-30 Thread Jesus Camacho Rodriguez (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-24202:
---
Status: Patch Available  (was: In Progress)

> Clean up local HS2 HMS cache code (II)
> --
>
> Key: HIVE-24202
> URL: https://issues.apache.org/jira/browse/HIVE-24202
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Follow-up for HIVE-24183 (split into different JIRAs).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23863) UGI doAs privilege action to make calls to Ranger Service

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23863?focusedWorklogId=493249&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493249
 ]

ASF GitHub Bot logged work on HIVE-23863:
-

Author: ASF GitHub Bot
Created on: 01/Oct/20 00:52
Start Date: 01/Oct/20 00:52
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1289:
URL: https://github.com/apache/hive/pull/1289


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493249)
Time Spent: 1h 20m  (was: 1h 10m)

> UGI doAs privilege action  to make calls to Ranger Service
> --
>
> Key: HIVE-23863
> URL: https://issues.apache.org/jira/browse/HIVE-23863
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23863.01.patch, HIVE-23863.02.patch, 
> HIVE-23863.03.patch, UGI and Replication.pdf
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-21611) Date.getTime() can be changed to System.currentTimeMillis()

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21611?focusedWorklogId=493250&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493250
 ]

ASF GitHub Bot logged work on HIVE-21611:
-

Author: ASF GitHub Bot
Created on: 01/Oct/20 00:52
Start Date: 01/Oct/20 00:52
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1334:
URL: https://github.com/apache/hive/pull/1334#issuecomment-701722885


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493250)
Time Spent: 2h  (was: 1h 50m)

> Date.getTime() can be changed to System.currentTimeMillis()
> ---
>
> Key: HIVE-21611
> URL: https://issues.apache.org/jira/browse/HIVE-21611
> Project: Hive
>  Issue Type: Bug
>Reporter: bd2019us
>Assignee: Hunter Logan
>Priority: Major
>  Labels: pull-request-available
> Attachments: 1.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Hello,
> I found that System.currentTimeMillis() can be used here instead of new 
> Date.getTime().
> Since new Date() is a thin wrapper of light method 
> System.currentTimeMillis(). The performance will be greatly damaged if it is 
> invoked too much times.
> According to my local testing at the same environment, 
> System.currentTimeMillis() can achieve a speedup to 5 times (435 ms vs 2073 
> ms), when these two methods are invoked 5,000,000 times.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23948) Improve Query Results Cache

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23948?focusedWorklogId=493248&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493248
 ]

ASF GitHub Bot logged work on HIVE-23948:
-

Author: ASF GitHub Bot
Created on: 01/Oct/20 00:52
Start Date: 01/Oct/20 00:52
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1335:
URL: https://github.com/apache/hive/pull/1335#issuecomment-701722878


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493248)
Time Spent: 40m  (was: 0.5h)

> Improve Query Results Cache
> ---
>
> Key: HIVE-23948
> URL: https://issues.apache.org/jira/browse/HIVE-23948
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hunter Logan
>Assignee: Hunter Logan
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Creating a Jira for this github PR from before github was actively used
> [https://github.com/apache/hive/pull/652]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-24209) Incorrect search argument conversion for NOT BETWEEN operation when vectorization is enabled

2020-09-30 Thread Ganesha Shreedhara (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-24209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17205215#comment-17205215
 ] 

Ganesha Shreedhara commented on HIVE-24209:
---

[~ashutoshc] Thanks for reviewing. Please help with pushing this fix to master. 

> Incorrect search argument conversion for NOT BETWEEN operation when 
> vectorization is enabled
> 
>
> Key: HIVE-24209
> URL: https://issues.apache.org/jira/browse/HIVE-24209
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Ganesha Shreedhara
>Assignee: Ganesha Shreedhara
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24209.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We skipped adding GenericUDFOPNot UDF in filter expression for NOT BETWEEN 
> operation when vectorization is enabled because of the improvement done as 
> part of HIVE-15884. But, this is not handled during the conversion of filter 
> expression to search argument due to which incorrect predicate gets pushed 
> down to storage layer that leads to incorrect splits generation and incorrect 
> result. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-23852) Natively support Date type in ReduceSink operator

2020-09-30 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23852?focusedWorklogId=493310&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493310
 ]

ASF GitHub Bot logged work on HIVE-23852:
-

Author: ASF GitHub Bot
Created on: 01/Oct/20 06:19
Start Date: 01/Oct/20 06:19
Worklog Time Spent: 10m 
  Work Description: abstractdog merged pull request #1274:
URL: https://github.com/apache/hive/pull/1274


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493310)
Time Spent: 1.5h  (was: 1h 20m)

> Natively support Date type in ReduceSink operator
> -
>
> Key: HIVE-23852
> URL: https://issues.apache.org/jira/browse/HIVE-23852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> There is no native support currently meaning that these types end up being 
> serialized as multi-key columns which is much slower (iterating through batch 
> columns instead of writing a value directly)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-23852) Natively support Date type in ReduceSink operator

2020-09-30 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-23852:

Fix Version/s: 4.0.0

> Natively support Date type in ReduceSink operator
> -
>
> Key: HIVE-23852
> URL: https://issues.apache.org/jira/browse/HIVE-23852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> There is no native support currently meaning that these types end up being 
> serialized as multi-key columns which is much slower (iterating through batch 
> columns instead of writing a value directly)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (HIVE-23852) Natively support Date type in ReduceSink operator

2020-09-30 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor resolved HIVE-23852.
-
Resolution: Fixed

> Natively support Date type in ReduceSink operator
> -
>
> Key: HIVE-23852
> URL: https://issues.apache.org/jira/browse/HIVE-23852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> There is no native support currently meaning that these types end up being 
> serialized as multi-key columns which is much slower (iterating through batch 
> columns instead of writing a value directly)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-23852) Natively support Date type in ReduceSink operator

2020-09-30 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17205286#comment-17205286
 ] 

László Bodor commented on HIVE-23852:
-

PR merged to master, thanks [~pgaref] for the patch and [~mustafaiman] for the 
review!

> Natively support Date type in ReduceSink operator
> -
>
> Key: HIVE-23852
> URL: https://issues.apache.org/jira/browse/HIVE-23852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> There is no native support currently meaning that these types end up being 
> serialized as multi-key columns which is much slower (iterating through batch 
> columns instead of writing a value directly)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

75 matches

Mail list logo