[jira] [Updated] (HIVE-22013) "Show table extended" query fails with Wrong FS error for partition in customized location

2022-04-25 Thread Ganesha Shreedhara (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ganesha Shreedhara updated HIVE-22013:
--
Summary: "Show table extended" query fails with Wrong FS error for 
partition in customized location  (was: "Show table extended" should not 
compute FS statistics)

> "Show table extended" query fails with Wrong FS error for partition in 
> customized location
> --
>
> Key: HIVE-22013
> URL: https://issues.apache.org/jira/browse/HIVE-22013
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Ganesha Shreedhara
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In some of the `show table extended` statements, following codepath is invoked
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java#L421]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java#L449]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java#L468]
> 1. Not sure why this invokes stats computation. This should be removed?
>  2. Even if #1 is needed, it would be broken when {{tblPath}} and 
> {{partitionPaths}} are different (i.e when both of them of them are in 
> different fs or configured via router etc).
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://xyz/blah/tables/location/, expected: hdfs://zzz..
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:657)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:194)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:698)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:106)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:763)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:759)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:759)
>   at 
> org.apache.hadoop.hive.ql.metadata.formatting.TextMetaDataFormatter.writeFileSystemStats(TextMetaDataFormatter.java
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-22013) "Show table extended" should not compute FS statistics

2022-04-25 Thread Ganesha Shreedhara (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-22013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527889#comment-17527889
 ] 

Ganesha Shreedhara commented on HIVE-22013:
---

[~rajesh.balamohan] Please review the [pull 
request|https://github.com/apache/hive/pull/3231]. 

> "Show table extended" should not compute FS statistics
> --
>
> Key: HIVE-22013
> URL: https://issues.apache.org/jira/browse/HIVE-22013
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Ganesha Shreedhara
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In some of the `show table extended` statements, following codepath is invoked
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java#L421]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java#L449]
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/formatting/TextMetaDataFormatter.java#L468]
> 1. Not sure why this invokes stats computation. This should be removed?
>  2. Even if #1 is needed, it would be broken when {{tblPath}} and 
> {{partitionPaths}} are different (i.e when both of them of them are in 
> different fs or configured via router etc).
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://xyz/blah/tables/location/, expected: hdfs://zzz..
>   at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:657)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:194)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:698)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:106)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:763)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:759)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:759)
>   at 
> org.apache.hadoop.hive.ql.metadata.formatting.TextMetaDataFormatter.writeFileSystemStats(TextMetaDataFormatter.java
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26175) single quote in a comment causes parsing errors

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26175?focusedWorklogId=762105=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762105
 ]

ASF GitHub Bot logged work on HIVE-26175:
-

Author: ASF GitHub Bot
Created on: 26/Apr/22 03:52
Start Date: 26/Apr/22 03:52
Worklog Time Spent: 10m 
  Work Description: renjianting opened a new pull request, #3244:
URL: https://github.com/apache/hive/pull/3244

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 762105)
Time Spent: 50m  (was: 40m)

> single quote in a comment causes parsing errors
> ---
>
> Key: HIVE-26175
> URL: https://issues.apache.org/jira/browse/HIVE-26175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Parser
>Affects Versions: 3.1.3
>Reporter: renjianting
>Assignee: renjianting
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> A single quote in a comment causes parsing errors: such as 
> {code:java}
> select 1 -- I'm xxx
> from tbl; {code}
> Running a task like this will result in the following error:
> {code:java}
> NoViableAltException(377@[201:64: ( ( KW_AS )? alias= identifier )?]) 
>  at 
> org.antlr.runtime.DFA.noViableAlt(DFA.java:158)   
>   at 
> org.antlr.runtime.DFA.predict(DFA.java:116)   
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.tableSource(HiveParser_FromClauseParser.java:4220)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1602)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:1903)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1527)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1370)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45020)  
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:39792)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:40044)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:39690) 
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:38900)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:38788)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2396)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1420)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)   
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)  
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compile(Driver.java:616) 
>   at 
> org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)  
>  

[jira] [Work logged] (HIVE-26175) single quote in a comment causes parsing errors

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26175?focusedWorklogId=762100=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762100
 ]

ASF GitHub Bot logged work on HIVE-26175:
-

Author: ASF GitHub Bot
Created on: 26/Apr/22 03:43
Start Date: 26/Apr/22 03:43
Worklog Time Spent: 10m 
  Work Description: renjianting closed pull request #3241: HIVE-26175: 
single quote in a comment causes parsing errors
URL: https://github.com/apache/hive/pull/3241




Issue Time Tracking
---

Worklog Id: (was: 762100)
Time Spent: 0.5h  (was: 20m)

> single quote in a comment causes parsing errors
> ---
>
> Key: HIVE-26175
> URL: https://issues.apache.org/jira/browse/HIVE-26175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Parser
>Affects Versions: 3.1.3
>Reporter: renjianting
>Assignee: renjianting
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> A single quote in a comment causes parsing errors: such as 
> {code:java}
> select 1 -- I'm xxx
> from tbl; {code}
> Running a task like this will result in the following error:
> {code:java}
> NoViableAltException(377@[201:64: ( ( KW_AS )? alias= identifier )?]) 
>  at 
> org.antlr.runtime.DFA.noViableAlt(DFA.java:158)   
>   at 
> org.antlr.runtime.DFA.predict(DFA.java:116)   
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.tableSource(HiveParser_FromClauseParser.java:4220)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1602)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:1903)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1527)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1370)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45020)  
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:39792)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:40044)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:39690) 
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:38900)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:38788)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2396)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1420)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)   
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)  
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compile(Driver.java:616) 
>   at 
> org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)  
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   

[jira] [Work logged] (HIVE-26175) single quote in a comment causes parsing errors

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26175?focusedWorklogId=762101=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762101
 ]

ASF GitHub Bot logged work on HIVE-26175:
-

Author: ASF GitHub Bot
Created on: 26/Apr/22 03:43
Start Date: 26/Apr/22 03:43
Worklog Time Spent: 10m 
  Work Description: renjianting opened a new pull request, #3241:
URL: https://github.com/apache/hive/pull/3241

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 762101)
Time Spent: 40m  (was: 0.5h)

> single quote in a comment causes parsing errors
> ---
>
> Key: HIVE-26175
> URL: https://issues.apache.org/jira/browse/HIVE-26175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Parser
>Affects Versions: 3.1.3
>Reporter: renjianting
>Assignee: renjianting
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A single quote in a comment causes parsing errors: such as 
> {code:java}
> select 1 -- I'm xxx
> from tbl; {code}
> Running a task like this will result in the following error:
> {code:java}
> NoViableAltException(377@[201:64: ( ( KW_AS )? alias= identifier )?]) 
>  at 
> org.antlr.runtime.DFA.noViableAlt(DFA.java:158)   
>   at 
> org.antlr.runtime.DFA.predict(DFA.java:116)   
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.tableSource(HiveParser_FromClauseParser.java:4220)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1602)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:1903)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1527)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1370)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45020)  
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:39792)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:40044)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:39690) 
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:38900)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:38788)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2396)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1420)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)   
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)  
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compile(Driver.java:616) 
>   at 
> org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)  
> 

[jira] [Work logged] (HIVE-26175) single quote in a comment causes parsing errors

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26175?focusedWorklogId=762096=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762096
 ]

ASF GitHub Bot logged work on HIVE-26175:
-

Author: ASF GitHub Bot
Created on: 26/Apr/22 03:25
Start Date: 26/Apr/22 03:25
Worklog Time Spent: 10m 
  Work Description: renjianting commented on PR #3241:
URL: https://github.com/apache/hive/pull/3241#issuecomment-1109265044

   @kgyrtkirk #3241 #




Issue Time Tracking
---

Worklog Id: (was: 762096)
Time Spent: 20m  (was: 10m)

> single quote in a comment causes parsing errors
> ---
>
> Key: HIVE-26175
> URL: https://issues.apache.org/jira/browse/HIVE-26175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Parser
>Affects Versions: 3.1.3
>Reporter: renjianting
>Assignee: renjianting
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> A single quote in a comment causes parsing errors: such as 
> {code:java}
> select 1 -- I'm xxx
> from tbl; {code}
> Running a task like this will result in the following error:
> {code:java}
> NoViableAltException(377@[201:64: ( ( KW_AS )? alias= identifier )?]) 
>  at 
> org.antlr.runtime.DFA.noViableAlt(DFA.java:158)   
>   at 
> org.antlr.runtime.DFA.predict(DFA.java:116)   
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.tableSource(HiveParser_FromClauseParser.java:4220)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1602)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:1903)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1527)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1370)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45020)  
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:39792)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:40044)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:39690) 
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:38900)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:38788)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2396)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1420)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)   
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)  
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compile(Driver.java:616) 
>   at 
> org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)  
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   

[jira] [Work logged] (HIVE-26159) hive cli is unavailable from hive command

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26159?focusedWorklogId=762087=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762087
 ]

ASF GitHub Bot logged work on HIVE-26159:
-

Author: ASF GitHub Bot
Created on: 26/Apr/22 02:05
Start Date: 26/Apr/22 02:05
Worklog Time Spent: 10m 
  Work Description: wecharyu commented on PR #3227:
URL: https://github.com/apache/hive/pull/3227#issuecomment-1109220423

   @nrg4878 I just use cli for some basic ddl and dml tests, cannot determine 
if all operations work fine in cli.
   
   To retain current behavior,  can we change the default service of hive shell 
to beeline?




Issue Time Tracking
---

Worklog Id: (was: 762087)
Time Spent: 1h 20m  (was: 1h 10m)

> hive cli is unavailable from hive command
> -
>
> Key: HIVE-26159
> URL: https://issues.apache.org/jira/browse/HIVE-26159
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Hive cli is a convenient tool to connect to hive metastore service, but now 
> hive cli can not start even if we use *--service cli* option, it should be a 
> bug of ticket HIVE-24348.
> *Steps to reproduce:*
> {code:bash}
> hive@hive:/root$ /usr/share/hive/bin/hive --service cli --hiveconf 
> hive.metastore.uris=thrift://hive:9084
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Beeline version 4.0.0-alpha-2-SNAPSHOT by Apache Hive
> beeline> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25952) Drop HiveRelMdPredicates::getPredicates(Project...) to use that of RelMdPredicates

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25952?focusedWorklogId=762071=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762071
 ]

ASF GitHub Bot logged work on HIVE-25952:
-

Author: ASF GitHub Bot
Created on: 26/Apr/22 00:20
Start Date: 26/Apr/22 00:20
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #3024: [WIP] 
HIVE-25952 projection get predicates (constant finder instead of hive variant)
URL: https://github.com/apache/hive/pull/3024




Issue Time Tracking
---

Worklog Id: (was: 762071)
Time Spent: 1h 20m  (was: 1h 10m)

> Drop HiveRelMdPredicates::getPredicates(Project...) to use that of 
> RelMdPredicates
> --
>
> Key: HIVE-25952
> URL: https://issues.apache.org/jira/browse/HIVE-25952
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> There are some differences on this method between Hive and Calcite, the idea 
> of this ticket is to unify the two methods, and then drop the override in 
> HiveRelMdPredicates in favour of the method of RelMdPredicates.
> After applying HIVE-25966, the only difference is in the test for constant 
> expressions, which can be summarized as follows:
> ||Expression Type|Is Constant for Hive?||Is Constant for Calcite?||
> |InputRef|False|False|
> |Call|True if function is deterministic (arguments are not checked), false 
> otherwise|True if function is deterministic and all operands are constants, 
> false otherwise|
> |CorrelatedVariable|False|False|
> |LocalRef|False|False|
> |Over|False|False|
> |DymanicParameter|False|True|
> |RangeRef|False|False|
> |FieldAccess|False|Given expr.field, true if expr is constant, false 
> otherwise|



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25953) Drop HiveRelMdPredicates::getPredicates(Join...) to use that of RelMdPredicates

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25953?focusedWorklogId=762070=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762070
 ]

ASF GitHub Bot logged work on HIVE-25953:
-

Author: ASF GitHub Bot
Created on: 26/Apr/22 00:20
Start Date: 26/Apr/22 00:20
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on PR #3033:
URL: https://github.com/apache/hive/pull/3033#issuecomment-1109164077

   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.




Issue Time Tracking
---

Worklog Id: (was: 762070)
Time Spent: 1h 50m  (was: 1h 40m)

> Drop HiveRelMdPredicates::getPredicates(Join...) to use that of 
> RelMdPredicates
> ---
>
> Key: HIVE-25953
> URL: https://issues.apache.org/jira/browse/HIVE-25953
> Project: Hive
>  Issue Type: Sub-task
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The goal of the ticket is to unify the two implementations and remove the 
> override in HiveRelMdPredicates.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25929) Let secret config properties to be propagated to Tez

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25929?focusedWorklogId=762072=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-762072
 ]

ASF GitHub Bot logged work on HIVE-25929:
-

Author: ASF GitHub Bot
Created on: 26/Apr/22 00:20
Start Date: 26/Apr/22 00:20
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #3019: 
HIVE-25929: Let secret config properties to be propagated to Tez
URL: https://github.com/apache/hive/pull/3019




Issue Time Tracking
---

Worklog Id: (was: 762072)
Time Spent: 1h 40m  (was: 1.5h)

> Let secret config properties to be propagated to Tez
> 
>
> Key: HIVE-25929
> URL: https://issues.apache.org/jira/browse/HIVE-25929
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> History in chronological order:
> HIVE-10508: removed some passwords from config that's propagated to execution 
> engines
> HIVE-9013: introduced hive.conf.hidden.list, which is used instead of the 
> hardcoded list in HIVE-10508
> the problem with HIVE-9013 is it's about to introduce a common method for 
> removing sensitive data from Configuration, which absolutely makes sense in 
> most of the cases (set command showing sensitive data), but can cause issues 
> e.g. while using non-secure cloud connectors on a cluster, where instead of 
> the hadoop credential provider API (which is considered the secure and proper 
> way), passwords/secrets appear in the Configuration object (like: 
> "fs.azure.account.oauth2.client.secret")
> 2 possible solutions:
> 1. introduce a new property like: "hive.conf.hidden.list.exec.engines" -> 
> which defaults to "hive.conf.hidden.list" (configurable, but maybe just more 
> confusing to users, having a new config property which should be understood 
> and maintained on a cluster)
> 2. simply revert DAGUtils to use to old stripHivePasswordDetails introduced 
> by HIVE-10508 (convenient, less confusing for users, but cannot be configured)



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26173) Upgrade derby to 10.14.2.0

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26173?focusedWorklogId=761919=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761919
 ]

ASF GitHub Bot logged work on HIVE-26173:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 17:21
Start Date: 25/Apr/22 17:21
Worklog Time Spent: 10m 
  Work Description: hemanthboyina opened a new pull request, #3243:
URL: https://github.com/apache/hive/pull/3243

   What changes were proposed in this pull request?
   Upgrading derby version to the latest
   
   Why are the changes needed?
   It will fix the CVEs
   
   Does this PR introduce any user-facing change?
   No
   
   How was this patch tested?
   Manual tests




Issue Time Tracking
---

Worklog Id: (was: 761919)
Remaining Estimate: 0h
Time Spent: 10m

> Upgrade derby to 10.14.2.0
> --
>
> Key: HIVE-26173
> URL: https://issues.apache.org/jira/browse/HIVE-26173
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hemanth Boyina
>Assignee: Hemanth Boyina
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> upgrade derby from 10.14.1.0 to 10.14.2.0, to fix the vulnerability 
> CVE-2018-1313



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26173) Upgrade derby to 10.14.2.0

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26173:
--
Labels: pull-request-available  (was: )

> Upgrade derby to 10.14.2.0
> --
>
> Key: HIVE-26173
> URL: https://issues.apache.org/jira/browse/HIVE-26173
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hemanth Boyina
>Assignee: Hemanth Boyina
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> upgrade derby from 10.14.1.0 to 10.14.2.0, to fix the vulnerability 
> CVE-2018-1313



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Resolved] (HIVE-23872) SemanticException Failed to get a spark session: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create Spark client for Spark session

2022-04-25 Thread renjianting (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

renjianting resolved HIVE-23872.

Resolution: Abandoned

> SemanticException Failed to get a spark session: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create Spark 
> client for Spark session
> --
>
> Key: HIVE-23872
> URL: https://issues.apache.org/jira/browse/HIVE-23872
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2
>Reporter: renjianting
>Priority: Blocker
>
>  when using hive on spark engine:
>     FAILED: SemanticException Failed to get a spark session: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create Spark 
> client for Spark session
> hadoop version: 2.7.3 / hive version: 3.1.2 / spark version 3.0.0



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26172) Upgrade ant to 1.10.12

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26172?focusedWorklogId=761915=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761915
 ]

ASF GitHub Bot logged work on HIVE-26172:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 17:13
Start Date: 25/Apr/22 17:13
Worklog Time Spent: 10m 
  Work Description: hemanthboyina opened a new pull request, #3242:
URL: https://github.com/apache/hive/pull/3242

   
   
   ### What changes were proposed in this pull request?
   Upgrading ant version to the latest 
   
   
   ### Why are the changes needed?
   It will fix the CVEs 
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Manual tests 




Issue Time Tracking
---

Worklog Id: (was: 761915)
Remaining Estimate: 0h
Time Spent: 10m

> Upgrade ant to 1.10.12
> --
>
> Key: HIVE-26172
> URL: https://issues.apache.org/jira/browse/HIVE-26172
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hemanth Boyina
>Assignee: Hemanth Boyina
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Upgrade ant from 1.10.9 to 1.10.12 to fix the vulnerability CVE-2021-36373



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26172) Upgrade ant to 1.10.12

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26172:
--
Labels: pull-request-available  (was: )

> Upgrade ant to 1.10.12
> --
>
> Key: HIVE-26172
> URL: https://issues.apache.org/jira/browse/HIVE-26172
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hemanth Boyina
>Assignee: Hemanth Boyina
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Upgrade ant from 1.10.9 to 1.10.12 to fix the vulnerability CVE-2021-36373



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26175) single quote in a comment causes parsing errors

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26175:
--
Labels: pull-request-available  (was: )

> single quote in a comment causes parsing errors
> ---
>
> Key: HIVE-26175
> URL: https://issues.apache.org/jira/browse/HIVE-26175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Parser
>Affects Versions: 3.1.3
>Reporter: renjianting
>Assignee: renjianting
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A single quote in a comment causes parsing errors: such as 
> {code:java}
> select 1 -- I'm xxx
> from tbl; {code}
> Running a task like this will result in the following error:
> {code:java}
> NoViableAltException(377@[201:64: ( ( KW_AS )? alias= identifier )?]) 
>  at 
> org.antlr.runtime.DFA.noViableAlt(DFA.java:158)   
>   at 
> org.antlr.runtime.DFA.predict(DFA.java:116)   
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.tableSource(HiveParser_FromClauseParser.java:4220)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1602)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:1903)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1527)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1370)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45020)  
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:39792)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:40044)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:39690) 
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:38900)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:38788)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2396)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1420)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)   
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)  
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compile(Driver.java:616) 
>   at 
> org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)  
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)  
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)  
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)   
>

[jira] [Work logged] (HIVE-26175) single quote in a comment causes parsing errors

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26175?focusedWorklogId=761911=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761911
 ]

ASF GitHub Bot logged work on HIVE-26175:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 17:00
Start Date: 25/Apr/22 17:00
Worklog Time Spent: 10m 
  Work Description: renjianting opened a new pull request, #3241:
URL: https://github.com/apache/hive/pull/3241

   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   




Issue Time Tracking
---

Worklog Id: (was: 761911)
Remaining Estimate: 0h
Time Spent: 10m

> single quote in a comment causes parsing errors
> ---
>
> Key: HIVE-26175
> URL: https://issues.apache.org/jira/browse/HIVE-26175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Parser
>Affects Versions: 3.1.3
>Reporter: renjianting
>Assignee: renjianting
>Priority: Major
> Fix For: 4.0.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A single quote in a comment causes parsing errors: such as 
> {code:java}
> select 1 -- I'm xxx
> from tbl; {code}
> Running a task like this will result in the following error:
> {code:java}
> NoViableAltException(377@[201:64: ( ( KW_AS )? alias= identifier )?]) 
>  at 
> org.antlr.runtime.DFA.noViableAlt(DFA.java:158)   
>   at 
> org.antlr.runtime.DFA.predict(DFA.java:116)   
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.tableSource(HiveParser_FromClauseParser.java:4220)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1602)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:1903)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1527)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1370)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45020)  
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:39792)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:40044)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:39690) 
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:38900)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:38788)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2396)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1420)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)   
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)  
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compile(Driver.java:616) 
>   at 
> org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)  
> 

[jira] [Assigned] (HIVE-26175) single quote in a comment causes parsing errors

2022-04-25 Thread renjianting (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

renjianting reassigned HIVE-26175:
--


> single quote in a comment causes parsing errors
> ---
>
> Key: HIVE-26175
> URL: https://issues.apache.org/jira/browse/HIVE-26175
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI, Parser
>Affects Versions: 3.1.3
>Reporter: renjianting
>Assignee: renjianting
>Priority: Major
> Fix For: 4.0.0
>
>
> A single quote in a comment causes parsing errors: such as 
> {code:java}
> select 1 -- I'm xxx
> from tbl; {code}
> Running a task like this will result in the following error:
> {code:java}
> NoViableAltException(377@[201:64: ( ( KW_AS )? alias= identifier )?]) 
>  at 
> org.antlr.runtime.DFA.noViableAlt(DFA.java:158)   
>   at 
> org.antlr.runtime.DFA.predict(DFA.java:116)   
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.tableSource(HiveParser_FromClauseParser.java:4220)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1602)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:1903)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1527)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1370)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45020)  
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:39792)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:40044)
>at 
> org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:39690) 
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:38900)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:38788)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2396)
>   at 
> org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1420)
>   at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220)   
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74)  
>   at 
> org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compile(Driver.java:616) 
>   at 
> org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)  
>   at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)  
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)  
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)  
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)   
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)  
> 

[jira] [Updated] (HIVE-26174) disable rename table across dbs when on different filesystem

2022-04-25 Thread Adrian Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Wang updated HIVE-26174:
---
Summary: disable rename table across dbs when on different filesystem  
(was: ALTER TABLE RENAME TO should check new db location)

> disable rename table across dbs when on different filesystem
> 
>
> Key: HIVE-26174
> URL: https://issues.apache.org/jira/browse/HIVE-26174
> Project: Hive
>  Issue Type: Improvement
>Reporter: Adrian Wang
>Assignee: Adrian Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, if we run 
> ALTER TABLE db1.table1 RENAME TO db2.table2;
> and with `db1` and `db2` on different filesystem, for example `db1` as 
> `"hdfs:/user/hive/warehouse/db1.db"`, and `db2` as 
> `"s3://bucket/s3warehouse/db2.db"`, the new `db2.table2` will be under 
> location `hdfs:/s3warehouse/db2.db/table2`, which looks quite strange.
> The idea is to ban this kind of operation, as we seem to intend to ban that, 
> but the check was done after we changed file system scheme so it was always 
> true.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26150) OrcRawRecordMerger reads each row twice

2022-04-25 Thread Alessandro Solimando (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527591#comment-17527591
 ] 

Alessandro Solimando commented on HIVE-26150:
-

I first discovered the issue while writing a unit test were there were no 
delete records ([this 
test|https://github.com/apache/hive/blob/61d4ff2be48b20df9fd24692c372ee9c2606babe/ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java#L436]).

I then checked some other tests, including those that are failing in the 
description of the JIRA ticket, _testRecordReaderNewBaseAndDelta_ includes some 
delete operations (it creates a _delete_delta_ file), is that what you mean?


> OrcRawRecordMerger reads each row twice
> ---
>
> Key: HIVE-26150
> URL: https://issues.apache.org/jira/browse/HIVE-26150
> Project: Hive
>  Issue Type: Bug
>  Components: ORC, Transactions
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Priority: Major
>
> OrcRawRecordMerger reads each row twice, the issue does not surface since the 
> merger is only used with the parameter "collapseEvents" as true, which 
> filters out one of the two rows.
> collapseEvents true and false should produce the same result, since in 
> current acid implementation, each event has a distinct rowid, so two 
> identical rows cannot be there, this is the case only for the bug.
> In order to reproduce the issue, it is sufficient to set the second parameter 
> to false 
> [here|https://github.com/apache/hive/blob/61d4ff2be48b20df9fd24692c372ee9c2606babe/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L2103-L2106],
>  and run tests in TestOrcRawRecordMerger and observe two tests failing:
> {code:bash}
> mvn test -Dtest=TestOrcRawRecordMerger -pl ql
> {code}
> {noformat}
> [INFO] Results:
> [INFO]
> [ERROR] Failures:
> [ERROR]   TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta:1332 Found 
> unexpected row: (0,ignore.1)
> [ERROR]   TestOrcRawRecordMerger.testRecordReaderOldBaseAndDelta:1208 Found 
> unexpected row: (0,ignore.1)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26174) ALTER TABLE RENAME TO should check new db location

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26174:
--
Labels: pull-request-available  (was: )

> ALTER TABLE RENAME TO should check new db location
> --
>
> Key: HIVE-26174
> URL: https://issues.apache.org/jira/browse/HIVE-26174
> Project: Hive
>  Issue Type: Improvement
>Reporter: Adrian Wang
>Assignee: Adrian Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, if we run 
> ALTER TABLE db1.table1 RENAME TO db2.table2;
> and with `db1` and `db2` on different filesystem, for example `db1` as 
> `"hdfs:/user/hive/warehouse/db1.db"`, and `db2` as 
> `"s3://bucket/s3warehouse/db2.db"`, the new `db2.table2` will be under 
> location `hdfs:/s3warehouse/db2.db/table2`, which looks quite strange.
> The idea is to ban this kind of operation, as we seem to intend to ban that, 
> but the check was done after we changed file system scheme so it was always 
> true.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26174) ALTER TABLE RENAME TO should check new db location

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26174?focusedWorklogId=761868=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761868
 ]

ASF GitHub Bot logged work on HIVE-26174:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 16:00
Start Date: 25/Apr/22 16:00
Worklog Time Spent: 10m 
  Work Description: adrian-wang opened a new pull request, #3240:
URL: https://github.com/apache/hive/pull/3240

   ### What changes were proposed in this pull request?
   
   1. Disable rename table across dbs when dbs on different filesystem.
   2. Adjust the place of comment so it matches code better.
   
   
   ### Why are the changes needed?
   
   Currently, if we run `ALTER TABLE db1.table1 RENAME TO db2.table2;` with 
`db1` and `db2` on different filesystem, for example `db1` as 
`"hdfs:/user/hive/warehouse/db1.db"`, and `db2` as 
`"s3://bucket/s3warehouse/db2.db"`, the new `db2.table2` will be under location 
`hdfs:/s3warehouse/db2.db/table2`, which looks quite strange.
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes.
   Before this patch, if we run
   ```
   CREATE DATABASE db1 LOCATION '/user/hive/warehouse/db1.db';
   CREATE DATABASE db2 LOCATION 's3://bucket/s3warehouse/db2.db';
   CREATE TABLE db1.table1 (...);
   

Issue Time Tracking
---

Worklog Id: (was: 761868)
Remaining Estimate: 0h
Time Spent: 10m

> ALTER TABLE RENAME TO should check new db location
> --
>
> Key: HIVE-26174
> URL: https://issues.apache.org/jira/browse/HIVE-26174
> Project: Hive
>  Issue Type: Improvement
>Reporter: Adrian Wang
>Assignee: Adrian Wang
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, if we run 
> ALTER TABLE db1.table1 RENAME TO db2.table2;
> and with `db1` and `db2` on different filesystem, for example `db1` as 
> `"hdfs:/user/hive/warehouse/db1.db"`, and `db2` as 
> `"s3://bucket/s3warehouse/db2.db"`, the new `db2.table2` will be under 
> location `hdfs:/s3warehouse/db2.db/table2`, which looks quite strange.
> The idea is to ban this kind of operation, as we seem to intend to ban that, 
> but the check was done after we changed file system scheme so it was always 
> true.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HIVE-26174) ALTER TABLE RENAME TO should check new db location

2022-04-25 Thread Adrian Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrian Wang reassigned HIVE-26174:
--


> ALTER TABLE RENAME TO should check new db location
> --
>
> Key: HIVE-26174
> URL: https://issues.apache.org/jira/browse/HIVE-26174
> Project: Hive
>  Issue Type: Improvement
>Reporter: Adrian Wang
>Assignee: Adrian Wang
>Priority: Major
>
> Currently, if we run 
> ALTER TABLE db1.table1 RENAME TO db2.table2;
> and with `db1` and `db2` on different filesystem, for example `db1` as 
> `"hdfs:/user/hive/warehouse/db1.db"`, and `db2` as 
> `"s3://bucket/s3warehouse/db2.db"`, the new `db2.table2` will be under 
> location `hdfs:/s3warehouse/db2.db/table2`, which looks quite strange.
> The idea is to ban this kind of operation, as we seem to intend to ban that, 
> but the check was done after we changed file system scheme so it was always 
> true.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25758) OOM due to recursive application of CBO rules

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25758?focusedWorklogId=761857=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761857
 ]

ASF GitHub Bot logged work on HIVE-25758:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 15:29
Start Date: 25/Apr/22 15:29
Worklog Time Spent: 10m 
  Work Description: asolimando commented on code in PR #2966:
URL: https://github.com/apache/hive/pull/2966#discussion_r857758546


##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java:
##
@@ -1214,6 +1214,58 @@ public FixNullabilityShuttle(RexBuilder rexBuilder,
 }
   }
 
+  /**
+   * Find disjunction (OR) in an expression (at any level of nesting).
+   *
+   * Example 1: OR(=($0, $1), IS NOT NULL($2))):INTEGER (OR in the top-level 
expression)
+   * Example 2: NOT(AND(=($0, $1), IS NOT NULL($2))
+   *   this is equivalent to OR((<>($0, $1), IS NULL($2))
+   * Example 3: AND(OR(=($0, $1), IS NOT NULL($2 (OR in inner expression)
+   */

Review Comment:
   > I had in mind compiling with -Pjavadoc profile and checking for new errors 
in this class. Actually I am afraid of <> symbols as well as $. Don't remember 
if it is fine to use them like that.
   
   Indeed '<' and '>' are illegal, the rest did not highlight any issue when 
compiling with `-Pjavadoc`. I took the chance to improve the formatting of the 
list of examples. 





Issue Time Tracking
---

Worklog Id: (was: 761857)
Time Spent: 3.5h  (was: 3h 20m)

> OOM due to recursive application of CBO rules
> -
>
> Key: HIVE-25758
> URL: https://issues.apache.org/jira/browse/HIVE-25758
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Affects Versions: 4.0.0
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
>  
> Reproducing query is as follows:
> {code:java}
> create table test1 (act_nbr string);
> create table test2 (month int);
> create table test3 (mth int, con_usd double);
> EXPLAIN
>SELECT c.month,
>   d.con_usd
>FROM
>  (SELECT 
> cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), 
> '-MM-dd'), -1), 1, 7), '-', '') AS int) AS month
>   FROM test1
>   UNION ALL
>   SELECT month
>   FROM test2
>   WHERE month = 202110) c
>JOIN test3 d ON c.month = d.mth; {code}
>  
> Different plans are generated during the first CBO steps, last being:
> {noformat}
> 2021-12-01T08:28:08,598 DEBUG [a18191bb-3a2b-4193-9abf-4e37dd1996bb main] 
> parse.CalcitePlanner: Plan after decorre
> lation:
> HiveProject(month=[$0], con_usd=[$2])
>   HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], 
> cost=[not available])
>     HiveProject(month=[$0])
>       HiveUnion(all=[true])
>         
> HiveProject(month=[CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-d
> d':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), 
> _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-
> 16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER])
>           HiveTableScan(table=[[default, test1]], table:alias=[test1])
>         HiveProject(month=[$0])
>           HiveFilter(condition=[=($0, CAST(202110):INTEGER)])
>             HiveTableScan(table=[[default, test2]], table:alias=[test2])
>     HiveTableScan(table=[[default, test3]], table:alias=[d]){noformat}
>  
> Then, the HEP planner will keep expanding the filter expression with 
> redundant expressions, such as the following, where the identical CAST 
> expression is present multiple times:
>  
> {noformat}
> rel#118:HiveFilter.HIVE.[].any(input=HepRelVertex#39,condition=IN(CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, 
> _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> 202110)){noformat}
>  
> The problem seems to come from a bad interaction of at least 
> _HiveFilterProjectTransposeRule_ and 
> {_}HiveJoinPushTransitivePredicatesRule{_}, possibly more.
> Most probably then UNION part can be removed and the reproducer be simplified 
> even further.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25758) OOM due to recursive application of CBO rules

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25758?focusedWorklogId=761845=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761845
 ]

ASF GitHub Bot logged work on HIVE-25758:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 15:16
Start Date: 25/Apr/22 15:16
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #2966:
URL: https://github.com/apache/hive/pull/2966#discussion_r857739177


##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java:
##
@@ -1257,15 +1249,26 @@ public Void visitCall(RexCall call) {
 }
 
 public boolean hasDisjunction(RexNode node) {
-  // clear the state
-  inNegation = false;
-  hasDisjunction = false;
-
   node.accept(this);
   return hasDisjunction;
 }
   }
 
+  /**
+   * Find disjunction (OR) in an expression (at any level of nesting).

Review Comment:
   `Find` implies that we are going to return a disjunctive expression. 
Suggestion:
   `Returns whether the expression has disjunctions (OR) at any level of 
nesting.`



##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java:
##
@@ -1257,15 +1249,26 @@ public Void visitCall(RexCall call) {
 }
 
 public boolean hasDisjunction(RexNode node) {
-  // clear the state
-  inNegation = false;
-  hasDisjunction = false;
-
   node.accept(this);
   return hasDisjunction;
 }
   }
 
+  /**
+   * Find disjunction (OR) in an expression (at any level of nesting).
+   *
+   * Example 1: OR(=($0, $1), IS NOT NULL($2))):INTEGER (OR in the top-level 
expression)
+   * Example 2: NOT(AND(=($0, $1), IS NOT NULL($2))
+   *   this is equivalent to OR((<>($0, $1), IS NULL($2))
+   * Example 3: AND(OR(=($0, $1), IS NOT NULL($2 (OR in inner expression)
+   *
+   * @param node the expression where to look for disjunctions.
+   * @return true if the given expressions contains a disjunction, false 
otherwise.
+   */
+  public static boolean hasDisjuction(RexNode node) {
+return new DisjunctivePredicatesFinder().hasDisjunction(node);

Review Comment:
   nit to get rid of the `DisjunctivePredicateFinder#hasDisjunction` method:
   ```
   DisjunctivePredicateFinder finder = new DisjunctivePredicateFinder();
   node.accept(finder);
   return finder.hasDisjunction;
   ```



##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java:
##
@@ -1214,6 +1214,58 @@ public FixNullabilityShuttle(RexBuilder rexBuilder,
 }
   }
 
+  /**
+   * Find disjunction (OR) in an expression (at any level of nesting).
+   *
+   * Example 1: OR(=($0, $1), IS NOT NULL($2))):INTEGER (OR in the top-level 
expression)
+   * Example 2: NOT(AND(=($0, $1), IS NOT NULL($2))
+   *   this is equivalent to OR((<>($0, $1), IS NULL($2))
+   * Example 3: AND(OR(=($0, $1), IS NOT NULL($2 (OR in inner expression)
+   */

Review Comment:
   I had in mind compiling with `-Pjavadoc` profile and checking for new errors 
in this class. Actually I am afraid of `<>` symbols as well as `$`. Don't 
remember if it is fine to use them like that.





Issue Time Tracking
---

Worklog Id: (was: 761845)
Time Spent: 3h 20m  (was: 3h 10m)

> OOM due to recursive application of CBO rules
> -
>
> Key: HIVE-25758
> URL: https://issues.apache.org/jira/browse/HIVE-25758
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Affects Versions: 4.0.0
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
>  
> Reproducing query is as follows:
> {code:java}
> create table test1 (act_nbr string);
> create table test2 (month int);
> create table test3 (mth int, con_usd double);
> EXPLAIN
>SELECT c.month,
>   d.con_usd
>FROM
>  (SELECT 
> cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), 
> '-MM-dd'), -1), 1, 7), '-', '') AS int) AS month
>   FROM test1
>   UNION ALL
>   SELECT month
>   FROM test2
>   WHERE month = 202110) c
>JOIN test3 d ON c.month = d.mth; {code}
>  
> Different plans are generated during the first CBO steps, last being:
> {noformat}
> 2021-12-01T08:28:08,598 DEBUG [a18191bb-3a2b-4193-9abf-4e37dd1996bb main] 
> parse.CalcitePlanner: Plan after decorre
> lation:
> HiveProject(month=[$0], con_usd=[$2])
>   HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], 
> cost=[not available])
>     HiveProject(month=[$0])
>       HiveUnion(all=[true])
>         
> HiveProject(month=[CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-d
> 

[jira] [Work logged] (HIVE-25758) OOM due to recursive application of CBO rules

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25758?focusedWorklogId=761831=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761831
 ]

ASF GitHub Bot logged work on HIVE-25758:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 14:50
Start Date: 25/Apr/22 14:50
Worklog Time Spent: 10m 
  Work Description: asolimando commented on code in PR #2966:
URL: https://github.com/apache/hive/pull/2966#discussion_r857716612


##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java:
##
@@ -1214,6 +1214,58 @@ public FixNullabilityShuttle(RexBuilder rexBuilder,
 }
   }
 
+  /**
+   * Find disjunction (OR) in an expression (at any level of nesting).
+   *
+   * Example 1: OR(=($0, $1), IS NOT NULL($2))):INTEGER (OR in the top-level 
expression)
+   * Example 2: NOT(AND(=($0, $1), IS NOT NULL($2))
+   *   this is equivalent to OR((<>($0, $1), IS NULL($2))
+   * Example 3: AND(OR(=($0, $1), IS NOT NULL($2 (OR in inner expression)
+   */

Review Comment:
   I get no errors while compiling (without `-Dmaven.javadoc.skip` of course).
   
   Is there anything specific you are referring to?





Issue Time Tracking
---

Worklog Id: (was: 761831)
Time Spent: 3h 10m  (was: 3h)

> OOM due to recursive application of CBO rules
> -
>
> Key: HIVE-25758
> URL: https://issues.apache.org/jira/browse/HIVE-25758
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Affects Versions: 4.0.0
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
>  
> Reproducing query is as follows:
> {code:java}
> create table test1 (act_nbr string);
> create table test2 (month int);
> create table test3 (mth int, con_usd double);
> EXPLAIN
>SELECT c.month,
>   d.con_usd
>FROM
>  (SELECT 
> cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), 
> '-MM-dd'), -1), 1, 7), '-', '') AS int) AS month
>   FROM test1
>   UNION ALL
>   SELECT month
>   FROM test2
>   WHERE month = 202110) c
>JOIN test3 d ON c.month = d.mth; {code}
>  
> Different plans are generated during the first CBO steps, last being:
> {noformat}
> 2021-12-01T08:28:08,598 DEBUG [a18191bb-3a2b-4193-9abf-4e37dd1996bb main] 
> parse.CalcitePlanner: Plan after decorre
> lation:
> HiveProject(month=[$0], con_usd=[$2])
>   HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], 
> cost=[not available])
>     HiveProject(month=[$0])
>       HiveUnion(all=[true])
>         
> HiveProject(month=[CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-d
> d':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), 
> _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-
> 16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER])
>           HiveTableScan(table=[[default, test1]], table:alias=[test1])
>         HiveProject(month=[$0])
>           HiveFilter(condition=[=($0, CAST(202110):INTEGER)])
>             HiveTableScan(table=[[default, test2]], table:alias=[test2])
>     HiveTableScan(table=[[default, test3]], table:alias=[d]){noformat}
>  
> Then, the HEP planner will keep expanding the filter expression with 
> redundant expressions, such as the following, where the identical CAST 
> expression is present multiple times:
>  
> {noformat}
> rel#118:HiveFilter.HIVE.[].any(input=HepRelVertex#39,condition=IN(CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, 
> _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> 202110)){noformat}
>  
> The problem seems to come from a bad interaction of at least 
> _HiveFilterProjectTransposeRule_ and 
> {_}HiveJoinPushTransitivePredicatesRule{_}, possibly more.
> Most probably then UNION part can be removed and the reproducer be simplified 
> even further.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26159) hive cli is unavailable from hive command

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26159?focusedWorklogId=761830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761830
 ]

ASF GitHub Bot logged work on HIVE-26159:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 14:48
Start Date: 25/Apr/22 14:48
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on PR #3227:
URL: https://github.com/apache/hive/pull/3227#issuecomment-1108675056

   @wecharyu The CLI code is not being maintained in the newer code base. I do 
not know what functionality works and what doesnt. How much testing have you 
done with the fat client?
   Also with this proposed change, how can users retain current behaviour (of 
using beeline)? 




Issue Time Tracking
---

Worklog Id: (was: 761830)
Time Spent: 1h 10m  (was: 1h)

> hive cli is unavailable from hive command
> -
>
> Key: HIVE-26159
> URL: https://issues.apache.org/jira/browse/HIVE-26159
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Hive cli is a convenient tool to connect to hive metastore service, but now 
> hive cli can not start even if we use *--service cli* option, it should be a 
> bug of ticket HIVE-24348.
> *Steps to reproduce:*
> {code:bash}
> hive@hive:/root$ /usr/share/hive/bin/hive --service cli --hiveconf 
> hive.metastore.uris=thrift://hive:9084
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Beeline version 4.0.0-alpha-2-SNAPSHOT by Apache Hive
> beeline> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25758) OOM due to recursive application of CBO rules

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25758?focusedWorklogId=761816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761816
 ]

ASF GitHub Bot logged work on HIVE-25758:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 14:18
Start Date: 25/Apr/22 14:18
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #2966:
URL: https://github.com/apache/hive/pull/2966#discussion_r857679191


##
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##
@@ -2547,6 +2547,9 @@ public static enum ConfVars {
 "If this config is true only pushed down filters remain in the 
operator tree, \n" +
 "and the original filter is removed. If this config is false, the 
original filter \n" +
 "is also left in the operator tree at the original place."),
+
HIVE_JOIN_DISJ_TRANSITIVE_PREDICATES_PUSHDOWN("hive.optimize.join.disjunctive.transitive.predicates.pushdown",
+true, "Whether to transitively infer disjunctive predicates across 
joins. \n"
++ "Disjunctive predicates can lead to OOM in transitive 
inference."),

Review Comment:
   Suggestion: Disjunctive predicates are hard to simplify and pushing them 
down may in some cases lead to infinite rule matching causing stackoverflow and 
OOM errors.





Issue Time Tracking
---

Worklog Id: (was: 761816)
Time Spent: 3h  (was: 2h 50m)

> OOM due to recursive application of CBO rules
> -
>
> Key: HIVE-25758
> URL: https://issues.apache.org/jira/browse/HIVE-25758
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Affects Versions: 4.0.0
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
>  
> Reproducing query is as follows:
> {code:java}
> create table test1 (act_nbr string);
> create table test2 (month int);
> create table test3 (mth int, con_usd double);
> EXPLAIN
>SELECT c.month,
>   d.con_usd
>FROM
>  (SELECT 
> cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), 
> '-MM-dd'), -1), 1, 7), '-', '') AS int) AS month
>   FROM test1
>   UNION ALL
>   SELECT month
>   FROM test2
>   WHERE month = 202110) c
>JOIN test3 d ON c.month = d.mth; {code}
>  
> Different plans are generated during the first CBO steps, last being:
> {noformat}
> 2021-12-01T08:28:08,598 DEBUG [a18191bb-3a2b-4193-9abf-4e37dd1996bb main] 
> parse.CalcitePlanner: Plan after decorre
> lation:
> HiveProject(month=[$0], con_usd=[$2])
>   HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], 
> cost=[not available])
>     HiveProject(month=[$0])
>       HiveUnion(all=[true])
>         
> HiveProject(month=[CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-d
> d':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), 
> _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-
> 16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER])
>           HiveTableScan(table=[[default, test1]], table:alias=[test1])
>         HiveProject(month=[$0])
>           HiveFilter(condition=[=($0, CAST(202110):INTEGER)])
>             HiveTableScan(table=[[default, test2]], table:alias=[test2])
>     HiveTableScan(table=[[default, test3]], table:alias=[d]){noformat}
>  
> Then, the HEP planner will keep expanding the filter expression with 
> redundant expressions, such as the following, where the identical CAST 
> expression is present multiple times:
>  
> {noformat}
> rel#118:HiveFilter.HIVE.[].any(input=HepRelVertex#39,condition=IN(CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, 
> _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> 202110)){noformat}
>  
> The problem seems to come from a bad interaction of at least 
> _HiveFilterProjectTransposeRule_ and 
> {_}HiveJoinPushTransitivePredicatesRule{_}, possibly more.
> Most probably then UNION part can be removed and the reproducer be simplified 
> even further.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25758) OOM due to recursive application of CBO rules

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25758?focusedWorklogId=761815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761815
 ]

ASF GitHub Bot logged work on HIVE-25758:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 14:14
Start Date: 25/Apr/22 14:14
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #2966:
URL: https://github.com/apache/hive/pull/2966#discussion_r857675116


##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java:
##
@@ -1214,6 +1214,58 @@ public FixNullabilityShuttle(RexBuilder rexBuilder,
 }
   }
 
+  /**
+   * Find disjunction (OR) in an expression (at any level of nesting).
+   *
+   * Example 1: OR(=($0, $1), IS NOT NULL($2))):INTEGER (OR in the top-level 
expression)
+   * Example 2: NOT(AND(=($0, $1), IS NOT NULL($2))
+   *   this is equivalent to OR((<>($0, $1), IS NULL($2))
+   * Example 3: AND(OR(=($0, $1), IS NOT NULL($2 (OR in inner expression)
+   */

Review Comment:
   Please ensure that we do not have illegal characters in the javadoc; i.e., 
we don't generate new javadoc errors.





Issue Time Tracking
---

Worklog Id: (was: 761815)
Time Spent: 2h 50m  (was: 2h 40m)

> OOM due to recursive application of CBO rules
> -
>
> Key: HIVE-25758
> URL: https://issues.apache.org/jira/browse/HIVE-25758
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Affects Versions: 4.0.0
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
>  
> Reproducing query is as follows:
> {code:java}
> create table test1 (act_nbr string);
> create table test2 (month int);
> create table test3 (mth int, con_usd double);
> EXPLAIN
>SELECT c.month,
>   d.con_usd
>FROM
>  (SELECT 
> cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), 
> '-MM-dd'), -1), 1, 7), '-', '') AS int) AS month
>   FROM test1
>   UNION ALL
>   SELECT month
>   FROM test2
>   WHERE month = 202110) c
>JOIN test3 d ON c.month = d.mth; {code}
>  
> Different plans are generated during the first CBO steps, last being:
> {noformat}
> 2021-12-01T08:28:08,598 DEBUG [a18191bb-3a2b-4193-9abf-4e37dd1996bb main] 
> parse.CalcitePlanner: Plan after decorre
> lation:
> HiveProject(month=[$0], con_usd=[$2])
>   HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], 
> cost=[not available])
>     HiveProject(month=[$0])
>       HiveUnion(all=[true])
>         
> HiveProject(month=[CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-d
> d':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), 
> _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-
> 16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER])
>           HiveTableScan(table=[[default, test1]], table:alias=[test1])
>         HiveProject(month=[$0])
>           HiveFilter(condition=[=($0, CAST(202110):INTEGER)])
>             HiveTableScan(table=[[default, test2]], table:alias=[test2])
>     HiveTableScan(table=[[default, test3]], table:alias=[d]){noformat}
>  
> Then, the HEP planner will keep expanding the filter expression with 
> redundant expressions, such as the following, where the identical CAST 
> expression is present multiple times:
>  
> {noformat}
> rel#118:HiveFilter.HIVE.[].any(input=HepRelVertex#39,condition=IN(CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, 
> _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> 202110)){noformat}
>  
> The problem seems to come from a bad interaction of at least 
> _HiveFilterProjectTransposeRule_ and 
> {_}HiveJoinPushTransitivePredicatesRule{_}, possibly more.
> Most probably then UNION part can be removed and the reproducer be simplified 
> even further.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25758) OOM due to recursive application of CBO rules

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25758?focusedWorklogId=761813=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761813
 ]

ASF GitHub Bot logged work on HIVE-25758:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 14:12
Start Date: 25/Apr/22 14:12
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #2966:
URL: https://github.com/apache/hive/pull/2966#discussion_r857673196


##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveCalciteUtil.java:
##
@@ -1214,6 +1214,58 @@ public FixNullabilityShuttle(RexBuilder rexBuilder,
 }
   }
 
+  /**
+   * Find disjunction (OR) in an expression (at any level of nesting).
+   *
+   * Example 1: OR(=($0, $1), IS NOT NULL($2))):INTEGER (OR in the top-level 
expression)
+   * Example 2: NOT(AND(=($0, $1), IS NOT NULL($2))
+   *   this is equivalent to OR((<>($0, $1), IS NULL($2))
+   * Example 3: AND(OR(=($0, $1), IS NOT NULL($2 (OR in inner expression)
+   */
+  public static class DisjunctivePredicatesFinder extends RexVisitorImpl 
{

Review Comment:
   How about making this class private and expose only a method? `public static 
hasDisjunction`?





Issue Time Tracking
---

Worklog Id: (was: 761813)
Time Spent: 2h 40m  (was: 2.5h)

> OOM due to recursive application of CBO rules
> -
>
> Key: HIVE-25758
> URL: https://issues.apache.org/jira/browse/HIVE-25758
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Affects Versions: 4.0.0
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
>  
> Reproducing query is as follows:
> {code:java}
> create table test1 (act_nbr string);
> create table test2 (month int);
> create table test3 (mth int, con_usd double);
> EXPLAIN
>SELECT c.month,
>   d.con_usd
>FROM
>  (SELECT 
> cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), 
> '-MM-dd'), -1), 1, 7), '-', '') AS int) AS month
>   FROM test1
>   UNION ALL
>   SELECT month
>   FROM test2
>   WHERE month = 202110) c
>JOIN test3 d ON c.month = d.mth; {code}
>  
> Different plans are generated during the first CBO steps, last being:
> {noformat}
> 2021-12-01T08:28:08,598 DEBUG [a18191bb-3a2b-4193-9abf-4e37dd1996bb main] 
> parse.CalcitePlanner: Plan after decorre
> lation:
> HiveProject(month=[$0], con_usd=[$2])
>   HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], 
> cost=[not available])
>     HiveProject(month=[$0])
>       HiveUnion(all=[true])
>         
> HiveProject(month=[CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-d
> d':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), 
> _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-
> 16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER])
>           HiveTableScan(table=[[default, test1]], table:alias=[test1])
>         HiveProject(month=[$0])
>           HiveFilter(condition=[=($0, CAST(202110):INTEGER)])
>             HiveTableScan(table=[[default, test2]], table:alias=[test2])
>     HiveTableScan(table=[[default, test3]], table:alias=[d]){noformat}
>  
> Then, the HEP planner will keep expanding the filter expression with 
> redundant expressions, such as the following, where the identical CAST 
> expression is present multiple times:
>  
> {noformat}
> rel#118:HiveFilter.HIVE.[].any(input=HepRelVertex#39,condition=IN(CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, 
> _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> 202110)){noformat}
>  
> The problem seems to come from a bad interaction of at least 
> _HiveFilterProjectTransposeRule_ and 
> {_}HiveJoinPushTransitivePredicatesRule{_}, possibly more.
> Most probably then UNION part can be removed and the reproducer be simplified 
> even further.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26160) Materialized View rewrite does not check tables scanned in sub-query expressions

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26160?focusedWorklogId=761805=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761805
 ]

ASF GitHub Bot logged work on HIVE-26160:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 14:07
Start Date: 25/Apr/22 14:07
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #3229:
URL: https://github.com/apache/hive/pull/3229#discussion_r857661697


##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveSubQueryVisitor.java:
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.optimizer.calcite;
+
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.RelVisitor;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rel.core.Project;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexSubQuery;
+import org.apache.calcite.rex.RexVisitorImpl;
+
+public class HiveSubQueryVisitor extends RelVisitor {

Review Comment:
   Do we really need this class? Can't we somehow exploit the 
`SubQueryRemoveRule`?  If we need this then probably we want to add some basic 
javadoc.



##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveSubQueryVisitor.java:
##
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.optimizer.calcite;
+
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.RelVisitor;
+import org.apache.calcite.rel.core.Filter;
+import org.apache.calcite.rel.core.Project;
+import org.apache.calcite.rex.RexNode;
+import org.apache.calcite.rex.RexSubQuery;
+import org.apache.calcite.rex.RexVisitorImpl;
+
+public class HiveSubQueryVisitor extends RelVisitor {
+
+  @Override
+  public void visit(RelNode node, int ordinal, RelNode parent) {
+if (node instanceof Filter) {
+  visit((Filter) node);
+} else if (node instanceof Project) {
+  visit((Project) node);
+}
+

Review Comment:
   Why do we need to focus only on Filter/Project ? Why not subqueries in 
`Join` or elsewhere? Can't we use the `RelNode#accept(RexShuttle)` for more 
uniform access?



##
ql/src/test/queries/clientpositive/materialized_view_rewrite_by_text_9.q:
##
@@ -0,0 +1,25 @@
+set hive.support.concurrency=true;
+set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
+set hive.materializedview.rewriting=false;
+
+create table t1(col0 int) STORED AS ORC
+  TBLPROPERTIES ('transactional'='true');
+
+create table t2(col0 int) STORED AS ORC
+  TBLPROPERTIES ('transactional'='true');
+
+insert into t1(col0) values (1), (NULL);
+insert into t2(col0) values (1), (2), (3), (NULL);
+
+create materialized view mat1 as
+select col0 from t1 where col0 = 1 union select col0 from t1 where col0 = 2;
+
+

Issue Time Tracking
---

Worklog Id: (was: 761805)
Time Spent: 20m  (was: 10m)

> Materialized View rewrite does not check tables scanned in sub-query 
> expressions
> 
>
> Key: HIVE-26160
> URL: https://issues.apache.org/jira/browse/HIVE-26160
> Project: Hive
>  Issue Type: Bug
>  

[jira] [Work logged] (HIVE-25980) Reduce fs calls in HiveMetaStoreChecker.checkTable

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25980?focusedWorklogId=761801=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761801
 ]

ASF GitHub Bot logged work on HIVE-25980:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 14:02
Start Date: 25/Apr/22 14:02
Worklog Time Spent: 10m 
  Work Description: pvary commented on PR #3053:
URL: https://github.com/apache/hive/pull/3053#issuecomment-1108618380

   @cravani: I was on a long PTO, and I do not remember our status here. Are 
you waiting for me?




Issue Time Tracking
---

Worklog Id: (was: 761801)
Time Spent: 6h  (was: 5h 50m)

> Reduce fs calls in HiveMetaStoreChecker.checkTable
> --
>
> Key: HIVE-25980
> URL: https://issues.apache.org/jira/browse/HIVE-25980
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> MSCK Repair table for high partition table can perform slow on Cloud Storage 
> such as S3, one of the case we found where slowness was observed in 
> HiveMetaStoreChecker.checkTable.
> {code:java}
> "HiveServer2-Background-Pool: Thread-382" #382 prio=5 os_prio=0 
> tid=0x7f97fc4a4000 nid=0x5c2a runnable [0x7f97c41a8000]
>java.lang.Thread.State: RUNNABLE
>   at java.net.SocketInputStream.socketRead0(Native Method)
>   at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>   at java.net.SocketInputStream.read(SocketInputStream.java:171)
>   at java.net.SocketInputStream.read(SocketInputStream.java:141)
>   at 
> sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:464)
>   at 
> sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68)
>   at 
> sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1341)
>   at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73)
>   at 
> sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:957)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157)
>   at 
> com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
>   at 
> com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:82)
>   at 
> com.amazonaws.thirdparty.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
>   at 
> com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
>   at 
> com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1331)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
>   at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)

[jira] [Work logged] (HIVE-26167) QueryStateMap in SessionState is not maintained correctly

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26167?focusedWorklogId=761800=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761800
 ]

ASF GitHub Bot logged work on HIVE-26167:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 14:00
Start Date: 25/Apr/22 14:00
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3234:
URL: https://github.com/apache/hive/pull/3234#discussion_r857660531


##
ql/src/java/org/apache/hadoop/hive/ql/Driver.java:
##
@@ -532,6 +522,10 @@ private void prepareContext() throws 
CommandProcessorException {
 context.setHDFSCleanup(true);
 
 driverTxnHandler.setContext(context);
+
+if (SessionState.get() != null) {
+  
SessionState.get().addQueryState(getConf().get(HiveConf.ConfVars.HIVEQUERYID.varname),
 getQueryState());

Review Comment:
   Would it be better to use `driverContext.getQueryState().getQueryId()` 
instead of depending on the conf?





Issue Time Tracking
---

Worklog Id: (was: 761800)
Time Spent: 40m  (was: 0.5h)

> QueryStateMap in SessionState is not maintained correctly
> -
>
> Key: HIVE-26167
> URL: https://issues.apache.org/jira/browse/HIVE-26167
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When the Driver is the QueryStateMap is also initialized with the query ID 
> and the current queryState object. This record is kept in the map until the 
> execution of the query is completed. 
> There are many unit tests that initialise the driver object once during the 
> setup phase, and use the same object to execute all the different queries. As 
> a consequence, after the first execution, the QueryStateMap will be cleaned 
> and all subsequent queries will run into null pointer exception while trying 
> to fetch the current querystate from the SessionState. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26159) hive cli is unavailable from hive command

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26159?focusedWorklogId=761797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761797
 ]

ASF GitHub Bot logged work on HIVE-26159:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 13:52
Start Date: 25/Apr/22 13:52
Worklog Time Spent: 10m 
  Work Description: pvary commented on PR #3227:
URL: https://github.com/apache/hive/pull/3227#issuecomment-1108607796

   @wecharyu: There was some code to start the HiveServer2 inside of the 
BeeLine process when it is started in CLI mode.
   I am not sure that part survived though, so it would be important to check.
   
   Thanks,
   Peter




Issue Time Tracking
---

Worklog Id: (was: 761797)
Time Spent: 1h  (was: 50m)

> hive cli is unavailable from hive command
> -
>
> Key: HIVE-26159
> URL: https://issues.apache.org/jira/browse/HIVE-26159
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Hive cli is a convenient tool to connect to hive metastore service, but now 
> hive cli can not start even if we use *--service cli* option, it should be a 
> bug of ticket HIVE-24348.
> *Steps to reproduce:*
> {code:bash}
> hive@hive:/root$ /usr/share/hive/bin/hive --service cli --hiveconf 
> hive.metastore.uris=thrift://hive:9084
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Beeline version 4.0.0-alpha-2-SNAPSHOT by Apache Hive
> beeline> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HIVE-26173) Upgrade derby to 10.14.2.0

2022-04-25 Thread Hemanth Boyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Boyina reassigned HIVE-26173:
-


> Upgrade derby to 10.14.2.0
> --
>
> Key: HIVE-26173
> URL: https://issues.apache.org/jira/browse/HIVE-26173
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hemanth Boyina
>Assignee: Hemanth Boyina
>Priority: Major
>
> upgrade derby from 10.14.1.0 to 10.14.2.0, to fix the vulnerability 
> CVE-2018-1313



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HIVE-26168) EXPLAIN DDL command output is not deterministic

2022-04-25 Thread Harshit Gupta (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harshit Gupta reassigned HIVE-26168:


Assignee: Harshit Gupta

> EXPLAIN DDL command output is not deterministic 
> 
>
> Key: HIVE-26168
> URL: https://issues.apache.org/jira/browse/HIVE-26168
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Stamatis Zampetakis
>Assignee: Harshit Gupta
>Priority: Minor
>
> The EXPLAIN DDL command (HIVE-24596) can be used to recreate the schema for a 
> given query in order to debug planner issues. This is achieved by fetching 
> information from the metastore and outputting series of DDL commands. 
> The output commands though may appear in different order among runs since 
> there is no mechanism to enforce an explicit order.
> Consider for instance the following scenario.
> {code:sql}
> CREATE TABLE customer
> (
> `c_custkey` bigint,
> `c_name`string,
> `c_address` string
> );
> INSERT INTO customer VALUES (1, 'Bob', '12 avenue Mansart'), (2, 'Alice', '24 
> avenue Mansart');
> EXPLAIN DDL SELECT c_custkey FROM customer WHERE c_name = 'Bob'; 
> {code}
> +Result 1+
> {noformat}
> ALTER TABLE default.customer UPDATE STATISTICS 
> SET('numRows'='2','rawDataSize'='48' );
> ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_address 
> SET('avgColLen'='17.0','maxColLen'='17','numNulls'='0','numDVs'='2' );
> -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_address BUT THEY ARE 
> NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwbec/QPAjtBF 
> ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_custkey 
> SET('lowValue'='1','highValue'='2','numNulls'='0','numDVs'='2' );
> -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_custkey BUT THEY ARE 
> NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwfO+SIOOofED 
> ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_name 
> SET('avgColLen'='4.0','maxColLen'='5','numNulls'='0','numDVs'='2' );
> -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_name BUT THEY ARE 
> NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS 
> SExMoAIChJLg1AGD1aCNBg== 
> {noformat}
> +Result 2+
> {noformat}
> ALTER TABLE default.customer UPDATE STATISTICS 
> SET('numRows'='2','rawDataSize'='48' );
> ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_custkey 
> SET('lowValue'='1','highValue'='2','numNulls'='0','numDVs'='2' );
> -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_custkey BUT THEY ARE 
> NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwfO+SIOOofED
> ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_address 
> SET('avgColLen'='17.0','maxColLen'='17','numNulls'='0','numDVs'='2' );
> -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_address BUT THEY ARE 
> NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwbec/QPAjtBF 
>  
> ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_name 
> SET('avgColLen'='4.0','maxColLen'='5','numNulls'='0','numDVs'='2' );
> -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_name BUT THEY ARE 
> NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS 
> SExMoAIChJLg1AGD1aCNBg== 
> {noformat}
> The two results are equivalent but the statements appear in a different 
> order. This is not a big issue cause the results remain correct but it may 
> lead to test flakiness so it might be worth addressing.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26159) hive cli is unavailable from hive command

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26159?focusedWorklogId=761782=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761782
 ]

ASF GitHub Bot logged work on HIVE-26159:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 13:05
Start Date: 25/Apr/22 13:05
Worklog Time Spent: 10m 
  Work Description: wecharyu commented on PR #3227:
URL: https://github.com/apache/hive/pull/3227#issuecomment-1108544251

   @pvary thanks for your information, my concern is that Beeline connects to 
hiveserver2 rather than metastore, what if I just start metastore service for 
test?




Issue Time Tracking
---

Worklog Id: (was: 761782)
Time Spent: 50m  (was: 40m)

> hive cli is unavailable from hive command
> -
>
> Key: HIVE-26159
> URL: https://issues.apache.org/jira/browse/HIVE-26159
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Hive cli is a convenient tool to connect to hive metastore service, but now 
> hive cli can not start even if we use *--service cli* option, it should be a 
> bug of ticket HIVE-24348.
> *Steps to reproduce:*
> {code:bash}
> hive@hive:/root$ /usr/share/hive/bin/hive --service cli --hiveconf 
> hive.metastore.uris=thrift://hive:9084
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Beeline version 4.0.0-alpha-2-SNAPSHOT by Apache Hive
> beeline> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25758) OOM due to recursive application of CBO rules

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25758?focusedWorklogId=761778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761778
 ]

ASF GitHub Bot logged work on HIVE-25758:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 13:02
Start Date: 25/Apr/22 13:02
Worklog Time Spent: 10m 
  Work Description: asolimando commented on code in PR #2966:
URL: https://github.com/apache/hive/pull/2966#discussion_r857601445


##
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##
@@ -2547,6 +2547,9 @@ public static enum ConfVars {
 "If this config is true only pushed down filters remain in the 
operator tree, \n" +
 "and the original filter is removed. If this config is false, the 
original filter \n" +
 "is also left in the operator tree at the original place."),
+
HIVE_JOIN_PUSH_TRANSITIVE_PREDICATES_CONSERVATIVE("hive.optimize.join.transitive.predicates.conservative",
+false, "Whether to avoid pushing predicates that are hard to simplify. 
\n"

Review Comment:
   Since we are focusing on the disjunctive predicates (as per another 
comment), let's opt for 
`hive.optimize.join.disjunctive.transitive.predicates.pushdown`.
   
   As discussed offline, let's leave the property disabled by default, because 
it's hard to judge how much benefit/harm there will be for Hive in the wild, we 
can revisit the default value later on if we have more evidence that this will 
help.





Issue Time Tracking
---

Worklog Id: (was: 761778)
Time Spent: 2.5h  (was: 2h 20m)

> OOM due to recursive application of CBO rules
> -
>
> Key: HIVE-25758
> URL: https://issues.apache.org/jira/browse/HIVE-25758
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Affects Versions: 4.0.0
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
>  
> Reproducing query is as follows:
> {code:java}
> create table test1 (act_nbr string);
> create table test2 (month int);
> create table test3 (mth int, con_usd double);
> EXPLAIN
>SELECT c.month,
>   d.con_usd
>FROM
>  (SELECT 
> cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), 
> '-MM-dd'), -1), 1, 7), '-', '') AS int) AS month
>   FROM test1
>   UNION ALL
>   SELECT month
>   FROM test2
>   WHERE month = 202110) c
>JOIN test3 d ON c.month = d.mth; {code}
>  
> Different plans are generated during the first CBO steps, last being:
> {noformat}
> 2021-12-01T08:28:08,598 DEBUG [a18191bb-3a2b-4193-9abf-4e37dd1996bb main] 
> parse.CalcitePlanner: Plan after decorre
> lation:
> HiveProject(month=[$0], con_usd=[$2])
>   HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], 
> cost=[not available])
>     HiveProject(month=[$0])
>       HiveUnion(all=[true])
>         
> HiveProject(month=[CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-d
> d':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), 
> _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-
> 16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER])
>           HiveTableScan(table=[[default, test1]], table:alias=[test1])
>         HiveProject(month=[$0])
>           HiveFilter(condition=[=($0, CAST(202110):INTEGER)])
>             HiveTableScan(table=[[default, test2]], table:alias=[test2])
>     HiveTableScan(table=[[default, test3]], table:alias=[d]){noformat}
>  
> Then, the HEP planner will keep expanding the filter expression with 
> redundant expressions, such as the following, where the identical CAST 
> expression is present multiple times:
>  
> {noformat}
> rel#118:HiveFilter.HIVE.[].any(input=HepRelVertex#39,condition=IN(CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, 
> _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> 202110)){noformat}
>  
> The problem seems to come from a bad interaction of at least 
> _HiveFilterProjectTransposeRule_ and 
> {_}HiveJoinPushTransitivePredicatesRule{_}, possibly more.
> Most probably then UNION part can be removed and the reproducer be simplified 
> even further.
>  



--
This message was sent by Atlassian 

[jira] [Updated] (HIVE-26135) Invalid Anti join conversion may cause missing results

2022-04-25 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-26135:

Description: 
right now I think the following is needed to trigger the issue:
* left outer join
* only select left hand side columns
* conditional which is using some udf
* the nullness of the udf is checked

repro sql; in case the conversion happens the row with 'a' will be missing
{code}
drop table if exists t;
drop table if exists n;

create table t(a string) stored as orc;
create table n(a string) stored as orc;

insert into t values ('a'),('1'),('2'),(null);
insert into n values ('a'),('b'),('1'),('3'),(null);


explain select n.* from n left outer join t on (n.a=t.a) where assert_true(t.a 
is null) is null;
explain select n.* from n left outer join t on (n.a=t.a) where cast(t.a as 
float) is null;


select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;
set hive.auto.convert.anti.join=false;
select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;

{code}

resultset with hive.auto.convert.anti.join enabled:
{code}
+--+
| n.a  |
+--+
| b|
| 3|
+--+
{code}

correct resultset with hive.auto.convert.anti.join disabled:
{code}
+---+
|  n.a  |
+---+
| a |
| b |
| 3 |
| NULL  |
+---+
{code}


workaround could be to disable the feature:
{code}
set hive.auto.convert.anti.join=false;
{code}


  was:
right now I think the following is needed to trigger the issue:
* left outer join
* only select left hand side columns
* conditional which is using some udf
* the nullness of the udf is checked

repro sql; in case the conversion happens the row with 'a' will be missing
{code}
drop table if exists t;
drop table if exists n;

create table t(a string) stored as orc;
create table n(a string) stored as orc;

insert into t values ('a'),('1'),('2'),(null);
insert into n values ('a'),('b'),('1'),('3'),(null);


explain select n.* from n left outer join t on (n.a=t.a) where assert_true(t.a 
is null) is null;
explain select n.* from n left outer join t on (n.a=t.a) where cast(t.a as 
float) is null;


select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;
set hive.auto.convert.anti.join=false;
select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;

{code}



workaround could be to disable the feature:
{code}
set hive.auto.convert.anti.join=false;
{code}



> Invalid Anti join conversion may cause missing results
> --
>
> Key: HIVE-26135
> URL: https://issues.apache.org/jira/browse/HIVE-26135
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> right now I think the following is needed to trigger the issue:
> * left outer join
> * only select left hand side columns
> * conditional which is using some udf
> * the nullness of the udf is checked
> repro sql; in case the conversion happens the row with 'a' will be missing
> {code}
> drop table if exists t;
> drop table if exists n;
> create table t(a string) stored as orc;
> create table n(a string) stored as orc;
> insert into t values ('a'),('1'),('2'),(null);
> insert into n values ('a'),('b'),('1'),('3'),(null);
> explain select n.* from n left outer join t on (n.a=t.a) where 
> assert_true(t.a is null) is null;
> explain select n.* from n left outer join t on (n.a=t.a) where cast(t.a as 
> float) is null;
> select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
> null;
> set hive.auto.convert.anti.join=false;
> select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
> null;
> {code}
> resultset with hive.auto.convert.anti.join enabled:
> {code}
> +--+
> | n.a  |
> +--+
> | b|
> | 3|
> +--+
> {code}
> correct resultset with hive.auto.convert.anti.join disabled:
> {code}
> +---+
> |  n.a  |
> +---+
> | a |
> | b |
> | 3 |
> | NULL  |
> +---+
> {code}
> workaround could be to disable the feature:
> {code}
> set hive.auto.convert.anti.join=false;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25758) OOM due to recursive application of CBO rules

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25758?focusedWorklogId=761754=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761754
 ]

ASF GitHub Bot logged work on HIVE-25758:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 12:30
Start Date: 25/Apr/22 12:30
Worklog Time Spent: 10m 
  Work Description: asolimando commented on code in PR #2966:
URL: https://github.com/apache/hive/pull/2966#discussion_r857573180


##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinPushTransitivePredicatesRule.java:
##
@@ -143,28 +138,82 @@ private ImmutableList 
getValidPreds(RelOptCluster cluster, RelNode chil
   }
 }
 
-// We need to filter i) those that have been pushed already as stored in 
the join,
-// and ii) those that were already in the subtree rooted at child
-ImmutableList toPush = 
HiveCalciteUtil.getPredsNotPushedAlready(predicatesToExclude,
-child, valids);
-return toPush;
+// We need to filter:
+//  i) those that have been pushed already as stored in the join,
+//  ii) those that were already in the subtree rooted at child.
+List toPush = 
HiveCalciteUtil.getPredsNotPushedAlready(predicatesToExclude, child, valids);
+
+// If we run the rule in conservative mode, we also filter:
+//  iii) predicates that are not safe for transitive inference.
+//
+// There is no formal definition of safety for predicate inference, only 
an empirical one.
+// An unsafe predicate in this context is one that when pushed across join 
operands, can lead
+// to redundant predicates that cannot be simplified (by means of 
predicates merging with other existing ones).
+// This situation can lead to an OOM for cases where lack of 
simplification allows inferring new predicates
+// (from LHS to RHS and vice-versa) recursively, predicates which are 
redundant, but that RexSimplify cannot handle.
+// This notion can be relaxed as soon as RexSimplify gets more powerful, 
and it can handle such cases.
+if (HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_JOIN_PUSH_TRANSITIVE_PREDICATES_CONSERVATIVE)) {
+  toPush = toPush.stream()
+  .filter(unsafeOperatorsFinder::isSafe)
+  .collect(Collectors.toList());
+}
+
+return ImmutableList.copyOf(toPush);
   }
 
-  private RexNode getTypeSafePred(RelOptCluster cluster, RexNode rex, 
RelDataType rType) {
-RexNode typeSafeRex = rex;
-if ((typeSafeRex instanceof RexCall) && 
HiveCalciteUtil.isComparisonOp((RexCall) typeSafeRex)) {
-  RexBuilder rb = cluster.getRexBuilder();
-  List fixedPredElems = new ArrayList();
-  RelDataType commonType = cluster.getTypeFactory().leastRestrictive(
-  RexUtil.types(((RexCall) rex).getOperands()));
-  for (RexNode rn : ((RexCall) rex).getOperands()) {
-fixedPredElems.add(rb.ensureType(commonType, rn, true));
-  }
+  //~ Inner Classes --
+
+  /**
+   * Finds unsafe operators in an expression (at any level of nesting).
+   * At the moment, the only unsafe operator is OR.
+   *
+   * Example 1: OR(=($0, $1), IS NOT NULL($2))):INTEGER (OR in the top-level 
expression)
+   * Example 2: NOT(AND(=($0, $1), IS NOT NULL($2))
+   *   this is equivalent to OR((<>($0, $1), IS NULL($2))
+   * Example 3: AND(OR(=($0, $1), IS NOT NULL($2 (OR in inner expression)
+   */
+  private static class UnsafeOperatorsFinder extends RexVisitorImpl {

Review Comment:
   Agreed, it's better to start with what we have right now, we can always make 
it more generic if needed later on.





Issue Time Tracking
---

Worklog Id: (was: 761754)
Time Spent: 2h 20m  (was: 2h 10m)

> OOM due to recursive application of CBO rules
> -
>
> Key: HIVE-25758
> URL: https://issues.apache.org/jira/browse/HIVE-25758
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Affects Versions: 4.0.0
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
>  
> Reproducing query is as follows:
> {code:java}
> create table test1 (act_nbr string);
> create table test2 (month int);
> create table test3 (mth int, con_usd double);
> EXPLAIN
>SELECT c.month,
>   d.con_usd
>FROM
>  (SELECT 
> cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), 
> '-MM-dd'), -1), 1, 7), '-', '') AS int) AS month
>   FROM test1
>   UNION ALL
>   SELECT month
>   FROM test2
>   WHERE month = 202110) c
>JOIN test3 d ON c.month = d.mth; {code}
>  
> Different plans are generated during the first 

[jira] [Work logged] (HIVE-25758) OOM due to recursive application of CBO rules

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25758?focusedWorklogId=761749=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761749
 ]

ASF GitHub Bot logged work on HIVE-25758:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 12:28
Start Date: 25/Apr/22 12:28
Worklog Time Spent: 10m 
  Work Description: asolimando commented on code in PR #2966:
URL: https://github.com/apache/hive/pull/2966#discussion_r857572044


##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinPushTransitivePredicatesRule.java:
##
@@ -65,21 +66,15 @@
  */
 public class HiveJoinPushTransitivePredicatesRule extends RelOptRule {
 
-  public static final HiveJoinPushTransitivePredicatesRule INSTANCE_JOIN =
-  new HiveJoinPushTransitivePredicatesRule(HiveJoin.class, 
HiveRelFactories.HIVE_FILTER_FACTORY);
-
-  public static final HiveJoinPushTransitivePredicatesRule INSTANCE_SEMIJOIN =
-  new HiveJoinPushTransitivePredicatesRule(HiveSemiJoin.class, 
HiveRelFactories.HIVE_FILTER_FACTORY);
-
-  public static final HiveJoinPushTransitivePredicatesRule INSTANCE_ANTIJOIN =
-  new HiveJoinPushTransitivePredicatesRule(HiveAntiJoin.class, 
HiveRelFactories.HIVE_FILTER_FACTORY);
-
+  private final HiveConf conf;

Review Comment:
   This is a great idea, I am doing that right away.





Issue Time Tracking
---

Worklog Id: (was: 761749)
Time Spent: 2h 10m  (was: 2h)

> OOM due to recursive application of CBO rules
> -
>
> Key: HIVE-25758
> URL: https://issues.apache.org/jira/browse/HIVE-25758
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Query Planning
>Affects Versions: 4.0.0
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
>  
> Reproducing query is as follows:
> {code:java}
> create table test1 (act_nbr string);
> create table test2 (month int);
> create table test3 (mth int, con_usd double);
> EXPLAIN
>SELECT c.month,
>   d.con_usd
>FROM
>  (SELECT 
> cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), 
> '-MM-dd'), -1), 1, 7), '-', '') AS int) AS month
>   FROM test1
>   UNION ALL
>   SELECT month
>   FROM test2
>   WHERE month = 202110) c
>JOIN test3 d ON c.month = d.mth; {code}
>  
> Different plans are generated during the first CBO steps, last being:
> {noformat}
> 2021-12-01T08:28:08,598 DEBUG [a18191bb-3a2b-4193-9abf-4e37dd1996bb main] 
> parse.CalcitePlanner: Plan after decorre
> lation:
> HiveProject(month=[$0], con_usd=[$2])
>   HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], 
> cost=[not available])
>     HiveProject(month=[$0])
>       HiveUnion(all=[true])
>         
> HiveProject(month=[CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-d
> d':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), 
> _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-
> 16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER])
>           HiveTableScan(table=[[default, test1]], table:alias=[test1])
>         HiveProject(month=[$0])
>           HiveFilter(condition=[=($0, CAST(202110):INTEGER)])
>             HiveTableScan(table=[[default, test2]], table:alias=[test2])
>     HiveTableScan(table=[[default, test3]], table:alias=[d]){noformat}
>  
> Then, the HEP planner will keep expanding the filter expression with 
> redundant expressions, such as the following, where the identical CAST 
> expression is present multiple times:
>  
> {noformat}
> rel#118:HiveFilter.HIVE.[].any(input=HepRelVertex#39,condition=IN(CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
>  _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, 
> _UTF-16LE'-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 
> 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", 
> _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER, 
> 202110)){noformat}
>  
> The problem seems to come from a bad interaction of at least 
> _HiveFilterProjectTransposeRule_ and 
> {_}HiveJoinPushTransitivePredicatesRule{_}, possibly more.
> Most probably then UNION part can be removed and the reproducer be simplified 
> even further.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26145) Disable notification cleaner if interval is zero

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26145?focusedWorklogId=761741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761741
 ]

ASF GitHub Bot logged work on HIVE-26145:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 12:14
Start Date: 25/Apr/22 12:14
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #3215:
URL: https://github.com/apache/hive/pull/3215#discussion_r857554725


##
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java:
##
@@ -174,12 +174,17 @@ public class DbNotificationListener extends 
TransactionalMetaStoreEventListener
 
   //cleaner is a static object, use static synchronized to make sure its 
thread-safe
   private static synchronized void init(Configuration conf) throws 
MetaException {
-if (cleaner == null) {
+long freq = MetastoreConf.getTimeVar(conf, 
MetastoreConf.ConfVars.EVENT_DB_LISTENER_CLEAN_INTERVAL, TimeUnit.MILLISECONDS);
+if (cleaner == null && freq > 0) {
   cleaner =

Review Comment:
   The `cleaner` object is accessed in many places in this class and in order 
to ensure that we are not going to have a NPE we need to review all calls 
carefully. For instance, it seems that configuration changes (`onConfigChange`) 
may touch try to update some aspects of the `cleaner` assuming that the latter 
one is initialized.





Issue Time Tracking
---

Worklog Id: (was: 761741)
Time Spent: 0.5h  (was: 20m)

> Disable notification cleaner if interval is zero
> 
>
> Key: HIVE-26145
> URL: https://issues.apache.org/jira/browse/HIVE-26145
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Janos Kovacs
>Assignee: Janos Kovacs
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Many of the housekeeping/background tasks can be turned off in case of having 
> multiple instances running parallel. 
> Some are controlled via the housekeeping node configuration, others are not 
> started if their frequency is set to zero.
> The DB-Notification cleaner unfortunately doesn't have this functionality 
> which makes all instances to race for the lock on the backend HMS database. 
> Goal is to add change to be able to turn cleaner off in case if there are 
> multiple instances running (be able to bound it to the housekeeping 
> instance).  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26145) Disable notification cleaner if interval is zero

2022-04-25 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527445#comment-17527445
 ] 

Stamatis Zampetakis commented on HIVE-26145:


Looking at the available configuration properties and their description:
* metastore.event.db.listener.clean.interval
* metastore.event.db.listener.clean.startup.wait.interval
it's not very intuitive which one should be used to disable the cleaner and how.

Setting the {{clean.interval}} to zero could mean that you want the cleaner to 
run ASAP without sleeping at all instead of deactivating it completely. On the 
other hand if you set the {{startup.wait.interval}} to 100*365 you are sure 
that the cleaner will not run in the next 100 years which I think it is as good 
as being disabled.

> Disable notification cleaner if interval is zero
> 
>
> Key: HIVE-26145
> URL: https://issues.apache.org/jira/browse/HIVE-26145
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Janos Kovacs
>Assignee: Janos Kovacs
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Many of the housekeeping/background tasks can be turned off in case of having 
> multiple instances running parallel. 
> Some are controlled via the housekeeping node configuration, others are not 
> started if their frequency is set to zero.
> The DB-Notification cleaner unfortunately doesn't have this functionality 
> which makes all instances to race for the lock on the backend HMS database. 
> Goal is to add change to be able to turn cleaner off in case if there are 
> multiple instances running (be able to bound it to the housekeeping 
> instance).  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-26167) QueryStateMap in SessionState is not maintained correctly

2022-04-25 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-26167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Pintér updated HIVE-26167:
-
Summary: QueryStateMap in SessionState is not maintained correctly  (was: 
QueryStateMap in SessionState is maintained correctly)

> QueryStateMap in SessionState is not maintained correctly
> -
>
> Key: HIVE-26167
> URL: https://issues.apache.org/jira/browse/HIVE-26167
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Pintér
>Assignee: László Pintér
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When the Driver is the QueryStateMap is also initialized with the query ID 
> and the current queryState object. This record is kept in the map until the 
> execution of the query is completed. 
> There are many unit tests that initialise the driver object once during the 
> setup phase, and use the same object to execute all the different queries. As 
> a consequence, after the first execution, the QueryStateMap will be cleaned 
> and all subsequent queries will run into null pointer exception while trying 
> to fetch the current querystate from the SessionState. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26158) TRANSLATED_TO_EXTERNAL partition tables cannot query partition data after rename table

2022-04-25 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527440#comment-17527440
 ] 

Zoltan Haindrich commented on HIVE-26158:
-

[~sanguines] I've also just bumped into the exact same thing - let me know if 
you would like to pick this up
I'll probably post a patch for it in the next couple days

> TRANSLATED_TO_EXTERNAL partition tables cannot query partition data after 
> rename table
> --
>
> Key: HIVE-26158
> URL: https://issues.apache.org/jira/browse/HIVE-26158
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0, 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: tanghui
>Assignee: Zoltan Haindrich
>Priority: Major
>
> After the patch is updated, the partition table location and hdfs data 
> directory are displayed normally, but the partition location of the table in 
> the SDS in the Hive metabase is still displayed as the location of the old 
> table, resulting in no data in the query partition.
>  
> in beeline:
> 
> set hive.create.as.external.legacy=true;
> CREATE TABLE part_test(
> c1 string
> ,c2 string
> )PARTITIONED BY (dat string)
> insert into part_test values ("11","th","20220101")
> insert into part_test values ("22","th","20220102")
> alter table part_test rename to part_test11;
> --this result is null.
> select * from part_test11 where dat="20220101";
> ||part_test.c1||part_test.c2||part_test.dat||
> | | | |
> -
> SDS in the Hive metabase:
> select SDS.LOCATION from TBLS,SDS where TBLS.TBL_NAME="part_test11" AND 
> TBLS.TBL_ID=SDS.CD_ID;
> ---
> |*LOCATION*|
> |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test11|
> |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test/dat=20220101|
> |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test/dat=20220102|
> ---
>  
> We need to modify the partition location of the table in SDS to ensure that 
> the query results are normal



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Assigned] (HIVE-26172) Upgrade ant to 1.10.12

2022-04-25 Thread Hemanth Boyina (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hemanth Boyina reassigned HIVE-26172:
-


> Upgrade ant to 1.10.12
> --
>
> Key: HIVE-26172
> URL: https://issues.apache.org/jira/browse/HIVE-26172
> Project: Hive
>  Issue Type: Improvement
>Reporter: Hemanth Boyina
>Assignee: Hemanth Boyina
>Priority: Major
>
> Upgrade ant from 1.10.9 to 1.10.12 to fix the vulnerability CVE-2021-36373



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25758) OOM due to recursive application of CBO rules

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25758?focusedWorklogId=761726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761726
 ]

ASF GitHub Bot logged work on HIVE-25758:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 11:40
Start Date: 25/Apr/22 11:40
Worklog Time Spent: 10m 
  Work Description: zabetak commented on code in PR #2966:
URL: https://github.com/apache/hive/pull/2966#discussion_r857510400


##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinPushTransitivePredicatesRule.java:
##
@@ -145,19 +138,25 @@ private ImmutableList getValidPreds(RelNode 
child, Set predicat
   }
 }
 
-// We need to filter i) those that have been pushed already as stored in 
the join,
-// ii) those that were already in the subtree rooted at child,
-// iii) predicates that are not safe for transitive inference.
+// We need to filter:
+//  i) those that have been pushed already as stored in the join,
+//  ii) those that were already in the subtree rooted at child.
+List toPush = 
HiveCalciteUtil.getPredsNotPushedAlready(predicatesToExclude, child, valids);
+
+// If we run the rule in conservative mode, we also filter:
+//  iii) predicates that are not safe for transitive inference.
 //
 // There is no formal definition of safety for predicate inference, only 
an empirical one.
 // An unsafe predicate in this context is one that when pushed across join 
operands, can lead
 // to redundant predicates that cannot be simplified (by means of 
predicates merging with other existing ones).
 // This situation can lead to an OOM for cases where lack of 
simplification allows inferring new predicates
-// (from LHS to RHS) recursively, predicates which are redundant, but that 
RexSimplify cannot handle.
+// (from LHS to RHS and vice-versa) recursively, predicates which are 
redundant, but that RexSimplify cannot handle.
 // This notion can be relaxed as soon as RexSimplify gets more powerful, 
and it can handle such cases.
-List toPush = 
HiveCalciteUtil.getPredsNotPushedAlready(predicatesToExclude, child, 
valids).stream()
-.filter(unsafeOperatorsFinder::isSafe)
-.collect(Collectors.toList());
+if (HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_JOIN_PUSH_TRANSITIVE_PREDICATES_CONSERVATIVE)) {

Review Comment:
   Good idea to add a configuration flag for this change.



##
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveJoinPushTransitivePredicatesRule.java:
##
@@ -143,28 +138,82 @@ private ImmutableList 
getValidPreds(RelOptCluster cluster, RelNode chil
   }
 }
 
-// We need to filter i) those that have been pushed already as stored in 
the join,
-// and ii) those that were already in the subtree rooted at child
-ImmutableList toPush = 
HiveCalciteUtil.getPredsNotPushedAlready(predicatesToExclude,
-child, valids);
-return toPush;
+// We need to filter:
+//  i) those that have been pushed already as stored in the join,
+//  ii) those that were already in the subtree rooted at child.
+List toPush = 
HiveCalciteUtil.getPredsNotPushedAlready(predicatesToExclude, child, valids);
+
+// If we run the rule in conservative mode, we also filter:
+//  iii) predicates that are not safe for transitive inference.
+//
+// There is no formal definition of safety for predicate inference, only 
an empirical one.
+// An unsafe predicate in this context is one that when pushed across join 
operands, can lead
+// to redundant predicates that cannot be simplified (by means of 
predicates merging with other existing ones).
+// This situation can lead to an OOM for cases where lack of 
simplification allows inferring new predicates
+// (from LHS to RHS and vice-versa) recursively, predicates which are 
redundant, but that RexSimplify cannot handle.
+// This notion can be relaxed as soon as RexSimplify gets more powerful, 
and it can handle such cases.
+if (HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_JOIN_PUSH_TRANSITIVE_PREDICATES_CONSERVATIVE)) {
+  toPush = toPush.stream()
+  .filter(unsafeOperatorsFinder::isSafe)
+  .collect(Collectors.toList());
+}
+
+return ImmutableList.copyOf(toPush);
   }
 
-  private RexNode getTypeSafePred(RelOptCluster cluster, RexNode rex, 
RelDataType rType) {
-RexNode typeSafeRex = rex;
-if ((typeSafeRex instanceof RexCall) && 
HiveCalciteUtil.isComparisonOp((RexCall) typeSafeRex)) {
-  RexBuilder rb = cluster.getRexBuilder();
-  List fixedPredElems = new ArrayList();
-  RelDataType commonType = cluster.getTypeFactory().leastRestrictive(
-  RexUtil.types(((RexCall) rex).getOperands()));
-  for (RexNode rn : ((RexCall) rex).getOperands()) {
-

[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761718=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761718
 ]

ASF GitHub Bot logged work on HIVE-26149:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 11:29
Start Date: 25/Apr/22 11:29
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3220:
URL: https://github.com/apache/hive/pull/3220#discussion_r857525828


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java:
##
@@ -1534,43 +1538,50 @@ public void dropDatabase(String catalogName, String 
dbName, boolean deleteData,
* @param maxBatchSize
* @throws TException
*/
-  private void dropDatabaseCascadePerTable(String catName, String dbName, 
List tableList,
-   boolean deleteData, int 
maxBatchSize) throws TException {
-String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf);
-for (Table table : new TableIterable(this, catName, dbName, tableList, 
maxBatchSize)) {
+  private void dropDatabaseCascadePerTable(DropDatabaseRequest req, 
List tableList, int maxBatchSize) 
+  throws TException {
+String dbNameWithCatalog = prependCatalogToDbName(req.getCatalogName(), 
req.getName(), conf);
+for (Table table : new TableIterable(
+this, req.getCatalogName(), req.getName(), tableList, maxBatchSize)) {
   boolean success = false;
   HiveMetaHook hook = getHook(table);
-  if (hook == null) {
-continue;
-  }
   try {
-hook.preDropTable(table);
-client.drop_table_with_environment_context(dbNameWithCatalog, 
table.getTableName(), deleteData, null);
-hook.commitDropTable(table, deleteData);
+if (hook != null) {
+  hook.preDropTable(table);
+}
+boolean isSoftDelete = req.isSoftDelete() && Boolean.parseBoolean(
+  table.getParameters().getOrDefault(SOFT_DELETE_TABLE, "false"));
+EnvironmentContext context = null;
+if (req.isSetTxnId()) {
+  context = new EnvironmentContext();
+  context.putToProperties("txnId", String.valueOf(req.getTxnId()));
+  req.setDeleteManagedDir(false);
+}
+client.drop_table_with_environment_context(dbNameWithCatalog, 
table.getTableName(), 
+req.isDeleteData() && !isSoftDelete, context);
+if (hook != null) {
+  hook.commitDropTable(table, req.isDeleteData());
+}
 success = true;
   } finally {
-if (!success) {
+if (!success && hook != null) {
   hook.rollbackDropTable(table);
 }
   }
 }
-client.drop_database(dbNameWithCatalog, deleteData, true);
+client.drop_database_req(req);
   }
 
   /**
* Handles dropDatabase by invoking drop_database in HMS.
* Useful when table list in DB can fit in memory, it will retrieve all 
tables at once and
* call drop_database once. Also handles drop_table hooks.
-   * @param catName
-   * @param dbName
+   * @param req
* @param tableList
-   * @param deleteData
* @throws TException
*/
-  private void dropDatabaseCascadePerDb(String catName, String dbName, 
List tableList,
-boolean deleteData) throws TException {
-String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf);
-List tables = getTableObjectsByName(catName, dbName, tableList);
+  private void dropDatabaseCascadePerDb(DropDatabaseRequest req, List 
tableList) throws TException {

Review Comment:
   that remains if soft delete is disabled, otherwise, we'll be locking just 
tables/not DB with an appropriate type of lock.  
   allTablesWithSuffix is an optimization, if all tables under DB are 
soft-delete eligible - grad just table-level lock. 





Issue Time Tracking
---

Worklog Id: (was: 761718)
Time Spent: 1h 40m  (was: 1.5h)

> Non blocking DROP DATABASE implementation
> -
>
> Key: HIVE-26149
> URL: https://issues.apache.org/jira/browse/HIVE-26149
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761719=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761719
 ]

ASF GitHub Bot logged work on HIVE-26149:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 11:29
Start Date: 25/Apr/22 11:29
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3220:
URL: https://github.com/apache/hive/pull/3220#discussion_r857525828


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java:
##
@@ -1534,43 +1538,50 @@ public void dropDatabase(String catalogName, String 
dbName, boolean deleteData,
* @param maxBatchSize
* @throws TException
*/
-  private void dropDatabaseCascadePerTable(String catName, String dbName, 
List tableList,
-   boolean deleteData, int 
maxBatchSize) throws TException {
-String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf);
-for (Table table : new TableIterable(this, catName, dbName, tableList, 
maxBatchSize)) {
+  private void dropDatabaseCascadePerTable(DropDatabaseRequest req, 
List tableList, int maxBatchSize) 
+  throws TException {
+String dbNameWithCatalog = prependCatalogToDbName(req.getCatalogName(), 
req.getName(), conf);
+for (Table table : new TableIterable(
+this, req.getCatalogName(), req.getName(), tableList, maxBatchSize)) {
   boolean success = false;
   HiveMetaHook hook = getHook(table);
-  if (hook == null) {
-continue;
-  }
   try {
-hook.preDropTable(table);
-client.drop_table_with_environment_context(dbNameWithCatalog, 
table.getTableName(), deleteData, null);
-hook.commitDropTable(table, deleteData);
+if (hook != null) {
+  hook.preDropTable(table);
+}
+boolean isSoftDelete = req.isSoftDelete() && Boolean.parseBoolean(
+  table.getParameters().getOrDefault(SOFT_DELETE_TABLE, "false"));
+EnvironmentContext context = null;
+if (req.isSetTxnId()) {
+  context = new EnvironmentContext();
+  context.putToProperties("txnId", String.valueOf(req.getTxnId()));
+  req.setDeleteManagedDir(false);
+}
+client.drop_table_with_environment_context(dbNameWithCatalog, 
table.getTableName(), 
+req.isDeleteData() && !isSoftDelete, context);
+if (hook != null) {
+  hook.commitDropTable(table, req.isDeleteData());
+}
 success = true;
   } finally {
-if (!success) {
+if (!success && hook != null) {
   hook.rollbackDropTable(table);
 }
   }
 }
-client.drop_database(dbNameWithCatalog, deleteData, true);
+client.drop_database_req(req);
   }
 
   /**
* Handles dropDatabase by invoking drop_database in HMS.
* Useful when table list in DB can fit in memory, it will retrieve all 
tables at once and
* call drop_database once. Also handles drop_table hooks.
-   * @param catName
-   * @param dbName
+   * @param req
* @param tableList
-   * @param deleteData
* @throws TException
*/
-  private void dropDatabaseCascadePerDb(String catName, String dbName, 
List tableList,
-boolean deleteData) throws TException {
-String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf);
-List tables = getTableObjectsByName(catName, dbName, tableList);
+  private void dropDatabaseCascadePerDb(DropDatabaseRequest req, List 
tableList) throws TException {

Review Comment:
   that remains if soft delete is disabled, otherwise, we'll be locking just 
tables/not DB with an appropriate type of lock.  
   allTablesWithSuffix is an optimization, if all tables under DB are 
soft-delete eligible - grad just DB-level lock. 





Issue Time Tracking
---

Worklog Id: (was: 761719)
Time Spent: 1h 50m  (was: 1h 40m)

> Non blocking DROP DATABASE implementation
> -
>
> Key: HIVE-26149
> URL: https://issues.apache.org/jira/browse/HIVE-26149
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26171) HMSHandler get_all_tables method can not retrieve tables from remote database

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26171?focusedWorklogId=761716=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761716
 ]

ASF GitHub Bot logged work on HIVE-26171:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 11:26
Start Date: 25/Apr/22 11:26
Worklog Time Spent: 10m 
  Work Description: pvary commented on PR #3238:
URL: https://github.com/apache/hive/pull/3238#issuecomment-1108447729

   @zhangbutao: I think having a test would help preventing the situation when 
some change inadvertently reverts the fix. If writing a test is not 
prohibitively hard, I would suggest adding one, maybe with mocking the 
`DataConnectorProviderFactory` or something around that.




Issue Time Tracking
---

Worklog Id: (was: 761716)
Time Spent: 20m  (was: 10m)

> HMSHandler get_all_tables method can not retrieve tables from remote database
> -
>
> Key: HIVE-26171
> URL: https://issues.apache.org/jira/browse/HIVE-26171
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0-alpha-1, 4.0.0-alpha-2
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> At present, get_all_tables  method in HMSHandler would not get table from 
> remote database. However, other component like presto and some jobs we 
> developed have used this api instead of _get_tables_ which could retrieve all 
> tables both native database and remote database .
> {code:java}
> // get_all_tables only can get tables from native database
> public List get_all_tables(final String dbname) throws MetaException 
> {{code}
> {code:java}
> // get_tables can get tables from both native and remote database
> public List get_tables(final String dbname, final String 
> pattern){code}
> I think we shoud fix get_all_tables to make it retrive tables from remote 
> database.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26162) Documentation upgrade

2022-04-25 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527431#comment-17527431
 ] 

Peter Vary commented on HIVE-26162:
---

[~florianc]: I am not aware of any such documentation.
OTOH, I think the possible serdeproperties are very much dependent on the serde 
used. So I would try to check the SerDe documentation if any, or create a new 
one if I do not find it based on the SerDe code

> Documentation upgrade
> -
>
> Key: HIVE-26162
> URL: https://issues.apache.org/jira/browse/HIVE-26162
> Project: Hive
>  Issue Type: Wish
>Reporter: Florian CASTELAIN
>Priority: Major
>
> Hello.
>  
> I have been looking for specific elements in the documentation, more 
> specifically, the list of serdeproperties.
> So I was looking for an exhaustive list of serdeproperties and I cannot find 
> one at all. 
> This is very surprising as one would expect a tool to describe all of its 
> features.
> Is it planned to create such a list ? If it already exists, where is it ? 
> Because the official docs do not contain it (or it is well hidden, thus you 
> should make it more accessible).
>  
> Thank you.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-25907) IOW Directory queries fails to write data to final path when query result cache is enabled

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25907?focusedWorklogId=761714=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761714
 ]

ASF GitHub Bot logged work on HIVE-25907:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 11:18
Start Date: 25/Apr/22 11:18
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on PR #2978:
URL: https://github.com/apache/hive/pull/2978#issuecomment-1108438926

   @kgyrtkirk Could you please review the changes?




Issue Time Tracking
---

Worklog Id: (was: 761714)
Time Spent: 1h 50m  (was: 1h 40m)

> IOW Directory queries fails to write data to final path when query result 
> cache is enabled
> --
>
> Key: HIVE-25907
> URL: https://issues.apache.org/jira/browse/HIVE-25907
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> INSERT OVERWRITE DIRECTORY queries fails to write the data to the specified 
> directory location when query result cache is enabled.
> *Steps to reproduce*
> {code:java}
> 1. create a data file with the following data
> 1 abc 10.5
> 2 def 11.5
> 2. create table pointing to that data
> create external table iowd(strct struct)
> row format delimited
> fields terminated by '\t'
> collection items terminated by ' '
> location '';
> 3. run the following query
> set hive.query.results.cache.enabled=true;
> INSERT OVERWRITE DIRECTORY "" SELECT * FROM iowd;
> {code}
> After execution of the above query, It is expected that the destination 
> directory contains data from the table iowd, But due to HIVE-21386 it is not 
> happening anymore.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26159) hive cli is unavailable from hive command

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26159?focusedWorklogId=761713=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761713
 ]

ASF GitHub Bot logged work on HIVE-26159:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 11:16
Start Date: 25/Apr/22 11:16
Worklog Time Spent: 10m 
  Work Description: pvary commented on PR #3227:
URL: https://github.com/apache/hive/pull/3227#issuecomment-1108437630

   @wecharyu: Retriggered the tests so we can have a green run.
   OTOH, I seem to remember that there were plans to use BeeLine, and embedded 
driver instead of the CLI
   
   Have you tried to issue the same commands from the BeeLine prompt as you 
have done with the CLI? If everything works as expected then it should work 
more-or-less the same way as with the CLI




Issue Time Tracking
---

Worklog Id: (was: 761713)
Time Spent: 40m  (was: 0.5h)

> hive cli is unavailable from hive command
> -
>
> Key: HIVE-26159
> URL: https://issues.apache.org/jira/browse/HIVE-26159
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Hive cli is a convenient tool to connect to hive metastore service, but now 
> hive cli can not start even if we use *--service cli* option, it should be a 
> bug of ticket HIVE-24348.
> *Steps to reproduce:*
> {code:bash}
> hive@hive:/root$ /usr/share/hive/bin/hive --service cli --hiveconf 
> hive.metastore.uris=thrift://hive:9084
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Beeline version 4.0.0-alpha-2-SNAPSHOT by Apache Hive
> beeline> 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26155) Create a new connection pool for compaction

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26155?focusedWorklogId=761704=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761704
 ]

ASF GitHub Bot logged work on HIVE-26155:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 11:00
Start Date: 25/Apr/22 11:00
Worklog Time Spent: 10m 
  Work Description: pvary commented on PR #3223:
URL: https://github.com/apache/hive/pull/3223#issuecomment-1108419421

   Wait!
   This could cause dead locks




Issue Time Tracking
---

Worklog Id: (was: 761704)
Time Spent: 50m  (was: 40m)

> Create a new connection pool for compaction
> ---
>
> Key: HIVE-26155
> URL: https://issues.apache.org/jira/browse/HIVE-26155
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: Antal Sinkovits
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: compaction, pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently the TxnHandler uses 2 connection pools to communicate with the HMS: 
> the default one and one for mutexing. If compaction is configured incorrectly 
> (e.g. too many Initiators are running on the same db) then compaction can use 
> up all the connections in the default connection pool and all user queries 
> can get stuck.
> We should have a separate connection pool (configurable size) just for 
> compaction-related activities.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HIVE-26150) OrcRawRecordMerger reads each row twice

2022-04-25 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-26150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17527423#comment-17527423
 ] 

Peter Vary commented on HIVE-26150:
---

[~asolimando]: Does this happen during normal read of deleted deltas?

> OrcRawRecordMerger reads each row twice
> ---
>
> Key: HIVE-26150
> URL: https://issues.apache.org/jira/browse/HIVE-26150
> Project: Hive
>  Issue Type: Bug
>  Components: ORC, Transactions
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Priority: Major
>
> OrcRawRecordMerger reads each row twice, the issue does not surface since the 
> merger is only used with the parameter "collapseEvents" as true, which 
> filters out one of the two rows.
> collapseEvents true and false should produce the same result, since in 
> current acid implementation, each event has a distinct rowid, so two 
> identical rows cannot be there, this is the case only for the bug.
> In order to reproduce the issue, it is sufficient to set the second parameter 
> to false 
> [here|https://github.com/apache/hive/blob/61d4ff2be48b20df9fd24692c372ee9c2606babe/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L2103-L2106],
>  and run tests in TestOrcRawRecordMerger and observe two tests failing:
> {code:bash}
> mvn test -Dtest=TestOrcRawRecordMerger -pl ql
> {code}
> {noformat}
> [INFO] Results:
> [INFO]
> [ERROR] Failures:
> [ERROR]   TestOrcRawRecordMerger.testRecordReaderNewBaseAndDelta:1332 Found 
> unexpected row: (0,ignore.1)
> [ERROR]   TestOrcRawRecordMerger.testRecordReaderOldBaseAndDelta:1208 Found 
> unexpected row: (0,ignore.1)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26145) Disable notification cleaner if interval is zero

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26145?focusedWorklogId=761698=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761698
 ]

ASF GitHub Bot logged work on HIVE-26145:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 10:53
Start Date: 25/Apr/22 10:53
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3215:
URL: https://github.com/apache/hive/pull/3215#discussion_r857500634


##
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java:
##
@@ -174,12 +174,17 @@ public class DbNotificationListener extends 
TransactionalMetaStoreEventListener
 
   //cleaner is a static object, use static synchronized to make sure its 
thread-safe
   private static synchronized void init(Configuration conf) throws 
MetaException {
-if (cleaner == null) {
+long freq = MetastoreConf.getTimeVar(conf, 
MetastoreConf.ConfVars.EVENT_DB_LISTENER_CLEAN_INTERVAL, TimeUnit.MILLISECONDS);
+if (cleaner == null && freq > 0) {
   cleaner =
   new CleanerThread(conf, RawStoreProxy.getProxy(conf, conf,
   MetastoreConf.getVar(conf, ConfVars.RAW_STORE_IMPL)));
   cleaner.start();
-}
+  LOG.info("Scheduling notification log cleanup service with " +

Review Comment:
   Nit: Could we use parametrized logging here instead of concatenating the 
strings?





Issue Time Tracking
---

Worklog Id: (was: 761698)
Time Spent: 20m  (was: 10m)

> Disable notification cleaner if interval is zero
> 
>
> Key: HIVE-26145
> URL: https://issues.apache.org/jira/browse/HIVE-26145
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Janos Kovacs
>Assignee: Janos Kovacs
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Many of the housekeeping/background tasks can be turned off in case of having 
> multiple instances running parallel. 
> Some are controlled via the housekeeping node configuration, others are not 
> started if their frequency is set to zero.
> The DB-Notification cleaner unfortunately doesn't have this functionality 
> which makes all instances to race for the lock on the backend HMS database. 
> Goal is to add change to be able to turn cleaner off in case if there are 
> multiple instances running (be able to bound it to the housekeeping 
> instance).  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26107) Worker shouldn't inject duplicate entries in `ready for cleaning` state into the compaction queue

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26107?focusedWorklogId=761659=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761659
 ]

ASF GitHub Bot logged work on HIVE-26107:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 08:58
Start Date: 25/Apr/22 08:58
Worklog Time Spent: 10m 
  Work Description: klcopp commented on code in PR #3172:
URL: https://github.com/apache/hive/pull/3172#discussion_r857405715


##
ql/src/java/org/apache/hadoop/hive/ql/DriverTxnHandler.java:
##
@@ -303,8 +303,15 @@ void setWriteIdForAcidFileSinks() throws 
SemanticException, LockException {
 
   private void allocateWriteIdForAcidAnalyzeTable() throws LockException {
 if (driverContext.getPlan().getAcidAnalyzeTable() != null) {
+  //Inside a compaction transaction, only stats gathering is running which 
is not requiring a new write id,
+  //and for duplicate compaction detection it is necessary to not 
increment it.
+  boolean isWithinCompactionTxn = 
Boolean.parseBoolean(SessionState.get().getHiveVariables().get(Constants.INSIDE_COMPACTION_TRANSACTION_FLAG));

Review Comment:
   Wouldn't it be a bit nicer to put the flag in DriverContext or something, 
instead of using a session state variable? This is just a suggestion so feel 
free to take it or leave it...





Issue Time Tracking
---

Worklog Id: (was: 761659)
Time Spent: 1.5h  (was: 1h 20m)

> Worker shouldn't inject duplicate entries in `ready for cleaning` state into 
> the compaction queue
> -
>
> Key: HIVE-26107
> URL: https://issues.apache.org/jira/browse/HIVE-26107
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> How to reproduce:
> 1) create an acid table and load some data ;
> 2) manually trigger the compaction for the table several times;
> 4) inspect compaction_queue: There are multiple entries in 'ready for 
> cleaning' state for the same table.
>  
> Expected behavior: All compaction request after the first one should be 
> rejected until the table is changed again.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26107) Worker shouldn't inject duplicate entries in `ready for cleaning` state into the compaction queue

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26107?focusedWorklogId=761657=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761657
 ]

ASF GitHub Bot logged work on HIVE-26107:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 08:53
Start Date: 25/Apr/22 08:53
Worklog Time Spent: 10m 
  Work Description: klcopp commented on code in PR #3172:
URL: https://github.com/apache/hive/pull/3172#discussion_r857400842


##
ql/src/java/org/apache/hadoop/hive/ql/DriverTxnHandler.java:
##
@@ -303,8 +303,15 @@ void setWriteIdForAcidFileSinks() throws 
SemanticException, LockException {
 
   private void allocateWriteIdForAcidAnalyzeTable() throws LockException {
 if (driverContext.getPlan().getAcidAnalyzeTable() != null) {
+  //Inside a compaction transaction, only stats gathering is running which 
is not requiring a new write id,
+  //and for duplicate compaction detection it is necessary to not 
increment it.
+  boolean isWithinCompactionTxn = 
Boolean.parseBoolean(SessionState.get().getHiveVariables().get(Constants.INSIDE_COMPACTION_TRANSACTION_FLAG));

Review Comment:
   I understand, too bad:)





Issue Time Tracking
---

Worklog Id: (was: 761657)
Time Spent: 1h 20m  (was: 1h 10m)

> Worker shouldn't inject duplicate entries in `ready for cleaning` state into 
> the compaction queue
> -
>
> Key: HIVE-26107
> URL: https://issues.apache.org/jira/browse/HIVE-26107
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> How to reproduce:
> 1) create an acid table and load some data ;
> 2) manually trigger the compaction for the table several times;
> 4) inspect compaction_queue: There are multiple entries in 'ready for 
> cleaning' state for the same table.
>  
> Expected behavior: All compaction request after the first one should be 
> rejected until the table is changed again.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761654=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761654
 ]

ASF GitHub Bot logged work on HIVE-26149:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 08:42
Start Date: 25/Apr/22 08:42
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3220:
URL: https://github.com/apache/hive/pull/3220#discussion_r857390546


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java:
##
@@ -1534,43 +1538,50 @@ public void dropDatabase(String catalogName, String 
dbName, boolean deleteData,
* @param maxBatchSize
* @throws TException
*/
-  private void dropDatabaseCascadePerTable(String catName, String dbName, 
List tableList,
-   boolean deleteData, int 
maxBatchSize) throws TException {
-String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf);
-for (Table table : new TableIterable(this, catName, dbName, tableList, 
maxBatchSize)) {
+  private void dropDatabaseCascadePerTable(DropDatabaseRequest req, 
List tableList, int maxBatchSize) 
+  throws TException {
+String dbNameWithCatalog = prependCatalogToDbName(req.getCatalogName(), 
req.getName(), conf);
+for (Table table : new TableIterable(
+this, req.getCatalogName(), req.getName(), tableList, maxBatchSize)) {
   boolean success = false;
   HiveMetaHook hook = getHook(table);
-  if (hook == null) {
-continue;
-  }
   try {
-hook.preDropTable(table);
-client.drop_table_with_environment_context(dbNameWithCatalog, 
table.getTableName(), deleteData, null);
-hook.commitDropTable(table, deleteData);
+if (hook != null) {
+  hook.preDropTable(table);
+}
+boolean isSoftDelete = req.isSoftDelete() && Boolean.parseBoolean(
+  table.getParameters().getOrDefault(SOFT_DELETE_TABLE, "false"));
+EnvironmentContext context = null;
+if (req.isSetTxnId()) {
+  context = new EnvironmentContext();
+  context.putToProperties("txnId", String.valueOf(req.getTxnId()));
+  req.setDeleteManagedDir(false);
+}
+client.drop_table_with_environment_context(dbNameWithCatalog, 
table.getTableName(), 
+req.isDeleteData() && !isSoftDelete, context);
+if (hook != null) {
+  hook.commitDropTable(table, req.isDeleteData());
+}
 success = true;
   } finally {
-if (!success) {
+if (!success && hook != null) {
   hook.rollbackDropTable(table);
 }
   }
 }
-client.drop_database(dbNameWithCatalog, deleteData, true);
+client.drop_database_req(req);
   }
 
   /**
* Handles dropDatabase by invoking drop_database in HMS.
* Useful when table list in DB can fit in memory, it will retrieve all 
tables at once and
* call drop_database once. Also handles drop_table hooks.
-   * @param catName
-   * @param dbName
+   * @param req
* @param tableList
-   * @param deleteData
* @throws TException
*/
-  private void dropDatabaseCascadePerDb(String catName, String dbName, 
List tableList,
-boolean deleteData) throws TException {
-String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf);
-List tables = getTableObjectsByName(catName, dbName, tableList);
+  private void dropDatabaseCascadePerDb(DropDatabaseRequest req, List 
tableList) throws TException {

Review Comment:
   How does this work together with:
   ```
 // We want no lock here, as the database lock will cover the 
tables,
 // and putting a lock will actually cause us to deadlock on 
ourselves.
   ```
   
   Wouldn't it cause issues with the locks?





Issue Time Tracking
---

Worklog Id: (was: 761654)
Time Spent: 1.5h  (was: 1h 20m)

> Non blocking DROP DATABASE implementation
> -
>
> Key: HIVE-26149
> URL: https://issues.apache.org/jira/browse/HIVE-26149
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26107) Worker shouldn't inject duplicate entries in `ready for cleaning` state into the compaction queue

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26107?focusedWorklogId=761652=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761652
 ]

ASF GitHub Bot logged work on HIVE-26107:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 08:35
Start Date: 25/Apr/22 08:35
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3172:
URL: https://github.com/apache/hive/pull/3172#discussion_r857385048


##
ql/src/test/queries/clientpositive/acid_insert_overwrite_update.q:
##
@@ -26,7 +26,6 @@ insert overwrite table sequential_update 
values(current_timestamp, 0, current_ti
 delete from sequential_update where seq=2;
 select distinct IF(seq==0, 'LOOKS OKAY', 'BROKEN'), 
regexp_extract(INPUT__FILE__NAME, '.*/(.*)/[^/]*', 1) from sequential_update;
 
-alter table sequential_update compact 'major';

Review Comment:
   It turned out that the Q test running environment doesn't start the HMS 
background threads at all. As a result the issued compactions are never 
processed. From now on it is not possible to initiate a second compaction on 
the same table with the same write id before the previous one is cleaned up. As 
a result, the second compaction request was refused in this test. BTW this 
questions the necessity of the compaction commands in these tests for me, but I 
only removed only the second one  to make the test green again.





Issue Time Tracking
---

Worklog Id: (was: 761652)
Time Spent: 1h 10m  (was: 1h)

> Worker shouldn't inject duplicate entries in `ready for cleaning` state into 
> the compaction queue
> -
>
> Key: HIVE-26107
> URL: https://issues.apache.org/jira/browse/HIVE-26107
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> How to reproduce:
> 1) create an acid table and load some data ;
> 2) manually trigger the compaction for the table several times;
> 4) inspect compaction_queue: There are multiple entries in 'ready for 
> cleaning' state for the same table.
>  
> Expected behavior: All compaction request after the first one should be 
> rejected until the table is changed again.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HIVE-19018) beeline -e now requires semicolon even when used with query from command line

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-19018:
--
Labels: pull-request-available  (was: )

> beeline -e now requires semicolon even when used with query from command line
> -
>
> Key: HIVE-19018
> URL: https://issues.apache.org/jira/browse/HIVE-19018
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.0.0
>
> Attachments: HIVE-19018.1.patch
>
>
> Right now if you execute {{beeline -u "jdbc:hive2://" -e "select 3"}}, 
> beeline console will wait for you to enter ';". It's a regression from the 
> old behavior. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26107) Worker shouldn't inject duplicate entries in `ready for cleaning` state into the compaction queue

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26107?focusedWorklogId=761650=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761650
 ]

ASF GitHub Bot logged work on HIVE-26107:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 08:29
Start Date: 25/Apr/22 08:29
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3172:
URL: https://github.com/apache/hive/pull/3172#discussion_r857379938


##
ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java:
##
@@ -235,18 +235,18 @@ private void loadData(boolean isVectorized) throws 
Exception {
 runStatementOnDriver("export table Tstage to '" + getWarehouseDir() 
+"/2'");
 runStatementOnDriver("load data inpath '" + getWarehouseDir() + "/2/data' 
overwrite into table T");
 String[][] expected3 = new String[][] {
-{"{\"writeid\":5,\"bucketid\":536870912,\"rowid\":0}\t5\t6", 
"t/base_005/00_0"},

Review Comment:
   Check line 222
   `runStatementOnDriver("alter table T compact 'major'")`
   Since stats gathering no longer increases the write id, (from now on) 
neither compactions do





Issue Time Tracking
---

Worklog Id: (was: 761650)
Time Spent: 1h  (was: 50m)

> Worker shouldn't inject duplicate entries in `ready for cleaning` state into 
> the compaction queue
> -
>
> Key: HIVE-26107
> URL: https://issues.apache.org/jira/browse/HIVE-26107
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> How to reproduce:
> 1) create an acid table and load some data ;
> 2) manually trigger the compaction for the table several times;
> 4) inspect compaction_queue: There are multiple entries in 'ready for 
> cleaning' state for the same table.
>  
> Expected behavior: All compaction request after the first one should be 
> rejected until the table is changed again.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761649=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761649
 ]

ASF GitHub Bot logged work on HIVE-26149:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 08:25
Start Date: 25/Apr/22 08:25
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on code in PR #3220:
URL: https://github.com/apache/hive/pull/3220#discussion_r857376124


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java:
##
@@ -1534,43 +1538,50 @@ public void dropDatabase(String catalogName, String 
dbName, boolean deleteData,
* @param maxBatchSize
* @throws TException
*/
-  private void dropDatabaseCascadePerTable(String catName, String dbName, 
List tableList,
-   boolean deleteData, int 
maxBatchSize) throws TException {
-String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf);
-for (Table table : new TableIterable(this, catName, dbName, tableList, 
maxBatchSize)) {
+  private void dropDatabaseCascadePerTable(DropDatabaseRequest req, 
List tableList, int maxBatchSize) 
+  throws TException {
+String dbNameWithCatalog = prependCatalogToDbName(req.getCatalogName(), 
req.getName(), conf);
+for (Table table : new TableIterable(
+this, req.getCatalogName(), req.getName(), tableList, maxBatchSize)) {
   boolean success = false;
   HiveMetaHook hook = getHook(table);
-  if (hook == null) {
-continue;
-  }
   try {
-hook.preDropTable(table);
-client.drop_table_with_environment_context(dbNameWithCatalog, 
table.getTableName(), deleteData, null);
-hook.commitDropTable(table, deleteData);
+if (hook != null) {
+  hook.preDropTable(table);
+}
+boolean isSoftDelete = req.isSoftDelete() && Boolean.parseBoolean(
+  table.getParameters().getOrDefault(SOFT_DELETE_TABLE, "false"));
+EnvironmentContext context = null;
+if (req.isSetTxnId()) {
+  context = new EnvironmentContext();
+  context.putToProperties("txnId", String.valueOf(req.getTxnId()));
+  req.setDeleteManagedDir(false);
+}
+client.drop_table_with_environment_context(dbNameWithCatalog, 
table.getTableName(), 
+req.isDeleteData() && !isSoftDelete, context);
+if (hook != null) {
+  hook.commitDropTable(table, req.isDeleteData());
+}
 success = true;
   } finally {
-if (!success) {
+if (!success && hook != null) {
   hook.rollbackDropTable(table);
 }
   }
 }
-client.drop_database(dbNameWithCatalog, deleteData, true);
+client.drop_database_req(req);
   }
 
   /**
* Handles dropDatabase by invoking drop_database in HMS.
* Useful when table list in DB can fit in memory, it will retrieve all 
tables at once and
* call drop_database once. Also handles drop_table hooks.
-   * @param catName
-   * @param dbName
+   * @param req
* @param tableList
-   * @param deleteData
* @throws TException
*/
-  private void dropDatabaseCascadePerDb(String catName, String dbName, 
List tableList,
-boolean deleteData) throws TException {
-String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf);
-List tables = getTableObjectsByName(catName, dbName, tableList);
+  private void dropDatabaseCascadePerDb(DropDatabaseRequest req, List 
tableList) throws TException {

Review Comment:
   if DB has a mix of soft-delete(prefixed) and managed tables, we acquire 
exclusive locks and remove as usual managed/external tables, however, for 
soft-delete tables we acquire excl_write lock and delegate cleanup to the 
cleaner process.  Read locks are only removed on soft-delete tables.  
   Note: if lockless reads are enabled we do not remove the db folder. 





Issue Time Tracking
---

Worklog Id: (was: 761649)
Time Spent: 1h 20m  (was: 1h 10m)

> Non blocking DROP DATABASE implementation
> -
>
> Key: HIVE-26149
> URL: https://issues.apache.org/jira/browse/HIVE-26149
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26107) Worker shouldn't inject duplicate entries in `ready for cleaning` state into the compaction queue

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26107?focusedWorklogId=761643=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761643
 ]

ASF GitHub Bot logged work on HIVE-26107:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 08:13
Start Date: 25/Apr/22 08:13
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3172:
URL: https://github.com/apache/hive/pull/3172#discussion_r857366146


##
ql/src/java/org/apache/hadoop/hive/ql/DriverTxnHandler.java:
##
@@ -303,8 +303,15 @@ void setWriteIdForAcidFileSinks() throws 
SemanticException, LockException {
 
   private void allocateWriteIdForAcidAnalyzeTable() throws LockException {
 if (driverContext.getPlan().getAcidAnalyzeTable() != null) {
+  //Inside a compaction transaction, only stats gathering is running which 
is not requiring a new write id,
+  //and for duplicate compaction detection it is necessary to not 
increment it.
+  boolean isWithinCompactionTxn = 
Boolean.parseBoolean(SessionState.get().getHiveVariables().get(Constants.INSIDE_COMPACTION_TRANSACTION_FLAG));

Review Comment:
   Unfortunately that won't work. This marker is set up within 
   `org.apache.hadoop.hive.ql.txn.compactor.Worker.StatsUpdater#gatherStats`
   method which is not a Compaction Txn. The reason I introduced this flag is 
to prevent the stats gathering from increasing the write id if it was initiated 
as a part of a compaction.





Issue Time Tracking
---

Worklog Id: (was: 761643)
Time Spent: 50m  (was: 40m)

> Worker shouldn't inject duplicate entries in `ready for cleaning` state into 
> the compaction queue
> -
>
> Key: HIVE-26107
> URL: https://issues.apache.org/jira/browse/HIVE-26107
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> How to reproduce:
> 1) create an acid table and load some data ;
> 2) manually trigger the compaction for the table several times;
> 4) inspect compaction_queue: There are multiple entries in 'ready for 
> cleaning' state for the same table.
>  
> Expected behavior: All compaction request after the first one should be 
> rejected until the table is changed again.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761637=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761637
 ]

ASF GitHub Bot logged work on HIVE-26149:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 08:00
Start Date: 25/Apr/22 08:00
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3220:
URL: https://github.com/apache/hive/pull/3220#discussion_r857355770


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java:
##
@@ -1534,43 +1538,50 @@ public void dropDatabase(String catalogName, String 
dbName, boolean deleteData,
* @param maxBatchSize
* @throws TException
*/
-  private void dropDatabaseCascadePerTable(String catName, String dbName, 
List tableList,
-   boolean deleteData, int 
maxBatchSize) throws TException {
-String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf);
-for (Table table : new TableIterable(this, catName, dbName, tableList, 
maxBatchSize)) {
+  private void dropDatabaseCascadePerTable(DropDatabaseRequest req, 
List tableList, int maxBatchSize) 
+  throws TException {
+String dbNameWithCatalog = prependCatalogToDbName(req.getCatalogName(), 
req.getName(), conf);
+for (Table table : new TableIterable(
+this, req.getCatalogName(), req.getName(), tableList, maxBatchSize)) {
   boolean success = false;
   HiveMetaHook hook = getHook(table);
-  if (hook == null) {
-continue;
-  }
   try {
-hook.preDropTable(table);
-client.drop_table_with_environment_context(dbNameWithCatalog, 
table.getTableName(), deleteData, null);
-hook.commitDropTable(table, deleteData);
+if (hook != null) {
+  hook.preDropTable(table);
+}
+boolean isSoftDelete = req.isSoftDelete() && Boolean.parseBoolean(
+  table.getParameters().getOrDefault(SOFT_DELETE_TABLE, "false"));
+EnvironmentContext context = null;
+if (req.isSetTxnId()) {
+  context = new EnvironmentContext();
+  context.putToProperties("txnId", String.valueOf(req.getTxnId()));
+  req.setDeleteManagedDir(false);
+}
+client.drop_table_with_environment_context(dbNameWithCatalog, 
table.getTableName(), 
+req.isDeleteData() && !isSoftDelete, context);
+if (hook != null) {
+  hook.commitDropTable(table, req.isDeleteData());
+}
 success = true;
   } finally {
-if (!success) {
+if (!success && hook != null) {
   hook.rollbackDropTable(table);
 }
   }
 }
-client.drop_database(dbNameWithCatalog, deleteData, true);
+client.drop_database_req(req);
   }
 
   /**
* Handles dropDatabase by invoking drop_database in HMS.
* Useful when table list in DB can fit in memory, it will retrieve all 
tables at once and
* call drop_database once. Also handles drop_table hooks.
-   * @param catName
-   * @param dbName
+   * @param req
* @param tableList
-   * @param deleteData
* @throws TException
*/
-  private void dropDatabaseCascadePerDb(String catName, String dbName, 
List tableList,
-boolean deleteData) throws TException {
-String dbNameWithCatalog = prependCatalogToDbName(catName, dbName, conf);
-List tables = getTableObjectsByName(catName, dbName, tableList);
+  private void dropDatabaseCascadePerDb(DropDatabaseRequest req, List 
tableList) throws TException {

Review Comment:
   What happens when the tables inside the db has a different configuration. 
Some of the tables are soft delete, and some of the tables are hard delete. 
Also what happens if the db and the table soft delete configuration is 
different?





Issue Time Tracking
---

Worklog Id: (was: 761637)
Time Spent: 1h 10m  (was: 1h)

> Non blocking DROP DATABASE implementation
> -
>
> Key: HIVE-26149
> URL: https://issues.apache.org/jira/browse/HIVE-26149
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761632
 ]

ASF GitHub Bot logged work on HIVE-26149:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 07:55
Start Date: 25/Apr/22 07:55
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3220:
URL: https://github.com/apache/hive/pull/3220#discussion_r857351954


##
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java:
##
@@ -1457,39 +1458,42 @@ public void dropDatabase(String name)
 
   @Override
   public void dropDatabase(String name, boolean deleteData, boolean 
ignoreUnknownDb)
-  throws NoSuchObjectException, InvalidOperationException, MetaException, 
TException {
+  throws TException {

Review Comment:
   Could we make the old methods deprecated?





Issue Time Tracking
---

Worklog Id: (was: 761632)
Time Spent: 1h  (was: 50m)

> Non blocking DROP DATABASE implementation
> -
>
> Key: HIVE-26149
> URL: https://issues.apache.org/jira/browse/HIVE-26149
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761631=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761631
 ]

ASF GitHub Bot logged work on HIVE-26149:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 07:54
Start Date: 25/Apr/22 07:54
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3220:
URL: https://github.com/apache/hive/pull/3220#discussion_r857351082


##
ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java:
##
@@ -3914,4 +3914,72 @@ private void testRenamePartition(boolean blocking) 
throws Exception {
 driver.getFetchTask().fetch(res);
 Assert.assertEquals("Expecting 1 rows and found " + res.size(), 1, 
res.size());
   }
+
+  @Test
+  public void testDropDatabaseNonBlocking() throws Exception {
+dropDatabaseNonBlocking(false, false);
+  }
+  @Test

Review Comment:
   nit: newlines





Issue Time Tracking
---

Worklog Id: (was: 761631)
Time Spent: 50m  (was: 40m)

> Non blocking DROP DATABASE implementation
> -
>
> Key: HIVE-26149
> URL: https://issues.apache.org/jira/browse/HIVE-26149
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761630
 ]

ASF GitHub Bot logged work on HIVE-26149:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 07:53
Start Date: 25/Apr/22 07:53
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3220:
URL: https://github.com/apache/hive/pull/3220#discussion_r857350352


##
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java:
##
@@ -1685,7 +1688,89 @@ public void testDropWithBaseMultiplePartitions() throws 
Exception {
   }
 }
   }
+  
+  @Test
+  public void testDropDatabaseCascadePerTableNonBlocking() throws Exception {
+MetastoreConf.setLongVar(hiveConf, 
MetastoreConf.ConfVars.BATCH_RETRIEVE_MAX, 1);
+dropDatabaseCascadeNonBlocking();
+  }
+  @Test
+  public void testDropDatabaseCascadePerDbNonBlocking() throws Exception {
+dropDatabaseCascadeNonBlocking();
+  }
+  private void dropDatabaseCascadeNonBlocking() throws Exception {

Review Comment:
   Nit: newline



##
ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java:
##
@@ -1685,7 +1688,89 @@ public void testDropWithBaseMultiplePartitions() throws 
Exception {
   }
 }
   }
+  
+  @Test
+  public void testDropDatabaseCascadePerTableNonBlocking() throws Exception {
+MetastoreConf.setLongVar(hiveConf, 
MetastoreConf.ConfVars.BATCH_RETRIEVE_MAX, 1);
+dropDatabaseCascadeNonBlocking();
+  }
+  @Test

Review Comment:
   Nit: newline





Issue Time Tracking
---

Worklog Id: (was: 761630)
Time Spent: 40m  (was: 0.5h)

> Non blocking DROP DATABASE implementation
> -
>
> Key: HIVE-26149
> URL: https://issues.apache.org/jira/browse/HIVE-26149
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761629=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761629
 ]

ASF GitHub Bot logged work on HIVE-26149:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 07:53
Start Date: 25/Apr/22 07:53
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3220:
URL: https://github.com/apache/hive/pull/3220#discussion_r857349784


##
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:
##
@@ -661,16 +662,36 @@ public void dropDatabase(String name, boolean deleteData, 
boolean ignoreUnknownD
*/
   public void dropDatabase(String name, boolean deleteData, boolean 
ignoreUnknownDb, boolean cascade)
   throws HiveException, NoSuchObjectException {
+dropDatabase(
+  new DropDatabaseDesc(name, ignoreUnknownDb, cascade, deleteData));
+  }
+
+  public void dropDatabase(DropDatabaseDesc desc) 

Review Comment:
   Nit: I would guess that we do not need the new line here





Issue Time Tracking
---

Worklog Id: (was: 761629)
Time Spent: 0.5h  (was: 20m)

> Non blocking DROP DATABASE implementation
> -
>
> Key: HIVE-26149
> URL: https://issues.apache.org/jira/browse/HIVE-26149
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26149) Non blocking DROP DATABASE implementation

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26149?focusedWorklogId=761628=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761628
 ]

ASF GitHub Bot logged work on HIVE-26149:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 07:50
Start Date: 25/Apr/22 07:50
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3220:
URL: https://github.com/apache/hive/pull/3220#discussion_r857347623


##
ql/src/java/org/apache/hadoop/hive/ql/ddl/database/drop/DropDatabaseAnalyzer.java:
##
@@ -49,28 +52,37 @@ public void analyzeInternal(ASTNode root) throws 
SemanticException {
 String databaseName = unescapeIdentifier(root.getChild(0).getText());
 boolean ifExists = root.getFirstChildWithType(HiveParser.TOK_IFEXISTS) != 
null;
 boolean cascade = root.getFirstChildWithType(HiveParser.TOK_CASCADE) != 
null;
+boolean isSoftDelete = HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_ACID_LOCKLESS_READS_ENABLED);
 
 Database database = getDatabase(databaseName, !ifExists);
 if (database == null) {
   return;
 }
-
 // if cascade=true, then we need to authorize the drop table action as 
well, and add the tables to the outputs
+boolean allTablesWithSuffix = false;
 if (cascade) {
   try {
-for (Table table : db.getAllTableObjects(databaseName)) {
+List tables = db.getAllTableObjects(databaseName);
+allTablesWithSuffix = tables.stream().allMatch(
+table -> AcidUtils.isTableSoftDeleteEnabled(table, conf));
+for (Table table : tables) {
   // We want no lock here, as the database lock will cover the tables,
   // and putting a lock will actually cause us to deadlock on 
ourselves.
-  outputs.add(new WriteEntity(table, 
WriteEntity.WriteType.DDL_NO_LOCK));
+  outputs.add(
+new WriteEntity(table, isSoftDelete && !allTablesWithSuffix ?

Review Comment:
   Nit: Could we create boolean variables with descriptive names? It is hard to 
follow what happens here.





Issue Time Tracking
---

Worklog Id: (was: 761628)
Time Spent: 20m  (was: 10m)

> Non blocking DROP DATABASE implementation
> -
>
> Key: HIVE-26149
> URL: https://issues.apache.org/jira/browse/HIVE-26149
> Project: Hive
>  Issue Type: Task
>Reporter: Denys Kuzmenko
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26107) Worker shouldn't inject duplicate entries in `ready for cleaning` state into the compaction queue

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26107?focusedWorklogId=761626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761626
 ]

ASF GitHub Bot logged work on HIVE-26107:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 07:33
Start Date: 25/Apr/22 07:33
Worklog Time Spent: 10m 
  Work Description: klcopp commented on code in PR #3172:
URL: https://github.com/apache/hive/pull/3172#discussion_r857334379


##
ql/src/java/org/apache/hadoop/hive/ql/DriverTxnHandler.java:
##
@@ -303,8 +303,15 @@ void setWriteIdForAcidFileSinks() throws 
SemanticException, LockException {
 
   private void allocateWriteIdForAcidAnalyzeTable() throws LockException {
 if (driverContext.getPlan().getAcidAnalyzeTable() != null) {
+  //Inside a compaction transaction, only stats gathering is running which 
is not requiring a new write id,
+  //and for duplicate compaction detection it is necessary to not 
increment it.
+  boolean isWithinCompactionTxn = 
Boolean.parseBoolean(SessionState.get().getHiveVariables().get(Constants.INSIDE_COMPACTION_TRANSACTION_FLAG));

Review Comment:
   Yes, exactly!





Issue Time Tracking
---

Worklog Id: (was: 761626)
Time Spent: 40m  (was: 0.5h)

> Worker shouldn't inject duplicate entries in `ready for cleaning` state into 
> the compaction queue
> -
>
> Key: HIVE-26107
> URL: https://issues.apache.org/jira/browse/HIVE-26107
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> How to reproduce:
> 1) create an acid table and load some data ;
> 2) manually trigger the compaction for the table several times;
> 4) inspect compaction_queue: There are multiple entries in 'ready for 
> cleaning' state for the same table.
>  
> Expected behavior: All compaction request after the first one should be 
> rejected until the table is changed again.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26148) Keep MetaStoreFilterHook interface compatibility after introducing catalogs

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26148?focusedWorklogId=761625=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761625
 ]

ASF GitHub Bot logged work on HIVE-26148:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 07:31
Start Date: 25/Apr/22 07:31
Worklog Time Spent: 10m 
  Work Description: pvary commented on code in PR #3218:
URL: https://github.com/apache/hive/pull/3218#discussion_r857332648


##
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/AuthorizationMetaStoreFilterHook.java:
##
@@ -46,11 +46,18 @@ public AuthorizationMetaStoreFilterHook(Configuration conf) 
{
   }
 
   @Override
-  public List filterTableNames(String catName, String dbName, 
List tableList)
+  public List filterTableNames(String dbName, List tableList)
   throws MetaException {
 List listObjs = getHivePrivObjects(dbName, tableList);
 return getTableNames(getFilteredObjects(listObjs));
   }
+
+  @Override
+  public List filterTableNames(String catName, String dbName, 
List tableList)
+  throws MetaException {
+return filterTableNames(dbName, tableList);

Review Comment:
   This seems problematic to me.
   If we ignore the catalog name, that could become a serious hidden issue.
   
   Do we have a better solution to this?
   Maybe throw an exception if the catalog name is not the default? Or do the 
filtering correctly with the catalog info as well?





Issue Time Tracking
---

Worklog Id: (was: 761625)
Time Spent: 0.5h  (was: 20m)

> Keep MetaStoreFilterHook interface compatibility after introducing catalogs
> ---
>
> Key: HIVE-26148
> URL: https://issues.apache.org/jira/browse/HIVE-26148
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Wechar
>Assignee: Wechar
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Hive 3.0 introduce catalog concept, when we upgrade hive dependency version 
> from 2.3 to 3.x, we found some interfaces of *MetaStoreFilterHook* are not 
> compatible:
> {code:bash}
>  git show ba8a99e115 -- 
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreFilterHook.java
> {code}
> {code:bash}
> --- 
> a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreFilterHook.java
> +++ 
> b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreFilterHook.java
>/**
> * Filter given list of tables
> -   * @param dbName
> -   * @param tableList
> +   * @param catName catalog name
> +   * @param dbName database name
> +   * @param tableList list of table returned by the metastore
> * @return List of filtered table names
> */
> -  public List filterTableNames(String dbName, List 
> tableList) throws MetaException;
> +  List filterTableNames(String catName, String dbName, List 
> tableList)
> +  throws MetaException;
> {code}
> We can remain the previous interfaces and use the default catalog to 
> implement.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26114) jdbc connection hivesrerver2 using dfs command with prefix space will cause exception

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26114?focusedWorklogId=761622=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761622
 ]

ASF GitHub Bot logged work on HIVE-26114:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 07:26
Start Date: 25/Apr/22 07:26
Worklog Time Spent: 10m 
  Work Description: pvary commented on PR #3176:
URL: https://github.com/apache/hive/pull/3176#issuecomment-1108177418

   @ming95: Please ping me later if we have further results here. I expect 
to be busy this week, and forget things  




Issue Time Tracking
---

Worklog Id: (was: 761622)
Time Spent: 1h 20m  (was: 1h 10m)

> jdbc connection hivesrerver2 using dfs command with prefix space will cause 
> exception
> -
>
> Key: HIVE-26114
> URL: https://issues.apache.org/jira/browse/HIVE-26114
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.3.8, 3.1.2
>Reporter: shezm
>Assignee: shezm
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> {code:java}
>         Connection con = 
> DriverManager.getConnection("jdbc:hive2://10.214.35.115:1/");
>         Statement stmt = con.createStatement();
>         // dfs command with prefix space or "\n"
>         ResultSet res = stmt.executeQuery(" dfs -ls /");
>         //ResultSet res = stmt.executeQuery("\ndfs -ls /"); {code}
> it will cause exception
> {code:java}
> Exception in thread "main" org.apache.hive.service.cli.HiveSQLException: 
> Error while processing statement: null
>     at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:231)
>     at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:217)
>     at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:244)
>     at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:375)
>     at com.ne.gdc.whitemane.shezm.TestJdbc.main(TestJdbc.java:30)
> Caused by: org.apache.hive.service.cli.HiveSQLException: Error while 
> processing statement: null
>     at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
>     at 
> org.apache.hive.service.cli.operation.HiveCommandOperation.runInternal(HiveCommandOperation.java:118)
>     at org.apache.hive.service.cli.operation.Operation.run(Operation.java:320)
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
>     at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>     at com.sun.proxy.$Proxy43.executeStatementAsync(Unknown Source)
>     at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
>     at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:530)
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
>     at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>     at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>     at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:605)
>     at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748)
>  {code}
> But when I execute sql with prefix "\n" it works fine
> {code:java}
> ResultSet res = stmt.executeQuery("\n select 1"); {code}



--
This 

[jira] [Work logged] (HIVE-26114) jdbc connection hivesrerver2 using dfs command with prefix space will cause exception

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26114?focusedWorklogId=761621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761621
 ]

ASF GitHub Bot logged work on HIVE-26114:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 07:26
Start Date: 25/Apr/22 07:26
Worklog Time Spent: 10m 
  Work Description: pvary commented on PR #3176:
URL: https://github.com/apache/hive/pull/3176#issuecomment-1108176377

   Also started a flaky test checker for the failed test, to see if it is 
indeed flaky: http://ci.hive.apache.org/job/hive-flaky-check/559/




Issue Time Tracking
---

Worklog Id: (was: 761621)
Time Spent: 1h 10m  (was: 1h)

> jdbc connection hivesrerver2 using dfs command with prefix space will cause 
> exception
> -
>
> Key: HIVE-26114
> URL: https://issues.apache.org/jira/browse/HIVE-26114
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.3.8, 3.1.2
>Reporter: shezm
>Assignee: shezm
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {code:java}
>         Connection con = 
> DriverManager.getConnection("jdbc:hive2://10.214.35.115:1/");
>         Statement stmt = con.createStatement();
>         // dfs command with prefix space or "\n"
>         ResultSet res = stmt.executeQuery(" dfs -ls /");
>         //ResultSet res = stmt.executeQuery("\ndfs -ls /"); {code}
> it will cause exception
> {code:java}
> Exception in thread "main" org.apache.hive.service.cli.HiveSQLException: 
> Error while processing statement: null
>     at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:231)
>     at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:217)
>     at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:244)
>     at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:375)
>     at com.ne.gdc.whitemane.shezm.TestJdbc.main(TestJdbc.java:30)
> Caused by: org.apache.hive.service.cli.HiveSQLException: Error while 
> processing statement: null
>     at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
>     at 
> org.apache.hive.service.cli.operation.HiveCommandOperation.runInternal(HiveCommandOperation.java:118)
>     at org.apache.hive.service.cli.operation.Operation.run(Operation.java:320)
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
>     at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>     at com.sun.proxy.$Proxy43.executeStatementAsync(Unknown Source)
>     at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
>     at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:530)
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
>     at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>     at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>     at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:605)
>     at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748)
>  {code}
> But when I execute sql with prefix "\n" it works fine
> {code:java}
> ResultSet res = stmt.executeQuery("\n select 1"); {code}


[jira] [Work logged] (HIVE-26107) Worker shouldn't inject duplicate entries in `ready for cleaning` state into the compaction queue

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26107?focusedWorklogId=761618=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761618
 ]

ASF GitHub Bot logged work on HIVE-26107:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 07:23
Start Date: 25/Apr/22 07:23
Worklog Time Spent: 10m 
  Work Description: veghlaci05 commented on code in PR #3172:
URL: https://github.com/apache/hive/pull/3172#discussion_r857326564


##
ql/src/java/org/apache/hadoop/hive/ql/DriverTxnHandler.java:
##
@@ -303,8 +303,15 @@ void setWriteIdForAcidFileSinks() throws 
SemanticException, LockException {
 
   private void allocateWriteIdForAcidAnalyzeTable() throws LockException {
 if (driverContext.getPlan().getAcidAnalyzeTable() != null) {
+  //Inside a compaction transaction, only stats gathering is running which 
is not requiring a new write id,
+  //and for duplicate compaction detection it is necessary to not 
increment it.
+  boolean isWithinCompactionTxn = 
Boolean.parseBoolean(SessionState.get().getHiveVariables().get(Constants.INSIDE_COMPACTION_TRANSACTION_FLAG));

Review Comment:
   I think you meant 
   `driverContext.getTxnType().equals(TxnType.COMPACTION)`
   otherwise I don't understand your point :)





Issue Time Tracking
---

Worklog Id: (was: 761618)
Time Spent: 0.5h  (was: 20m)

> Worker shouldn't inject duplicate entries in `ready for cleaning` state into 
> the compaction queue
> -
>
> Key: HIVE-26107
> URL: https://issues.apache.org/jira/browse/HIVE-26107
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> How to reproduce:
> 1) create an acid table and load some data ;
> 2) manually trigger the compaction for the table several times;
> 4) inspect compaction_queue: There are multiple entries in 'ready for 
> cleaning' state for the same table.
>  
> Expected behavior: All compaction request after the first one should be 
> rejected until the table is changed again.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Work logged] (HIVE-26114) jdbc connection hivesrerver2 using dfs command with prefix space will cause exception

2022-04-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26114?focusedWorklogId=761620=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-761620
 ]

ASF GitHub Bot logged work on HIVE-26114:
-

Author: ASF GitHub Bot
Created on: 25/Apr/22 07:23
Start Date: 25/Apr/22 07:23
Worklog Time Spent: 10m 
  Work Description: pvary commented on PR #3176:
URL: https://github.com/apache/hive/pull/3176#issuecomment-1108173812

   @ming95: You can rerun the tests from the jenkins UI. I restarted them, 
so if we have a green run, we can merge.
   
   Thanks,
   Peter




Issue Time Tracking
---

Worklog Id: (was: 761620)
Time Spent: 1h  (was: 50m)

> jdbc connection hivesrerver2 using dfs command with prefix space will cause 
> exception
> -
>
> Key: HIVE-26114
> URL: https://issues.apache.org/jira/browse/HIVE-26114
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 2.3.8, 3.1.2
>Reporter: shezm
>Assignee: shezm
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.2.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> {code:java}
>         Connection con = 
> DriverManager.getConnection("jdbc:hive2://10.214.35.115:1/");
>         Statement stmt = con.createStatement();
>         // dfs command with prefix space or "\n"
>         ResultSet res = stmt.executeQuery(" dfs -ls /");
>         //ResultSet res = stmt.executeQuery("\ndfs -ls /"); {code}
> it will cause exception
> {code:java}
> Exception in thread "main" org.apache.hive.service.cli.HiveSQLException: 
> Error while processing statement: null
>     at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:231)
>     at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:217)
>     at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:244)
>     at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:375)
>     at com.ne.gdc.whitemane.shezm.TestJdbc.main(TestJdbc.java:30)
> Caused by: org.apache.hive.service.cli.HiveSQLException: Error while 
> processing statement: null
>     at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:380)
>     at 
> org.apache.hive.service.cli.operation.HiveCommandOperation.runInternal(HiveCommandOperation.java:118)
>     at org.apache.hive.service.cli.operation.Operation.run(Operation.java:320)
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:530)
>     at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:517)
>     at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:498)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>     at 
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
>     at com.sun.proxy.$Proxy43.executeStatementAsync(Unknown Source)
>     at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:310)
>     at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:530)
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
>     at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
>     at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>     at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>     at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:605)
>     at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>     at java.lang.Thread.run(Thread.java:748)
>  {code}
> But when I execute sql with prefix "\n" it works fine
> {code:java}
> ResultSet res = stmt.executeQuery("\n select 1"); {code}