date:20221021

[jira] [Work logged] (HIVE-26437) dump unpartitioned Tables in parallel

2022-10-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26437?focusedWorklogId=819028&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-819028
 ]

ASF GitHub Bot logged work on HIVE-26437:
-

Author: ASF GitHub Bot
Created on: 21/Oct/22 07:06
Start Date: 21/Oct/22 07:06
Worklog Time Spent: 10m 
  Work Description: pudidic commented on PR #3644:
URL: https://github.com/apache/hive/pull/3644#issuecomment-1286547794

   +1 Looks good to me. I tested on my laptop and the failing test passed. 
Please re-trigger the CI system to make sure it works.




Issue Time Tracking
---

Worklog Id: (was: 819028)
Time Spent: 2h 20m  (was: 2h 10m)

> dump unpartitioned Tables in parallel
> -
>
> Key: HIVE-26437
> URL: https://issues.apache.org/jira/browse/HIVE-26437
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Amit Saonerkar
>Assignee: Amit Saonerkar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (HIVE-26629) Misleading error message with hive.metastore.limit.partition.request

2022-10-21 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26629?focusedWorklogId=819029&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-819029
 ]

ASF GitHub Bot logged work on HIVE-26629:
-

Author: ASF GitHub Bot
Created on: 21/Oct/22 07:15
Start Date: 21/Oct/22 07:15
Worklog Time Spent: 10m 
  Work Description: sonarcloud[bot] commented on PR #3693:
URL: https://github.com/apache/hive/pull/3693#issuecomment-1286555886

   Kudos, SonarCloud Quality Gate passed!    [![Quality Gate 
passed](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/QualityGateBadge/passed-16px.png
 'Quality Gate 
passed')](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=3693)
   
   
[![Bug](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/bug-16px.png
 
'Bug')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3693&resolved=false&types=BUG)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3693&resolved=false&types=BUG)
 [0 
Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3693&resolved=false&types=BUG)
  
   
[![Vulnerability](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/vulnerability-16px.png
 
'Vulnerability')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3693&resolved=false&types=VULNERABILITY)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3693&resolved=false&types=VULNERABILITY)
 [0 
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3693&resolved=false&types=VULNERABILITY)
  
   [![Security 
Hotspot](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/security_hotspot-16px.png
 'Security 
Hotspot')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3693&resolved=false&types=SECURITY_HOTSPOT)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3693&resolved=false&types=SECURITY_HOTSPOT)
 [0 Security 
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=3693&resolved=false&types=SECURITY_HOTSPOT)
  
   [![Code 
Smell](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/common/code_smell-16px.png
 'Code 
Smell')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3693&resolved=false&types=CODE_SMELL)
 
[![A](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/RatingBadge/A-16px.png
 
'A')](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3693&resolved=false&types=CODE_SMELL)
 [0 Code 
Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=3693&resolved=false&types=CODE_SMELL)
   
   [![No Coverage 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/CoverageChart/NoCoverageInfo-16px.png
 'No Coverage 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3693&metric=coverage&view=list)
 No Coverage information  
   [![No Duplication 
information](https://sonarsource.github.io/sonarcloud-github-static-resources/v2/checks/Duplications/NoDuplicationInfo-16px.png
 'No Duplication 
information')](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=3693&metric=duplicated_lines_density&view=list)
 No Duplication information
   
   




Issue Time Tracking
---

Worklog Id: (was: 819029)
Time Spent: 0.5h  (was: 20m)

> Misleading error message with hive.metastore.limit.partition.request 
> -
>
> Key: HIVE-26629
> URL: https://issues.apache.org/jira/browse/HIVE-26629
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Reporter: Miklos Szurap
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Dropping partitions from a table fails with a misleading error message saying 
> that "partition not found":
> {code}
> 0: jdbc:hive2://nightly-71x-zx-1.nightly-71x-> alter table t1p drop partition 
> (p1>0);
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10006]: Partition not found (p1 > 0) (state=42000,code=10006)
> {code}
> however the partitions exist, the real error message is visible in the 
> HiveServer2 logs:
> {code}
> Caused by: MetaException(message:Number of partitions scanned (=2) on table 
> 't1p' exceeds limit (=1). This is c

[jira] [Updated] (HIVE-22187) Refactor the SemanticAnalyzers

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-22187:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Refactor the SemanticAnalyzers
> --
>
> Key: HIVE-22187
> URL: https://issues.apache.org/jira/browse/HIVE-22187
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>  Labels: refactor-semanticanalyzer
>
> The Semantic Analyzers are generally huge classes, many thousand of lines 
> each. They should be put into more manageable structures.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-7661) Observed performance issues while sorting using Hive's Parallel Order by clause while retaining pre-existing sort order.

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-7661:
--

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Observed performance issues while sorting using Hive's Parallel Order by 
> clause while retaining pre-existing sort order.
> 
>
> Key: HIVE-7661
> URL: https://issues.apache.org/jira/browse/HIVE-7661
> Project: Hive
>  Issue Type: Bug
>  Components: Logical Optimizer
>Affects Versions: 0.12.0
> Environment: Cloudera 5.0
> hive-0.12.0-cdh5.0.0
> Red Hat Linux
>Reporter: Vishal Kamath
>Priority: Major
>  Labels: performance
> Fix For: 0.12.1
>
>
> Improve Hive's sampling logic to accommodate use cases that require to retain 
> the pre existing sort in the underlying source table. 
> In order to support Parallel order by clause, Hive Samples the source table 
> based on values provided to hive.optimize.sampling.orderby.number and 
> hive.optimize.sampling.orderby.percent. 
> This does work with reasonable performance when sorting is performed on a 
> columns having random distribution of data but has severe performance issues 
> when retaining the sort order. 
> Let us try to understand this with an example. 
> insert overwrite table lineitem_temp_report 
> select 
>   l_orderkey, l_partkey, l_suppkey, l_linenumber, l_quantity, 
> l_extendedprice, l_discount, l_tax, l_returnflag, l_linestatus, l_shipdate, 
> l_commitdate, l_receiptdate, l_shipinstruct, l_shipmode, l_comment
> from 
>   lineitem
> order by l_orderkey, l_partkey, l_suppkey;
> Sample data set for lineitem table. The first column represents the 
> l_orderKey and is sorted.
>  
> l_orderkey|l_partkey|l_suppkey|l_linenumber|l_quantity|l_extendedprice|l_discount|l_tax|l_returnflag|l_linestatus|l_shipdate|l_commitdate|l_receiptdate|l_shipinstruct|l_shipmode|l_comment
> 197|1771022|96040|2|8|8743.52|0.09|0.02|A|F|1995-04-17|1995-07-01|1995-0
> 197|1771022|96040|2|8|4-27|DELIVER IN PERSON|SHIP|y blithely even 
> 197|1771022|96040|2|8|deposits. blithely fina|
> 197|1558290|83306|3|17|22919.74|0.06|0.02|N|O|1995-08-02|1995-06-23|1995
> 197|1558290|83306|3|17|-08-03|COLLECT COD|REG AIR|ts. careful|
> 197|179355|29358|4|25|35858.75|0.04|0.01|N|F|1995-06-13|1995-05-23|1995-
> 197|179355|29358|4|25|06-24|TAKE BACK RETURN|FOB|s-- quickly final 
> 197|179355|29358|4|25|accounts|
> 197|414653|39658|5|14|21946.82|0.09|0.01|R|F|1995-05-08|1995-05-24|1995-
> 197|414653|39658|5|14|05-12|TAKE BACK RETURN|RAIL|use slyly slyly silent 
> 197|414653|39658|5|14|depo|
> 197|1058800|8821|6|1|1758.75|0.07|0.05|N|O|1995-07-15|1995-06-21|1995-08
> 197|1058800|8821|6|1|-11|COLLECT COD|RAIL| even, thin dependencies sno|
> 198|560609|60610|1|33|55096.14|0.07|0.02|N|O|1998-01-05|1998-03-20|1998-
> 198|560609|60610|1|33|01-10|TAKE BACK RETURN|TRUCK|carefully caref|
> 198|152287|77289|2|20|26785.60|0.03|0.00|N|O|1998-01-15|1998-03-31|1998-
> 198|152287|77289|2|20|01-25|DELIVER IN PERSON|FOB|carefully final 
> 198|152287|77289|2|20|escapades a|
> 224|1899665|74720|3|41|68247.37|0.07|0.04|A|F|1994-09-01|1994-09-15|1994
> 224|1899665|74720|3|41|-09-02|TAKE BACK RETURN|SHIP|after the furiou|
> When we try to either sort on a presorted column or do a multi-column sort 
> while trying to retain the sort order on the source table,
> Source table "lineitem" has 600 million rows. 
> We don't see equal distribution of data to the reducers. Out of 100 reducers, 
> 99 complete in less than 40 seconds. The last reducer is doing the bulk of 
> the work processing nearly 570 million rows. 
> So, let us understand what is going wrong here ..
> on a table having 600 million records with orderkey column sorted, i created 
> temp table with 10% sampling.  
> insert overwrite table sampTempTbl (select * from lineitem tablesample (10 
> percent) t);
> select min(l_orderkey), max(l_orderkey) from sampTempTbl ;
> 12306309,142321700
> where as on the source table, the orderkey range (select min(l_orderkey), 
> max(l_orderkey) from lineitem)  is 1 and 6  
> So naturally bulk of the records will be directed towards single reducer. 
> One way to work around this problem is to increase the 
> hive.optimize.sampling.orderby.number to a larger value (as close as the # 
> rows in the input source table). But then we will have to provide higher heap 
> (hive-env.sh) for hive,

[jira] [Updated] (HIVE-20289) The function of row_number have different result

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-20289:
---
Fix Version/s: (was: 2.3.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> The function of row_number have different result 
> -
>
> Key: HIVE-20289
> URL: https://issues.apache.org/jira/browse/HIVE-20289
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.0
> Environment: hive 2.3.3
> hadoop 2.7.6
>Reporter: guangdong
>Priority: Minor
>
> 1. Create table like this:
> create table src(
> name string
> ,buy_time string
> ,consumption int );
> 2.Then insert data：
> insert into src values('zzz','2018-08-01',20),('zzz','2018-08-01',10);
> 3.When i execute sql in hive 2.3.3. The result is :
> hive> select consumption, row_number() over(distribute by name sort by 
> buy_time desc) from src;
> Query ID = dwetl_20180801210808_692d5d70-a136-4525-9cdb-b6269e6c3069
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Starting Job = job_1531984581474_944267, Tracking URL = 
> http://hadoop-jr-nn02.pekdc1.jdfin.local:8088/proxy/application_1531984581474_944267/
> Kill Command = /soft/hadoop/bin/hadoop job  -kill job_1531984581474_944267
> Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 
> 1
> 2018-08-01 21:09:08,855 Stage-1 map = 0%,  reduce = 0%
> 2018-08-01 21:09:16,026 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.12 
> sec
> 2018-08-01 21:09:22,210 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 4.09 sec
> MapReduce Total cumulative CPU time: 4 seconds 90 msec
> Ended Job = job_1531984581474_944267
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 2  Reduce: 1   Cumulative CPU: 4.09 sec   HDFS Read: 437 
> HDFS Write: 10 SUCCESS
> Total MapReduce CPU Time Spent: 4 seconds 90 msec
> OK
> 201
> 102
> Time taken: 80.135 seconds, Fetched: 2 row(s)
> 4.When i execute sql in hive 0.14. The result is :
> > select consumption, row_number() over(distribute by name sort by buy_time 
> > desc) from src;
> Query ID = dwetl_2018080121_7812d9f0-328d-4125-ba99-0f577f4cca9a
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=
> Starting Job = job_1531984581474_944597, Tracking URL = 
> http://hadoop-jr-nn02.pekdc1.jdfin.local:8088/proxy/application_1531984581474_944597/
> Kill Command = /soft/hadoop/bin/hadoop job  -kill job_1531984581474_944597
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 
> 1
> 2018-08-01 21:22:26,467 Stage-1 map = 0%,  reduce = 0%
> 2018-08-01 21:22:34,839 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 1.13 
> sec
> 2018-08-01 21:22:40,984 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 
> 3.28 sec
> MapReduce Total cumulative CPU time: 3 seconds 280 msec
> Ended Job = job_1531984581474_944597
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 3.28 sec   HDFS Read: 233 
> HDFS Write: 10 SUCCESS
> Total MapReduce CPU Time Spent: 3 seconds 280 msec
> OK
> I hope have the common result . How could i can do?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-4587) DDLTask.showTableProperties() returns 0 on error

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-4587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-4587:
--
Fix Version/s: (was: 0.10.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> DDLTask.showTableProperties() returns 0 on error
> 
>
> Key: HIVE-4587
> URL: https://issues.apache.org/jira/browse/HIVE-4587
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Reporter: Eugene Koifman
>Priority: Minor
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> it outputs an error msg but returns status 0 which is not right



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-7980) Hive on spark issue..

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-7980:
--

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Hive on spark issue..
> -
>
> Key: HIVE-7980
> URL: https://issues.apache.org/jira/browse/HIVE-7980
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Spark
>Affects Versions: spark-branch
> Environment: Test Environment is..
> . hive 0.14.0(spark branch version)
> . spark 
> (http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data/spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0.jar)
> . hadoop 2.4.0 (yarn)
>Reporter: alton.jung
>Assignee: Chao Sun
>Priority: Major
> Fix For: spark-branch
>
>
> .I followed this 
> guide(https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started).
>  and i compiled hive from spark branch. in the next step i met the below 
> error..
> (*i typed the hive query on beeline, i used the  simple query using "order 
> by" to invoke the palleral works 
>ex) select * from test where id = 1 order by id;
> )
> [Error list is]
> 2014-09-04 02:58:08,796 ERROR spark.SparkClient 
> (SparkClient.java:execute(158)) - Error generating Spark Plan
> java.lang.NullPointerException
>   at 
> org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1262)
>   at 
> org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1269)
>   at 
> org.apache.spark.SparkContext.hadoopRDD$default$5(SparkContext.scala:537)
>   at 
> org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:77)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
>   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
> 2014-09-04 02:58:11,108 ERROR ql.Driver (SessionState.java:printError(696)) - 
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> 2014-09-04 02:58:11,182 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824527954 end=1409824691182 duration=163228 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,223 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogBegin(108)) -  from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,224 INFO  log.PerfLogger 
> (PerfLogger.java:PerfLogEnd(135)) -  start=1409824691223 end=1409824691224 duration=1 
> from=org.apache.hadoop.hive.ql.Driver>
> 2014-09-04 02:58:11,306 ERROR operation.Operation 
> (SQLOperation.java:run(199)) - Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
>   at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:284)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:146)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
>   at 
> org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:508)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:114

[jira] [Updated] (HIVE-10951) Describe a non-partitioned table table

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-10951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-10951:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Describe a non-partitioned table table 
> ---
>
> Key: HIVE-10951
> URL: https://issues.apache.org/jira/browse/HIVE-10951
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Priority: Major
> Fix For: hbase-metastore-branch
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-13734) How to initialize hive metastore database

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-13734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-13734:
---
Fix Version/s: (was: 2.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> How to initialize hive metastore database
> -
>
> Key: HIVE-13734
> URL: https://issues.apache.org/jira/browse/HIVE-13734
> Project: Hive
>  Issue Type: Test
>  Components: Configuration, Database/Schema
>Affects Versions: 2.0.0
>Reporter: Lijiayong
>Assignee: Damien Carol
>Priority: Major
>  Labels: mesosphere
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> When run "hive",there is a mistake:Exception in thread "main" 
> java.lang.RuntimeException:Hive metastore database is not initializad.Please 
> use schematool to create the schema.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-24277) Temporary table with constraints is persisted in HMS

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24277:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Temporary table with constraints is persisted in HMS
> 
>
> Key: HIVE-24277
> URL: https://issues.apache.org/jira/browse/HIVE-24277
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 4.0.0
>Reporter: Adesh Kumar Rao
>Assignee: Adesh Kumar Rao
>Priority: Major
>
> Run below in a session
> {noformat}
> 0: jdbc:hive2://zk1-nikhil.q5dzd3jj30bupgln50> create temporary table ttemp 
> (id int default 0);
> INFO  : Compiling 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb): 
> create temporary table ttemp (id int default 0)
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb); 
> Time taken: 0.625 seconds
> INFO  : Executing 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb): 
> create temporary table ttemp (id int default 0)
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing 
> command(queryId=hive_20201015050509_99267861-56f7-4940-ae3f-5a895dc3d2cb); 
> Time taken: 4.02 seconds
> INFO  : OK
> No rows affected (5.32 seconds)
> {noformat}
> Running "show tables" in another session will return that temporary table in 
> output
> {noformat}
> 0: jdbc:hive2://zk1-nikhil.q5dzd3jj30bupgln50> show tables
> . . . . . . . . . . . . . . . . . . . . . . .> ;
> INFO  : Compiling 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7): 
> show tables
> INFO  : Semantic Analysis Completed (retrial = false)
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from 
> deserializer)], properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7); 
> Time taken: 0.065 seconds
> INFO  : Executing 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7): 
> show tables
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : Completed executing 
> command(queryId=hive_20201015050554_7882c055-f084-4919-9a18-800d3fe4dcf7); 
> Time taken: 0.057 seconds
> INFO  : OK
> +--+
> | tab_name |
> +--+
> | ttemp|
> +--+
> {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-23613) Cleanup FindBugs

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-23613:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Cleanup FindBugs
> 
>
> Key: HIVE-23613
> URL: https://issues.apache.org/jira/browse/HIVE-23613
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Priority: Major
>
> Let this Jira be an umbrella for an effort of cleaning up the enormous amount 
> of FindBugs warnings in Hive project.
> Old way using findbugs:
> {code}
> mvn -DskipTests test-compile findbugs:findbugs -DskipTests=true
> {code}
> New way using spotbugs (where -pl use the module you want to check):
> {code}
> mvn -Pspotbugs 
> -Dorg.slf4j.simpleLogger.log.org.apache.maven.plugin.surefire.SurefirePlugin=INFO
>  -pl :hive-storage-api -am test-compile 
> com.github.spotbugs:spotbugs-maven-plugin:4.0.0:check
> {code}
> outputs
> {code}
> find . -name findbugsXml.xml
> find . -name spotbugsXml.xml
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-17320) OrcRawRecordMerger.discoverKeyBounds logic can be simplified

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-17320:
---
Fix Version/s: (was: 3.2.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> OrcRawRecordMerger.discoverKeyBounds logic can be simplified
> 
>
> Key: HIVE-17320
> URL: https://issues.apache.org/jira/browse/HIVE-17320
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Eugene Koifman
>Priority: Major
>
> with HIVE-17089 we never have any insert events in the deltas
> so if for every split of the base we know min/max key, we can use them to 
> filter delete events since all files are sorted by RecordIdentifier
> So we should be able to create SARG for all delete deltas
> the code can be simplified since now min/max key doesn't ever have to be null



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26530) HS2 OOM-OperationManager.queryIdOperation does not properly clean up multiple queryIds

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26530:
---
Fix Version/s: (was: 3.1.2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> HS2 OOM-OperationManager.queryIdOperation does not properly clean up multiple 
> queryIds
> --
>
> Key: HIVE-26530
> URL: https://issues.apache.org/jira/browse/HIVE-26530
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.1.2
>Reporter: Fachuan Bai
>Assignee: Fachuan Bai
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 168h
>  Time Spent: 10m
>  Remaining Estimate: 167h 50m
>
> Version: Hive 3.1.2
> I used Airflow to execute the Hive SQL while the Hive Server2 will OOM.
> I found the same issue: https://issues.apache.org/jira/browse/HIVE-22275
> But it is only fixed on 4.0.0



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26655) TPC-DS query 17 returns wrong results

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26655:
---
Fix Version/s: (was: 4.0.0-alpha-2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> TPC-DS query 17 returns wrong results
> -
>
> Key: HIVE-26655
> URL: https://issues.apache.org/jira/browse/HIVE-26655
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sungwoo Park
>Priority: Major
>
> When tested with 100GB ORC tables, the number of rows returned by query 17 is 
> not stable. It returns fewer rows than the correct result (55 rows).
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-23616) Llap VectorizedParquetRecordReader split parquet chunk error

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-23616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-23616:
---
Fix Version/s: (was: 3.1.2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Llap VectorizedParquetRecordReader split parquet chunk error
> 
>
> Key: HIVE-23616
> URL: https://issues.apache.org/jira/browse/HIVE-23616
> Project: Hive
>  Issue Type: Bug
>  Components: llap, Tez
>Affects Versions: 3.1.2
>Reporter: hezhang
>Assignee: hezhang
>Priority: Blocker
> Attachments: HIVE-23616.patch
>
>
> LlapCacheAwareFs split consecutive chunk bug.
> {code:java}
> Task failed, taskId=task_1588830091075_0817_1_00_03, 
> diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( 
> failure ) : java.lang.AssertionError: Lower bound for offset 8456998 is 
> [7891192, 12033926) at 
> org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.getAndValidateMissingChunks(LlapCacheAwareFs.java:384)
>  at 
> org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:259)
>  at java.io.DataInputStream.read(DataInputStream.java:149) at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:102)
>  at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFullyHeapBuffer(DelegatingSeekableInputStream.java:127)
>  at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:91)
>  at 
> org.apache.parquet.hadoop.ParquetFileReader$ConsecutiveChunkList.readAll(ParquetFileReader.java:1174)
>  at 
> org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:805)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:423)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:401)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:353)
>  at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:92)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>  at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>  at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
>  at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>  at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116) 
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:419)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250) at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
>  at java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
>  at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
>  at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:111)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745) , errorMessage=Cann

[jira] [Updated] (HIVE-17957) Hive 2.4.0 Release Planning

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-17957:
---
Fix Version/s: (was: 2.4.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Hive 2.4.0 Release Planning
> ---
>
> Key: HIVE-17957
> URL: https://issues.apache.org/jira/browse/HIVE-17957
> Project: Hive
>  Issue Type: Task
>Affects Versions: 2.4.0
>Reporter: Sahil Takiar
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-22377) Refactor CalcitePlanner

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-22377:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Refactor CalcitePlanner
> ---
>
> Key: HIVE-22377
> URL: https://issues.apache.org/jira/browse/HIVE-22377
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Miklos Gergely
>Assignee: Miklos Gergely
>Priority: Major
>
> CalcitePlanner is a 5000+ lines long class, which is trying to do too many 
> things on it's own. It extends SemanticAnalyzer, though it is not a 
> SemanticAnalyzer. It should have a better design.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26649) Hive metabase performance issues due to slow queries

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26649:
---
Fix Version/s: (was: 2.3.3)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Hive metabase performance issues due to slow queries
> 
>
> Key: HIVE-26649
> URL: https://issues.apache.org/jira/browse/HIVE-26649
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 2.3.3
> Environment: metastore db ：mysql 5.X
> hive：2.3.3
>Reporter: yihangqiao
>Assignee: yihangqiao
>Priority: Major
>  Labels: metastore, patch, performance
> Attachments: image-2022-10-19-14-42-33-073.png
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When the Hive metabase uses Mysql, during the peak period of Hive statement 
> query, the metastore initiates a large amount of DirectSQL, which will cause 
> performance problems in the metabase. The fundamental reason is that some 
> DirectSQL performance problems cause a large number of slow queries at the DB 
> level.
> For example for the following Hive query：
> {code:java}
> select * from imd_fcac_safe.fcac_dw_loan_details where ds='2021-10-10' and 
> sysid='MCFCM' {code}
> where ds and sysid are the primary and secondary partitions of the 
> imd_fcac_safe.fcac_dw_loan_details table, respectively
>  
> The Hive statement will generate the DirectSQL query as follows:
> {code:java}
> explain select PARTITIONS.PART_ID from PARTITIONS  inner join TBLS on 
> PARTITIONS.TBL_ID = TBLS.TBL_ID     and TBLS.TBL_NAME = 
> 'fcac_dw_loan_details'   inner join DBS on TBLS.DB_ID = DBS.DB_ID      and 
> DBS.NAME = 'imd_fcac_safe' inner join PARTITION_KEY_VALS FILTER0 on 
> FILTER0.PART_ID = PARTITIONS.PART_ID and FILTER0.INTEGER_IDX = 0 inner join 
> PARTITION_KEY_VALS FILTER1 on FILTER1.PART_ID = PARTITIONS.PART_ID and 
> FILTER1.INTEGER_IDX = 1 where ( ((FILTER0.PART_KEY_VAL = '2021-10-10') and 
> (FILTER1.PART_KEY_VAL = 'MCFCM')) ) {code}
> !image-2022-10-19-14-42-33-073.png!
>  
> Problems with this statement
> There is no TBL_ID field in the PARTITION_KEY_VALS table, which will cause 
> the partition of the same name of the unrelated table to be described when 
> performing an associated query; there is no index column in the 
> PARTITION_KEY_VAL table, so it cannot be accelerated by the index.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-17014) Password File Encryption for HiveServer2 Client

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-17014:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Password File Encryption for HiveServer2 Client
> ---
>
> Key: HIVE-17014
> URL: https://issues.apache.org/jira/browse/HIVE-17014
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Vlad Gudikov
>Assignee: Vlad Gudikov
>Priority: Major
> Fix For: 2.1.2
>
> Attachments: PasswordFileEncryption.docx.pdf
>
>
> The main point of this file is to encrypt password file that is used for 
> beeline connection using -w key. Any ideas or proposals would be great.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-24948) Enhancing performance of OrcInputFormat.getSplits with bucket pruning

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24948:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Enhancing performance of OrcInputFormat.getSplits with bucket pruning
> -
>
> Key: HIVE-24948
> URL: https://issues.apache.org/jira/browse/HIVE-24948
> Project: Hive
>  Issue Type: Bug
>  Components: ORC, Query Processor, Tez
>Reporter: Eugene Chung
>Assignee: Eugene Chung
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24948_3.1.2.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The summarized flow of generating input splits at Tez AM is like below; (by 
> calling HiveSplitGenerator.initialize())
>  # Perform dynamic partition pruning
>  # Get the list of InputSplit by calling InputFormat.getSplits()
>  
> [https://github.com/apache/hive/blob/624f62aadc08577cafaa299cfcf17c71fa6cdb3a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java#L260-L260]
>  # Perform bucket pruning with the list above if it's possible
>  
> [https://github.com/apache/hive/blob/624f62aadc08577cafaa299cfcf17c71fa6cdb3a/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java#L299-L301]
> But I observed that the action 2, getting the list of InputSplit, can make 
> big overhead when the inputs are ORC files in HDFS. 
> For example, there is a ORC table T partitioned by 'log_date' and each 
> partition is bucketed by a column 'q'. There are 240 buckets in each 
> partition and the size of each bucket(ORC file) is, let's say, 100MB.
> The SQL is like this.  
> {noformat}
> set hive.tez.bucket.pruning=true;
> select count(*) from T
> where log_date between '2020-01-01' and '2020-06-30'
> and q = 'foobar';{noformat}
> It means there are 240 * 183(days) = 43680 ORC files in the input paths, but 
> thanks to bucket pruning, only 183 files should be processed.
> In my company's environment, the whole processing time of the SQL was roughly 
> 5 minutes. However, I've checked that it took more than 3 minutes to make the 
> list of OrcSplit for 43680 ORC files. The logs with tez.am.log.level=DEBUG 
> showed like below;
> {noformat}
> 2021-03-25 01:21:31,850 [DEBUG] [InputInitializer {Map 1} #0] 
> |orc.OrcInputFormat|: getSplits started
> ...
> 2021-03-25 01:24:51,435 [DEBUG] [InputInitializer {Map 1} #0] 
> |orc.OrcInputFormat|: getSplits finished
> 2021-03-25 01:24:51,444 [INFO] [InputInitializer {Map 1} #0] 
> |io.HiveInputFormat|: number of splits 43680
> 2021-03-25 01:24:51,444 [DEBUG] [InputInitializer {Map 1} #0] 
> |log.PerfLogger|:  end=1616602891776 duration=199668 
> from=org.apache.hadoop.hive.ql.io.HiveInputFormat>
> ...
> 2021-03-25 01:26:03,385 [INFO] [Dispatcher thread {Central}] 
> |app.DAGAppMaster|: DAG completed, dagId=dag_1615862187190_731117_1, 
> dagState=SUCCEEDED {noformat}
> 43680 - 183 = 43497 InputSplits which consume about 60% of entire processing 
> time are just simply discarded by the action 3, pruneBuckets().
>  
> With bucket pruning, I think making the whole list of ORC input splits is not 
> necessary.
> Therefore, I suggest that the flow would be like this;
>  # Perform dynamic partition pruning
>  # Get the list of InputSplit by calling InputFormat.getSplits()
>  ## OrcInputFormat.getSplits() returns the bucket-pruned list if BitSet from 
> FixedBucketPruningOptimizer exists



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-4564) Distinct along with order by is not working when table name is part of column name in order by clause

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-4564:
--

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Distinct along with order by is not working when table name is part of column 
> name in order by clause
> -
>
> Key: HIVE-4564
> URL: https://issues.apache.org/jira/browse/HIVE-4564
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0
>Reporter: chandra sekhar gunturi
>Priority: Major
>  Labels: distinct, hive,, order
> Fix For: 0.9.1
>
>
> I have following table named 'region'.
> hive> desc region; 
> r_regionkey int 
> r_name string 
> r_comment string
> When we use  clause combination in table_name.column_name 
> format, the query throws SemanticException.
> For example, the following query throws error. 
> hive> select distinct region.r_name from region order by region.r_name; 
> FAILED: SemanticException [Error 10004]: Line 1:51 Invalid table alias or 
> column reference 'region': (possible column names are: _col0)
> The same query works fine if the same query is used without table name in 
> order by clause. 
> The following query works fine for region table. 
> hive> select distinct region.r_name from region order by r_name;
> This is a common scenario in actual real world scenarios.
> For example, I want to find out what are all the cities my employees are from.
> >> SELECT DISTINCT CITY.NAME FROM EMPLOYEE, CITY WHERE EMPLOYEE.CID=CITY.CID 
> >> ORDER BY CITY.NAME 
> Here we are forced to use CITY.NAME as it may conflict with EMPLOYEE.NAME. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26570) Incorrect results on sum(nvl(col,0)) when vectorization is ON

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26570:
---
Fix Version/s: (was: All Versions)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Incorrect results on sum(nvl(col,0)) when vectorization is ON
> -
>
> Key: HIVE-26570
> URL: https://issues.apache.org/jira/browse/HIVE-26570
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Affects Versions: 2.1.1
>Reporter: jhuchuan
>Priority: Major
>
> 1、
> create table testdb.lc_appl
> (
>   loan_no string,
>   fee_amt decimal(16,2)
> )
> clustered by (loan_no)
> into 5 buckets
> stored as orc
> tblproperties('transactional'='true');
>  
> 2、
> insert into testdb.lc_appl
> values ('a',12.12)
> insert into testdb.lc_appl
> values ('b',13.13)
>  
> set hive.vectorized.execution.enabled=false;
> select loan_no ,sum(fee_amt),sum(nvl(fee_amt,0))
> from testdb.lc_appl 
> group by loan_no 
> --correct result
> a 12.12 12.12 
> b 13.13 13.13
>  
>  
> set hive.vectorized.execution.enabled=true;
> select loan_no ,sum(fee_amt),sum(nvl(fee_amt,0))
> from testdb.lc_appl 
> group by loan_no 
> --incorrect result
> a 12.12 0
> b 13.13 0
>  
> 3、whether hive.vectorized.execution.enabled is true or false, the result 
> below is always right
> select loan_no ,sum(fee_amt),sum(coalesce(fee_amt,0.00))
> from testdb.lc_appl 
> group by loan_no 
> --correct result
> a 12.12 12.12 
> b 13.13 13.13



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-18853) Alter table change column shouldn't let define same constraint on a column with existing constraint

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-18853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-18853:
---
Fix Version/s: (was: 3.2.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Alter table change column shouldn't let define same constraint on a column 
> with existing constraint
> ---
>
> Key: HIVE-18853
> URL: https://issues.apache.org/jira/browse/HIVE-18853
> Project: Hive
>  Issue Type: Bug
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> {code:sql}
> create table ttest(i int constraint nn1 not null enable);
> alter table ttest change i i int constraint nn2 not null enable);
> {code}
> The above statements will end up creating multiple not null constraints on 
> column i/j.
> {code:sql}
> desc formatted ttest;
> # Not Null Constraints
> Table:constraints.ttest
> Constraint Name:  nn1
> Column Name:  i
> Constraint Name:  nn2
> Column Name:  i
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-21774) Support partition level filtering for events with multiple partitions

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-21774:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Support partition level filtering for events with multiple partitions
> -
>
> Key: HIVE-21774
> URL: https://issues.apache.org/jira/browse/HIVE-21774
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> Some of the events in hive can span across multiple partitions, table or even 
> database. Events related to transactions, can span across multiple databases. 
> When a transaction does some write operation, it is added to the write 
> notification log table. During dump of commit transaction event, al the 
> entries present in the write notification log table for that transaction is 
> read and is added to the commit transaction message. In case partition filter 
> is supplied for the dump, only those partitions which are part of the policy 
> should be added to the commit txn message.
>  * All the events which are not partition level will be added to the list of 
> events to be dumped.
>  * Pass the filter condition for the policy to commit transaction message 
> handler (events which are not partition level).
>  * During dump for commit transaction event, extract the events added in the 
> write notification log table and compare it with the filter condition.
>  * If the event from write notification log satisfies the filter condition, 
> then add it to the commit transaction message.
>  * If filter condition is null, then add all the events from write 
> notification log table to commit transaction message.
>  * For events which does not have partition level info like open txn, abort 
> txn etc, just dump the events without any filtering. So it may happen that 
> some of events which are not related to any of the satisfying partition, may 
> get replayed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-19693) Create hive API on Java 1.9 based

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-19693:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Create hive API on Java 1.9 based
> -
>
> Key: HIVE-19693
> URL: https://issues.apache.org/jira/browse/HIVE-19693
> Project: Hive
>  Issue Type: Improvement
>  Components: API
>Affects Versions: 2.3.2
>Reporter: Murtaza Hatim Zaveri
>Assignee: Murtaza Hatim Zaveri
>Priority: Major
>  Labels: newbie
> Fix For: 0.10.1
>
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26484) Introduce gRPC Proto Definition

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26484:
---
Fix Version/s: (was: All Versions)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Introduce gRPC Proto Definition
> ---
>
> Key: HIVE-26484
> URL: https://issues.apache.org/jira/browse/HIVE-26484
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Rohan Sonecha
>Assignee: Rohan Sonecha
>Priority: Minor
>
> Native support for gRPC in Hive Metastore will give users access to many 
> features including built-in streaming RPC support, default use of HTTP/2 
> protocol, fine grained security, and more.
>  
> This PR will introduce the complete proto file and necessary modifications to 
> pom.xml files to enable gRPC support. It will be followed with PRs that will 
> contain subsets of the Hive Metastore methods along with Thrift to gRPC 
> mappers and unit tests. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-21773) Supporting external table replication with partition filter.

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-21773:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Supporting external table replication with partition filter.
> 
>
> Key: HIVE-21773
> URL: https://issues.apache.org/jira/browse/HIVE-21773
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> Hive external table replication is done differently than managed table 
> replication. In case of external table, list is created for the locations of 
> the table and partitions to be replicated. If the partition location is 
> within the table location, then partition location is not added to the list. 
> For partitions with location outside table, partition location is added to 
> the list. In case of incremental dump, the data related events are ignored 
> and just the metadata related events are dumped. The list of location is 
> prepared and that is used for replication. During load, the events are 
> replayed and then the distcp tasks are created, one for each location present 
> in the list.
> For partition level replication, not all partition will be present in the 
> dump. So even if the partition locations are within the table location, each 
> partition location will be added to the list.
>  * If where condition is present in the REPL DUMP command then add location 
> for each satisfying partition even though the partition location is within 
> table location.
>  * If table is not mentioned in the where clause then follow the older 
> behavior.
>  * If table is mentioned with a key but the key does not match any of the 
> partitioned column then fail repl dump.
>  * If the table is mentioned with the key and even if all the partitions are 
> satisfying the filter condition, add location for each partition. This is to 
> avoid copying partitions which are added using alter after the dump.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-17814) Reduce Memory footprint for large database bootstrap replication load

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-17814:
---
Fix Version/s: (was: 3.2.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Reduce Memory footprint for large database bootstrap replication load 
> --
>
> Key: HIVE-17814
> URL: https://issues.apache.org/jira/browse/HIVE-17814
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Anishek Agarwal
>Assignee: Anishek Agarwal
>Priority: Major
>
> As part of HIVE-16896 we are doing dynamic Query Task generation for 
> bootstrap repl load. This was done since the number of tasks for large 
> databases will generate a very large graph with hundreds of thousands of 
> objects, this would put additional memory pressure on hive. 
> The execution hook's however still keep reference to the query plan which 
> gets dynamically modified and at the end of all task execution hive will have 
> the whole DAG in memory which is what we have to prevent, Additionally for 
> PostExecution Hive hooks we are additionally storing the TaskRunner objects 
> for each task that is executed. 
> We have to handle these issues to prevent excessive memory usage for 
> replication specifically bootstrap replication. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25464) Refactor GenericUDFToUnixTimeStamp

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25464:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Refactor GenericUDFToUnixTimeStamp
> --
>
> Key: HIVE-25464
> URL: https://issues.apache.org/jira/browse/HIVE-25464
> Project: Hive
>  Issue Type: Sub-task
>  Components: UDF
>Affects Versions: All Versions
>Reporter: Sruthi Mooriyathvariam
>Assignee: Sruthi Mooriyathvariam
>Priority: Minor
>
> Remove redundant code and refactor the entire GenericUDFToUnixTimeStamp code



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-13524) if the source files contain date / time /timestamp values in single column not able to read values by using timestamp datatype.

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-13524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-13524:
---
Fix Version/s: (was: 1.1.1)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> if the source files contain date / time /timestamp values in single column 
> not able to read values by using timestamp datatype.
> ---
>
> Key: HIVE-13524
> URL: https://issues.apache.org/jira/browse/HIVE-13524
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, CLI
>Affects Versions: 1.1.0, 1.1.1
>Reporter: William
>Priority: Minor
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> If the source files contain date / time /timestamp values in single column 
> not able to read values by using timestamp / date data type.
> The reason for the request Is to support the BI tools calculation on date 
> fields , Also make the HIVE DDL to support  both Hive and Impala on date 
> fields (Doesn't have date data type).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26401) Refine the log of add_partitions if the partition already exists

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26401:
---
Fix Version/s: (was: 4.0.0-alpha-2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Refine the log of add_partitions if the partition already exists
> 
>
> Key: HIVE-26401
> URL: https://issues.apache.org/jira/browse/HIVE-26401
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wechar
>Assignee: Wechar
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently {{*add_partitions_xxx*}} will log the complete information of a 
> partition if it already exists, see in 
> [HMSHandler.java#L4320|https://github.com/apache/hive/blob/e3751ab545370f9b252d0b4a07bc315037541a95/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L4320]:
> {code:java}
> if (!shouldAdd) {
>   LOG.info("Not adding partition {} as it already exists", part);
>   return false;
> }
> {code}
> It will print a long message including the columns of this partition, we 
> think it is unnecessary based on the following two points:
> {color:red}1. The long message is redundant.{color}
> We can get enough information from just 
> *cat_name.db_name.tbl_name[part_col1=part_val1/part_col2=part_val2...]*
> {color:red}2. The long message is not friendly to save and query.{color}
> This log message will take up a large log space especially when the user need 
> to execute *MSCK REPAIR TABLE* operation regularly because the old partition 
> must be already existed. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-10503) Aggregate stats cache: follow up optimizations

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-10503:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Aggregate stats cache: follow up optimizations
> --
>
> Key: HIVE-10503
> URL: https://issues.apache.org/jira/browse/HIVE-10503
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.2.0
>Reporter: Vaibhav Gumashta
>Priority: Major
> Fix For: 1.3.0
>
>
> Some follow up work items:
> 1. Estimate cache nodes from memory size - currently the user needs to 
> specify size based on #nodes.
> 2. Make the AggregateStatsCache#add method asynchronous - adding to cache can 
> happen in a new thread.
> 3. Based on perf testing, explore an alternate data structure for the node 
> list per cache key.
> 4. Explore ideas to reduce locking granularity of the value list per cache 
> key.
> 5. There is an O(n*n) loop while finding the match - that should go away.
> 6. Single call to DB to get aggregate for columns not in cache.
> 7. Organize metrics capturing in a better way.
> 8. Address concerns on TTL causing stale data in cache.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25296) Replace parquet-hadoop-bundle dependency with the actual parquet modules

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25296:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Replace parquet-hadoop-bundle dependency with the actual parquet modules
> 
>
> Key: HIVE-25296
> URL: https://issues.apache.org/jira/browse/HIVE-25296
> Project: Hive
>  Issue Type: Improvement
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The parquet-hadoop-bundle is not a real dependency but a mere packaging
> of three parquet modules to create an uber jar. The Parquet community
> created this artificial module on demand by HIVE-5783 but the
> benefits if any are unclear.
> On the contrary using the uber dependency has some drawbacks:
> * Parquet souce code cannot be attached easily in IDEs which makes debugging 
> sessions cumbersome.
> * Finding concrete dependencies with Parquet is not possible just by 
> inspecting the pom files.
> * Extra maintenance cost for the Parquet community adding additional 
> verification steps during a release.
> The goal of this JIRA is to replace the uber dependency with concrete 
> dependencies to the respective modules:
> * parquet-common
> * parquet-column
> * parquet-hadoop



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-16894) Multi-threaded execution of bootstrap dump of tables / functions

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-16894:
---
Fix Version/s: (was: 3.2.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Multi-threaded execution of bootstrap dump of tables / functions
> 
>
> Key: HIVE-16894
> URL: https://issues.apache.org/jira/browse/HIVE-16894
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Anishek Agarwal
>Assignee: Anishek Agarwal
>Priority: Major
>
> after completing HIVE-16893 the bootstrap process will dump single table at a 
> time and hence will be very time consuming while not optimally utilizing the 
> available resources. Since there is no dependency between dumps of various 
> tables we should be able to do this in parallel.
> Bootstrap dump at db level does :
> * boostrap of all tables (scope of current jira) 
> ** boostrap of all partitions in a table. 
> * boostrap of all functions (scope of current jira) 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26195) Keep Kafka handler naming style consistent with others

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26195:
---
Fix Version/s: (was: 4.0.0-alpha-2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Keep Kafka handler naming style consistent with others
> --
>
> Key: HIVE-26195
> URL: https://issues.apache.org/jira/browse/HIVE-26195
> Project: Hive
>  Issue Type: Improvement
>  Components: StorageHandler
>Affects Versions: 4.0.0-alpha-2
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Keep Kafka handler naming style consistent with others (JDBC, Hbase, Kudu, 
> Druid)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-7603) Getting exception while running Hive queries using Oozie's Hive action

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-7603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-7603:
--
Fix Version/s: (was: 0.12.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Getting exception while running Hive queries using Oozie's Hive action
> --
>
> Key: HIVE-7603
> URL: https://issues.apache.org/jira/browse/HIVE-7603
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.12.0
> Environment: Oozie 4.0,Hive.12,CDH5
>Reporter: Anuroopa George
>Priority: Major
>
> Getting the following exception:
> [main] ERROR hive.ql.exec.DDLTask  - 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: 
> Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient 
> The queries works fine when submitted to Hive prompt. Updated Oozie share lib 
> with respect to Hive.12 also. After updating Oozie  lib/hive   to Hive .12,  
> I am  facing this issue



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-10849) LLAP: Serialize handling of requests / events for a query within daemons

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-10849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-10849:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> LLAP: Serialize handling of requests / events for a query within daemons
> 
>
> Key: HIVE-10849
> URL: https://issues.apache.org/jira/browse/HIVE-10849
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Major
> Fix For: llap
>
>
> It's possible for requests to come in out of order with multiple listeners 
> and sending threads. Serializing processing of these events per DAG should 
> simplify the code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-16904) during repl load for large number of partitions the metadata file can be huge and can lead to out of memory

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-16904:
---
Fix Version/s: (was: 3.2.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> during repl load for large number of partitions the metadata file can be huge 
> and can lead to out of memory 
> 
>
> Key: HIVE-16904
> URL: https://issues.apache.org/jira/browse/HIVE-16904
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Anishek Agarwal
>Assignee: Anishek Agarwal
>Priority: Major
>
> the metadata pertaining to a table + its partitions is stored in a single 
> file, During repl load all the data is loaded in memory in one shot and then 
> individual partitions processed. This can lead to huge memory overhead as the 
> entire file is read in memory. try to deserialize the partition objects with 
> some sort of streaming json deserializer. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25930) hive-3.1.1-current_timestamp时区差8小时修复代码

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25930:
---
Fix Version/s: (was: 3.1.2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

>  hive-3.1.1-current_timestamp时区差8小时修复代码
> ---
>
> Key: HIVE-25930
> URL: https://issues.apache.org/jira/browse/HIVE-25930
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Planning
>Affects Versions: 3.1.1
>Reporter: lkl
>Priority: Major
> Attachments: 1644206926(1).png
>
>
> current_timestamp 默认 utc 时区与系统时区不一致



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-10117) LLAP: Use task number, attempt number to cache plans

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-10117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-10117:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> LLAP: Use task number, attempt number to cache plans
> 
>
> Key: HIVE-10117
> URL: https://issues.apache.org/jira/browse/HIVE-10117
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
>Priority: Major
> Fix For: llap
>
>
> Instead of relying on thread locals only. This can be used to share the work 
> between Inputs / Processor / Outputs in Tez.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-24713) HS2 never shut down after reconnecting to Zookeeper

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24713:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> HS2 never shut down after reconnecting to Zookeeper
> ---
>
> Key: HIVE-24713
> URL: https://issues.apache.org/jira/browse/HIVE-24713
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Eugene Chung
>Assignee: Eugene Chung
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> While using zookeeper discovery mode, the problem that HS2 never knows 
> deregistering from Zookeeper always happens.
> Reproduction is simple.
>  # Find one of the zk servers which holds the DeRegisterWatcher watches of 
> HS2 instances. If the version of ZK server is 3.5.0 or above, it's easily 
> found with [http://zk-server:8080/commands/watches] (ZK AdminServer feature)
>  # Check which HS2 instance is watching on the ZK server found at 1, say it's 
> _hs2-of-2_
>  # Restart the ZK server found at 1
>  # Deregister _hs2-of-2_ with the command
> {noformat}
> hive --service hiveserver2 -deregister hs2-of-2{noformat}
>  # _hs2-of-2_ never knows that it must be shut down because the watch event 
> of DeregisterWatcher was already fired at the time of 3.
> The reason of the problem is explained at 
> [https://zookeeper.apache.org/doc/r3.3.3/zookeeperProgrammers.html#sc_WatchRememberThese]
> I added some logging to DeRegisterWatcher and checked what events were 
> occurred at the time of 3(restarting of ZK server);
>  # WatchedEvent state:Disconnected type:None path:null
>  # WatchedEvent[WatchedEvent state:SyncConnected type:None path:null]
>  # WatchedEvent[WatchedEvent state:SaslAuthenticated type:None path:null]
>  # WatchedEvent[WatchedEvent state:SyncConnected type:NodeDataChanged
>  path:/hiveserver2/serverUri=hs2-of-2:1;version=3.1.2;sequence=000711]
> As the zk manual says, watches are one-time triggers. When the connection to 
> the ZK server was reestablished, state:SyncConnected type:NodeDataChanged for 
> the path is fired and it's the end. *DeregisterWatcher must be registered 
> again for the same znode to get a future NodeDeleted event.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-18215) Possible code optimization exists for "INSERT OVERWITE on MM table. SELECT FROM (SELECT .. UNION ALL SELECT ..)

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-18215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-18215:
---
Fix Version/s: (was: 3.2.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Possible code optimization exists for "INSERT OVERWITE on MM table. SELECT 
> FROM (SELECT .. UNION ALL SELECT ..)
> ---
>
> Key: HIVE-18215
> URL: https://issues.apache.org/jira/browse/HIVE-18215
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Steve Yeom
>Assignee: Steve Yeom
>Priority: Minor
>
> removeTempOrDuplicateFiles(.) has an opportunity for performance code 
> optimization for the 
> test case of "INSERT OVERWITE on MM table. SELECT FROM (SELECT .. UNION ALL 
> SELECT ..)" from dp_counter_mm.q.
> This is MM table specific and we can avoid calling fs.exists() by creating a 
> specific mmDirectories
> list for the current SELECT statement (out of two SELECTs in our test case 
> from 
> the dp_counter_mm.q) from the IOW union all query.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-18049) Enable Hive on Tez to provide globally sorted clustered table

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-18049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-18049:
---
Fix Version/s: (was: 2.1.1)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Enable Hive on Tez to provide globally sorted clustered table
> -
>
> Key: HIVE-18049
> URL: https://issues.apache.org/jira/browse/HIVE-18049
> Project: Hive
>  Issue Type: New Feature
>  Components: Hive, Tez
>Reporter: LingXiao Lan
>Priority: Major
> Attachments: CombinedPartitioner.txt, HIVE-18049.1.patch, 
> tez-0.8.5.txt
>
>
> {code:sql}
> CREATE TABLE `test`(
>`time` int,
>`userid` bigint)
>  CLUSTERED BY (
>userid)
>  SORTED BY (
>userid ASC)
>  INTO 4 BUCKETS
>  ;
> {code}
> I'm new to hive. When I tried to use hive to store my data, I met the 
> fallowing questions.
> When insert data into this table, the data will be sorted into 4 buckets 
> automatically. But because hive uses hash partitioner by default, the data is 
> only sorted in each bucket and isn't sorted among different buckets. 
> Sometimes we need the data to be globally sorted, to optimizing indexing, for 
> example.
> If we can sample the table first and use TotalOrderPartitioner, this work 
> could be done. The difficulty is how do we automatically decide when to use 
> TotalOrderPartitioner and when not, because a insertion query can be complex, 
> which results in a complex DAG in Tez.
> I have implemented a temporary version. It uses a customer partitioner which 
> combines hash partitioner and totalorder partitioner. A physical optimizer is 
> added to hive to decide to choose which partitioner. But in order to reduce 
> the work load, this version should affect tez source code, which is not 
> necessary in fact.
> I'm wondering if we can implement a more common version which addresses this 
> issue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-20454) extend inheritPerms to ACID in Hive 1.X

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-20454:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> extend inheritPerms to ACID in Hive 1.X
> ---
>
> Key: HIVE-20454
> URL: https://issues.apache.org/jira/browse/HIVE-20454
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Priority: Major
> Fix For: 1.3.0
>
> Attachments: HIVE-20454.02.patch
>
>
> CompactorMR.commitJob() does a rename() in way that doesn't respect 
> {{HiveConf.ConfVars.HIVE_WAREHOUSE_SUBDIR_INHERIT_PERMS}}.
> Few other places where Acid write does the same.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26563) Add extra columns in Show Compactions output and sort the output

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26563:
---
Fix Version/s: (was: 4.0.0-alpha-2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Add extra columns in Show Compactions output and sort the output
> 
>
> Key: HIVE-26563
> URL: https://issues.apache.org/jira/browse/HIVE-26563
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 3.0.0
>Reporter: KIRTI RUGE
>Assignee: KIRTI RUGE
>Priority: Major
>
> SHOW COMPACTIONS need reformatting in below aspects:
> 1.Need to add all below columns     
>   host information, duration, next_txn_id, txn_id, commit_time, 
> highest_write_id, cleaner       start, tbl_properties
> 2. Sort the output in a way it should display a moist recent element at the 
> start(either completed or in progress)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26618) Add setting to turn on/off removing sections of a query plan known never produces rows

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26618:
---
Fix Version/s: (was: 4.0.0)
   (was: 4.0.0-alpha-2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Add setting to turn on/off removing sections of a query plan known never 
> produces rows
> --
>
> Key: HIVE-26618
> URL: https://issues.apache.org/jira/browse/HIVE-26618
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-26524 introduced an optimization to remove sections of query plan known 
> never produces rows.
> Add a setting into hive conf to turn on/off this optimization.
> When the optimization is turned off restore the legacy behavior:
> * represent empty result operator with {{HiveSortLimit}} 0
> * disable {{HiveRemoveEmptySingleRules}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-16865) Handle replication bootstrap of large databases

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-16865:
---
Fix Version/s: (was: 3.2.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Handle replication bootstrap of large databases
> ---
>
> Key: HIVE-16865
> URL: https://issues.apache.org/jira/browse/HIVE-16865
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Anishek Agarwal
>Assignee: Anishek Agarwal
>Priority: Major
>
> for larger databases make sure that we can handle replication bootstrap.
> * Assuming large database can have close to million tables or a few tables 
> with few hundred thousand partitions. 
> *  for function replication if a primary warehouse has large number of custom 
> functions defined such that the same binary file in corporates most of these 
> functions then on the replica warehouse there might be a problem in loading 
> all these functions as we will have the same jar on primary copied over for 
> each function such that each function will have a local copy of the jar, 
> loading all these jars might lead to excessive memory usage. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26559) Skip unnecessary get all partition operations when where condition with 1=0 in CBO.

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26559:
---
Fix Version/s: (was: All Versions)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Skip unnecessary get all partition operations when where condition with 1=0 
> in CBO.
> ---
>
> Key: HIVE-26559
> URL: https://issues.apache.org/jira/browse/HIVE-26559
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Affects Versions: All Versions
>Reporter: shuaiqi.guo
>Assignee: shuaiqi.guo
>Priority: Major
> Attachments: HIVE-26559.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> In some cases, queries may get executed with where condition mentioning to 
> "1=0" to get schema. E.g
> {noformat}
> SELECT
>   *
> FROM
>   table_with_millions_of_partitions
> WHERE
>   1=0
> {noformat}
> In actual production, it likes:
> {noformat}
> SELECT
>   *
> FROM
>   table_with_millions_of_partitions
> WHERE
>   partition_col1 = value1
>   and partition_col2 = value2
>   and if(some conditions, true, false)
> {noformat}
>  
> When the cbo optimizer optimizes the execution plan of this query, the cbo 
> optimizer will get all the partitions of table_with_millions_of_partitions. 
> This seems useless and causes hiveserver to fail when the number of 
> partitions is very high.
>  
> Use this patch to skip unnecessary get all partition operation when pruneNode 
> is always false.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-24708) org.apache.thrift.transport.TTransportException: null；Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.StatsTa

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24708:
---
Fix Version/s: (was: 0.13.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> org.apache.thrift.transport.TTransportException: null；Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.StatsTask
> 
>
> Key: HIVE-24708
> URL: https://issues.apache.org/jira/browse/HIVE-24708
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 3.1.2
> Environment: centos7
> hadoop：3.1.4
> hbase：2.2.6
> mysql：5.7.32
> hive：3.1.2
>  
>Reporter: tiki
>Priority: Major
> Attachments: hive错误日志.txt
>
>
> h2. 1.原配置
> 在hive-site.xml中配置了：
>  
> {code:java}
>   
> hive.metastore.uris
> thrift://hadoop-server-004:9083,hadoop-server-005:9083
>   
> {code}
> 之后，在hive中插入数据，会打印一条Error日志，但数据成功插入，查看hive的hive.log日志文件，发现如下错误：
> {code:java}
> org.apache.thrift.transport.TTransportException: null
> ..
> 2021-01-27T20:00:17,086 ERROR [HiveServer2-Background-Pool: Thread-510] 
> exec.StatsTask: Failed to run stats task 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.thrift.transport.TTransportException
> ..
> Caused by: org.apache.thrift.transport.TTransportException
> ..
> 2021-01-27T20:00:17,089 ERROR [HiveServer2-Background-Pool: Thread-510] 
> operation.Operation: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.StatsTask
> ..
> {code}
> 具体日志错误见 附件
> h2. 2.修改配置
> 将配置修改为：
>  
> {code:java}
>   
> hive.metastore.uris
> 
>   
> {code}
> 之后，插入数据正常



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26113) Align HMS and metastore tables's schema

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26113:
---
Fix Version/s: (was: 4.0.0-alpha-2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Align HMS and metastore tables's schema
> ---
>
> Key: HIVE-26113
> URL: https://issues.apache.org/jira/browse/HIVE-26113
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> HMS tables should be in sync with those exposed by Hive metastore via _sysdb_.
> At the moment there are some discrepancies for the existing tables, the 
> present ticket aims at bridging this gap.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-19566) Vectorization: Fix NULL / Wrong Results issues in Complex Type Functions

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-19566:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Vectorization: Fix NULL / Wrong Results issues in Complex Type Functions
> 
>
> Key: HIVE-19566
> URL: https://issues.apache.org/jira/browse/HIVE-19566
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Priority: Critical
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized Complex Type functions:
>  * index
>  * (StructField)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-3906) URI_Escape and URI_UnEscape UDF

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-3906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-3906:
--
Fix Version/s: (was: 0.8.1)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> URI_Escape and URI_UnEscape UDF
> ---
>
> Key: HIVE-3906
> URL: https://issues.apache.org/jira/browse/HIVE-3906
> Project: Hive
>  Issue Type: New Feature
>  Components: UDF
>Affects Versions: 0.8.1
> Environment: Hadoop 0.20.1
> Java 1.6.0
>Reporter: Liu Zongquan
>Priority: Major
>  Labels: patch
> Attachments: HIVE-3906.1.patch.txt, udf_uri_escape.q, 
> udf_uri_escape.q.out, udf_uri_unescape.q, udf_uri_unescape.q.out
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Current releases of Hive lacks a function which would encode URL or form 
> parameters or it escapes the URI.
> The function URI_ESCAPE (uri) would return the encoded form  of the URI which 
> would be useful while using HiveQL.Its always advisable to encode URL or form 
> parameters; plain form parameter is vulnerable to cross site attack, SQL 
> injection and may direct our web application into some unpredicted output.
> Functionality :-
> Function Name: URI_ESCAPE (uri)
> Returns the encoded form of the uri.
> Example: hive> SELECT URI_ESCAPE('http://www.example.com?a=l&t');
> -> 'http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t'
> Usage :-
> Case 1 : To get encoded uri corresponding to a particular uri
> hive> SELECT URI_ESCAPE('http://google.com/resource?key=value1 & value2');
> -> 'http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2'
> Case 2 : To query a table to get encoded form of the urls corresponding to 
> users
> Table :- USER_URLS
> userid |url
> USR1|http://www.example.com?a=l&t   
> USR00010|http://search.barnesandnoble.com/booksearch/first book.pdf   
>
> USR00100|http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4
> USR01000|http://google.com/resource?key=value
> USR1|http://google.com/resource?key=value1 & value2
> USR10001|ftp://eau.ww.eesd.gov.calgary/home/smith/budget.wk1
> USR10010|gopher://gopher.voa.gov
> USR10100|http://www.apple.com/index.html
> USR11000|file:/data/letters/to_mom.txt
> USR11001|http://www.cuug.ab.ca:8001/~branderr/csce.html 
> Query : select userid,url,uri_escape(uri) from USER_URLS;
> Result :-
> USR1|http://www.example.com?a=l&t|http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t
>
> USR00010|http://search.barnesandnoble.com/booksearch/first 
> book.pdf|http://search.barnesandnoble.com/booksearch/first%20book.pdf 
> 
> USR00100|http://abc.dev.domain.com/0007AC/ads/800x480 15sec 
> h.264.mp4|http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst%20book.pdf
> USR01000|http://google.com/resource?key=value|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue
> USR1|http://google.com/resource?key=value1 & 
> value2|http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2
> USR10001|ftp://eau.ww.eesd.gov.calgary/home/smith/budget.wk1|ftp%3A%2F%2Feau.ww.eesd.gov.calgary%2Fhome%2Fsmith%2Fbudget.wk1
> USR10010|gopher://gopher.voa.gov|gopher%3A%2F%2Fgopher.voa.gov
> USR10100|http://www.apple.com/index.html|http%3A%2F%2Fwww.apple.com%2Findex.html
> USR11000|file:/data/letters/to_mom.txt|file%3A%2Fdata%2Fletters%2Fto_mom.txt
> USR11001|http://www.cuug.ab.ca:8001/~branderr/csce.html|http%3A%2F%2Fwww.cuug.ab.ca%3A8001%2F%7Ebranderr%2Fcsce.html
> Current releases of Hive lacks a function which would decode the encoded uri.
> The function URI_UNESCAPE (uri) would return the decoded form  of the encoded 
> URI which would be useful while using HiveQL.This function converts the 
> specified string by replacing any escape sequences with their unescaped 
> representation.
> Functionality :-
> Function Name: URI_UNESCAPE (uri)
> Returns the decoded form of the encoded uri.
> Example: hive> SELECT 
> URI_UNESCAPE('http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t');
> -> 'http://www.example.com?a=l&t'
> Usage :-
> Case 1 : To get decoded uri corresponding to a particular encoded uri
> hive> SELECT 
> URI_UNESCAPE('http%3A%2F%2Fgoogle.com%2Fresource%3Fkey%3Dvalue1%20%26%20value2');
> -> 'http://google.com/resource?key=value1 & value2'
> Case 2 : To query a table to get decoded form of the encoded urls 
> corresponding to users
> Table :- USER_URLS
> userid |encodedurl
> USR1|http%3A%2F%2Fwww.example.com%3Fa%3Dl%26t
> USR00010|http://search.barnesandnoble.com/booksearch/first%20bo

[jira] [Updated] (HIVE-19672) Column Names mismatch between native Druid Tables and Hive External table map

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-19672:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Column Names mismatch between native Druid Tables and Hive External table map
> -
>
> Key: HIVE-19672
> URL: https://issues.apache.org/jira/browse/HIVE-19672
> Project: Hive
>  Issue Type: Bug
>  Components: Druid integration
>Affects Versions: 3.0.0
>Reporter: Slim Bouguerra
>Priority: Major
>
> Druid Columns names are case sensitive while Hive is case insensitive.
> This implies that any Druid Datasource that has columns with some upper cases 
> as part of column name it will not return the expected results.
> One possible fix is to try to remap the column names before issuing Json 
> Query to Druid.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26236) count(1) with subquery count(distinct) gives wrong results with hive.optimize.distinct.rewrite=true and cbo on

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26236:
---
Fix Version/s: (was: 3.1.0)
   (was: 3.1.2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> count(1) with subquery count(distinct) gives wrong results with 
> hive.optimize.distinct.rewrite=true and cbo on
> --
>
> Key: HIVE-26236
> URL: https://issues.apache.org/jira/browse/HIVE-26236
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Logical Optimizer
>Affects Versions: All Versions
>Reporter: honghui.Liu
>Assignee: honghui.Liu
>Priority: Major
>  Labels: count(distinct
>
> it give wrong result when hive.optimize.distinct.rewrite is true, By default, 
> it's true for all 3.x versions. The test result is 2, and the expected result 
> is 1.
> {code:java}
> create table count_distinct(a int, b int);
> insert into table count_distinct values (1,2),(2,3);
> set hive.execution.engine=tez;
> set hive.cbo.enable=true;
> set hive.optimize.distinct.rewrite=true;
> select count(1) from ( 
>       select count(distinct a) from count_distinct
> ) tmp; {code}
> Before CBO optimization，RelNode tree as this，
> {code:java}
> HiveProject(_o__c0=[$0])
>   HiveAggregate(group=[{}], agg#0=[count($0)])
>     HiveProject($f0=[1])
>       HiveProject(_o__c0=[$0])
>         HiveAggregate(group=[{}], agg#0=[count(DISTINCT $0)])
>           HiveProject($f0=[$0])
>             HiveTableScan(table=[[default.count_distinct]], 
> table:alias=[count_distinct]) {code}
> Optimized by HiveExpandDistinctAggregatesRule, RelNode tree as this，
> {code:java}
> HiveProject(_o__c0=[$0])
>   HiveAggregate(group=[{}], agg#0=[count($0)])
>     HiveProject($f0=[1])
>       HiveProject(_o__c0=[$0])
>         HiveAggregate(group=[{}], agg#0=[count($0)])
>           HiveAggregate(group=[{0}])
>             HiveProject($f0=[$0])
>               HiveProject($f0=[$0])
>                 HiveTableScan(table=[[default.count_distinct]], 
> table:alias=[count_distinct]) {code}
> count(distinct xx) converte to count (xx) from (select xx from table_name 
> group by xx) 
> Optimized by Projection Pruning, RelNode tree as this, 
> {code:java}
> HiveAggregate(group=[{}], agg#0=[count()])
>   HiveProject(DUMMY=[0])
>     HiveAggregate(group=[{}])
>       HiveAggregate(group=[{0}])
>         HiveProject(a=[$0])
>           HiveTableScan(table=[[default.count_distinct]], 
> table:alias=[count_distinct]) {code}
> In this case, an error occurs in the execution plan.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-10694) LLAP: Add counters for time lost per query due to preemption

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-10694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-10694:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> LLAP: Add counters for time lost per query due to preemption
> 
>
> Key: HIVE-10694
> URL: https://issues.apache.org/jira/browse/HIVE-10694
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
>Priority: Major
> Fix For: llap
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-20528) hive onprem-s3 replication is slow

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-20528:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> hive onprem-s3 replication is slow
> --
>
> Key: HIVE-20528
> URL: https://issues.apache.org/jira/browse/HIVE-20528
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> Hive onprem - s3 replication was initiated for 1TB tpc-ds schema ( ~ 250GB 
> orc file size ) and 16hrs into the run it still says only 10% is complete. 
> Out of the 24 tables, only 4 tables are replicated.
> The same schema in onprem-onprem use case get done in almost ~2hrs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-17678) appendPartition in HiveMetaStoreClient does not conform to the IMetaStoreClient.

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-17678:
---
Fix Version/s: (was: 1.1.0)
   (was: 2.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> appendPartition in HiveMetaStoreClient does not conform to the 
> IMetaStoreClient.
> 
>
> Key: HIVE-17678
> URL: https://issues.apache.org/jira/browse/HIVE-17678
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Nithish
>Assignee: Nithish
>Priority: Major
>  Labels: Metastore
> Attachments: HIVE-17678.1.patch
>
>
> {code:java}
> Partition appendPartition(String dbName, String tableName, String partName) 
> {code}
> in HiveMetaStoreClient does not conform with the declaration 
> {code:java}
> Partition appendPartition(String tableName, String dbName, String name) 
> {code} 
> in IMetaStoreClient
> *Positions for dbName and tableName are interchanged.*



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-17579) repl load without providing the database name in the command fails.

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-17579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-17579:
---
Fix Version/s: (was: 3.2.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> repl load without providing the database name in the command fails.
> ---
>
> Key: HIVE-17579
> URL: https://issues.apache.org/jira/browse/HIVE-17579
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Anishek Agarwal
>Assignee: Anishek Agarwal
>Priority: Major
>
> repl dump [databasename] = > [hdfs location]
> if we run {{repl load [hdfs location]}} this fails. it should pick the 
> database name from the metadata file in this case but leads to a HiveException



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-21638) Hive execute miss stage

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-21638:
---
Fix Version/s: (was: 2.3.2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Hive execute miss stage
> ---
>
> Key: HIVE-21638
> URL: https://issues.apache.org/jira/browse/HIVE-21638
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.3.2
>Reporter: ann
>Priority: Major
>  Labels: pull-request-available
> Attachments: stage-miss-bugfix.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When execute query finished , there are some missed stage because of status 
> check failed.
> It need to check after execute finshed , if not executed all stage and throw 
> exception to avoid incorrect result in the end.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26112) Missing scripts for metastore

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26112:
---
Fix Version/s: (was: 4.0.0-alpha-2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Missing scripts for metastore
> -
>
> Key: HIVE-26112
> URL: https://issues.apache.org/jira/browse/HIVE-26112
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 4.0.0-alpha-2
>Reporter: Alessandro Solimando
>Assignee: Alessandro Solimando
>Priority: Blocker
>
> The version of the scripts for _metastore_ and _standalone-metastore_ should 
> be in sync, but at the moment for the metastore side we are missing 3.2.0 
> scripts (in _metastore/scripts/upgrade/hive_), while they are present in the 
> standalone_metastore counterpart(s):
> * hive-schema-3.2.0.*.sql
> * upgrade-3.1.0-to-3.2.0.*.sql
> * upgrade-3.2.0-to-4.0.0-alpha-1.*.sql
> * upgrade-4.0.0-alpha-1-to-4.0.0-alpha-2.*.sql



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-10469) Create table Like in HcatClient does not create partitions

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-10469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-10469:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Create table Like in HcatClient does not create partitions
> --
>
> Key: HIVE-10469
> URL: https://issues.apache.org/jira/browse/HIVE-10469
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog
>Reporter: Antoni Ivanov
>Priority: Major
> Fix For: 0.12.1
>
>
> When using HcaClient#createTableLike the table created is missing partitions 
> although the original table does have them
> This is unlike the Hacatalog Rest API: 
> https://hive.apache.org/javadocs/hcat-r0.5.0/rest.html 
> or as impala/hive SQL query 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-24778) Unify hive.strict.timestamp.conversion and hive.strict.checks.type.safety properties

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24778:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Unify hive.strict.timestamp.conversion and hive.strict.checks.type.safety 
> properties
> 
>
> Key: HIVE-24778
> URL: https://issues.apache.org/jira/browse/HIVE-24778
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 4.0.0
>Reporter: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The majority of strict type checks can be controlled by 
> {{hive.strict.checks.type.safety}} property. HIVE-24157 introduced another 
> property, namely  {{hive.strict.timestamp.conversion}}, to control the 
> implicit comparisons between numerics and timestamps.
> The name and description of {{hive.strict.checks.type.safety}} imply that the 
> property covers all strict checks so having others for specific cases appears 
> confusing and can easily lead to unexpected behavior.
> The goal of this issue is to unify those properties to facilitate 
> configuration and improve code reuse.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-21147) Remove Contrib RegexSerDe

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-21147:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Remove Contrib RegexSerDe
> -
>
> Key: HIVE-21147
> URL: https://issues.apache.org/jira/browse/HIVE-21147
> Project: Hive
>  Issue Type: Improvement
>  Components: Serializers/Deserializers
>Affects Versions: 4.0.0
>Reporter: David Mollitor
>Priority: Major
>
> https://github.com/apache/hive/blob/f37c5de6c32b9395d1b34fa3c02ed06d1bfbf6eb/contrib/src/java/org/apache/hadoop/hive/contrib/serde2/RegexSerDe.java
> https://github.com/apache/hive/blob/ae008b79b5d52ed6a38875b73025a505725828eb/serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java
> Merge any difference in functionality and remove the version in the 'contrib' 
> library



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-10266) Boolean expression True and True returns False

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-10266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-10266:
---
Fix Version/s: (was: 0.13.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Boolean expression True and True returns False
> --
>
> Key: HIVE-10266
> URL: https://issues.apache.org/jira/browse/HIVE-10266
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 0.14.0
>Reporter: ckran
>Priority: Major
>
> A Hive query with a Boolean expression with day and month calculations that 
> each evaluate to TRUE  with use of AND evaluates to FALSE. 
> create table datest (cntr int, date date ) row format delimited fields 
> terminated by ',' stored as textfile ;
> insert into table datest values (1,'2015-04-8') ;
> select
> ((DAY('2015-05-25') - DAY(DATE)) < 25), 
> ((MONTH('2015-05-25') - MONTH(DATE)) = 1) ,
> ((DAY('2015-05-25') - DAY(DATE)) < 25) AND ((MONTH('2015-05-25') - 
> MONTH(DATE)) = 1) 
> from datest 
> Returns values
> True | True | False 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-20063) Global limit concurrent connections of HiveServer2

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-20063:
---
Fix Version/s: (was: 2.4.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Global limit concurrent connections of HiveServer2
> --
>
> Key: HIVE-20063
> URL: https://issues.apache.org/jira/browse/HIVE-20063
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Affects Versions: 2.3.2, 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
>Priority: Critical
>
> HS2 should have ability to config a global concurrent connections limit of 
> HiveServer2. it should reject to connect when reach the number of connections.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-5864) Hive Table filter Not working (ERROR:SemanticException MetaException)

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-5864:
--
Fix Version/s: (was: 0.12.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Hive Table filter Not working (ERROR:SemanticException MetaException)
> -
>
> Key: HIVE-5864
> URL: https://issues.apache.org/jira/browse/HIVE-5864
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.12.0
>Reporter: ashok Kumar
>Priority: Major
>
> hive -e "select * from  where year=2013 and month=11 and day=05 and 
> hour=22"
> Logging initialized using configuration in 
> jar:file:/usr/lib/hive/lib/hive-common-0.12.0.jar!/hive-log4j.properties
> FAILED: SemanticException MetaException(message:Filtering is supported only 
> on partition keys of type string)
> and I able to execute the  hive -e "select * from   limit 3"
> I'm upgraded the Hive from 0.10 to 0.12 and it works in Hive 0.10 . 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-24988) Add support for complex types columns for Dynamic Partition pruning Optimisation

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24988:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Add support for complex types columns for Dynamic Partition pruning 
> Optimisation
> 
>
> Key: HIVE-24988
> URL: https://issues.apache.org/jira/browse/HIVE-24988
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>
> DynamicPartitionPruningOptimization fails for complex types.  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-20529) Statistics update in S3 is taking time at target side during REPL Load

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-20529:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Statistics update in S3 is taking time at target side during REPL Load
> --
>
> Key: HIVE-20529
> URL: https://issues.apache.org/jira/browse/HIVE-20529
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>
> The statistics operations access the file system to get the number of files 
> created by the operation. In S3 it causes 2-3 seconds of delay. The file list 
> can be obtained from the event info in the replication directory and can be 
> used to update the statistics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25867) Partition filter condition should pushed down to metastore query if it is equivalence Predicate

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25867:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Partition filter condition should pushed down to metastore query if it is 
> equivalence Predicate
> ---
>
> Key: HIVE-25867
> URL: https://issues.apache.org/jira/browse/HIVE-25867
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Reporter: shezm
>Assignee: shezm
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The colnum type of the partition is different from the column type of the hql 
> query, the metastore will not push down the query to the RDBMS, but will 
> instead get all PARTITIONS.PART_NAME of the hive table then filter it 
> according to the hql Expression. 
> https://github.com/apache/hive/blob/5b112aa6dcc4e374c0a7c2b24042f24ae6815da1/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java#L1316
> If the hive table has too many partitions and there are multiple hql queries 
> at the same time,RDBMS will increasing CPU IO_WAIT and affect performance.
> If the partition filter condition in hql is an equivalent predicate, the 
> metastore should be pushed down to RDBMS, which can optimize the query 
> performance of hive large tables.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-20534) File operation at target side during S3 replication slowing down the replication

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-20534:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> File operation at target side during S3 replication slowing down the 
> replication
> 
>
> Key: HIVE-20534
> URL: https://issues.apache.org/jira/browse/HIVE-20534
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>
> 1. Check is there during create partition for existence of partition location 
> (in add partitions core method in metastore.java). It’s not required as we 
> would have created the directory and copied the required files to it.
> 2. Creating qualified directory name (convertAddSpecToMetaPartition method in 
> hive.java)– File system is access to check if the path provided is fully 
> qualified or not. Not sure why it’s taking 1-2 seconds.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-20132) External Table: Alter Table Change column is not supported.

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-20132:
---
Fix Version/s: (was: 3.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> External Table: Alter Table Change column is not supported.
> ---
>
> Key: HIVE-20132
> URL: https://issues.apache.org/jira/browse/HIVE-20132
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Dileep Kumar Chiguruvada
>Assignee: Ashutosh Chauhan
>Priority: Major
>
> External Table: Alter Table Change column is not supported..
> It fails with "ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, 
> ADDCOLS] to a non-native table"
> {code}
> 0: jdbc:hive2://ctr-e138-1518143905142-404953> alter table calcs change 
> column key string int;
> Error: Error while compiling statement: FAILED: SemanticException [Error 
> 10134]: ALTER TABLE can only be used for [ADDPROPS, DROPPROPS, ADDCOLS] to a 
> non-native table  calcs (state=42000,code=10134)
> {code}
> This is very much required for upgraded clusters where managed tables(in 2.6 
> clusters) automatically converted to external tables(in 3.0.0).
> One such use case is  Storagehandler Tables, where we might need to alter 
> columns.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25357) Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues to unblock the pre-commit tests

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25357:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Fix the checkstyle issue in HiveIcebergMetaHook and the iceberg test issues 
> to unblock the pre-commit tests
> ---
>
> Key: HIVE-25357
> URL: https://issues.apache.org/jira/browse/HIVE-25357
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Marta Kuczora
>Assignee: Marta Kuczora
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> [ERROR] 
> /home/jenkins/agent/workspace/hive-precommit_master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergMetaHook.java:221:3:
>  Cyclomatic Complexity is 13 (max allowed is 12). [CyclomaticComplexity]
> This issue probably came in with 
> [this|https://github.com/apache/hive/commit/76c49b9df957c8c05b81a4016282c03648b728b9]
>  commit 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-16008) TRUNC of a Day and Hour is not fetching the expected output in Hive

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-16008:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> TRUNC of a Day and Hour is not fetching the expected output in Hive
> ---
>
> Key: HIVE-16008
> URL: https://issues.apache.org/jira/browse/HIVE-16008
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.1.0
> Environment: Cloudera
>Reporter: Vinoth Ragunathan
>Priority: Minor
>  Labels: hive
> Fix For: 1.2.3
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Trunc function for the Day and for an Hour is not fetching the expected 
> result in Hive.
> However rest of the string units like 'MM', ''  works as expected
> I tried the below in Hive
> Query :
> SELECT TRUNC('2016-12-11 01:02:04','DD') FROM sample_08;
> SELECT TRUNC('2016-12-11 01:02:04','HH24') FROM sample_08;
> Result:
> NULL
> Expectation:
> Ideally it should be 2016-12-11 00:00:00 for day
> 2016-12-11 01:00:00 for Hour
> Thanks,
> Vnoth R



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-21049) how get hive query log by thrift server

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-21049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-21049:
---
Fix Version/s: (was: 1.2.1)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> how get hive query log by thrift server
> ---
>
> Key: HIVE-21049
> URL: https://issues.apache.org/jira/browse/HIVE-21049
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2, JDBC
>Affects Versions: 1.2.2
>Reporter: feiyuanxing
>Assignee: feiyuanxing
>Priority: Minor
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> i hope get query or execute log or application ID job ID 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25096) beeline can't get the correct hiveserver2 using the zoopkeeper with serviceDiscoveryMode=zooKeeper.

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25096:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> beeline can't get the correct hiveserver2 using the zoopkeeper with 
> serviceDiscoveryMode=zooKeeper.
> ---
>
> Key: HIVE-25096
> URL: https://issues.apache.org/jira/browse/HIVE-25096
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.1.2
> Environment: centos7.4
> x86_64
>Reporter: xiaozhongcheng
>Assignee: hezhang
>Priority: Major
> Attachments: HIVE-25096.patch
>
>
> beeline can't get the correct hiveserver2 using the zoopkeeper with 
> serviceDiscoveryMode=zooKeeper.
>  
> {code:java}
> // code placeholder
> [root@vhost-120-28 hive]# beeline -u 
> "jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
>  --verbose=true
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/wdp/1.0/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/wdp/1.0/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> !connect 
> jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
>  '' [passwd stripped] 
> Connecting to 
> jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
> Error: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read 
> HiveServer2 configs from ZooKeeper (state=,code=0)
> java.sql.SQLException: org.apache.hive.jdbc.ZooKeeperHiveClientException: 
> Unable to read HiveServer2 configs from ZooKeeper
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:170)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
> at java.sql.DriverManager.getConnection(DriverManager.java:664)
> at java.sql.DriverManager.getConnection(DriverManager.java:208)
> at 
> org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:145)
> at 
> org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:209)
> at org.apache.hive.beeline.Commands.connect(Commands.java:1641)
> at org.apache.hive.beeline.Commands.connect(Commands.java:1536)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:56)
> at 
> org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1384)
> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1423)
> at org.apache.hive.beeline.BeeLine.connectUsingArgs(BeeLine.java:900)
> at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:795)
> at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1048)
> at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:538)
> at org.apache.hive.beeline.BeeLine.main(BeeLine.java:520)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read 
> HiveServer2 configs from ZooKeeper
> at 
> org.apache.hive.jdbc.ZooKeeperHiveClientHelper.configureConnParams(ZooKeeperHiveClientHelper.java:147)
> at 
> org.apache.hive.j

[jira] [Updated] (HIVE-8851) Broadcast files for small tables via SparkContext.addFile() and SparkFiles.get() [Spark Branch]

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-8851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-8851:
--

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Broadcast files for small tables via SparkContext.addFile() and 
> SparkFiles.get() [Spark Branch]
> ---
>
> Key: HIVE-8851
> URL: https://issues.apache.org/jira/browse/HIVE-8851
> Project: Hive
>  Issue Type: Sub-task
>  Components: Spark
>Reporter: Xuefu Zhang
>Assignee: Jimmy Xiang
>Priority: Major
> Fix For: spark-branch
>
> Attachments: HIVE-8851.1-spark.patch, HIVE-8851.2-spark.patch
>
>
> Currently files generated by SparkHashTableSinkOperator for small tables are 
> written directly on HDFS with a high replication factor. When map join 
> happens, map join operator is going to load these files into hash tables. 
> Since on multiple partitions can be process on the same worker node, reading 
> the same set of files multiple times are not ideal. The improvment can be 
> done by calling SparkContext.addFiles() on these files, and use 
> SparkFiles.getFile() to download them to the worker node just once.
> Please note that SparkFiles.getFile() is a static method. Code invoking this 
> method needs to be in a static method. This calling method needs to be 
> synchronized because it may get called in different threads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25244) Hive predicate pushdown with Parquet format for `date` as partitioned column name produce empty resultset

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25244:
---
Fix Version/s: (was: 3.1.0)
   (was: 3.2.0)
   (was: 3.1.1)
   (was: 3.1.2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Hive predicate pushdown with Parquet format for `date` as partitioned column 
> name produce empty resultset
> -
>
> Key: HIVE-25244
> URL: https://issues.apache.org/jira/browse/HIVE-25244
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Parquet
>Affects Versions: 3.1.0, 3.1.1, 3.1.2
>Reporter: Aniket Adnaik
>Assignee: Aniket Adnaik
>Priority: Major
> Attachments: test_table3_data.tar.gz
>
>
> Hive predicate push down with Parquet format for partitioned column with 
> column name as  keyword -> `date` produces empty result set.
> If any of the followings configs is set to false, then the select query 
> returns results.
> hive.optimize.ppd.storage, hive.optimize.ppd , hive.optimize.index.filter .
> Repro steps:
> --
> 1. 
> 1) Create an external partitioned table in Hive
> CREATE EXTERNAL TABLE `test_table3`(`id` string) PARTITIONED BY (`date` 
> string) STORED AS parquet;
> 2) In spark-shell create data frame and write the data parquet file
> import java.sql.Timestamp
> import org.apache.spark.sql.Row
> import org.apache.spark.sql.types._
> import spark.implicits._
> val someDF = Seq(("1", "05172021"),("2", "05172021"), ("3", "06182021"), 
> ("4", "07192021")).toDF("id", "date")
> someDF.write.mode("overwrite").parquet(" path>/hive/warehouse/external/test_table3/date=05172021")
> 3) In Hive change the permissions and add partition to the table
> $> hdfs dfs -chmod -R 777 /hive/warehouse/external/test_table3
> Hive Beeline ->
> ALTER TABLE test_table3 ADD PARTITION(`date`='05172021') LOCATION  ' path>/hive/warehouse/external/test_table3/date=05172021'
> 4) SELECT * FROM test_table3;   <- produces all rows
> SELECT * FROM test_table3 WHERE `date`='05172021';   <--- produces no rows   
> SET hive.optimize.ppd.storage=false;  <--- turn off ppd push down optimization
> SELECT * FROM test_table3 WHERE `date`='05172021'; <--- produces rows after 
> setting above config to false
> Attaching parquet data files for reference:
>  
>  
>  
>   



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-14157) deal with ACID operations (insert, update, delete)

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-14157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-14157:
---
Fix Version/s: (was: 2.1.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> deal with ACID operations (insert, update, delete)
> --
>
> Key: HIVE-14157
> URL: https://issues.apache.org/jira/browse/HIVE-14157
> Project: Hive
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Major
> Attachments: DB2BP_Security_RCAC_0412.pdf
>
>
> INSERT statement
> When you issue an INSERT statement against a table for which row-level access 
> control
> is activated, the rules specified in all the enabled row permissions defined 
> on that table
> determine whether the row can be inserted. To be inserted, the row must 
> conform to the
> enabled row permissions that are defined on the table. A conformant row is a 
> row that, if
> inserted, can be retrieved back by using a SELECT statement by the same user. 
> This
> behavior is identical to how an insert into a symmetric view works. In other 
> words, you
> cannot insert a row that you cannot select. 
> UPDATE statement
> When you issue an UPDATE statement against a table for which row-level access 
> control
> is activated, the rules specified in all the enabled row permissions that are 
> defined on that
> table determine whether the row can be updated. Enabled row permissions are 
> used as
> follows during UPDATE operations:
> 1. The enabled row permissions filter the set of rows to be updated. In other 
> words,
> you cannot update rows that you cannot select.
> 2. The updated rows (if any) must conform to the enabled row permissions. A
> conformant updated row is a row that can be retrieved back using a SELECT
> statement by the same user. This is identical to how an update of a symmetric
> view works. In other words, you cannot update a row such that you can no
> longer select that row.
> DELETE statement
> When a DELETE statement is issued against a table for which row-level access 
> control is
> activated, the rules specified in all the enabled row permissions that are 
> defined on that
> table determine which rows can be deleted. The enabled row permissions filter 
> the set of
> rows to be deleted. In other words, you cannot delete rows that you cannot 
> select.
> MERGE statement
> A MERGE statement can be thought of as both an INSERT and an UPDATE operation.
> The processing of a MERGE follows the processing of INSERT and UPDATE.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25292) to_unix_timestamp & unix_timestamp should support ENGLISH format by default

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25292:
---
Fix Version/s: (was: 3.2.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> to_unix_timestamp & unix_timestamp should support ENGLISH format by default
> ---
>
> Key: HIVE-25292
> URL: https://issues.apache.org/jira/browse/HIVE-25292
> Project: Hive
>  Issue Type: Improvement
>  Components: Clients
>Reporter: shezm
>Assignee: shezm
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Hei
> The to_unix_timestamp function is implemented by GenericUDFToUnixTimeStamp. 
> It uses SimpleDateFormat to parse the time of the string type.
> But SimpleDateFormat does not specify the Locale parameter, that is, the 
> default locale of the jvm machine will be used. This will cause some 
> non-English local machines to be unable to run similar sql like :
>  
> {code:java}
> hive> select to_unix_timestamp('16/Mar/2017:12:25:01', 'dd/MMM/yyy:HH:mm:ss');
> OK
> NULL
> hive> select unix_timestamp('16/Mar/2017:12:25:01', 'dd/MMM/yyy:HH:mm:ss');
> OK
> NULL
> {code}
>  
> At the same time, I found that in spark, to_unix_timestamp & unix_timestamp 
> also use SimpleDateFormat, and spark uses Locale.US by default, but this will 
> make it impossible to use local language syntax. For example, in the Chinese 
> environment, I can parse this result correctly in hive,
>  
> {code:java}
> hive> select to_unix_timestamp('16/三月/2017:12:25:01', 'dd//yyy:HH:mm:ss');
> OK
> 1489638301
> Time taken: 0.147 seconds, Fetched: 1 row(s)
> OK
> {code}
> But spark will return Null.
> Because English dates are more common dates, I think two SimpleDateFormats 
> are needed. The new SimpleDateFormat is initialized with the Locale.ENGLISH 
> parameter.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-11085) Alter table fail with NPE if schema change

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-11085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-11085:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Alter table fail with NPE if schema change
> --
>
> Key: HIVE-11085
> URL: https://issues.apache.org/jira/browse/HIVE-11085
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Priority: Major
> Fix For: hbase-metastore-branch
>
>
> alter1.q fail. Specifically, the following statement fail:
> create table alter1(a int, b int);
> add jar itests/test-serde/target/hive-it-test-serde-1.3.0-SNAPSHOT.jar;
> alter table alter1 set serde 'org.apache.hadoop.hive.serde2.TestSerDe' with 
> serdeproperties('s1'='9');
> Error stack:
> {code}
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. 
> java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.metadata.Hive.alterTable(Hive.java:498)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.alterTable(DDLTask.java:3418)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:338)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1660)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1419)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1200)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1067)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1057)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1116)
> at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1090)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:146)
> at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter1(TestCliDriver.java:130)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at junit.framework.TestCase.runTest(TestCase.java:176)
> at junit.framework.TestCase.runBare(TestCase.java:141)
> at junit.framework.TestResult$1.protect(TestResult.java:122)
> at junit.framework.TestResult.runProtected(TestResult.java:142)
> at junit.framework.TestResult.run(TestResult.java:125)
> at junit.framework.TestCase.run(TestCase.java:129)
> at junit.framework.TestSuite.runTest(TestSuite.java:255)
> at junit.framework.TestSuite.run(TestSuite.java:250)
> at 
> org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
> at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
> at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
> Caused by: MetaException(message:java.lang.NullPointerException)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:5301)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_core(HiveMetaStore.java:3443)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_table_with_cascade(HiveMetaStore.java:3395)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_table(HiveMetaStoreClient.java:352)
>

[jira] [Updated] (HIVE-4565) TestCliDriver and TestParse fail with non Sun Java

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-4565:
--

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> TestCliDriver and TestParse fail with non Sun Java
> --
>
> Key: HIVE-4565
> URL: https://issues.apache.org/jira/browse/HIVE-4565
> Project: Hive
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 0.11.0
> Environment: RedHat x86 IBM Java 6
>Reporter: Renata Ghisloti Duarte de Souza
>Priority: Minor
> Fix For: 0.11.1
>
> Attachments: HIVE-4565.patch
>
>
> While executing Hive's unit tests two testcases have different outputs with 
> Sun Java and non-Sun Java (such as IBM):
> TestCliDriver and TestParse.
> The differences are mainly due to the use of HashMaps on the creation of the 
> Logical Plan on analyzeInternal method. Sun java presents the elements of a 
> HashMap in one order, and non sun Java on a different order.
> Both outputs are correct, and don't affect the final query result.  I propose 
> this patch attached to make Hive unit tests compliant with all JVMs.
> The patch adds the output files and a change on ql/build.xml.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25199) beeline -u url -n username -p prompt input password hiveserver always get “anonymous” not the real input password

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25199:
---
Fix Version/s: (was: 2.3.2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> beeline -u url -n username -p prompt input password hiveserver always get 
> “anonymous” not the real input password
> -
>
> Key: HIVE-25199
> URL: https://issues.apache.org/jira/browse/HIVE-25199
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.3.2
>Reporter: chjgan
>Priority: Blocker
> Attachments: image-2021-06-04-15-54-22-714.png, 
> image-2021-06-04-15-55-33-377.png, image-2021-06-04-15-56-27-600.png, 
> image-2021-06-04-16-12-34-681.png
>
>
>  
> !image-2021-06-04-15-54-22-714.png!
> !image-2021-06-04-15-55-33-377.png!
> !image-2021-06-04-15-56-27-600.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-19945) Beeline - run against a different sql engine

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-19945:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Beeline - run against a different sql engine
> 
>
> Key: HIVE-19945
> URL: https://issues.apache.org/jira/browse/HIVE-19945
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Affects Versions: 3.0.0
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> Original idea by [~kgyrtkirk] 
> "I think beeline also support to load different sql drivers (not sure about 
> this...but I think I saw some pointers to this) Anyway...it would be great to 
> be able to execute a test against a different sql engine; like psql."
> something like:
> {code}
> mvn test -Dtest=TestQ -Dqengine=ExternalPSQL 
> -Djdbc.uri=psql://localhost:5432/somedb
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-9282) hive could not able to integrate with spark

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-9282:
--

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> hive could not able to integrate with spark
> ---
>
> Key: HIVE-9282
> URL: https://issues.apache.org/jira/browse/HIVE-9282
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 0.12.0
> Environment: centOS 6.4  and hadoop-1.0.4 and hive-0.12.0 and 
> spark-0.8.0
>Reporter: suraj
>Priority: Major
> Fix For: spark-branch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> i have installed hadoop-1.0.4 and on top this i have installed everything by 
> just following this site :
> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
> hive-0.12.0
> and spark-0.8.0 
> in that site they have mentioned that i have to install spark-1.2.x assembly 
> but i have installed spark as spark-0.8.0
> even i have compiled hive library using maven still i am getting issue that 
> wrong FS.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25540:
---
Fix Version/s: (was: 4.0.0-alpha-2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Enable batch update of column stats only for MySql and Postgres 
> 
>
> Key: HIVE-25540
> URL: https://issues.apache.org/jira/browse/HIVE-25540
> Project: Hive
>  Issue Type: Sub-task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The batch updation of partition column stats using direct sql is tested only 
> for MySql and Postgres.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26314) Support alter function in Hive DDL

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26314:
---
Fix Version/s: (was: 4.0.0-alpha-2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Support alter function in Hive DDL
> --
>
> Key: HIVE-26314
> URL: https://issues.apache.org/jira/browse/HIVE-26314
> Project: Hive
>  Issue Type: Task
>  Components: Hive
>Affects Versions: 4.0.0-alpha-1
>Reporter: Wechar
>Assignee: Wechar
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Hive SQL does not support {{*ALTER FUNCTION*}} yet, we can refer to the 
> {{*CREATE [OR REPLACE] FUNCTION*}} of 
> [Spark|https://spark.apache.org/docs/3.1.2/sql-ref-syntax-ddl-create-function.html]
>  to implement the alter function .
> {code:sql}
> CREATE [ TEMPORARY ] FUNCTION [ OR REPLACE ] [IF NOT EXISTS ]
>   [db_name.]function_name AS class_name
>   [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ];
> {code}
> * *OR REPLACE*
> If specified, the resources for the function are reloaded. This is mainly 
> useful to pick up any changes made to the implementation of the function. 
> This parameter is mutually exclusive to {{*IF NOT EXISTS*}} and can not be 
> specified together.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-24112) TestMiniLlapLocalCliDriver[dynamic_semijoin_reduction_on_aggcol] is flaky

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24112:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> TestMiniLlapLocalCliDriver[dynamic_semijoin_reduction_on_aggcol] is flaky
> -
>
> Key: HIVE-24112
> URL: https://issues.apache.org/jira/browse/HIVE-24112
> Project: Hive
>  Issue Type: Bug
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: flaky-test
>
> http://ci.hive.apache.org/job/hive-flaky-check/96/



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-22755) Cleaner/Compaction can skip the read locks and use the min open txn id

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-22755:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Cleaner/Compaction can skip the read locks and use the min open txn id
> --
>
> Key: HIVE-22755
> URL: https://issues.apache.org/jira/browse/HIVE-22755
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Slim Bouguerra
>Priority: Major
>
> The minOpenTxnId is used by the Cleaner here
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java#L154
> This currently converts it to open write-ids to clean appropriately.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-20373) Output of 'show compactions' displays double header

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-20373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-20373:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Output of 'show compactions' displays double header
> ---
>
> Key: HIVE-20373
> URL: https://issues.apache.org/jira/browse/HIVE-20373
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: László Bodor
>Priority: Minor
>
> {code}
> +---+---++++--++---+-+--+
> | dbname | tabname | partname | type | state | workerid | starttime | 
> duration | hadoopjobid |
> +---+---++++--++---+-+--+
> | Database | Table | Partition | Type | State | Worker | Start Time | 
> Duration(ms) | HadoopJobId |
> | default | student | --- | MAJOR | working | 
> ctr-e138-1518143905142-435940-01-03.hwx.site-61 | 1534156696000 | --- | 
> job_1534152461533_0030 |
> | default | acid_partitioned | bkt=1 | MAJOR | initiated | --- | --- | --- | 
> --- |
> | default | acid_partitioned | bkt=2 | MAJOR | initiated | --- | --- | --- | 
> --- |
> +---+---++++--++---+-+–+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-10848) LLAP: Better handling of hostnames when sending heartbeats to the AM

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-10848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-10848:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> LLAP: Better handling of hostnames when sending heartbeats to the AM
> 
>
> Key: HIVE-10848
> URL: https://issues.apache.org/jira/browse/HIVE-10848
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Siddharth Seth
>Priority: Major
> Fix For: llap
>
>
> Daemons send an alive message to the listening co-ordinator - along with the 
> daemon's hostname, which is used to keep tasks alive.
> This can be problematic with hostname resolution if the AM and dameons end up 
> using different hostnames.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-24989) Support vectorisation of join with key columns of complex types

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24989:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Support vectorisation of join with key columns of complex types
> ---
>
> Key: HIVE-24989
> URL: https://issues.apache.org/jira/browse/HIVE-24989
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 4.0.0
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>
> Support for complex type is not present in add key.
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected 
> column vector type LISTCaused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected column vector 
> type LIST at 
> org.apache.hadoop.hive.ql.exec.vector.VectorColumnSetInfo.addKey(VectorColumnSetInfo.java:138)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.wrapper.VectorHashKeyWrapperBatch.compileKeyWrapperBatch(VectorHashKeyWrapperBatch.java:913)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.wrapper.VectorHashKeyWrapperBatch.compileKeyWrapperBatch(VectorHashKeyWrapperBatch.java:894)
>  at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapJoinOperator.initializeOp(VectorMapJoinOperator.java:137)
>  at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:360) at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:549) at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:503) 
> at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:369) at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:332)
>   {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-16921) ability to trace the task tree created during repl load from logs

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-16921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-16921:
---
Fix Version/s: (was: 3.2.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> ability to trace the task tree created during repl load from logs
> -
>
> Key: HIVE-16921
> URL: https://issues.apache.org/jira/browse/HIVE-16921
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: Anishek Agarwal
>Assignee: Anishek Agarwal
>Priority: Major
>
> As part of HIVE-16896 , we will be dynamically creating the task tree as we 
> execute the bootstrap load. To make it easier to debug and understand the 
> replication load task tree that is created in execution phase we should have 
> additional logging enabled by default _INFO_ to log maker statements which 
> can be searched in the logs to determine the possible task tree that is being 
> created. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26492) orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when encounter decimal type value

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26492:
---
Fix Version/s: (was: 3.1.1)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> orc.apache.orc.impl.writer.StructTreeWriter write wrong column stats when 
> encounter decimal type value
> --
>
> Key: HIVE-26492
> URL: https://issues.apache.org/jira/browse/HIVE-26492
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 3.1.1
>Reporter: Yunhong Zheng
>Priority: Major
> Attachments: image-2022-08-22-22-05-53-412.png
>
>
> For _orc.apache.orc.impl.writer.StructTreeWriter_ in hive-exec-3.1.1 . In 
> this class, method _writeFileStatistics_ will create a wrong column stats min 
> value (column stats min equals 0) when encounter decimal type column(like 
> decimal(5, 2), decimal(14, 2)). the debug screenshot：
> !image-2022-08-22-22-05-53-412.png|width=822,height=249!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-24071) Continue cleaning the NotificationEvents till we have data greater than TTL

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-24071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-24071:
---
Fix Version/s: (was: 4.0.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Continue cleaning the NotificationEvents till we have data greater than TTL
> ---
>
> Key: HIVE-24071
> URL: https://issues.apache.org/jira/browse/HIVE-24071
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>
> Continue cleaning the NotificationEvents till we have data greater than TTL.
> Currently we only clean the notification events once every 2 hours and also 
> strict 1 every time. We should continue deleting until we clear up all 
> the notification events greater than TTL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-18011) hive lib too much repetition

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-18011:
---
Fix Version/s: (was: 2.3.0)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> hive lib too much repetition
> 
>
> Key: HIVE-18011
> URL: https://issues.apache.org/jira/browse/HIVE-18011
> Project: Hive
>  Issue Type: Task
>  Components: distribution
>Affects Versions: 2.3.0
> Environment: Any Environment
>Reporter: zhaixiaobin
>Priority: Major
>
> *Following is the lib directory of the hive, too many duplicate jar :*
> Json lib:   gson, jackson , json-1.8.jar   ??(n)
> -rw-r--r-- 1 root root  4368200 Dec  9  2015 accumulo-core-1.6.0.jar
> -rw-r--r-- 1 root root   102069 Dec  9  2015 accumulo-fate-1.6.0.jar
> -rw-r--r-- 1 root root57420 Dec  9  2015 accumulo-start-1.6.0.jar
> -rw-r--r-- 1 root root   117409 Dec  9  2015 accumulo-trace-1.6.0.jar
> -rw-r--r-- 1 root root62983 Dec  9  2015 activation-1.1.jar
> -rw-r--r-- 1 root root   133957 Dec  9  2015 aether-api-0.9.0.M2.jar
> -rw-r--r-- 1 root root26285 Dec 15  2016 
> aether-connector-file-0.9.0.M2.jar
> -rw-r--r-- 1 root root52012 Dec 15  2016 aether-connector-okhttp-0.0.9.jar
> -rw-r--r-- 1 root root   144866 Dec  9  2015 aether-impl-0.9.0.M2.jar
> -rw-r--r-- 1 root root17703 Dec  9  2015 aether-spi-0.9.0.M2.jar
> -rw-r--r-- 1 root root   133588 Dec  9  2015 aether-util-0.9.0.M2.jar
> -rw-r--r-- 1 root root88458 Feb  3  2017 aircompressor-0.3.jar
> -rw-r--r-- 1 root root85912 Sep  8  2016 airline-0.7.jar
> {color:red}-rw-r--r-- 1 root root  1034049 Dec  9  2015 ant-1.6.5.jar{color}
> {color:red}-rw-r--r-- 1 root root  1997485 Dec  9  2015 ant-1.9.1.jar{color}
> -rw-r--r-- 1 root root18336 Dec  9  2015 ant-launcher-1.9.1.jar
> {color:red}-rw-r--r-- 1 root root   374032 Dec  9  2015 
> antlr4-runtime-4.5.jar{color}
> {color:red}-rw-r--r-- 1 root root   167761 Dec  9  2015 
> antlr-runtime-3.5.2.jar{color}
> -rw-r--r-- 1 root root31827 Aug 30  2016 apache-curator-2.7.1.pom
> -rw-r--r-- 1 root root43033 Dec  9  2015 asm-3.1.jar
> -rw-r--r-- 1 root root32693 Dec  9  2015 asm-commons-3.1.jar
> -rw-r--r-- 1 root root21879 Dec  9  2015 asm-tree-3.1.jar
> -rw-r--r-- 1 root root  5222951 Oct 18  2016 avatica-1.8.0.jar
> -rw-r--r-- 1 root root20102 Oct 18  2016 avatica-metrics-1.8.0.jar
> -rw-r--r-- 1 root root   436303 Dec  9  2015 avro-1.7.7.jar
> -rw-r--r-- 1 root root   110600 Dec  9  2015 bonecp-0.8.0.RELEASE.jar
> -rw-r--r-- 1 root root74175 Dec 15  2016 bytebuffer-collections-0.2.5.jar
> -rw-r--r-- 1 root root  4085527 Oct 18  2016 calcite-core-1.10.0.jar
> -rw-r--r-- 1 root root96585 Oct 18  2016 calcite-druid-1.10.0.jar
> -rw-r--r-- 1 root root   481884 Oct 18  2016 calcite-linq4j-1.10.0.jar
> -rw-r--r-- 1 root root60282 Sep  8  2016 classmate-1.0.0.jar
> -rw-r--r-- 1 root root41123 Dec  9  2015 commons-cli-1.2.jar
> -rw-r--r-- 1 root root58160 Dec  9  2015 commons-codec-1.4.jar
> -rw-r--r-- 1 root root   588337 Dec  9  2015 commons-collections-3.2.2.jar
> -rw-r--r-- 1 root root30595 Dec  9  2015 commons-compiler-2.7.6.jar
> -rw-r--r-- 1 root root   378217 Dec  9  2015 commons-compress-1.9.jar
> {color:red}-rw-r--r-- 1 root root   160519 Dec  9  2015 
> commons-dbcp-1.4.jar{color}
> {color:red}-rw-r--r-- 1 root root   167962 Sep  8  2016 
> commons-dbcp2-2.0.1.jar{color}
> -rw-r--r-- 1 root root   112341 Dec  9  2015 commons-el-1.0.jar
> -rw-r--r-- 1 root root   279781 Dec  9  2015 commons-httpclient-3.0.1.jar
> -rw-r--r-- 1 root root   185140 Dec  9  2015 commons-io-2.4.jar
> {color:red}-rw-r--r-- 1 root root   284220 Dec  9  2015 
> commons-lang-2.6.jar{color}
> {color:red}-rw-r--r-- 1 root root   315805 Dec  9  2015 
> commons-lang3-3.1.jar{color}
> -rw-r--r-- 1 root root61829 Dec  9  2015 commons-logging-1.2.jar
> -rw-r--r-- 1 root root   988514 Dec  9  2015 commons-math-2.2.jar
> -rw-r--r-- 1 root root  2213560 Dec 15  2016 commons-math3-3.6.1.jar
> {color:red}-rw-r--r-- 1 root root96221 Dec  9  2015 
> commons-pool-1.5.4.jar{color}
> {color:red}-rw-r--r-- 1 root root   108036 Sep  8  2016 
> commons-pool2-2.2.jar{color}
> -rw-r--r-- 1 root root   415578 Dec  9  2015 commons-vfs2-2.0.jar
> -rw-r--r-- 1 root root79845 Dec  9  2015 compress-lzf-1.0.3.jar
> -rw-r--r-- 1 root root   425111 Sep  8  2016 config-magic-0.9.jar
> -rw-r--r-- 1 root root69500 Aug 30  2016 curator-client-2.7.1.jar
>

[jira] [Updated] (HIVE-11227) Kryo exception during table creation in Hive

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-11227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-11227:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Kryo exception during table creation in Hive
> 
>
> Key: HIVE-11227
> URL: https://issues.apache.org/jira/browse/HIVE-11227
> Project: Hive
>  Issue Type: Bug
>  Components: Configuration, Database/Schema, Hive, HiveServer2, 
> Indexing, Locking
>Affects Versions: 0.13.1
> Environment: CentOS 6.5, jdk 1.7, cpu: 2x1.9 GHz 6-core Xeon (24 
> cores), Ram: 64GB-128GB
>Reporter: Akamai
>Priority: Major
> Fix For: 0.14.1
>
> Attachments: Kryo Exception.txt, 
> init_load_hdpextract_user.tlog.clean.log, tlog_detail.20150710.log.clean, 
> trsm_tlog_detail.20150714.log.clean
>
>
> Exception is getting thorwn during table creation in Hive  
> Error: java.lang.RuntimeException: 
> org.apache.hive.com/esotericsoftware.kryo.KryoException: Encountered 
> unregistered class ID: 380



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-25931) hive split failed when task run on yarn with fairscheduler because yarn return memory -1

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-25931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25931:
---
Fix Version/s: (was: 2.1.1)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> hive split failed when task run on yarn with fairscheduler because yarn 
> return memory -1
> 
>
> Key: HIVE-25931
> URL: https://issues.apache.org/jira/browse/HIVE-25931
> Project: Hive
>  Issue Type: Improvement
>  Components: Tez
>Affects Versions: 2.1.1
>Reporter: lkl
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> hive split failed when task run on yarn with fairscheduler because yarn 
> return memory -1,in general ,if resource is used full,task should be at 
> accepted state,not return failed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26259) Alter Function does not update resource uris

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26259:
---
Fix Version/s: (was: 4.0.0-alpha-2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Alter Function does not update resource uris
> 
>
> Key: HIVE-26259
> URL: https://issues.apache.org/jira/browse/HIVE-26259
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.2
>Reporter: Wechar
>Assignee: Wechar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> *Bug Description:*
> The jar of Hive permanent UDF can be loaded based on the resource uris, but 
> we encountered an issue after changing the resource uris through spark-sql:
> {code:sql}
> CREATE OR REPLACE FUNCTION test_db.test_udf AS 'com.xxx.xxx'
> USING JAR 'hdfs://path/to/jar';
> {code}
> Then when we use the UDF `test_db.test_udf`, an error occured like this:
> {code:sh}
> Error in query: Can not load class 'com.xxx.xxx' when registering the 
> function 'test_db.test_udf'...
> {code}
> *Root Cause:*
> Hive metastore does not update resource uris while executing 
> `alter_function()`, which should be included and will not make any side 
> effect.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-11840) when multi insert the inputformat becomes OneNullRowInputFormat

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-11840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-11840:
---

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> when multi insert the inputformat becomes OneNullRowInputFormat
> ---
>
> Key: HIVE-11840
> URL: https://issues.apache.org/jira/browse/HIVE-11840
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 0.14.0, 1.2.1
>Reporter: 袁枫
>Priority: Blocker
> Fix For: 0.14.1
>
> Attachments: multi insert, single__insert
>
>
> example:
> from portrait.rec_feature_feedback a 
> insert overwrite table portrait.test1 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('949722CF_12F7_523A_EE21_E3D591B7E755') 
> insert overwrite table portrait.test2 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('test') 
> insert overwrite table portrait.test3 select iid, feedback_15day, 
> feedback_7day, feedback_5day, feedback_3day, feedback_1day where l_date = 
> '2015-09-09' and bid in ('F7734668_CC49_8C4F_24C5_EA8B6728E394')
> when single insert it works.but multi insert when i select * from test1:
> NULL NULL NULL NULL NULL NULL.
> i see "explain extended"
> Path -> Alias:
> -mr-10006portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Cyiyaowang, bid=F7734668_CC49_8C4F_24C5_EA8B6728E394} [a]
> -mr-10007portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Czgc_pc, bid=949722CF_12F7_523A_EE21_E3D591B7E755} [a]
>   Path -> Partition:
> -mr-10006portrait.rec_feature_feedback{l_date=2015-09-09, 
> cid=Cyiyaowang, bid=F7734668_CC49_8C4F_24C5_EA8B6728E394} 
>   Partition
> base file name: bid=F7734668_CC49_8C4F_24C5_EA8B6728E394
> input format: org.apache.hadoop.hive.ql.io.OneNullRowInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> partition values:
>   bid F7734668_CC49_8C4F_24C5_EA8B6728E394
>   cid Cyiyaowang
>   l_date 2015-09-09
> but when single insert:
> Path -> Alias:
> 
> hdfs://bfdhadoopcool/warehouse/portrait.db/rec_feature_feedback/l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755
>  [a]
>   Path -> Partition:
> 
> hdfs://bfdhadoopcool/warehouse/portrait.db/rec_feature_feedback/l_date=2015-09-09/cid=Czgc_pc/bid=949722CF_12F7_523A_EE21_E3D591B7E755
>  
>   Partition
> base file name: bid=949722CF_12F7_523A_EE21_E3D591B7E755
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> partition values:
>   bid 949722CF_12F7_523A_EE21_E3D591B7E755
>   cid Czgc_pc
>   l_date 2015-09-09



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HIVE-26148) Keep MetaStoreFilterHook interface compatibility after introducing catalogs

2022-10-21 Thread Stamatis Zampetakis (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-26148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26148:
---
Fix Version/s: (was: 4.0.0-alpha-2)

I cleared the fixVersion field since this ticket is still open. Please review 
this ticket and if the fix is already committed to a specific version please 
set the version accordingly and mark the ticket as RESOLVED.

According to the [JIRA 
guidelines|https://cwiki.apache.org/confluence/display/Hive/HowToContribute] 
the fixVersion should be set only when the issue is resolved/closed.

> Keep MetaStoreFilterHook interface compatibility after introducing catalogs
> ---
>
> Key: HIVE-26148
> URL: https://issues.apache.org/jira/browse/HIVE-26148
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Wechar
>Assignee: Wechar
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Hive 3.0 introduce catalog concept, when we upgrade hive dependency version 
> from 2.3 to 3.x, we found some interfaces of *MetaStoreFilterHook* are not 
> compatible:
> {code:bash}
>  git show ba8a99e115 -- 
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreFilterHook.java
> {code}
> {code:bash}
> --- 
> a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreFilterHook.java
> +++ 
> b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreFilterHook.java
>/**
> * Filter given list of tables
> -   * @param dbName
> -   * @param tableList
> +   * @param catName catalog name
> +   * @param dbName database name
> +   * @param tableList list of table returned by the metastore
> * @return List of filtered table names
> */
> -  public List filterTableNames(String dbName, List 
> tableList) throws MetaException;
> +  List filterTableNames(String catName, String dbName, List 
> tableList)
> +  throws MetaException;
> {code}
> We can remain the previous interfaces and use the default catalog to 
> implement.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

1 2 3 4 >

1 - 100 of 386 matches

Mail list logo