[jira] [Updated] (HIVE-11353) Map env does not reflect in the Local Map Join
[ https://issues.apache.org/jira/browse/HIVE-11353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryu Kobayashi updated HIVE-11353: - Fix Version/s: All Versions Resolution: Won't Fix Status: Resolved (was: Patch Available) > Map env does not reflect in the Local Map Join > -- > > Key: HIVE-11353 > URL: https://issues.apache.org/jira/browse/HIVE-11353 > Project: Hive > Issue Type: Bug >Reporter: Ryu Kobayashi >Assignee: Ryu Kobayashi >Priority: Major > Fix For: All Versions > > Attachments: HIVE-11353.1.patch > > > mapreduce.map.env is not reflected when the Local Map Join is ran. Following > a sample query: > {code} > hive> set mapreduce.map.env=AAA=111,BBB=222,CCC=333; > hive> select > > reflect("java.lang.System", "getenv", "CCC") as CCC, > > a.AAA, > > b.BBB > > from ( > > SELECT > > reflect("java.lang.System", "getenv", "AAA") as AAA > > from > > foo > > ) a > > join ( > > select > > reflect("java.lang.System", "getenv", "BBB") as BBB > > from > > foo > > ) b > > limit 1; > Warning: Map Join MAPJOIN[10][bigTable=?] in task 'Stage-3:MAPRED' is a cross > product > Query ID = root_20150716013643_a8ca1539-68ae-4f13-b9fa-7a8b88f01f13 > Total jobs = 1 > 15/07/16 01:36:46 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > Execution log at: > /tmp/root/root_20150716013643_a8ca1539-68ae-4f13-b9fa-7a8b88f01f13.log > 2015-07-16 01:36:47 Starting to launch local task to process map join; > maximum memory = 477102080 > 2015-07-16 01:36:48 Dump the side-table for tag: 0 with group count: 1 > into file: > file:/tmp/root/9b900f85-d5e4-4632-90bc-19f4bac516ff/hive_2015-07-16_01-36-43_217_8812243019719259041-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable > 2015-07-16 01:36:48 Uploaded 1 File to: > file:/tmp/root/9b900f85-d5e4-4632-90bc-19f4bac516ff/hive_2015-07-16_01-36-43_217_8812243019719259041-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable > (282 bytes) > 2015-07-16 01:36:48 End of local task; Time Taken: 0.934 sec. > Execution completed successfully > MapredLocal task succeeded > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1436962851556_0015, Tracking URL = > http://hadoop27:8088/proxy/application_1436962851556_0015/ > Kill Command = /usr/local/hadoop/bin/hadoop job -kill job_1436962851556_0015 > Hadoop job information for Stage-3: number of mappers: 1; number of reducers: > 0 > 2015-07-16 01:36:56,488 Stage-3 map = 0%, reduce = 0% > 2015-07-16 01:37:01,656 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 1.28 > sec > MapReduce Total cumulative CPU time: 1 seconds 280 msec > Ended Job = job_1436962851556_0015 > MapReduce Jobs Launched: > Stage-Stage-3: Map: 1 Cumulative CPU: 1.28 sec HDFS Read: 5428 HDFS > Write: 13 SUCCESS > Total MapReduce CPU Time Spent: 1 seconds 280 msec > OK > 333 null222 > Time taken: 19.562 seconds, Fetched: 1 row(s) > {code} > The attached patch will include those taken from Hadoop's code. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-11353) Map env does not reflect in the Local Map Join
[ https://issues.apache.org/jira/browse/HIVE-11353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837493#comment-17837493 ] Ryu Kobayashi commented on HIVE-11353: -- MapReduce has been deprecated and this ticket will be closed. > Map env does not reflect in the Local Map Join > -- > > Key: HIVE-11353 > URL: https://issues.apache.org/jira/browse/HIVE-11353 > Project: Hive > Issue Type: Bug >Reporter: Ryu Kobayashi >Assignee: Ryu Kobayashi >Priority: Major > Attachments: HIVE-11353.1.patch > > > mapreduce.map.env is not reflected when the Local Map Join is ran. Following > a sample query: > {code} > hive> set mapreduce.map.env=AAA=111,BBB=222,CCC=333; > hive> select > > reflect("java.lang.System", "getenv", "CCC") as CCC, > > a.AAA, > > b.BBB > > from ( > > SELECT > > reflect("java.lang.System", "getenv", "AAA") as AAA > > from > > foo > > ) a > > join ( > > select > > reflect("java.lang.System", "getenv", "BBB") as BBB > > from > > foo > > ) b > > limit 1; > Warning: Map Join MAPJOIN[10][bigTable=?] in task 'Stage-3:MAPRED' is a cross > product > Query ID = root_20150716013643_a8ca1539-68ae-4f13-b9fa-7a8b88f01f13 > Total jobs = 1 > 15/07/16 01:36:46 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > Execution log at: > /tmp/root/root_20150716013643_a8ca1539-68ae-4f13-b9fa-7a8b88f01f13.log > 2015-07-16 01:36:47 Starting to launch local task to process map join; > maximum memory = 477102080 > 2015-07-16 01:36:48 Dump the side-table for tag: 0 with group count: 1 > into file: > file:/tmp/root/9b900f85-d5e4-4632-90bc-19f4bac516ff/hive_2015-07-16_01-36-43_217_8812243019719259041-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable > 2015-07-16 01:36:48 Uploaded 1 File to: > file:/tmp/root/9b900f85-d5e4-4632-90bc-19f4bac516ff/hive_2015-07-16_01-36-43_217_8812243019719259041-1/-local-10003/HashTable-Stage-3/MapJoin-mapfile00--.hashtable > (282 bytes) > 2015-07-16 01:36:48 End of local task; Time Taken: 0.934 sec. > Execution completed successfully > MapredLocal task succeeded > Launching Job 1 out of 1 > Number of reduce tasks is set to 0 since there's no reduce operator > Starting Job = job_1436962851556_0015, Tracking URL = > http://hadoop27:8088/proxy/application_1436962851556_0015/ > Kill Command = /usr/local/hadoop/bin/hadoop job -kill job_1436962851556_0015 > Hadoop job information for Stage-3: number of mappers: 1; number of reducers: > 0 > 2015-07-16 01:36:56,488 Stage-3 map = 0%, reduce = 0% > 2015-07-16 01:37:01,656 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 1.28 > sec > MapReduce Total cumulative CPU time: 1 seconds 280 msec > Ended Job = job_1436962851556_0015 > MapReduce Jobs Launched: > Stage-Stage-3: Map: 1 Cumulative CPU: 1.28 sec HDFS Read: 5428 HDFS > Write: 13 SUCCESS > Total MapReduce CPU Time Spent: 1 seconds 280 msec > OK > 333 null222 > Time taken: 19.562 seconds, Fetched: 1 row(s) > {code} > The attached patch will include those taken from Hadoop's code. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28199) Docker quickstart does not work for Hive 3.1.3 on Mac M2
[ https://issues.apache.org/jira/browse/HIVE-28199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-28199: -- Labels: pull-request-available (was: ) > Docker quickstart does not work for Hive 3.1.3 on Mac M2 > > > Key: HIVE-28199 > URL: https://issues.apache.org/jira/browse/HIVE-28199 > Project: Hive > Issue Type: Bug >Reporter: Ryan Goldenberg >Assignee: Ryan Goldenberg >Priority: Minor > Labels: pull-request-available > > Quickstart: > [https://hive.apache.org/developement/quickstart/#--hiveserver2-metastore] > On Mac M2, {{docker-compose up}} for {{HIVE_VERSION=3.1.3}} gives the > following errors > * {{/home/hive/.beeline}} directory issue > {quote}metastore | *** schemaTool failed *** > metastore | [ > metastore | WARN] Failed to create directory: > metastore | /home/hive/.beeline > metastore | No such file or directory > {quote} * Underscore in network name, from {{/tmp/hive/hive.log}} on > {{{}hiveserver2{}}}: > {quote}2024-04-02T16:26:24,867 ERROR [main] utils.MetaStoreUtils: Got > exception: java.net.URISyntaxException Illegal character in hostname at index > 25: thrift://metastore.docker_default:9083 > java.net.URISyntaxException: Illegal character in hostname at index 25: > thrift://metastore.docker_default:9083 > {quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28199) Docker quickstart does not work for Hive 3.1.3 on Mac M2
[ https://issues.apache.org/jira/browse/HIVE-28199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Goldenberg updated HIVE-28199: --- Description: Quickstart: [https://hive.apache.org/developement/quickstart/#--hiveserver2-metastore] On Mac M2, {{docker-compose up}} for {{HIVE_VERSION=3.1.3}} gives the following errors * {{/home/hive/.beeline}} directory issue {quote}metastore | *** schemaTool failed *** metastore | [ metastore | WARN] Failed to create directory: metastore | /home/hive/.beeline metastore | No such file or directory {quote} * Underscore in network name, from {{/tmp/hive/hive.log}} on {{{}hiveserver2{}}}: {quote}2024-04-02T16:26:24,867 ERROR [main] utils.MetaStoreUtils: Got exception: java.net.URISyntaxException Illegal character in hostname at index 25: thrift://metastore.docker_default:9083 java.net.URISyntaxException: Illegal character in hostname at index 25: thrift://metastore.docker_default:9083 {quote} was: Quickstart: [https://hive.apache.org/developement/quickstart/#--hiveserver2-metastore] On Mac M2, {{docker-compose up}} for {{HIVE_VERSION=3.1.3}} gives the following errors * {{/home/hive/.beeline}} directory issue metastore| *** schemaTool failed *** metastore| [ metastore| WARN] Failed to create directory: metastore| /home/hive/.beeline metastore| No such file or directory * Underscore in network name, from {{/tmp/hive/hive.log}} on {{{}hiveserver2{}}}: 2024-04-02T16:26:24,867 ERROR [main] utils.MetaStoreUtils: Got exception: java.net.URISyntaxException Illegal character in hostname at index 25: thrift://metastore.docker_default:9083 java.net.URISyntaxException: Illegal character in hostname at index 25: thrift://metastore.docker_default:9083{{}} > Docker quickstart does not work for Hive 3.1.3 on Mac M2 > > > Key: HIVE-28199 > URL: https://issues.apache.org/jira/browse/HIVE-28199 > Project: Hive > Issue Type: Bug >Reporter: Ryan Goldenberg >Priority: Minor > > Quickstart: > [https://hive.apache.org/developement/quickstart/#--hiveserver2-metastore] > On Mac M2, {{docker-compose up}} for {{HIVE_VERSION=3.1.3}} gives the > following errors > * {{/home/hive/.beeline}} directory issue > {quote}metastore | *** schemaTool failed *** > metastore | [ > metastore | WARN] Failed to create directory: > metastore | /home/hive/.beeline > metastore | No such file or directory > {quote} * Underscore in network name, from {{/tmp/hive/hive.log}} on > {{{}hiveserver2{}}}: > {quote}2024-04-02T16:26:24,867 ERROR [main] utils.MetaStoreUtils: Got > exception: java.net.URISyntaxException Illegal character in hostname at index > 25: thrift://metastore.docker_default:9083 > java.net.URISyntaxException: Illegal character in hostname at index 25: > thrift://metastore.docker_default:9083 > {quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (HIVE-28199) Docker quickstart does not work for Hive 3.1.3 on Mac M2
[ https://issues.apache.org/jira/browse/HIVE-28199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Goldenberg reassigned HIVE-28199: -- Assignee: Ryan Goldenberg > Docker quickstart does not work for Hive 3.1.3 on Mac M2 > > > Key: HIVE-28199 > URL: https://issues.apache.org/jira/browse/HIVE-28199 > Project: Hive > Issue Type: Bug >Reporter: Ryan Goldenberg >Assignee: Ryan Goldenberg >Priority: Minor > > Quickstart: > [https://hive.apache.org/developement/quickstart/#--hiveserver2-metastore] > On Mac M2, {{docker-compose up}} for {{HIVE_VERSION=3.1.3}} gives the > following errors > * {{/home/hive/.beeline}} directory issue > {quote}metastore | *** schemaTool failed *** > metastore | [ > metastore | WARN] Failed to create directory: > metastore | /home/hive/.beeline > metastore | No such file or directory > {quote} * Underscore in network name, from {{/tmp/hive/hive.log}} on > {{{}hiveserver2{}}}: > {quote}2024-04-02T16:26:24,867 ERROR [main] utils.MetaStoreUtils: Got > exception: java.net.URISyntaxException Illegal character in hostname at index > 25: thrift://metastore.docker_default:9083 > java.net.URISyntaxException: Illegal character in hostname at index 25: > thrift://metastore.docker_default:9083 > {quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-28199) Docker quickstart does not work for Hive 3.1.3 on Mac M2
Ryan Goldenberg created HIVE-28199: -- Summary: Docker quickstart does not work for Hive 3.1.3 on Mac M2 Key: HIVE-28199 URL: https://issues.apache.org/jira/browse/HIVE-28199 Project: Hive Issue Type: Bug Reporter: Ryan Goldenberg Quickstart: [https://hive.apache.org/developement/quickstart/#--hiveserver2-metastore] On Mac M2, {{docker-compose up}} for {{HIVE_VERSION=3.1.3}} gives the following errors * {{/home/hive/.beeline}} directory issue metastore| *** schemaTool failed *** metastore| [ metastore| WARN] Failed to create directory: metastore| /home/hive/.beeline metastore| No such file or directory * Underscore in network name, from {{/tmp/hive/hive.log}} on {{{}hiveserver2{}}}: 2024-04-02T16:26:24,867 ERROR [main] utils.MetaStoreUtils: Got exception: java.net.URISyntaxException Illegal character in hostname at index 25: thrift://metastore.docker_default:9083 java.net.URISyntaxException: Illegal character in hostname at index 25: thrift://metastore.docker_default:9083{{}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-28019) Fix query type information in proto files for load and explain queries
[ https://issues.apache.org/jira/browse/HIVE-28019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837401#comment-17837401 ] Ramesh Kumar Thangarajan commented on HIVE-28019: - Hi [~zabetak] First of all, thank you very much for the review on this. :) I am with you on the fact that HiveOperation was introduced for authorization and may be we should not change it to represent the query type. But I still believe we should do the change for PREHOOK: type: and POSTHOOK: type: and also the HiveProtoLoggingHook. I feel that the change to HiveOperation.Explain for the explain queries is needed mostly because we use the HiveOperation to print in the preexecute and postexecute actions. [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/hooks/PreExecutePrinter.java#L69] At present we report the type information for the queries in preexec and postexec as below: PREHOOK: type: QUERY POSTHOOK: type: QUERY I think this is the query type information that is reported along with other information on the query. If that is the case I feel we should not report other type for explain queries. If this change is loss of information shouldn't the usage of type wrong by the users? Although we can skip this and fix only the HiveProtoLoggingHook to address right query type, I feel we will report two different information for the same query in different places. Also keeping them synchronized will help us in the complete testing for all types of queries. Please let me know if you think my points make sense. I will address to not touch the commandType and rather create a field to represent explain queries and use that to report the correct query type in HiveProtoLoggingHook and the PREHOOK: type: and POSTHOOK: type. > Fix query type information in proto files for load and explain queries > -- > > Key: HIVE-28019 > URL: https://issues.apache.org/jira/browse/HIVE-28019 > Project: Hive > Issue Type: Task > Components: HiveServer2 >Reporter: Ramesh Kumar Thangarajan >Assignee: Ramesh Kumar Thangarajan >Priority: Major > Labels: pull-request-available > > Certain query types like LOAD, export, import and explain queries did not > produce the right Hive operation type -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-27734) Add Iceberg's storage-partitioned join capabilities to Hive's [sorted-]bucket-map-join
[ https://issues.apache.org/jira/browse/HIVE-27734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837264#comment-17837264 ] Shohei Okumiya commented on HIVE-27734: --- [~dkuzmenko] I surveyed some optimizations implemented in Hive, potentially useful features of Iceberg, and how to integrate those existing optimizations with Iceberg. I drafted a document and PR as an example. * [Design document|https://docs.google.com/document/d/1srEK3atO2T3Apa-FsF6bW__ECY-nFrev_1RZ8EN4UF8/edit?usp=sharing] * [A sample implementation|https://github.com/apache/hive/pull/5194] I presume we can take the following actions. I'd be glad to hear other ideas if you had. # We may create an umbrella ticket as this topic seems too big to complete in a single ticket. # We may share the documents with the Hive dev ML so that Hive and Iceberg experts can be involved. # Anything else? > Add Iceberg's storage-partitioned join capabilities to Hive's > [sorted-]bucket-map-join > -- > > Key: HIVE-27734 > URL: https://issues.apache.org/jira/browse/HIVE-27734 > Project: Hive > Issue Type: Improvement > Components: Iceberg integration >Affects Versions: 4.0.0-alpha-2 >Reporter: Janos Kovacs >Assignee: Shohei Okumiya >Priority: Major > > Iceberg's 'data bucketing' is implemented through its rich (function based) > partitioning feature which helps to optimize join operations - called storage > partitioned joins. > doc: > [https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE/edit#heading=h.82w8qxfl2uwl] > spark impl.: https://issues.apache.org/jira/browse/SPARK-37375 > This feature is not yet leveraged in Hive into its bucket-map-join > optimization, neither alone nor with Iceberg's SortOrder to > sorted-bucket-map-join. > Customers migrating from Hive table format to Iceberg format with storage > optimized schema will experience performance degradation on large tables > where Iceberg's gain on no-listing performance improvement is significantly > smaller than the actual join performance over bucket-join or even > sorted-bucket-join. > > {noformat} > SET hive.query.results.cache.enabled=false; > SET hive.fetch.task.conversion = none; > SET hive.optimize.bucketmapjoin=true; > SET hive.convert.join.bucket.mapjoin.tez=true; > SET hive.auto.convert.join.noconditionaltask.size=1000; > --if you are working with external table, you need this for bmj: > SET hive.disable.unsafe.external.table.operations=false; > -- HIVE BUCKET-MAP-JOIN > DROP TABLE IF EXISTS default.hivebmjt1 PURGE; > DROP TABLE IF EXISTS default.hivebmjt2 PURGE; > CREATE TABLE default.hivebmjt1 (id int, txt string) CLUSTERED BY (id) INTO 8 > BUCKETS; > CREATE TABLE default.hivebmjt2 (id int, txt string); > INSERT INTO default.hivebmjt1 VALUES > (1,'1'),(2,'2'),(3,'3'),(4,'4'),(5,'5'),(6,'6'),(7,'7'),(8,'8'); > INSERT INTO default.hivebmjt2 VALUES (1,'1'),(2,'2'),(3,'3'),(4,'4'); > EXPLAIN > SELECT * FROM default.hivebmjt1 f INNER JOIN default.hivebmjt2 d ON f.id > = d.id; > EXPLAIN > SELECT * FROM default.hivebmjt1 f LEFT OUTER JOIN default.hivebmjt2 d ON f.id > = d.id; > -- Both are optimized into BMJ > -- ICEBERG BUCKET-MAP-JOIN via Iceberg's storage-partitioned join > DROP TABLE IF EXISTS default.icespbmjt1 PURGE; > DROP TABLE IF EXISTS default.icespbmjt2 PURGE; > CREATE TABLE default.icespbmjt1 (txt string) PARTITIONED BY (id int) STORED > BY ICEBERG ; > CREATE TABLE default.icespbmjt2 (txt string) PARTITIONED BY (id int) STORED > BY ICEBERG ; > INSERT INTO default.icespbmjt1 VALUES ('1',1),('2',2),('3',3),('4',4); > INSERT INTO default.icespbmjt2 VALUES ('1',1),('2',2),('3',3),('4',4); > EXPLAIN > SELECT * FROM default.icespbmjt1 f INNER JOIN default.icespbmjt2 d ON > f.id = d.id; > EXPLAIN > SELECT * FROM default.icespbmjt1 f LEFT OUTER JOIN default.icespbmjt2 d ON > f.id = d.id; > -- Only Map-Join optimised > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-28082) HiveAggregateReduceFunctionsRule could generate an inconsistent result
[ https://issues.apache.org/jira/browse/HIVE-28082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837244#comment-17837244 ] Krisztian Kasa commented on HIVE-28082: --- Some more details about the issue: {code} explain cbo select avg('text'); select avg('text'); {code} {{avg('text')}} is converted to {{sum('text')/count('text')}} {code} HiveProject(_o__c0=[/($0, $1)]) HiveAggregate(group=[{}], agg#0=[sum($0)], agg#1=[count()]) HiveProject($f0=[_UTF-16LE'text':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"]) HiveTableScan(table=[[_dummy_database, _dummy_table]], table:alias=[_dummy_table]) {code} and {{sum('text')}} throws an exception at execution time and logged as a warning: {code} 2024-04-15T04:47:57,568 WARN [TezTR-671313_1_1_0_0_0] generic.GenericUDAFSum: GenericUDAFSumDouble java.lang.NumberFormatException: For input string: "text" at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043) at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110) at java.lang.Double.parseDouble(Double.java:538) at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectInspectorUtils.java:867) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFSum$GenericUDAFSumDouble.iterate(GenericUDAFSum.java:444) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:215) at org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:620) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:792) at org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:701) at org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:766) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:94) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:173) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:155) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:555) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:101) at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:83) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:414) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:293) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2024-04-15T04:47:57,568 WARN [TezTR-671313_1_1_0_0_0] generic.GenericUDAFSum: GenericUDAFSumDouble ignoring similar exceptions. {code} Similar when CBO is turned off {code} set hive.cbo.enable=false; select avg('text'); {code} {code} 024-04-15T04:55:29,444 WARN [TezTR-126305_1_1_0_0_0] generic.GenericUDAFAverage: Ignoring similar exceptions java.lang.NumberFormatException: For input string: "text" at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043) ~[?:1.8.0_301] at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110) ~[?:1.8.0_301] at java.lang.Double.parseDouble(Double.java:538) ~[?:1.8.0_301] at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getDouble(PrimitiveObjectIn
[jira] [Updated] (HIVE-28198) Trino table is recognized as EXTERNAL_TABLE regardless of external_location parameter
[ https://issues.apache.org/jira/browse/HIVE-28198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mladjan Gadzic updated HIVE-28198: -- Description: {code:java} trino > create table hive.default.test_table(id int);{code} {code:java} trino> delete from hive.default.test_table; Query 20240402_103228_00042_hm8m3, FAILED, 1 node Splits: 1 total, 0 done (0.00%) 0.08 [0 rows, 0B] [0 rows/s, 0B/s] Query 20240402_103228_00042_hm8m3 failed: Cannot delete from non-managed Hive table{code} This behavior is tested and works as expected in Hive 3. Table type is stored in HMS DB in {{TBLS}} table {{TBL_TYPE}} field. For Hive 3 value is MANAGED_TABLE and EXTERNAL_TABLE for Hive 4. was: {code:java} trino > create table hive.default.test_table(id int);{code} {code:java} trino> delete from hive.default.test_table; Query 20240402_103228_00042_hm8m3, FAILED, 1 node Splits: 1 total, 0 done (0.00%) 0.08 [0 rows, 0B] [0 rows/s, 0B/s] Query 20240402_103228_00042_hm8m3 failed: Cannot delete from non-managed Hive table{code} This behavior is tested and works as expected in Hive 3. Table type is stored in HMS DB in {{TBLS}} table {{TBL_TYPE}} field. For Hive 3 value is MANAGED_TABLE and EXTERNAL_TABLEfor Hive 4. > Trino table is recognized as EXTERNAL_TABLE regardless of external_location > parameter > - > > Key: HIVE-28198 > URL: https://issues.apache.org/jira/browse/HIVE-28198 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 4.0.0 >Reporter: Mladjan Gadzic >Priority: Major > > {code:java} > trino > create table hive.default.test_table(id int);{code} > {code:java} > trino> delete from hive.default.test_table; > Query 20240402_103228_00042_hm8m3, FAILED, 1 node Splits: 1 total, 0 done > (0.00%) 0.08 [0 rows, 0B] [0 rows/s, 0B/s] > Query 20240402_103228_00042_hm8m3 failed: Cannot delete from non-managed Hive > table{code} > This behavior is tested and works as expected in Hive 3. Table type is stored > in HMS DB in {{TBLS}} table {{TBL_TYPE}} field. For Hive 3 value is > MANAGED_TABLE and EXTERNAL_TABLE for Hive 4. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-28198) Trino table is recognized as EXTERNAL_TABLE regardless of external_location parameter
Mladjan Gadzic created HIVE-28198: - Summary: Trino table is recognized as EXTERNAL_TABLE regardless of external_location parameter Key: HIVE-28198 URL: https://issues.apache.org/jira/browse/HIVE-28198 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 4.0.0 Reporter: Mladjan Gadzic {code:java} trino > create table hive.default.test_table(id int);{code} {code:java} trino> delete from hive.default.test_table; Query 20240402_103228_00042_hm8m3, FAILED, 1 node Splits: 1 total, 0 done (0.00%) 0.08 [0 rows, 0B] [0 rows/s, 0B/s] Query 20240402_103228_00042_hm8m3 failed: Cannot delete from non-managed Hive table{code} This behavior is tested and works as expected in Hive 3. Table type is stored in HMS DB in {{TBLS}} table {{TBL_TYPE}} field. For Hive 3 value is MANAGED_TABLE and EXTERNAL_TABLEfor Hive 4. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-28153) Flaky test TestConflictingDataFiles.testMultiFiltersUpdate
[ https://issues.apache.org/jira/browse/HIVE-28153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simhadri Govindappa resolved HIVE-28153. Fix Version/s: 4.1.0 Resolution: Fixed > Flaky test TestConflictingDataFiles.testMultiFiltersUpdate > -- > > Key: HIVE-28153 > URL: https://issues.apache.org/jira/browse/HIVE-28153 > Project: Hive > Issue Type: Test > Components: Test >Reporter: Butao Zhang >Assignee: Simhadri Govindappa >Priority: Major > Labels: pull-request-available > Fix For: 4.1.0 > > > This test has been failing a lot lately, such as > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-5063/13/tests/] > > And the flaky test shows this test is unstable: > [http://ci.hive.apache.org/job/hive-flaky-check/831/testReport/] > {code:java} > 10:29:21 [INFO] T E S T S > 10:29:21 [INFO] --- > 10:29:21 [INFO] Running org.apache.iceberg.mr.hive.TestConflictingDataFiles > 10:36:13 [ERROR] Tests run: 60, Failures: 1, Errors: 0, Skipped: 24, Time > elapsed: 399.12 s <<< FAILURE! - in > org.apache.iceberg.mr.hive.TestConflictingDataFiles > 10:36:13 [ERROR] > org.apache.iceberg.mr.hive.TestConflictingDataFiles.testMultiFiltersUpdate[fileFormat=PARQUET, > engine=tez, catalog=HIVE_CATALOG, isVectorized=false, formatVersion=1] Time > elapsed: 11.781 s <<< FAILURE! > 10:36:13 java.lang.AssertionError: expected:<12> but was:<13> > 10:36:13 at org.junit.Assert.fail(Assert.java:89) > 10:36:13 at org.junit.Assert.failNotEquals(Assert.java:835) > 10:36:13 at org.junit.Assert.assertEquals(Assert.java:647) > 10:36:13 at org.junit.Assert.assertEquals(Assert.java:633) > 10:36:13 at > org.apache.iceberg.mr.hive.TestConflictingDataFiles.testMultiFiltersUpdate(TestConflictingDataFiles.java:135) > 10:36:13 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 10:36:13 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 10:36:13 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 10:36:13 at java.lang.reflect.Method.invoke(Method.java:498) > 10:36:13 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > 10:36:13 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 10:36:13 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > 10:36:13 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 10:36:13 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > 10:36:13 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > 10:36:13 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 10:36:13 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > 10:36:13 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > 10:36:13 at java.util.concurrent.FutureTask.run(FutureTask.java:266) > 10:36:13 at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-28153) Flaky test TestConflictingDataFiles.testMultiFiltersUpdate
[ https://issues.apache.org/jira/browse/HIVE-28153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837165#comment-17837165 ] Simhadri Govindappa commented on HIVE-28153: Change is merged to master. Thanks, [~dkuzmenko] and [~zhangbutao] for the review! Additionally, I have raised HIVE-28192 to investigate the bug mentioned above. It seems like the IOContext is shared between threads in non-vectorized code flow which is causing duplicate records. > Flaky test TestConflictingDataFiles.testMultiFiltersUpdate > -- > > Key: HIVE-28153 > URL: https://issues.apache.org/jira/browse/HIVE-28153 > Project: Hive > Issue Type: Test > Components: Test >Reporter: Butao Zhang >Assignee: Simhadri Govindappa >Priority: Major > Labels: pull-request-available > > This test has been failing a lot lately, such as > [http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-5063/13/tests/] > > And the flaky test shows this test is unstable: > [http://ci.hive.apache.org/job/hive-flaky-check/831/testReport/] > {code:java} > 10:29:21 [INFO] T E S T S > 10:29:21 [INFO] --- > 10:29:21 [INFO] Running org.apache.iceberg.mr.hive.TestConflictingDataFiles > 10:36:13 [ERROR] Tests run: 60, Failures: 1, Errors: 0, Skipped: 24, Time > elapsed: 399.12 s <<< FAILURE! - in > org.apache.iceberg.mr.hive.TestConflictingDataFiles > 10:36:13 [ERROR] > org.apache.iceberg.mr.hive.TestConflictingDataFiles.testMultiFiltersUpdate[fileFormat=PARQUET, > engine=tez, catalog=HIVE_CATALOG, isVectorized=false, formatVersion=1] Time > elapsed: 11.781 s <<< FAILURE! > 10:36:13 java.lang.AssertionError: expected:<12> but was:<13> > 10:36:13 at org.junit.Assert.fail(Assert.java:89) > 10:36:13 at org.junit.Assert.failNotEquals(Assert.java:835) > 10:36:13 at org.junit.Assert.assertEquals(Assert.java:647) > 10:36:13 at org.junit.Assert.assertEquals(Assert.java:633) > 10:36:13 at > org.apache.iceberg.mr.hive.TestConflictingDataFiles.testMultiFiltersUpdate(TestConflictingDataFiles.java:135) > 10:36:13 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 10:36:13 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 10:36:13 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 10:36:13 at java.lang.reflect.Method.invoke(Method.java:498) > 10:36:13 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > 10:36:13 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 10:36:13 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > 10:36:13 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 10:36:13 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > 10:36:13 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > 10:36:13 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > 10:36:13 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > 10:36:13 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > 10:36:13 at java.util.concurrent.FutureTask.run(FutureTask.java:266) > 10:36:13 at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-28177) Announce Hive 1.x EOL and remove from downloads space
[ https://issues.apache.org/jira/browse/HIVE-28177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837116#comment-17837116 ] Stamatis Zampetakis commented on HIVE-28177: Hey [~ayushsaxena], I've seen that you send the announcement in user@ and dev@ thanks for doing that! Are you also planning to tackle the remaining items? > Announce Hive 1.x EOL and remove from downloads space > - > > Key: HIVE-28177 > URL: https://issues.apache.org/jira/browse/HIVE-28177 > Project: Hive > Issue Type: Task > Components: Documentation >Reporter: Stamatis Zampetakis >Priority: Major > > The Hive 1.x release line is officially unsupported. The respective > discussion and vote can be found below: > * https://lists.apache.org/thread/sxcrcf4v9j630tl9domp0bn4m33bdq0s > * [https://lists.apache.org/thread/cyfg2ftrsh9bn0wgycm7ltqsx9yb6fts] > The following tasks are pending: > * Update the Hive website to reflect that Hive 1.x is EOL > * Send an official announcement email to the following lists: user@hive, > dev@hive, announce@apache > * Remove hive-1.2.2 from [https://downloads.apache.org/hive/] (it will be > automatically archived) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-26220) Shade & relocate dependencies in hive-exec to avoid conflicting with downstream projects
[ https://issues.apache.org/jira/browse/HIVE-26220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-26220: -- Labels: hive-4.0.1-must (was: ) > Shade & relocate dependencies in hive-exec to avoid conflicting with > downstream projects > > > Key: HIVE-26220 > URL: https://issues.apache.org/jira/browse/HIVE-26220 > Project: Hive > Issue Type: Improvement >Affects Versions: 4.0.0, 4.0.0-alpha-1 >Reporter: Chao Sun >Priority: Blocker > Labels: hive-4.0.1-must > > Currently projects like Spark, Trino/Presto, Iceberg, etc, are depending on > {{hive-exec:core}} which was removed in HIVE-25531. The reason these projects > use {{hive-exec:core}} is because they have the flexibility to exclude, shade > & relocate dependencies in {{hive-exec}} that conflict with the ones they > brought in by themselves. However, with {{hive-exec}} this is no longer > possible, since it is a fat jar that shade those dependencies but do not > relocate many of them. > In order for the downstream projects to consume {{hive-exec}}, we will need > to make sure all the dependencies in {{hive-exec}} are properly shaded and > relocated, so they won't cause conflicts with those from the downstream. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (HIVE-28133) Log the original exception in HiveIOExceptionHandlerUtil#handleRecordReaderException
[ https://issues.apache.org/jira/browse/HIVE-28133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837112#comment-17837112 ] Denys Kuzmenko commented on HIVE-28133: --- Merged to master Thanks [~abstractdog] for the review! > Log the original exception in > HiveIOExceptionHandlerUtil#handleRecordReaderException > > > Key: HIVE-28133 > URL: https://issues.apache.org/jira/browse/HIVE-28133 > Project: Hive > Issue Type: Improvement >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (HIVE-28133) Log the original exception in HiveIOExceptionHandlerUtil#handleRecordReaderException
[ https://issues.apache.org/jira/browse/HIVE-28133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko resolved HIVE-28133. --- Fix Version/s: 4.1.0 Resolution: Fixed > Log the original exception in > HiveIOExceptionHandlerUtil#handleRecordReaderException > > > Key: HIVE-28133 > URL: https://issues.apache.org/jira/browse/HIVE-28133 > Project: Hive > Issue Type: Improvement >Reporter: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Fix For: 4.1.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-28197) Add deserializer to convert JSON plans to RelNodes
[ https://issues.apache.org/jira/browse/HIVE-28197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-28197: -- Labels: pull-request-available (was: ) > Add deserializer to convert JSON plans to RelNodes > -- > > Key: HIVE-28197 > URL: https://issues.apache.org/jira/browse/HIVE-28197 > Project: Hive > Issue Type: New Feature > Components: Hive >Affects Versions: 4.0.0 >Reporter: Soumyakanti Das >Assignee: Soumyakanti Das >Priority: Major > Labels: pull-request-available > > We have a serializer that converts RelNodes to JSON. With this patch, we will > be able to deserialize JSON plans. -- This message was sent by Atlassian Jira (v8.20.10#820010)