[jira] [Commented] (HIVE-25338) AIOBE in conv UDF if input is empty
[ https://issues.apache.org/jira/browse/HIVE-25338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17385252#comment-17385252 ] Naresh P R commented on HIVE-25338: --- Thanks for the analysis [~zabetak] I updated PR to return null for empty input. > AIOBE in conv UDF if input is empty > --- > > Key: HIVE-25338 > URL: https://issues.apache.org/jira/browse/HIVE-25338 > Project: Hive > Issue Type: Bug >Reporter: Naresh P R >Assignee: Naresh P R >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Repro > {code:java} > create table test (a string); > insert into test values (""); > select conv(a,16,10) from test;{code} > Exception trace: > {code:java} > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at org.apache.hadoop.hive.ql.udf.UDFConv.evaluate(UDFConv.java:160){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25303) CTAS hive.create.as.external.legacy tries to place data files in managed WH path
[ https://issues.apache.org/jira/browse/HIVE-25303?focusedWorklogId=626424=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626424 ] ASF GitHub Bot logged work on HIVE-25303: - Author: ASF GitHub Bot Created on: 22/Jul/21 04:35 Start Date: 22/Jul/21 04:35 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on pull request #2442: URL: https://github.com/apache/hive/pull/2442#issuecomment-884650693 this patch doesn't add any tests; please add some. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626424) Time Spent: 1h 10m (was: 1h) > CTAS hive.create.as.external.legacy tries to place data files in managed WH > path > > > Key: HIVE-25303 > URL: https://issues.apache.org/jira/browse/HIVE-25303 > Project: Hive > Issue Type: Bug > Components: HiveServer2, Standalone Metastore >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Under legacy table creation mode (hive.create.as.external.legacy=true), when > a database has been created in a specific LOCATION, in a session where that > database is USEd, tables created using > CREATE TABLE AS SELECT > should inherit the HDFS path from the database's location. > Instead, Hive is trying to write the table data into > /warehouse/tablespace/managed/hive// -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT
[ https://issues.apache.org/jira/browse/HIVE-25306?focusedWorklogId=626418=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626418 ] ASF GitHub Bot logged work on HIVE-25306: - Author: ASF GitHub Bot Created on: 22/Jul/21 04:00 Start Date: 22/Jul/21 04:00 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on pull request #2445: URL: https://github.com/apache/hive/pull/2445#issuecomment-884641257 @zabetak @sankarh Addressed all the comments and got green build also. Could you guys please review the PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626418) Time Spent: 2h 50m (was: 2h 40m) > Move Date and Timestamp parsing from ResolverStyle.LENIENT to > ResolverStyle.STRICT > -- > > Key: HIVE-25306 > URL: https://issues.apache.org/jira/browse/HIVE-25306 > Project: Hive > Issue Type: Bug > Components: Query Planning, UDF >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Attachments: DB_compare.JPG > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Description - > Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to > convert the date/timpstamp from int,string,char etc to Date or Timestamp. > Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date > like "1992-13-12" is converted to "2000-01-12", > Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like > "1992-13-12" is not be converted instead NULL is return. > https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-20475) Hive Thrift Server 2 stops frequently
[ https://issues.apache.org/jira/browse/HIVE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline resolved HIVE-20475. - Resolution: Duplicate HIVE-25307 > Hive Thrift Server 2 stops frequently > - > > Key: HIVE-20475 > URL: https://issues.apache.org/jira/browse/HIVE-20475 > Project: Hive > Issue Type: Bug > Environment: HDP 2.6.5.0 > Hive 1.2.1000 > Spark 2.3.0 >Reporter: Vinod Nerella >Priority: Major > > 18/08/28 02:18:05 ERROR TThreadPoolServer: Error occurred during processing > of message. > java.lang.RuntimeException: > org.apache.thrift.transport.TSaslTransportException: No data or no sasl data > in the stream > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no > sasl data in the stream > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:328) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > ... 4 more -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626396=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626396 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 22/Jul/21 01:33 Start Date: 22/Jul/21 01:33 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674445700 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/FailoverMetaData.java ## @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.parse.repl.load; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonIgnoreProperties; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.core.JsonProcessingException; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.parse.repl.dump.Utils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.BufferedReader; +import java.io.IOException; +import java.io.InputStreamReader; +import java.util.List; + +@JsonIgnoreProperties(ignoreUnknown = true) +public class FailoverMetaData { +public static final String FAILOVER_METADATA = "_failovermetadata"; +private static final Logger LOG = LoggerFactory.getLogger(FailoverMetaData.class); + +private static ObjectMapper JSON_OBJECT_MAPPER = new ObjectMapper(); + +@JsonProperty +private Long failoverEventId = null; +@JsonProperty +private Long cursorPoint = null; +@JsonProperty +private List abortedTxns; +@JsonProperty +private List openTxns; +@JsonProperty +private List txnsWithoutLock; + +@JsonIgnore +private volatile boolean initialized = false; +@JsonIgnore +private final Path metadataFile; +@JsonIgnore +private final HiveConf hiveConf; + +public FailoverMetaData() { +metadataFile = null; +hiveConf = null; +} + +public FailoverMetaData(Path dumpDir, HiveConf hiveConf) { +this.hiveConf = hiveConf; +this.metadataFile = new Path(dumpDir, FAILOVER_METADATA); +} + +private void initializeIfNot() throws SemanticException { +if (!initialized) { +loadMetadataFromFile(); +initialized = true; +} +} + +public void setMetaData(FailoverMetaData otherDMD) { +this.failoverEventId = otherDMD.failoverEventId; +this.abortedTxns = otherDMD.abortedTxns; +this.openTxns = otherDMD.openTxns; +this.cursorPoint = otherDMD.cursorPoint; +this.txnsWithoutLock = otherDMD.txnsWithoutLock; +this.initialized = true; +} + +private synchronized void loadMetadataFromFile() throws SemanticException { Review comment: currently, we're using it from single thread only. but this is just to make sure that even in future, this task should be sequential in nature. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626396) Time Spent: 7.5h (was: 7h 20m) > Handle failover case during Repl Dump > - > > Key: HIVE-24918 > URL: https://issues.apache.org/jira/browse/HIVE-24918 > Project: Hive > Issue Type: New Feature >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 7.5h > Remaining Estimate: 0h > > To
[jira] [Updated] (HIVE-25365) Insufficient privileges to show partitions when partition columns are authorized
[ https://issues.apache.org/jira/browse/HIVE-25365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25365: -- Labels: pull-request-available (was: ) > Insufficient privileges to show partitions when partition columns are > authorized > > > Key: HIVE-25365 > URL: https://issues.apache.org/jira/browse/HIVE-25365 > Project: Hive > Issue Type: Bug > Components: Authorization >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When the privileges of partition columns have granted to user, showing > partitions still needs select privilege on the table, though they are able to > query from partition columns. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25365) Insufficient privileges to show partitions when partition columns are authorized
[ https://issues.apache.org/jira/browse/HIVE-25365?focusedWorklogId=626389=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626389 ] ASF GitHub Bot logged work on HIVE-25365: - Author: ASF GitHub Bot Created on: 22/Jul/21 01:05 Start Date: 22/Jul/21 01:05 Worklog Time Spent: 10m Work Description: dengzhhu653 opened a new pull request #2515: URL: https://github.com/apache/hive/pull/2515 … columns are authorized ### What changes were proposed in this pull request? ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626389) Remaining Estimate: 0h Time Spent: 10m > Insufficient privileges to show partitions when partition columns are > authorized > > > Key: HIVE-25365 > URL: https://issues.apache.org/jira/browse/HIVE-25365 > Project: Hive > Issue Type: Bug > Components: Authorization >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > When the privileges of partition columns have granted to user, showing > partitions still needs select privilege on the table, though they are able to > query from partition columns. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25365) Insufficient privileges to show partitions when partition columns are authorized
[ https://issues.apache.org/jira/browse/HIVE-25365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng updated HIVE-25365: --- Summary: Insufficient privileges to show partitions when partition columns are authorized (was: Insufficient priviledges to show partitions when partition columns are authorized) > Insufficient privileges to show partitions when partition columns are > authorized > > > Key: HIVE-25365 > URL: https://issues.apache.org/jira/browse/HIVE-25365 > Project: Hive > Issue Type: Bug > Components: Authorization >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Minor > > When the privileges of partition columns have granted to user, showing > partitions still needs select privilege on the table, though they are able to > query from partition columns. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25365) Insufficient priviledges to show partitions when partition columns are authorized
[ https://issues.apache.org/jira/browse/HIVE-25365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhihua Deng reassigned HIVE-25365: -- Assignee: Zhihua Deng > Insufficient priviledges to show partitions when partition columns are > authorized > - > > Key: HIVE-25365 > URL: https://issues.apache.org/jira/browse/HIVE-25365 > Project: Hive > Issue Type: Bug > Components: Authorization >Reporter: Zhihua Deng >Assignee: Zhihua Deng >Priority: Minor > > When the privileges of partition columns have granted to user, showing > partitions still needs select privilege on the table, though they are able to > query from partition columns. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25114) Optmize get_tables() api call in HMS
[ https://issues.apache.org/jira/browse/HIVE-25114?focusedWorklogId=626380=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626380 ] ASF GitHub Bot logged work on HIVE-25114: - Author: ASF GitHub Bot Created on: 22/Jul/21 00:08 Start Date: 22/Jul/21 00:08 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #2292: URL: https://github.com/apache/hive/pull/2292#issuecomment-884575366 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626380) Time Spent: 20m (was: 10m) > Optmize get_tables() api call in HMS > > > Key: HIVE-25114 > URL: https://issues.apache.org/jira/browse/HIVE-25114 > Project: Hive > Issue Type: Improvement >Reporter: Sai Hemanth Gantasala >Assignee: Sai Hemanth Gantasala >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Optmize get_tables() call in HMS api. There should only be one call to object > store instead of 2 calls to return the table objects. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626379 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 22/Jul/21 00:03 Start Date: 22/Jul/21 00:03 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674421491 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -268,11 +329,35 @@ private boolean shouldDumpAtlasMetadata() { return conf.getBoolVar(HiveConf.ConfVars.REPL_INCLUDE_ATLAS_METADATA); } - private Path getCurrentDumpPath(Path dumpRoot, boolean isBootstrap) throws IOException { + private Path getCurrentDumpPath(Path dumpRoot, boolean isBootstrap) throws IOException, HiveException { Path lastDumpPath = ReplUtils.getLatestDumpPath(dumpRoot, conf); if (lastDumpPath != null && shouldResumePreviousDump(lastDumpPath, isBootstrap)) { //Resume previous dump LOG.info("Resuming the dump with existing dump directory {}", lastDumpPath); + FileSystem fs = lastDumpPath.getFileSystem(conf); + Path hiveDumpDir = new Path(lastDumpPath, ReplUtils.REPL_HIVE_BASE_DIR); + Path failoverMetadataFile = new Path(hiveDumpDir, FailoverMetaData.FAILOVER_METADATA); + Path failoverReadyMarkerFile = new Path(hiveDumpDir, ReplAck.FAILOVER_READY_MARKER.toString()); + if (fs.exists(failoverReadyMarkerFile)) { +//If failoverReadyMarkerFile exists, this means previous dump iteration failed while creating dump ACK file. +//So, just delete this file and proceed further. +LOG.info("Deleting failover ready marker file: {} created in previous dump iteration.", failoverReadyMarkerFile); +fs.delete(failoverReadyMarkerFile, true); + } + if (fs.exists(failoverMetadataFile)) { +//If failoverMetadata file exists, this means previous dump iteration failed after writing failover metadata info Review comment: yes, that's the reason we've used nested configuration check in this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626379) Time Spent: 7h 20m (was: 7h 10m) > Handle failover case during Repl Dump > - > > Key: HIVE-24918 > URL: https://issues.apache.org/jira/browse/HIVE-24918 > Project: Hive > Issue Type: New Feature >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 7h 20m > Remaining Estimate: 0h > > To handle: > a) Whenever user wants to go ahead with failover, during the next or > subsequent repl dump operation upon confirming that there are no pending open > transaction events, It should create a _failover_ready marker file in the > dump dir. This marker file would contain scheduled query name > that has generated this dump. > b) Skip next repl dump instances once we have the marker file placed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626378=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626378 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 22/Jul/21 00:02 Start Date: 22/Jul/21 00:02 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674421232 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/FailoverMetaData.java ## @@ -0,0 +1,207 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.parse.repl.load; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonIgnoreProperties; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.core.JsonProcessingException; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.parse.repl.dump.Utils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.BufferedReader; +import java.io.IOException; +import java.io.InputStreamReader; +import java.util.List; + +@JsonIgnoreProperties(ignoreUnknown = true) +public class FailoverMetaData { +public static final String FAILOVER_METADATA = "_failovermetadata"; +private static final Logger LOG = LoggerFactory.getLogger(FailoverMetaData.class); + +private static ObjectMapper JSON_OBJECT_MAPPER = new ObjectMapper(); + +@JsonProperty +private Long failoverEventId = null; +@JsonProperty +private Long cursorPoint = null; +@JsonProperty +private List abortedTxns; +@JsonProperty +private List openTxns; +@JsonProperty +private List txnsWithoutLock; + +@JsonIgnore +private boolean initialized = false; +@JsonIgnore +private final Path dumpFile; +@JsonIgnore +private final HiveConf hiveConf; + +public FailoverMetaData() { +//to be instantiated by JSON ObjectMapper. +dumpFile = null; +hiveConf = null; Review comment: If we remove them, it won't compile saying that these two variables are not initialised. ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/FailoverMetaData.java ## @@ -0,0 +1,207 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.parse.repl.load; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonIgnoreProperties; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.core.JsonProcessingException; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.parse.repl.dump.Utils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.BufferedReader; +import java.io.IOException; +import java.io.InputStreamReader; +import java.util.List; +
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626377=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626377 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 22/Jul/21 00:02 Start Date: 22/Jul/21 00:02 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674421147 ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/FailoverMetaData.java ## @@ -0,0 +1,207 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.parse.repl.load; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonIgnoreProperties; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.core.JsonProcessingException; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.parse.repl.dump.Utils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.BufferedReader; +import java.io.IOException; +import java.io.InputStreamReader; +import java.util.List; + +@JsonIgnoreProperties(ignoreUnknown = true) +public class FailoverMetaData { +public static final String FAILOVER_METADATA = "_failovermetadata"; +private static final Logger LOG = LoggerFactory.getLogger(FailoverMetaData.class); + +private static ObjectMapper JSON_OBJECT_MAPPER = new ObjectMapper(); + +@JsonProperty +private Long failoverEventId = null; +@JsonProperty +private Long cursorPoint = null; +@JsonProperty +private List abortedTxns; +@JsonProperty +private List openTxns; +@JsonProperty +private List txnsWithoutLock; + +@JsonIgnore +private boolean initialized = false; +@JsonIgnore +private final Path dumpFile; +@JsonIgnore +private final HiveConf hiveConf; + +public FailoverMetaData() { +//to be instantiated by JSON ObjectMapper. +dumpFile = null; +hiveConf = null; Review comment: If we remove them, it won't compile saying that these two variables are not initialised. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626377) Time Spent: 7h (was: 6h 50m) > Handle failover case during Repl Dump > - > > Key: HIVE-24918 > URL: https://issues.apache.org/jira/browse/HIVE-24918 > Project: Hive > Issue Type: New Feature >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 7h > Remaining Estimate: 0h > > To handle: > a) Whenever user wants to go ahead with failover, during the next or > subsequent repl dump operation upon confirming that there are no pending open > transaction events, It should create a _failover_ready marker file in the > dump dir. This marker file would contain scheduled query name > that has generated this dump. > b) Skip next repl dump instances once we have the marker file placed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626373=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626373 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 21/Jul/21 23:58 Start Date: 21/Jul/21 23:58 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674419823 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -552,6 +648,31 @@ private boolean isTableSatifiesConfig(Table table) { return true; } + private void fetchFailoverMetadata(Hive hiveDb) throws HiveException, TException { +FailoverMetaData fmd = new FailoverMetaData( +new Path(work.getCurrentDumpPath(), ReplUtils.REPL_HIVE_BASE_DIR), conf); +List txnsForDb = getOpenTxns(getTxnMgr().getValidTxns(excludedTxns), work.dbNameOrPattern); +if (!txnsForDb.isEmpty()) { + hiveDb.abortTransactions(txnsForDb); +} +fmd.setAbortedTxns(txnsForDb); +fmd.setCursorPoint(currentNotificationId(hiveDb)); +ValidTxnList failoverTxns = getTxnMgr().getValidTxns(excludedTxns); Review comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626373) Time Spent: 6h 50m (was: 6h 40m) > Handle failover case during Repl Dump > - > > Key: HIVE-24918 > URL: https://issues.apache.org/jira/browse/HIVE-24918 > Project: Hive > Issue Type: New Feature >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 6h 50m > Remaining Estimate: 0h > > To handle: > a) Whenever user wants to go ahead with failover, during the next or > subsequent repl dump operation upon confirming that there are no pending open > transaction events, It should create a _failover_ready marker file in the > dump dir. This marker file would contain scheduled query name > that has generated this dump. > b) Skip next repl dump instances once we have the marker file placed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626372=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626372 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 21/Jul/21 23:57 Start Date: 21/Jul/21 23:57 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674419599 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -242,6 +253,56 @@ public int execute() { return 0; } + private void rollbackFailover(Path failoverReadyMarker, Path failoverMetadataFile, Database db) + throws HiveException, IOException { +LOG.info("Rolling back failover initiated in previous dump iteration."); +FileSystem fs = failoverMetadataFile.getFileSystem(conf); +if (failoverMetadataFile != null) { Review comment: Yes, this can be removed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626372) Time Spent: 6h 40m (was: 6.5h) > Handle failover case during Repl Dump > - > > Key: HIVE-24918 > URL: https://issues.apache.org/jira/browse/HIVE-24918 > Project: Hive > Issue Type: New Feature >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 6h 40m > Remaining Estimate: 0h > > To handle: > a) Whenever user wants to go ahead with failover, during the next or > subsequent repl dump operation upon confirming that there are no pending open > transaction events, It should create a _failover_ready marker file in the > dump dir. This marker file would contain scheduled query name > that has generated this dump. > b) Skip next repl dump instances once we have the marker file placed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626371=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626371 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 21/Jul/21 23:56 Start Date: 21/Jul/21 23:56 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674419121 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java ## @@ -200,6 +203,233 @@ private void testTargetDbReplIncompatible(boolean setReplIncompProp) throws Thro } } + @Test + public void testFailoverDuringDump() throws Throwable { +HiveConf primaryConf = primary.getConf(); +TxnStore txnHandler = TxnUtils.getTxnStore(primary.getConf()); +WarehouseInstance.Tuple dumpData = null; Review comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626371) Time Spent: 6.5h (was: 6h 20m) > Handle failover case during Repl Dump > - > > Key: HIVE-24918 > URL: https://issues.apache.org/jira/browse/HIVE-24918 > Project: Hive > Issue Type: New Feature >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 6.5h > Remaining Estimate: 0h > > To handle: > a) Whenever user wants to go ahead with failover, during the next or > subsequent repl dump operation upon confirming that there are no pending open > transaction events, It should create a _failover_ready marker file in the > dump dir. This marker file would contain scheduled query name > that has generated this dump. > b) Skip next repl dump instances once we have the marker file placed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626368 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 21/Jul/21 23:54 Start Date: 21/Jul/21 23:54 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674418314 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -424,11 +518,13 @@ private boolean validDump(Path dumpDir) throws IOException { return false; } - private boolean shouldDump(Path previousDumpPath) throws IOException { + private boolean shouldDump(Path previousDumpPath, boolean isPrevDumpFailoverReady) throws IOException { //If no previous dump means bootstrap. So return true as there was no //previous dump to load if (previousDumpPath == null) { return true; +} else if (isPrevDumpFailoverReady) { + return false; Review comment: Yes, in that case, it won't be treated as valid dump and execution would go to getCurrentDumpPath to resume the previous failed dump. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626368) Time Spent: 6h 10m (was: 6h) > Handle failover case during Repl Dump > - > > Key: HIVE-24918 > URL: https://issues.apache.org/jira/browse/HIVE-24918 > Project: Hive > Issue Type: New Feature >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 6h 10m > Remaining Estimate: 0h > > To handle: > a) Whenever user wants to go ahead with failover, during the next or > subsequent repl dump operation upon confirming that there are no pending open > transaction events, It should create a _failover_ready marker file in the > dump dir. This marker file would contain scheduled query name > that has generated this dump. > b) Skip next repl dump instances once we have the marker file placed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626370 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 21/Jul/21 23:54 Start Date: 21/Jul/21 23:54 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674418439 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -242,6 +253,56 @@ public int execute() { return 0; } + private void rollbackFailover(Path failoverReadyMarker, Path failoverMetadataFile, Database db) + throws HiveException, IOException { +LOG.info("Rolling back failover initiated in previous dump iteration."); +FileSystem fs = failoverMetadataFile.getFileSystem(conf); +if (failoverMetadataFile != null) { + fs.delete(failoverMetadataFile, true); +} +if (failoverReadyMarker != null) { + fs.delete(failoverReadyMarker, true); +} +unsetReplFailoverEnabledIfSet(db); + } + + private boolean checkFailoverStatus(Path previousValidHiveDumpPath) throws HiveException, IOException { Review comment: Will refractor this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626370) Time Spent: 6h 20m (was: 6h 10m) > Handle failover case during Repl Dump > - > > Key: HIVE-24918 > URL: https://issues.apache.org/jira/browse/HIVE-24918 > Project: Hive > Issue Type: New Feature >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 6h 20m > Remaining Estimate: 0h > > To handle: > a) Whenever user wants to go ahead with failover, during the next or > subsequent repl dump operation upon confirming that there are no pending open > transaction events, It should create a _failover_ready marker file in the > dump dir. This marker file would contain scheduled query name > that has generated this dump. > b) Skip next repl dump instances once we have the marker file placed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626367=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626367 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 21/Jul/21 23:53 Start Date: 21/Jul/21 23:53 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674417976 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -173,22 +178,24 @@ public int execute() { return ErrorMsg.REPL_FAILED_WITH_NON_RECOVERABLE_ERROR.getErrorCode(); } Path previousValidHiveDumpPath = getPreviousValidDumpMetadataPath(dumpRoot); -boolean isBootstrap = (previousValidHiveDumpPath == null); -work.setBootstrap(isBootstrap); +work.setBootstrap(previousValidHiveDumpPath == null); if (previousValidHiveDumpPath != null) { work.setOldReplScope(new DumpMetaData(previousValidHiveDumpPath, conf).getReplScope()); } -//If no previous dump is present or previous dump is already loaded, proceed with the dump operation. -if (shouldDump(previousValidHiveDumpPath)) { - Path currentDumpPath = getCurrentDumpPath(dumpRoot, isBootstrap); +boolean isPrevDumpFailoverReady = checkFailoverStatus(previousValidHiveDumpPath); Review comment: checkFailoverStatus is used to examine valid dump dir (containing dump ACK file) whereas in getCurrentDumpPath, we're probing the invalid dump dir (meaning without dump ACK). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626367) Time Spent: 6h (was: 5h 50m) > Handle failover case during Repl Dump > - > > Key: HIVE-24918 > URL: https://issues.apache.org/jira/browse/HIVE-24918 > Project: Hive > Issue Type: New Feature >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 6h > Remaining Estimate: 0h > > To handle: > a) Whenever user wants to go ahead with failover, during the next or > subsequent repl dump operation upon confirming that there are no pending open > transaction events, It should create a _failover_ready marker file in the > dump dir. This marker file would contain scheduled query name > that has generated this dump. > b) Skip next repl dump instances once we have the marker file placed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626366 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 21/Jul/21 23:51 Start Date: 21/Jul/21 23:51 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674417311 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -268,11 +329,35 @@ private boolean shouldDumpAtlasMetadata() { return conf.getBoolVar(HiveConf.ConfVars.REPL_INCLUDE_ATLAS_METADATA); } - private Path getCurrentDumpPath(Path dumpRoot, boolean isBootstrap) throws IOException { + private Path getCurrentDumpPath(Path dumpRoot, boolean isBootstrap) throws IOException, HiveException { Path lastDumpPath = ReplUtils.getLatestDumpPath(dumpRoot, conf); if (lastDumpPath != null && shouldResumePreviousDump(lastDumpPath, isBootstrap)) { //Resume previous dump LOG.info("Resuming the dump with existing dump directory {}", lastDumpPath); + FileSystem fs = lastDumpPath.getFileSystem(conf); + Path hiveDumpDir = new Path(lastDumpPath, ReplUtils.REPL_HIVE_BASE_DIR); + Path failoverMetadataFile = new Path(hiveDumpDir, FailoverMetaData.FAILOVER_METADATA); + Path failoverReadyMarkerFile = new Path(hiveDumpDir, ReplAck.FAILOVER_READY_MARKER.toString()); + if (fs.exists(failoverReadyMarkerFile)) { +//If failoverReadyMarkerFile exists, this means previous dump iteration failed while creating dump ACK file. +//So, just delete this file and proceed further. +LOG.info("Deleting failover ready marker file: {} created in previous dump iteration.", failoverReadyMarkerFile); +fs.delete(failoverReadyMarkerFile, true); + } + if (fs.exists(failoverMetadataFile)) { +//If failoverMetadata file exists, this means previous dump iteration failed after writing failover metadata info +//Now, if the failover start config is enabled, then just use the same metadata in current iteration also. +//Else just rollback failover initiated in previous failed dump iteration. +if (conf.getBoolVar(HiveConf.ConfVars.HIVE_REPL_FAILOVER_START)) { + FailoverMetaData fmd = new FailoverMetaData(hiveDumpDir, conf); + if (fmd.isValidMetadata()) { Review comment: then, it would be recalculated in this dump iteration. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626366) Time Spent: 5h 50m (was: 5h 40m) > Handle failover case during Repl Dump > - > > Key: HIVE-24918 > URL: https://issues.apache.org/jira/browse/HIVE-24918 > Project: Hive > Issue Type: New Feature >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 5h 50m > Remaining Estimate: 0h > > To handle: > a) Whenever user wants to go ahead with failover, during the next or > subsequent repl dump operation upon confirming that there are no pending open > transaction events, It should create a _failover_ready marker file in the > dump dir. This marker file would contain scheduled query name > that has generated this dump. > b) Skip next repl dump instances once we have the marker file placed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626365=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626365 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 21/Jul/21 23:50 Start Date: 21/Jul/21 23:50 Worklog Time Spent: 10m Work Description: hmangla98 commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674417074 ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -552,6 +648,31 @@ private boolean isTableSatifiesConfig(Table table) { return true; } + private void fetchFailoverMetadata(Hive hiveDb) throws HiveException, TException { +FailoverMetaData fmd = new FailoverMetaData( +new Path(work.getCurrentDumpPath(), ReplUtils.REPL_HIVE_BASE_DIR), conf); +List txnsForDb = getOpenTxns(getTxnMgr().getValidTxns(excludedTxns), work.dbNameOrPattern); +if (!txnsForDb.isEmpty()) { + hiveDb.abortTransactions(txnsForDb); Review comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626365) Time Spent: 5h 40m (was: 5.5h) > Handle failover case during Repl Dump > - > > Key: HIVE-24918 > URL: https://issues.apache.org/jira/browse/HIVE-24918 > Project: Hive > Issue Type: New Feature >Reporter: Haymant Mangla >Assignee: Haymant Mangla >Priority: Major > Labels: pull-request-available > Time Spent: 5h 40m > Remaining Estimate: 0h > > To handle: > a) Whenever user wants to go ahead with failover, during the next or > subsequent repl dump operation upon confirming that there are no pending open > transaction events, It should create a _failover_ready marker file in the > dump dir. This marker file would contain scheduled query name > that has generated this dump. > b) Skip next repl dump instances once we have the marker file placed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25307) Hive Server 2 crashes when Thrift library encounters particular security protocol issue
[ https://issues.apache.org/jira/browse/HIVE-25307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-25307: Attachment: hive-thrift-fix2-05-3_1.patch > Hive Server 2 crashes when Thrift library encounters particular security > protocol issue > --- > > Key: HIVE-25307 > URL: https://issues.apache.org/jira/browse/HIVE-25307 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: pull-request-available > Attachments: hive-thrift-fix2-03-3_1.patch, > hive-thrift-fix2-04-3_1.patch, hive-thrift-fix2-05-3_1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > A RuntimeException is thrown by the Thrift library that causes Hive Server 2 > to crash on our customer's machine. If you Google this the exception has been > reported a couple of times over the years but not fixed. A blog (see > references below) says it is an occasional security protocol issue between > Hive Server 2 and a proxy like a Gateway. > One challenge in the older 0.9.3 Thrift version was the Thrift > TTransportFactory getTransport method declaration had throws no Exceptions. > Hence the likely choice of RuntimeException. But that Exception is fatal to > Hive Server 2. > The proposed fix is a work around is we catch RuntimeException in the inner > class TUGIAssumingTransportFactory of the HadoopThriftAuthBridge class in > Hive Server 2. And throw a throw the RuntimeException's (inner) cause (e.g. > TSaslTransportException) as a TTransportException. > Once the Thrift library stops throwing RuntimeException or we catch fatal > Throwable exceptions in the Thrift library's TThreadPoolServer's inner class > WorkerProcess run method and display them, the RuntimeException try/catch > clause can be removed. > ExceptionClassName: > java.lang.RuntimeException > ExceptionStackTrace: > java.lang.RuntimeException: > org.apache.thrift.transport.TSaslTransportException: No data or no sasl data > in the stream > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:694) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:691) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:691) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no > sasl data in the stream > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:326) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > ... 10 more > > References: > [Hive server 2 thrift error - Cloudera Community - > 34293|https://community.cloudera.com/t5/Support-Questions/Hive-server-2-thrift-error/td-p/34293] > Eric Lin blog "“NO DATA OR NO SASL DATA IN THE STREAM” ERROR IN HIVESERVER2 > LOG" > HIVE-12754 AuthTypes.NONE cause exception after HS2 start - ASF JIRA > (apache.org) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25307) Hive Server 2 crashes when Thrift library encounters particular security protocol issue
[ https://issues.apache.org/jira/browse/HIVE-25307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-25307: Attachment: (was: hive-thrift-fix2-01-3_1.patch) > Hive Server 2 crashes when Thrift library encounters particular security > protocol issue > --- > > Key: HIVE-25307 > URL: https://issues.apache.org/jira/browse/HIVE-25307 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: pull-request-available > Attachments: hive-thrift-fix2-03-3_1.patch, > hive-thrift-fix2-04-3_1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > A RuntimeException is thrown by the Thrift library that causes Hive Server 2 > to crash on our customer's machine. If you Google this the exception has been > reported a couple of times over the years but not fixed. A blog (see > references below) says it is an occasional security protocol issue between > Hive Server 2 and a proxy like a Gateway. > One challenge in the older 0.9.3 Thrift version was the Thrift > TTransportFactory getTransport method declaration had throws no Exceptions. > Hence the likely choice of RuntimeException. But that Exception is fatal to > Hive Server 2. > The proposed fix is a work around is we catch RuntimeException in the inner > class TUGIAssumingTransportFactory of the HadoopThriftAuthBridge class in > Hive Server 2. And throw a throw the RuntimeException's (inner) cause (e.g. > TSaslTransportException) as a TTransportException. > Once the Thrift library stops throwing RuntimeException or we catch fatal > Throwable exceptions in the Thrift library's TThreadPoolServer's inner class > WorkerProcess run method and display them, the RuntimeException try/catch > clause can be removed. > ExceptionClassName: > java.lang.RuntimeException > ExceptionStackTrace: > java.lang.RuntimeException: > org.apache.thrift.transport.TSaslTransportException: No data or no sasl data > in the stream > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:694) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:691) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:691) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no > sasl data in the stream > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:326) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > ... 10 more > > References: > [Hive server 2 thrift error - Cloudera Community - > 34293|https://community.cloudera.com/t5/Support-Questions/Hive-server-2-thrift-error/td-p/34293] > Eric Lin blog "“NO DATA OR NO SASL DATA IN THE STREAM” ERROR IN HIVESERVER2 > LOG" > HIVE-12754 AuthTypes.NONE cause exception after HS2 start - ASF JIRA > (apache.org) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25307) Hive Server 2 crashes when Thrift library encounters particular security protocol issue
[ https://issues.apache.org/jira/browse/HIVE-25307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-25307: Attachment: (was: hive-thrift-fix2-02-3_1.patch) > Hive Server 2 crashes when Thrift library encounters particular security > protocol issue > --- > > Key: HIVE-25307 > URL: https://issues.apache.org/jira/browse/HIVE-25307 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Labels: pull-request-available > Attachments: hive-thrift-fix2-03-3_1.patch, > hive-thrift-fix2-04-3_1.patch > > Time Spent: 10m > Remaining Estimate: 0h > > A RuntimeException is thrown by the Thrift library that causes Hive Server 2 > to crash on our customer's machine. If you Google this the exception has been > reported a couple of times over the years but not fixed. A blog (see > references below) says it is an occasional security protocol issue between > Hive Server 2 and a proxy like a Gateway. > One challenge in the older 0.9.3 Thrift version was the Thrift > TTransportFactory getTransport method declaration had throws no Exceptions. > Hence the likely choice of RuntimeException. But that Exception is fatal to > Hive Server 2. > The proposed fix is a work around is we catch RuntimeException in the inner > class TUGIAssumingTransportFactory of the HadoopThriftAuthBridge class in > Hive Server 2. And throw a throw the RuntimeException's (inner) cause (e.g. > TSaslTransportException) as a TTransportException. > Once the Thrift library stops throwing RuntimeException or we catch fatal > Throwable exceptions in the Thrift library's TThreadPoolServer's inner class > WorkerProcess run method and display them, the RuntimeException try/catch > clause can be removed. > ExceptionClassName: > java.lang.RuntimeException > ExceptionStackTrace: > java.lang.RuntimeException: > org.apache.thrift.transport.TSaslTransportException: No data or no sasl data > in the stream > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:694) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:691) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1710) > at > org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:691) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:269) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no > sasl data in the stream > at > org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:326) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > ... 10 more > > References: > [Hive server 2 thrift error - Cloudera Community - > 34293|https://community.cloudera.com/t5/Support-Questions/Hive-server-2-thrift-error/td-p/34293] > Eric Lin blog "“NO DATA OR NO SASL DATA IN THE STREAM” ERROR IN HIVESERVER2 > LOG" > HIVE-12754 AuthTypes.NONE cause exception after HS2 start - ASF JIRA > (apache.org) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25364) Null Pointer Exception while estimating row count in external tables.
[ https://issues.apache.org/jira/browse/HIVE-25364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Soumyakanti Das updated HIVE-25364: --- Description: Running the query below for external tables produces NPE because of missing APIs to handle JDBC Converter and JdbcHiveTableScan in [RelMdDistinctRowCount.java|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/RelMdDistinctRowCount.java] (calcite). The catch-all method [getDistinctRowCount|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/RelMdDistinctRowCount.java#L76] returns null for HiveJdbcConverter and JdbcHiveTableScan, which ultimately results in a null value for *computeInnerJoinSelectivity(j, mq, predicate)* method at [HiveRelMdSelectivity.java|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/stats/HiveRelMdSelectivity.java#L78] {code:java} double innerJoinSelectivity = computeInnerJoinSelectivity(j, mq, predicate); {code} Query: {code:java} explain cbo with t1 as (select fkey, ikey, sum(dkey) as dk_sum, sum(dkey2) as dk2_sum from ext_simple_derby_table1 left join ext_simple_derby_table3 on ikey = ikey2 where fkey2 is null group by fkey, ikey), t2 as (select datekey, fkey, ikey, sum(dkey) as dk_sum2, sum(dkey2) as dk2_sum2 from ext_simple_derby_table2 left join ext_simple_derby_table4 on ikey = ikey2 where fkey2 is null group by datekey, fkey, ikey) select t1.fkey, t2.ikey, sum(t1.ikey) from t1 left join t2 on t1.ikey = t2.ikey AND t1.fkey = t2.fkey where t2.fkey is null group by t2.datekey, t1.fkey, t2.ikey {code} The stacktrace: {code:java} java.lang.NullPointerException at org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdSelectivity.getSelectivity(HiveRelMdSelectivity.java:78) at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source) at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source) at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source) at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426) at org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:765) at org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:131) at org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRowCount.getRowCount(HiveRelMdRowCount.java:175) at org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRuntimeRowCount.getRowCount(HiveRelMdRuntimeRowCount.java:53) at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212) at org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:205) at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212) at org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:140) at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212) at org.apache.calcite.rel.metadata.RelMdUtil.getJoinRowCount(RelMdUtil.java:723) at org.apache.calcite.rel.core.Join.estimateRowCount(Join.java:205) at org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRowCount.getRowCount(HiveRelMdRowCount.java:113) at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212) at org.apache.hadoop.hive.ql.optimizer.calcite.stats.FilterSelectivityEstimator.(FilterSelectivityEstimator.java:62) at
[jira] [Assigned] (HIVE-25364) Null Pointer Exception while estimating row count in external tables.
[ https://issues.apache.org/jira/browse/HIVE-25364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Soumyakanti Das reassigned HIVE-25364: -- > Null Pointer Exception while estimating row count in external tables. > - > > Key: HIVE-25364 > URL: https://issues.apache.org/jira/browse/HIVE-25364 > Project: Hive > Issue Type: Bug >Reporter: Soumyakanti Das >Assignee: Soumyakanti Das >Priority: Major > > Running the query below for external tables produces NPE because of missing > APIs to handle JDBC Converter and JdbcHiveTableScan in > [RelMdDistinctRowCount.java|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/RelMdDistinctRowCount.java] > (calcite). The catch-all method > [getDistinctRowCount|https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/metadata/RelMdDistinctRowCount.java#L76] > returns null for HiveJdbcConverter and JdbcHiveTableScan, which ultimately > results in a null value for *computeInnerJoinSelectivity(j, mq, predicate)* > method shown below. > {code:java} > double innerJoinSelectivity = computeInnerJoinSelectivity(j, mq, predicate); > {code} > Query: > {code} > explain cbo > with t1 as (select fkey, ikey, sum(dkey) as dk_sum, sum(dkey2) as dk2_sum > from ext_simple_derby_table1 left join ext_simple_derby_table3 > on ikey = ikey2 > where fkey2 is null > group by fkey, ikey), > t2 as (select datekey, fkey, ikey, sum(dkey) as dk_sum2, sum(dkey2) as > dk2_sum2 >from ext_simple_derby_table2 left join ext_simple_derby_table4 >on ikey = ikey2 >where fkey2 is null >group by datekey, fkey, ikey) > select t1.fkey, t2.ikey, sum(t1.ikey) > from t1 left join t2 > on t1.ikey = t2.ikey AND t1.fkey = t2.fkey > where t2.fkey is null > group by t2.datekey, t1.fkey, t2.ikey > {code} > The stacktrace: > {code} > java.lang.NullPointerException > at > org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdSelectivity.getSelectivity(HiveRelMdSelectivity.java:78) > at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source) > at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source) > at GeneratedMetadataHandler_Selectivity.getSelectivity_$(Unknown Source) > at GeneratedMetadataHandler_Selectivity.getSelectivity(Unknown Source) > at > org.apache.calcite.rel.metadata.RelMetadataQuery.getSelectivity(RelMetadataQuery.java:426) > at > org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:765) > at > org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:131) > at > org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRowCount.getRowCount(HiveRelMdRowCount.java:175) > at > org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRuntimeRowCount.getRowCount(HiveRelMdRuntimeRowCount.java:53) > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) > at > org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212) > at > org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:205) > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) > at > org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212) > at > org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:140) > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) > at > org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:212) > at > org.apache.calcite.rel.metadata.RelMdUtil.getJoinRowCount(RelMdUtil.java:723) > at org.apache.calcite.rel.core.Join.estimateRowCount(Join.java:205) > at > org.apache.hadoop.hive.ql.optimizer.calcite.stats.HiveRelMdRowCount.getRowCount(HiveRelMdRowCount.java:113) > at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) > at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) >
[jira] [Work logged] (HIVE-24918) Handle failover case during Repl Dump
[ https://issues.apache.org/jira/browse/HIVE-24918?focusedWorklogId=626319=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626319 ] ASF GitHub Bot logged work on HIVE-24918: - Author: ASF GitHub Bot Created on: 21/Jul/21 20:43 Start Date: 21/Jul/21 20:43 Worklog Time Spent: 10m Work Description: pkumarsinha commented on a change in pull request #2121: URL: https://github.com/apache/hive/pull/2121#discussion_r674235129 ## File path: itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationScenariosAcidTables.java ## @@ -200,6 +203,233 @@ private void testTargetDbReplIncompatible(boolean setReplIncompProp) throws Thro } } + @Test + public void testFailoverDuringDump() throws Throwable { +HiveConf primaryConf = primary.getConf(); +TxnStore txnHandler = TxnUtils.getTxnStore(primary.getConf()); +WarehouseInstance.Tuple dumpData = null; Review comment: Declare at line 212 only. ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -552,6 +648,31 @@ private boolean isTableSatifiesConfig(Table table) { return true; } + private void fetchFailoverMetadata(Hive hiveDb) throws HiveException, TException { +FailoverMetaData fmd = new FailoverMetaData( +new Path(work.getCurrentDumpPath(), ReplUtils.REPL_HIVE_BASE_DIR), conf); +List txnsForDb = getOpenTxns(getTxnMgr().getValidTxns(excludedTxns), work.dbNameOrPattern); +if (!txnsForDb.isEmpty()) { + hiveDb.abortTransactions(txnsForDb); +} +fmd.setAbortedTxns(txnsForDb); +fmd.setCursorPoint(currentNotificationId(hiveDb)); +ValidTxnList failoverTxns = getTxnMgr().getValidTxns(excludedTxns); Review comment: failoverTxns --> allValidTxns ? ## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ## @@ -552,6 +648,31 @@ private boolean isTableSatifiesConfig(Table table) { return true; } + private void fetchFailoverMetadata(Hive hiveDb) throws HiveException, TException { +FailoverMetaData fmd = new FailoverMetaData( +new Path(work.getCurrentDumpPath(), ReplUtils.REPL_HIVE_BASE_DIR), conf); +List txnsForDb = getOpenTxns(getTxnMgr().getValidTxns(excludedTxns), work.dbNameOrPattern); +if (!txnsForDb.isEmpty()) { + hiveDb.abortTransactions(txnsForDb); +} +fmd.setAbortedTxns(txnsForDb); +fmd.setCursorPoint(currentNotificationId(hiveDb)); +ValidTxnList failoverTxns = getTxnMgr().getValidTxns(excludedTxns); +List openTxns = getOpenTxns(failoverTxns); +fmd.setOpenTxns(openTxns); +fmd.setTxnsWithoutLock(getTxnsNotPresentInHiveLocksTable(openTxns)); +txnsForDb = getOpenTxns(failoverTxns, work.dbNameOrPattern); +if (!txnsForDb.isEmpty()) { + LOG.warn("Txns: " + txnsForDb + " initiated for database: " + + work.dbNameOrPattern + " while failover is in progress."); Review comment: You are also aborting these, add that too in the log ## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/FailoverMetaData.java ## @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.hive.ql.parse.repl.load; + +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonIgnoreProperties; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.core.JsonProcessingException; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.hadoop.hive.conf.HiveConf; +import org.apache.hadoop.hive.ql.parse.SemanticException; +import org.apache.hadoop.hive.ql.parse.repl.dump.Utils; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.BufferedReader; +import java.io.IOException; +import java.io.InputStreamReader; +import java.util.List; + +@JsonIgnoreProperties(ignoreUnknown = true) +public class FailoverMetaData { +public static final String FAILOVER_METADATA =
[jira] [Work logged] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT
[ https://issues.apache.org/jira/browse/HIVE-25306?focusedWorklogId=626222=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626222 ] ASF GitHub Bot logged work on HIVE-25306: - Author: ASF GitHub Bot Created on: 21/Jul/21 16:26 Start Date: 21/Jul/21 16:26 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #2445: URL: https://github.com/apache/hive/pull/2445#discussion_r674145711 ## File path: serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java ## @@ -1254,9 +1260,13 @@ public static Timestamp getTimestampFromString(String s) { s = s.trim(); s = trimNanoTimestamp(s); +if(StringUtils.isEmpty(s)) + return null; + try { return TimestampUtils.stringToTimestamp(s); -} catch (IllegalArgumentException e) { +} catch (IllegalArgumentException | DateTimeException e) { + LOG.info("cannot parse datetime : {}", s); Review comment: Removed ## File path: common/src/java/org/apache/hive/common/util/TimestampParser.java ## @@ -191,7 +190,13 @@ public Timestamp parseTimestamp(final String text) LOG.debug("Could not parse timestamp text: {}", text); } } -return Timestamp.valueOf(text); +Timestamp timestamp = null; +try { + timestamp = Timestamp.valueOf(text); +} catch (IllegalArgumentException e) { + LOG.info(e.getMessage()); Review comment: Removed ## File path: common/src/test/org/apache/hive/common/util/TestTimestampParser.java ## @@ -47,12 +47,18 @@ public void testDefault() { Assert.assertEquals(Timestamp.valueOf("1945-12-31T23:59:59"), tsp.parseTimestamp("1945-12-31 23:59:59")); + } - @Test(expected = IllegalArgumentException.class) + @Test public void testDefaultInvalid() { final TimestampParser tsp = new TimestampParser(); -tsp.parseTimestamp("12345"); +Assert.assertEquals(null, tsp.parseTimestamp("12345")); +Assert.assertEquals(null, tsp.parseTimestamp("1945-12-45 23:59:59")); +Assert.assertEquals(null, tsp.parseTimestamp("1945-15-20 23:59:59")); +Assert.assertEquals(null, tsp.parseTimestamp("-00-00 00:00:00")); +Assert.assertEquals(null, tsp.parseTimestamp("")); +Assert.assertEquals(null, tsp.parseTimestamp("null")); Review comment: Done ## File path: common/src/test/org/apache/hive/common/util/TestTimestampParser.java ## @@ -25,7 +25,7 @@ /** * Test suite for parsing timestamps. */ -public class TestTimestampParser { +public class TestTimestampParser { Review comment: Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626222) Time Spent: 2h 40m (was: 2.5h) > Move Date and Timestamp parsing from ResolverStyle.LENIENT to > ResolverStyle.STRICT > -- > > Key: HIVE-25306 > URL: https://issues.apache.org/jira/browse/HIVE-25306 > Project: Hive > Issue Type: Bug > Components: Query Planning, UDF >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Attachments: DB_compare.JPG > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Description - > Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to > convert the date/timpstamp from int,string,char etc to Date or Timestamp. > Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date > like "1992-13-12" is converted to "2000-01-12", > Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like > "1992-13-12" is not be converted instead NULL is return. > https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT
[ https://issues.apache.org/jira/browse/HIVE-25306?focusedWorklogId=626221=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626221 ] ASF GitHub Bot logged work on HIVE-25306: - Author: ASF GitHub Bot Created on: 21/Jul/21 16:26 Start Date: 21/Jul/21 16:26 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #2445: URL: https://github.com/apache/hive/pull/2445#discussion_r674145571 ## File path: ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFAddMonths.java ## @@ -173,8 +173,8 @@ public void testWrongDateStr() throws HiveException { ObjectInspector[] arguments = { valueOI0, valueOI1 }; udf.initialize(arguments); -runAndVerify("2014-02-30", 1, "2014-04-02", udf); -runAndVerify("2014-02-32", 1, "2014-04-04", udf); +runAndVerify("2014-02-30", 1, null, udf); +runAndVerify("2014-02-32", 1, null, udf); Review comment: Sure. I will get access and then update the UDF. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626221) Time Spent: 2.5h (was: 2h 20m) > Move Date and Timestamp parsing from ResolverStyle.LENIENT to > ResolverStyle.STRICT > -- > > Key: HIVE-25306 > URL: https://issues.apache.org/jira/browse/HIVE-25306 > Project: Hive > Issue Type: Bug > Components: Query Planning, UDF >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Attachments: DB_compare.JPG > > Time Spent: 2.5h > Remaining Estimate: 0h > > Description - > Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to > convert the date/timpstamp from int,string,char etc to Date or Timestamp. > Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date > like "1992-13-12" is converted to "2000-01-12", > Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like > "1992-13-12" is not be converted instead NULL is return. > https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT
[ https://issues.apache.org/jira/browse/HIVE-25306?focusedWorklogId=626220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626220 ] ASF GitHub Bot logged work on HIVE-25306: - Author: ASF GitHub Bot Created on: 21/Jul/21 16:25 Start Date: 21/Jul/21 16:25 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #2445: URL: https://github.com/apache/hive/pull/2445#discussion_r674145304 ## File path: ql/src/test/results/clientpositive/llap/probedecode_mapjoin_stats.q.out ## @@ -416,4 +416,4 @@ POSTHOOK: Input: default@orders_fact POSTHOOK: Input: default@seller_dim A masked pattern was here 101101 12345 12345 Seller 1Item 1012001-01-30 00:00:00 -104104 23456 23456 Seller 2Item 1042002-03-02 00:00:00 +104104 23456 23456 Seller 2Item 104NULL Review comment: Because of the following query in probedecode_mapjoin_stats.q INSERT INTO orders_fact values(23456, 104, '2002-02-30 00:00:00'); Also timestamp format is "-MM-DD HH:MM:SS" due to that '2002-02-30 00:00:00' value get converted to null. Since probedecode_mapjoin_stats.q is to test happy flow of mapjoin_stats so lets make it some valid date to avoid confusion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626220) Time Spent: 2h 20m (was: 2h 10m) > Move Date and Timestamp parsing from ResolverStyle.LENIENT to > ResolverStyle.STRICT > -- > > Key: HIVE-25306 > URL: https://issues.apache.org/jira/browse/HIVE-25306 > Project: Hive > Issue Type: Bug > Components: Query Planning, UDF >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Attachments: DB_compare.JPG > > Time Spent: 2h 20m > Remaining Estimate: 0h > > Description - > Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to > convert the date/timpstamp from int,string,char etc to Date or Timestamp. > Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date > like "1992-13-12" is converted to "2000-01-12", > Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like > "1992-13-12" is not be converted instead NULL is return. > https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work started] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-25362 started by Panagiotis Garefalakis. - > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from adjusting their locality delay and being > added to the DelayQueue leading sometimes to missed locality chances when all > LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25362: -- Component/s: llap > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task > Components: llap >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from adjusting their locality delay and being > added to the DelayQueue leading sometimes to missed locality chances when all > LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25362: -- Labels: pull-request-available (was: ) > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from adjusting their locality delay and being > added to the DelayQueue leading sometimes to missed locality chances when all > LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?focusedWorklogId=626184=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626184 ] ASF GitHub Bot logged work on HIVE-25362: - Author: ASF GitHub Bot Created on: 21/Jul/21 15:22 Start Date: 21/Jul/21 15:22 Worklog Time Spent: 10m Work Description: pgaref opened a new pull request #2513: URL: https://github.com/apache/hive/pull/2513 Change-Id: Ib2d91faf4ee6d1bd839953f97ef7f7d2a00ef65f ### What changes were proposed in this pull request? HIVE-24914 introduced a short-circuit optimization when all nodes are busy returning DELAYED_RESOURCES and reseting locality delay for a given tasks. However, this may prevent tasks from adjusting their locality delay and being added to the DelayQueue leading sometimes to missed locality chances when all LLap resources are fully utilized. Handle the two cases separately. ### Why are the changes needed? Improve locality in Llap scheduler ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? TestLlapTaskSchedulerService#testAdjustLocalityDelay -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626184) Remaining Estimate: 0h Time Spent: 10m > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from adjusting their locality delay and being > added to the DelayQueue leading sometimes to missed locality chances when all > LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25362: -- Description: HIVE-24914 introduced a short-circuit optimization when all nodes are busy returning DELAYED_RESOURCES and reseting locality delay for a given tasks. However, this may prevent tasks from adjusting their locality delay and being added to the DelayQueue leading sometimes to missed locality chances when all LLap resources are fully utilized. To address the issue we should handle the two cases separately. was: HIVE-24914 introduced a short-circuit optimization when all nodes are busy returning DELAYED_RESOURCES and reseting locality delay for a given tasks. However, this may prevent tasks from being added to the DelayQueue leading to worse locality when all LLap resources are fully utilized. To address the issue we should handle the two cases separately. > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from adjusting their locality delay and being > added to the DelayQueue leading sometimes to missed locality chances when all > LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust delay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25362: -- Summary: LLAP: ensure tasks with locality have a chance to adjust delay (was: LLAP: ensure tasks with locality have a chance to adjust localityDelay) > LLAP: ensure tasks with locality have a chance to adjust delay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from being added to the DelayQueue leading to > worse locality when all LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality have a chance to adjust localityDelay
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25362: -- Summary: LLAP: ensure tasks with locality have a chance to adjust localityDelay (was: LLAP: ensure tasks with locality are added to DelayQueue) > LLAP: ensure tasks with locality have a chance to adjust localityDelay > -- > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from being added to the DelayQueue leading to > worse locality when all LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT
[ https://issues.apache.org/jira/browse/HIVE-25306?focusedWorklogId=626181=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626181 ] ASF GitHub Bot logged work on HIVE-25306: - Author: ASF GitHub Bot Created on: 21/Jul/21 15:13 Start Date: 21/Jul/21 15:13 Worklog Time Spent: 10m Work Description: zabetak commented on a change in pull request #2445: URL: https://github.com/apache/hive/pull/2445#discussion_r673845756 ## File path: common/src/test/org/apache/hive/common/util/TestTimestampParser.java ## @@ -47,12 +47,18 @@ public void testDefault() { Assert.assertEquals(Timestamp.valueOf("1945-12-31T23:59:59"), tsp.parseTimestamp("1945-12-31 23:59:59")); + } - @Test(expected = IllegalArgumentException.class) + @Test public void testDefaultInvalid() { final TimestampParser tsp = new TimestampParser(); -tsp.parseTimestamp("12345"); +Assert.assertEquals(null, tsp.parseTimestamp("12345")); +Assert.assertEquals(null, tsp.parseTimestamp("1945-12-45 23:59:59")); +Assert.assertEquals(null, tsp.parseTimestamp("1945-15-20 23:59:59")); +Assert.assertEquals(null, tsp.parseTimestamp("-00-00 00:00:00")); +Assert.assertEquals(null, tsp.parseTimestamp("")); +Assert.assertEquals(null, tsp.parseTimestamp("null")); Review comment: `assertEquals` -> `assertNull` ## File path: common/src/java/org/apache/hive/common/util/TimestampParser.java ## @@ -191,7 +190,13 @@ public Timestamp parseTimestamp(final String text) LOG.debug("Could not parse timestamp text: {}", text); } } -return Timestamp.valueOf(text); +Timestamp timestamp = null; +try { + timestamp = Timestamp.valueOf(text); +} catch (IllegalArgumentException e) { + LOG.info(e.getMessage()); Review comment: Do we need to LOG this exception? I have the impression that when this method returns null we will generate an log message anyways. Moreover, I think that logging at INFO level may be a bit too much for something that seems to be in the regular control flow. ## File path: common/src/test/org/apache/hive/common/util/TestTimestampParser.java ## @@ -25,7 +25,7 @@ /** * Test suite for parsing timestamps. */ -public class TestTimestampParser { +public class TestTimestampParser { Review comment: Nit: Extra space ## File path: ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFAddMonths.java ## @@ -173,8 +173,8 @@ public void testWrongDateStr() throws HiveException { ObjectInspector[] arguments = { valueOI0, valueOI1 }; udf.initialize(arguments); -runAndVerify("2014-02-30", 1, "2014-04-02", udf); -runAndVerify("2014-02-32", 1, "2014-04-04", udf); +runAndVerify("2014-02-30", 1, null, udf); +runAndVerify("2014-02-32", 1, null, udf); Review comment: Worth adding a comment that this behavior is also compatible with MySQL: ``` SELECT DATE_ADD('2014-02-30',INTERVAL 1 MONTH); +-+ | DATE_ADD('2014-02-30',INTERVAL 1 MONTH) | +-+ | NULL| +-+ 1 row in set, 1 warning (0.00 sec) ``` Moreover it is good to mention in the JIRA which UDFs are impacted by the change in this PR and update the [wiki|https://cwiki.apache.org/confluence/display/hive/languagemanual+udf] accordingly. ## File path: serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java ## @@ -1254,9 +1260,13 @@ public static Timestamp getTimestampFromString(String s) { s = s.trim(); s = trimNanoTimestamp(s); +if(StringUtils.isEmpty(s)) + return null; + try { return TimestampUtils.stringToTimestamp(s); -} catch (IllegalArgumentException e) { +} catch (IllegalArgumentException | DateTimeException e) { + LOG.info("cannot parse datetime : {}", s); Review comment: Logging at INFO level may be too much. I would consider WARN, DEBUG, or remove the line altogether. ## File path: ql/src/test/results/clientpositive/llap/probedecode_mapjoin_stats.q.out ## @@ -416,4 +416,4 @@ POSTHOOK: Input: default@orders_fact POSTHOOK: Input: default@seller_dim A masked pattern was here 101101 12345 12345 Seller 1Item 1012001-01-30 00:00:00 -104104 23456 23456 Seller 2Item 1042002-03-02 00:00:00 +104104 23456 23456 Seller 2Item 104NULL Review comment: Why do we get this NULL value? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.
[jira] [Work logged] (HIVE-25344) Add a possibility to query Iceberg table snapshots based on the timestamp or the snapshot id
[ https://issues.apache.org/jira/browse/HIVE-25344?focusedWorklogId=626182=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626182 ] ASF GitHub Bot logged work on HIVE-25344: - Author: ASF GitHub Bot Created on: 21/Jul/21 15:13 Start Date: 21/Jul/21 15:13 Worklog Time Spent: 10m Work Description: pvary opened a new pull request #2512: URL: https://github.com/apache/hive/pull/2512 ### What changes were proposed in this pull request? Allow the following queries working on the Iceberg tables: ``` SELECT * FROM t FOR SYSTEM_TIME AS OF ; SELECT * FROM t FOR SYSTEM_VERSION AS OF ; ``` ### Why are the changes needed? We would like to have timetravel available for Iceberg tables ### Does this PR introduce _any_ user-facing change? Enables the following queries: ``` SELECT * FROM t FOR SYSTEM_TIME AS OF ; SELECT * FROM t FOR SYSTEM_VERSION AS OF ; ``` ### How was this patch tested? Added unit tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626182) Remaining Estimate: 0h Time Spent: 10m > Add a possibility to query Iceberg table snapshots based on the timestamp or > the snapshot id > > > Key: HIVE-25344 > URL: https://issues.apache.org/jira/browse/HIVE-25344 > Project: Hive > Issue Type: New Feature >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Implement the following commands: > {code:java} > SELECT * FROM t FOR SYSTEM_TIME AS OF ; > SELECT * FROM t FOR SYSTEM_VERSION AS OF ;{code} > where SYSTEM_TIME is the Iceberg table state at the given timestamp (UTC), or > SYSTEM_VERSION is the Iceberg table snapshot id. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25344) Add a possibility to query Iceberg table snapshots based on the timestamp or the snapshot id
[ https://issues.apache.org/jira/browse/HIVE-25344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25344: -- Labels: pull-request-available (was: ) > Add a possibility to query Iceberg table snapshots based on the timestamp or > the snapshot id > > > Key: HIVE-25344 > URL: https://issues.apache.org/jira/browse/HIVE-25344 > Project: Hive > Issue Type: New Feature >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Implement the following commands: > {code:java} > SELECT * FROM t FOR SYSTEM_TIME AS OF ; > SELECT * FROM t FOR SYSTEM_VERSION AS OF ;{code} > where SYSTEM_TIME is the Iceberg table state at the given timestamp (UTC), or > SYSTEM_VERSION is the Iceberg table snapshot id. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25356) JDBCSplitFilterAboveJoinRule's onMatch method throws exception
[ https://issues.apache.org/jira/browse/HIVE-25356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Soumyakanti Das updated HIVE-25356: --- Description: The stack trace is produced by [JDBCAbstractSplitFilterRule.java#L181 |https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/jdbc/JDBCAbstractSplitFilterRule.java#L181]. In the onMatch method, a HiveFilter is being cast to HiveJdbcConverter. {code:java} java.lang.ClassCastException: org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveFilter cannot be cast to org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.jdbc.HiveJdbcConverter java.lang.ClassCastException: org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveFilter cannot be cast to org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.jdbc.HiveJdbcConverter at org.apache.hadoop.hive.ql.optimizer.calcite.rules.jdbc.JDBCAbstractSplitFilterRule$JDBCSplitFilterAboveJoinRule.onMatch(JDBCAbstractSplitFilterRule.java:181) at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:333) at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:542) at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:407) at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:271) at org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74) at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:202) at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:189) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2440) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2406) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPostJoinOrderingTransform(CalcitePlanner.java:2326) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1735) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1588) at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1340) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:559) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12512) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:452) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:316) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:175) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:316) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:500) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:453) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:417) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:411) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:256) at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:353) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at
[jira] [Updated] (HIVE-25356) JDBCSplitFilterAboveJoinRule's onMatch method throws exception
[ https://issues.apache.org/jira/browse/HIVE-25356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Soumyakanti Das updated HIVE-25356: --- Description: The stacktrace: {{ java.lang.ClassCastException: org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveFilter cannot be cast to org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.jdbc.HiveJdbcConverter at org.apache.hadoop.hive.ql.optimizer.calcite.rules.jdbc.JDBCAbstractSplitFilterRule$JDBCSplitFilterAboveJoinRule.onMatch(JDBCAbstractSplitFilterRule.java:181) at org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:333) at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:542) at org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:407) at org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:271) at org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74) at org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:202) at org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:189) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2440) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.executeProgram(CalcitePlanner.java:2406) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPostJoinOrderingTransform(CalcitePlanner.java:2326) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1735) at org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1588) at org.apache.calcite.tools.Frameworks.lambda$withPlanner$0(Frameworks.java:131) at org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:914) at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:180) at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:126) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1340) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:559) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12512) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:452) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:316) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:175) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:316) at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:223) at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:105) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:500) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:453) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:417) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:411) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:229) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:256) at org.apache.hadoop.hive.cli.CliDriver.processCmd1(CliDriver.java:201) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:127) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:422) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:353) at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:744) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:714) at org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:170) at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:157) at org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver(TestMiniLlapLocalCliDriver.java:62) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at
[jira] [Commented] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT
[ https://issues.apache.org/jira/browse/HIVE-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384936#comment-17384936 ] Stamatis Zampetakis commented on HIVE-25306: Thanks for the comparison [~ashish-kumar-sharma]. I am adding below some observations from a few quick tests in MySQL. It seems that MySQL in some cases also returns an ERROR for invalid dates in the absence of explicit casts. {code:sql} CREATE TABLE person (id integer, birth date); SELECT * FROM person where birth > '1970-02-29'; {code} {noformat} ERROR 1525 (HY000): Incorrect DATE value: '1970-02-29' Warning (Code 1292): Incorrect date value: '1970-02-29' for column 'birth' at row 1 Error (Code 1525): Incorrect DATE value: '1970-02-29' {noformat} Without the explicit cast the query fails with an error due to the incorrect date. {code:sql} SELECT * FROM person where birth > CAST('1970-02-29' as DATE); {code} {noformat} Empty set, 1 warning (0.00 sec) Warning (Code 1292): Incorrect datetime value: '1970-02-29' {noformat} With the explicit cast the query does not fail but prints a warning and returns an empty result as expected. > Move Date and Timestamp parsing from ResolverStyle.LENIENT to > ResolverStyle.STRICT > -- > > Key: HIVE-25306 > URL: https://issues.apache.org/jira/browse/HIVE-25306 > Project: Hive > Issue Type: Bug > Components: Query Planning, UDF >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Attachments: DB_compare.JPG > > Time Spent: 2h > Remaining Estimate: 0h > > Description - > Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to > convert the date/timpstamp from int,string,char etc to Date or Timestamp. > Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date > like "1992-13-12" is converted to "2000-01-12", > Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like > "1992-13-12" is not be converted instead NULL is return. > https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25362) LLAP: ensure tasks with locality are added to DelayQueue
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis updated HIVE-25362: -- Parent: HIVE-24913 Issue Type: Sub-task (was: Bug) > LLAP: ensure tasks with locality are added to DelayQueue > > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Sub-task >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from being added to the DelayQueue leading to > worse locality when all LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25362) LLAP: ensure tasks with locality are added to DelayQueue
[ https://issues.apache.org/jira/browse/HIVE-25362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Panagiotis Garefalakis reassigned HIVE-25362: - > LLAP: ensure tasks with locality are added to DelayQueue > > > Key: HIVE-25362 > URL: https://issues.apache.org/jira/browse/HIVE-25362 > Project: Hive > Issue Type: Bug >Reporter: Panagiotis Garefalakis >Assignee: Panagiotis Garefalakis >Priority: Major > > HIVE-24914 introduced a short-circuit optimization when all nodes are busy > returning DELAYED_RESOURCES and reseting locality delay for a given tasks. > However, this may prevent tasks from being added to the DelayQueue leading to > worse locality when all LLap resources are fully utilized. > To address the issue we should handle the two cases separately. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics
[ https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=626161=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626161 ] ASF GitHub Bot logged work on HIVE-25345: - Author: ASF GitHub Bot Created on: 21/Jul/21 14:24 Start Date: 21/Jul/21 14:24 Worklog Time Spent: 10m Work Description: klcopp commented on a change in pull request #2493: URL: https://github.com/apache/hive/pull/2493#discussion_r674021874 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -295,11 +295,12 @@ "SELECT COUNT(*), MIN(\"TXN_ID\"), ({0} - MIN(\"TXN_STARTED\"))/1000 FROM \"TXNS\" WHERE \"TXN_STATE\"='" + TxnStatus.ABORTED + "') \"A\" CROSS JOIN (" + "SELECT COUNT(*), ({0} - MIN(\"HL_ACQUIRED_AT\"))/1000 FROM \"HIVE_LOCKS\") \"HL\" CROSS JOIN (" + - "SELECT COUNT(*) FROM (SELECT COUNT(\"TXN_ID\"), \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" FROM \"TXN_COMPONENTS\" " + - "INNER JOIN \"TXNS\" ON \"TC_TXNID\" = \"TXN_ID\" WHERE \"TXN_STATE\"='" + TxnStatus.ABORTED + "' " + - "GROUP BY \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" HAVING COUNT(\"TXN_ID\") > ?) \"L\") \"L\" CROSS JOIN (" + "SELECT ({0} - MIN(\"CQ_COMMIT_TIME\"))/1000 from \"COMPACTION_QUEUE\" WHERE " + "\"CQ_STATE\"=''" + Character.toString(READY_FOR_CLEANING) + "'') OLDEST_CLEAN"; + private static final String SELECT_TABLES_WITH_X_ABORTED_TXNS = + "SELECT \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" FROM \"TXN_COMPONENTS\" " + + "INNER JOIN \"TXNS\" ON \"TC_TXNID\" = \"TXN_ID\" WHERE \"TXN_STATE\" = " + TxnStatus.ABORTED + Review comment: Ok, then never mind -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626161) Time Spent: 2.5h (was: 2h 20m) > Add logging based on new compaction metrics > --- > > Key: HIVE-25345 > URL: https://issues.apache.org/jira/browse/HIVE-25345 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics
[ https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=626151=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626151 ] ASF GitHub Bot logged work on HIVE-25345: - Author: ASF GitHub Bot Created on: 21/Jul/21 14:07 Start Date: 21/Jul/21 14:07 Worklog Time Spent: 10m Work Description: lcspinter commented on a change in pull request #2493: URL: https://github.com/apache/hive/pull/2493#discussion_r674005898 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -295,11 +295,12 @@ "SELECT COUNT(*), MIN(\"TXN_ID\"), ({0} - MIN(\"TXN_STARTED\"))/1000 FROM \"TXNS\" WHERE \"TXN_STATE\"='" + TxnStatus.ABORTED + "') \"A\" CROSS JOIN (" + "SELECT COUNT(*), ({0} - MIN(\"HL_ACQUIRED_AT\"))/1000 FROM \"HIVE_LOCKS\") \"HL\" CROSS JOIN (" + - "SELECT COUNT(*) FROM (SELECT COUNT(\"TXN_ID\"), \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" FROM \"TXN_COMPONENTS\" " + - "INNER JOIN \"TXNS\" ON \"TC_TXNID\" = \"TXN_ID\" WHERE \"TXN_STATE\"='" + TxnStatus.ABORTED + "' " + - "GROUP BY \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" HAVING COUNT(\"TXN_ID\") > ?) \"L\") \"L\" CROSS JOIN (" + "SELECT ({0} - MIN(\"CQ_COMMIT_TIME\"))/1000 from \"COMPACTION_QUEUE\" WHERE " + "\"CQ_STATE\"=''" + Character.toString(READY_FOR_CLEANING) + "'') OLDEST_CLEAN"; + private static final String SELECT_TABLES_WITH_X_ABORTED_TXNS = + "SELECT \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" FROM \"TXN_COMPONENTS\" " + + "INNER JOIN \"TXNS\" ON \"TC_TXNID\" = \"TXN_ID\" WHERE \"TXN_STATE\" = " + TxnStatus.ABORTED + Review comment: I've seen examples with and without quotes. For instance, in `CompactionTxnHandler` we are using it without quotes, and it's working. (markCleaned, findPotentialCompactions, cleanTxnToWriteIdTable, cleanEmptyAbortedAndCommittedTxns) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626151) Time Spent: 2h 20m (was: 2h 10m) > Add logging based on new compaction metrics > --- > > Key: HIVE-25345 > URL: https://issues.apache.org/jira/browse/HIVE-25345 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics
[ https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=626147=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626147 ] ASF GitHub Bot logged work on HIVE-25345: - Author: ASF GitHub Bot Created on: 21/Jul/21 14:01 Start Date: 21/Jul/21 14:01 Worklog Time Spent: 10m Work Description: klcopp commented on a change in pull request #2493: URL: https://github.com/apache/hive/pull/2493#discussion_r674001249 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -295,11 +295,12 @@ "SELECT COUNT(*), MIN(\"TXN_ID\"), ({0} - MIN(\"TXN_STARTED\"))/1000 FROM \"TXNS\" WHERE \"TXN_STATE\"='" + TxnStatus.ABORTED + "') \"A\" CROSS JOIN (" + "SELECT COUNT(*), ({0} - MIN(\"HL_ACQUIRED_AT\"))/1000 FROM \"HIVE_LOCKS\") \"HL\" CROSS JOIN (" + - "SELECT COUNT(*) FROM (SELECT COUNT(\"TXN_ID\"), \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" FROM \"TXN_COMPONENTS\" " + - "INNER JOIN \"TXNS\" ON \"TC_TXNID\" = \"TXN_ID\" WHERE \"TXN_STATE\"='" + TxnStatus.ABORTED + "' " + - "GROUP BY \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" HAVING COUNT(\"TXN_ID\") > ?) \"L\") \"L\" CROSS JOIN (" + "SELECT ({0} - MIN(\"CQ_COMMIT_TIME\"))/1000 from \"COMPACTION_QUEUE\" WHERE " + "\"CQ_STATE\"=''" + Character.toString(READY_FOR_CLEANING) + "'') OLDEST_CLEAN"; + private static final String SELECT_TABLES_WITH_X_ABORTED_TXNS = + "SELECT \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" FROM \"TXN_COMPONENTS\" " + + "INNER JOIN \"TXNS\" ON \"TC_TXNID\" = \"TXN_ID\" WHERE \"TXN_STATE\" = " + TxnStatus.ABORTED + Review comment: Uh.. that's strange. TXN_STATE is a char(1), the values should be surrounded with quotes, no? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626147) Time Spent: 2h 10m (was: 2h) > Add logging based on new compaction metrics > --- > > Key: HIVE-25345 > URL: https://issues.apache.org/jira/browse/HIVE-25345 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics
[ https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=626145=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626145 ] ASF GitHub Bot logged work on HIVE-25345: - Author: ASF GitHub Bot Created on: 21/Jul/21 13:59 Start Date: 21/Jul/21 13:59 Worklog Time Spent: 10m Work Description: klcopp commented on a change in pull request #2493: URL: https://github.com/apache/hive/pull/2493#discussion_r673999555 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/metrics/DeltaFilesMetricReporter.java ## @@ -212,11 +215,41 @@ public static void mergeDeltaFilesStats(AcidDirectory dir, long checkThresholdIn } } } + +logDeltaDirMetrics(dir, conf, numObsoleteDeltas, numDeltas, numSmallDeltas); + String path = getRelPath(dir); newDeltaFilesStats(numObsoleteDeltas, numDeltas, numSmallDeltas) .forEach((type, cnt) -> deltaFilesStats.computeIfAbsent(type, v -> new HashMap<>()).put(path, cnt)); } + private static void logDeltaDirMetrics(AcidDirectory dir, Configuration conf, int numObsoleteDeltas, int numDeltas, + int numSmallDeltas) { +long loggerFrequency = HiveConf +.getTimeVar(conf, HiveConf.ConfVars.HIVE_COMPACTOR_ACID_METRICS_LOGGER_FREQUENCY, TimeUnit.MILLISECONDS); +if (loggerFrequency <= 0) { + return; +} +long currentTime = System.currentTimeMillis(); +if (lastSuccessfulLoggingTime == 0 || currentTime >= lastSuccessfulLoggingTime + loggerFrequency) { Review comment: got it now:) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626145) Time Spent: 2h (was: 1h 50m) > Add logging based on new compaction metrics > --- > > Key: HIVE-25345 > URL: https://issues.apache.org/jira/browse/HIVE-25345 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics
[ https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=626139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626139 ] ASF GitHub Bot logged work on HIVE-25345: - Author: ASF GitHub Bot Created on: 21/Jul/21 13:44 Start Date: 21/Jul/21 13:44 Worklog Time Spent: 10m Work Description: lcspinter commented on a change in pull request #2493: URL: https://github.com/apache/hive/pull/2493#discussion_r673985827 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/metrics/DeltaFilesMetricReporter.java ## @@ -212,11 +215,41 @@ public static void mergeDeltaFilesStats(AcidDirectory dir, long checkThresholdIn } } } + +logDeltaDirMetrics(dir, conf, numObsoleteDeltas, numDeltas, numSmallDeltas); + String path = getRelPath(dir); newDeltaFilesStats(numObsoleteDeltas, numDeltas, numSmallDeltas) .forEach((type, cnt) -> deltaFilesStats.computeIfAbsent(type, v -> new HashMap<>()).put(path, cnt)); } + private static void logDeltaDirMetrics(AcidDirectory dir, Configuration conf, int numObsoleteDeltas, int numDeltas, + int numSmallDeltas) { +long loggerFrequency = HiveConf +.getTimeVar(conf, HiveConf.ConfVars.HIVE_COMPACTOR_ACID_METRICS_LOGGER_FREQUENCY, TimeUnit.MILLISECONDS); +if (loggerFrequency <= 0) { + return; +} +long currentTime = System.currentTimeMillis(); +if (lastSuccessfulLoggingTime == 0 || currentTime >= lastSuccessfulLoggingTime + loggerFrequency) { Review comment: Could you please elaborate on this? If `loggerFrequence = 0` the method returns. `lastSuccessfullLoggingTime == 0` means that the `logDeltaDirMetrics` is called for the very first time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626139) Time Spent: 1h 50m (was: 1h 40m) > Add logging based on new compaction metrics > --- > > Key: HIVE-25345 > URL: https://issues.apache.org/jira/browse/HIVE-25345 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics
[ https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=626138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626138 ] ASF GitHub Bot logged work on HIVE-25345: - Author: ASF GitHub Bot Created on: 21/Jul/21 13:42 Start Date: 21/Jul/21 13:42 Worklog Time Spent: 10m Work Description: lcspinter commented on a change in pull request #2493: URL: https://github.com/apache/hive/pull/2493#discussion_r673984019 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/metrics/DeltaFilesMetricReporter.java ## @@ -84,6 +84,8 @@ public static final String OBJECT_NAME_PREFIX = "metrics:type=compaction,name="; + private static long lastSuccessfulLoggingTime = 0; Review comment: It is initialized with that value when the `logDeltaDirMetrics()` method is called for the first time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626138) Time Spent: 1h 40m (was: 1.5h) > Add logging based on new compaction metrics > --- > > Key: HIVE-25345 > URL: https://issues.apache.org/jira/browse/HIVE-25345 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics
[ https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=626137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626137 ] ASF GitHub Bot logged work on HIVE-25345: - Author: ASF GitHub Bot Created on: 21/Jul/21 13:40 Start Date: 21/Jul/21 13:40 Worklog Time Spent: 10m Work Description: lcspinter commented on a change in pull request #2493: URL: https://github.com/apache/hive/pull/2493#discussion_r673981938 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -295,11 +295,12 @@ "SELECT COUNT(*), MIN(\"TXN_ID\"), ({0} - MIN(\"TXN_STARTED\"))/1000 FROM \"TXNS\" WHERE \"TXN_STATE\"='" + TxnStatus.ABORTED + "') \"A\" CROSS JOIN (" + "SELECT COUNT(*), ({0} - MIN(\"HL_ACQUIRED_AT\"))/1000 FROM \"HIVE_LOCKS\") \"HL\" CROSS JOIN (" + - "SELECT COUNT(*) FROM (SELECT COUNT(\"TXN_ID\"), \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" FROM \"TXN_COMPONENTS\" " + - "INNER JOIN \"TXNS\" ON \"TC_TXNID\" = \"TXN_ID\" WHERE \"TXN_STATE\"='" + TxnStatus.ABORTED + "' " + - "GROUP BY \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" HAVING COUNT(\"TXN_ID\") > ?) \"L\") \"L\" CROSS JOIN (" + "SELECT ({0} - MIN(\"CQ_COMMIT_TIME\"))/1000 from \"COMPACTION_QUEUE\" WHERE " + "\"CQ_STATE\"=''" + Character.toString(READY_FOR_CLEANING) + "'') OLDEST_CLEAN"; + private static final String SELECT_TABLES_WITH_X_ABORTED_TXNS = + "SELECT \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" FROM \"TXN_COMPONENTS\" " + + "INNER JOIN \"TXNS\" ON \"TC_TXNID\" = \"TXN_ID\" WHERE \"TXN_STATE\" = " + TxnStatus.ABORTED + Review comment: The tests started failing because the `TxnStatus.ABORTED` was surrender by `'`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626137) Time Spent: 1.5h (was: 1h 20m) > Add logging based on new compaction metrics > --- > > Key: HIVE-25345 > URL: https://issues.apache.org/jira/browse/HIVE-25345 > Project: Hive > Issue Type: Improvement >Reporter: László Pintér >Assignee: László Pintér >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25345) Add logging based on new compaction metrics
[ https://issues.apache.org/jira/browse/HIVE-25345?focusedWorklogId=626135=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626135 ] ASF GitHub Bot logged work on HIVE-25345: - Author: ASF GitHub Bot Created on: 21/Jul/21 13:33 Start Date: 21/Jul/21 13:33 Worklog Time Spent: 10m Work Description: klcopp commented on a change in pull request #2493: URL: https://github.com/apache/hive/pull/2493#discussion_r673965403 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/metrics/AcidMetricService.java ## @@ -38,7 +38,27 @@ import java.util.stream.Collectors; import static org.apache.hadoop.hive.metastore.HiveMetaStoreClient.MANUALLY_INITIATED_COMPACTION; -import static org.apache.hadoop.hive.metastore.metrics.MetricsConstants.*; Review comment: +1 ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/metrics/DeltaFilesMetricReporter.java ## @@ -84,6 +84,8 @@ public static final String OBJECT_NAME_PREFIX = "metrics:type=compaction,name="; + private static long lastSuccessfulLoggingTime = 0; Review comment: This should probably be initialized to System.currentTimeMillis(); ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/metrics/AcidMetricService.java ## @@ -52,6 +72,7 @@ private Configuration conf; private TxnStore txnHandler; + private long lastSuccessfulLoggingTime = 0; Review comment: This should probably be initialized to System.currentTimeMillis(); ## File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/metrics/DeltaFilesMetricReporter.java ## @@ -212,11 +215,41 @@ public static void mergeDeltaFilesStats(AcidDirectory dir, long checkThresholdIn } } } + +logDeltaDirMetrics(dir, conf, numObsoleteDeltas, numDeltas, numSmallDeltas); + String path = getRelPath(dir); newDeltaFilesStats(numObsoleteDeltas, numDeltas, numSmallDeltas) .forEach((type, cnt) -> deltaFilesStats.computeIfAbsent(type, v -> new HashMap<>()).put(path, cnt)); } + private static void logDeltaDirMetrics(AcidDirectory dir, Configuration conf, int numObsoleteDeltas, int numDeltas, + int numSmallDeltas) { +long loggerFrequency = HiveConf +.getTimeVar(conf, HiveConf.ConfVars.HIVE_COMPACTOR_ACID_METRICS_LOGGER_FREQUENCY, TimeUnit.MILLISECONDS); +if (loggerFrequency <= 0) { + return; +} +long currentTime = System.currentTimeMillis(); +if (lastSuccessfulLoggingTime == 0 || currentTime >= lastSuccessfulLoggingTime + loggerFrequency) { Review comment: I think the first part should be: loggerFrequency == 0 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java ## @@ -295,11 +295,12 @@ "SELECT COUNT(*), MIN(\"TXN_ID\"), ({0} - MIN(\"TXN_STARTED\"))/1000 FROM \"TXNS\" WHERE \"TXN_STATE\"='" + TxnStatus.ABORTED + "') \"A\" CROSS JOIN (" + "SELECT COUNT(*), ({0} - MIN(\"HL_ACQUIRED_AT\"))/1000 FROM \"HIVE_LOCKS\") \"HL\" CROSS JOIN (" + - "SELECT COUNT(*) FROM (SELECT COUNT(\"TXN_ID\"), \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" FROM \"TXN_COMPONENTS\" " + - "INNER JOIN \"TXNS\" ON \"TC_TXNID\" = \"TXN_ID\" WHERE \"TXN_STATE\"='" + TxnStatus.ABORTED + "' " + - "GROUP BY \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" HAVING COUNT(\"TXN_ID\") > ?) \"L\") \"L\" CROSS JOIN (" + "SELECT ({0} - MIN(\"CQ_COMMIT_TIME\"))/1000 from \"COMPACTION_QUEUE\" WHERE " + "\"CQ_STATE\"=''" + Character.toString(READY_FOR_CLEANING) + "'') OLDEST_CLEAN"; + private static final String SELECT_TABLES_WITH_X_ABORTED_TXNS = + "SELECT \"TC_DATABASE\", \"TC_TABLE\", \"TC_PARTITION\" FROM \"TXN_COMPONENTS\" " + + "INNER JOIN \"TXNS\" ON \"TC_TXNID\" = \"TXN_ID\" WHERE \"TXN_STATE\" = " + TxnStatus.ABORTED + Review comment: Probably the issue here is that TxnStatus.ABORTED needs to be surrounded by single quotes ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/metrics/AcidMetricService.java ## @@ -85,36 +106,134 @@ public void run() { private void collectMetrics() throws MetaException { ShowCompactResponse currentCompactions = txnHandler.showCompact(new ShowCompactRequest()); -updateMetricsFromShowCompact(currentCompactions); +updateMetricsFromShowCompact(currentCompactions, conf); updateDBMetrics(); } private void updateDBMetrics() throws MetaException { MetricsInfo metrics = txnHandler.getMetricsInfo(); Metrics.getOrCreateGauge(NUM_TXN_TO_WRITEID).set(metrics.getTxnToWriteIdCount()); +logDbMetrics(metrics);
[jira] [Updated] (HIVE-25361) Allow Iceberg table update columns command
[ https://issues.apache.org/jira/browse/HIVE-25361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25361: -- Labels: pull-request-available (was: ) > Allow Iceberg table update columns command > -- > > Key: HIVE-25361 > URL: https://issues.apache.org/jira/browse/HIVE-25361 > Project: Hive > Issue Type: New Feature >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > We should allow {{ALTER TABLE tableName UPDATE COLUMNS}} for iceberg tables > so non-HiveCatalog tables can refresh the columns -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25361) Allow Iceberg table update columns command
[ https://issues.apache.org/jira/browse/HIVE-25361?focusedWorklogId=626129=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626129 ] ASF GitHub Bot logged work on HIVE-25361: - Author: ASF GitHub Bot Created on: 21/Jul/21 13:21 Start Date: 21/Jul/21 13:21 Worklog Time Spent: 10m Work Description: pvary opened a new pull request #2511: URL: https://github.com/apache/hive/pull/2511 ### What changes were proposed in this pull request? Allows Iceberg table update columns command ### Why are the changes needed? non-HiveCatalog based Iceberg tables need to be able to refresh the column list in the HMS DB ### Does this PR introduce _any_ user-facing change? Allows Iceberg table update columns command ### How was this patch tested? Added the command to the relevant unit tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626129) Remaining Estimate: 0h Time Spent: 10m > Allow Iceberg table update columns command > -- > > Key: HIVE-25361 > URL: https://issues.apache.org/jira/browse/HIVE-25361 > Project: Hive > Issue Type: New Feature >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > We should allow {{ALTER TABLE tableName UPDATE COLUMNS}} for iceberg tables > so non-HiveCatalog tables can refresh the columns -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25361) Allow Iceberg table update columns command
[ https://issues.apache.org/jira/browse/HIVE-25361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary reassigned HIVE-25361: - > Allow Iceberg table update columns command > -- > > Key: HIVE-25361 > URL: https://issues.apache.org/jira/browse/HIVE-25361 > Project: Hive > Issue Type: New Feature >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > > We should allow {{ALTER TABLE tableName UPDATE COLUMNS}} for iceberg tables > so non-HiveCatalog tables can refresh the columns -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (HIVE-25276) Enable automatic statistics generation for Iceberg tables
[ https://issues.apache.org/jira/browse/HIVE-25276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary reopened HIVE-25276: --- > Enable automatic statistics generation for Iceberg tables > - > > Key: HIVE-25276 > URL: https://issues.apache.org/jira/browse/HIVE-25276 > Project: Hive > Issue Type: Improvement >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 5h 40m > Remaining Estimate: 0h > > During inserts we should have calculate the column statistics -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25276) Enable automatic statistics generation for Iceberg tables
[ https://issues.apache.org/jira/browse/HIVE-25276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384888#comment-17384888 ] Peter Vary commented on HIVE-25276: --- Reverted because of a concurrently tested commit broke the tests > Enable automatic statistics generation for Iceberg tables > - > > Key: HIVE-25276 > URL: https://issues.apache.org/jira/browse/HIVE-25276 > Project: Hive > Issue Type: Improvement >Reporter: Peter Vary >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 5h 40m > Remaining Estimate: 0h > > During inserts we should have calculate the column statistics -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25359) Changes to metastore API in HIVE-24880 are not backwards compatible
[ https://issues.apache.org/jira/browse/HIVE-25359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384886#comment-17384886 ] Karen Coppage commented on HIVE-25359: -- [~kgyrtkirk] sure thing. I'll add myself to the thrift watchers. > Changes to metastore API in HIVE-24880 are not backwards compatible > --- > > Key: HIVE-25359 > URL: https://issues.apache.org/jira/browse/HIVE-25359 > Project: Hive > Issue Type: Sub-task >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > With HIVE-24880 find_next_compact(String workerId) was changed to > find_next_compact(String workerId, String workerVersion). This isn't > backwards compatible and could break other components > This commit reverts that change, deprecates find_next_compact, adds a new > method: find_next_compact2(FindNextCompactRequest rqst) where > FindNextCompactRequest has fields workerId and workerVersion, and makes Hive > use find_next_compact2. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25359) Changes to metastore API in HIVE-24880 are not backwards compatible
[ https://issues.apache.org/jira/browse/HIVE-25359?focusedWorklogId=626114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626114 ] ASF GitHub Bot logged work on HIVE-25359: - Author: ASF GitHub Bot Created on: 21/Jul/21 12:52 Start Date: 21/Jul/21 12:52 Worklog Time Spent: 10m Work Description: klcopp commented on a change in pull request #2507: URL: https://github.com/apache/hive/pull/2507#discussion_r673943888 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java ## @@ -204,7 +218,8 @@ public CompactionInfo findNextToCompact(String workerId, String workerVersion) t info.properties = rs.getString(6); // Now, update this record as being worked on by this worker. long now = getDbTime(dbConn); - s = "UPDATE \"COMPACTION_QUEUE\" SET \"CQ_WORKER_ID\" = '" + workerId + "', \"CQ_WORKER_VERSION\" = '" + workerVersion + "', " + + s = "UPDATE \"COMPACTION_QUEUE\" SET \"CQ_WORKER_ID\" = '" + rqst.getWorkerId() + "', " + +"\"CQ_WORKER_VERSION\" = '" + rqst.getWorkerVersion() + "', " + Review comment: Great question. It turns out we do handle this case, we filter for: Objects::nonNull when counting the number of versions. However this case wasn't covered in tests so I'll update the PR with coverage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626114) Time Spent: 0.5h (was: 20m) > Changes to metastore API in HIVE-24880 are not backwards compatible > --- > > Key: HIVE-25359 > URL: https://issues.apache.org/jira/browse/HIVE-25359 > Project: Hive > Issue Type: Sub-task >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > With HIVE-24880 find_next_compact(String workerId) was changed to > find_next_compact(String workerId, String workerVersion). This isn't > backwards compatible and could break other components > This commit reverts that change, deprecates find_next_compact, adds a new > method: find_next_compact2(FindNextCompactRequest rqst) where > FindNextCompactRequest has fields workerId and workerVersion, and makes Hive > use find_next_compact2. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25360) Iceberg vectorized ORC reads don't support column reordering
[ https://issues.apache.org/jira/browse/HIVE-25360?focusedWorklogId=626105=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626105 ] ASF GitHub Bot logged work on HIVE-25360: - Author: ASF GitHub Bot Created on: 21/Jul/21 12:31 Start Date: 21/Jul/21 12:31 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2508: URL: https://github.com/apache/hive/pull/2508#discussion_r673929062 ## File path: iceberg/iceberg-handler/src/test/java/org/apache/iceberg/mr/hive/TestHiveIcebergStorageHandlerWithEngine.java ## @@ -609,6 +606,121 @@ public void testAlterChangeColumn() throws IOException { Assert.assertArrayEquals(new Object[]{0L, "Brown"}, result.get(0)); Assert.assertArrayEquals(new Object[]{1L, "Green"}, result.get(1)); Assert.assertArrayEquals(new Object[]{2L, "Pink"}, result.get(2)); + + } + + // Tests CHANGE COLUMN feature similarly like above, but with a more complex schema, aimed to verify vectorized + // reads support the feature properly, also combining with other schema changes e.g. ADD COLUMN + @Test + public void testSchemaEvolutionOnVectorizedReads() throws Exception { +// Currently only ORC, but in the future this should run against each fileformat with vectorized read support. +Assume.assumeTrue("Vectorized reads only.", isVectorized); + +Schema orderSchema = new Schema( +optional(1, "order_id", Types.IntegerType.get()), Review comment: Do we handle complex types? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626105) Time Spent: 0.5h (was: 20m) > Iceberg vectorized ORC reads don't support column reordering > > > Key: HIVE-25360 > URL: https://issues.apache.org/jira/browse/HIVE-25360 > Project: Hive > Issue Type: Bug >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > HIVE-25256 added support for Iceberg backed tables to support CHANGE COLUMN > statement. These include type, name and order changes to the schema. Native > ORC tables only support renames, but with the help of Iceberg as an > intermediary table format layer, this can be achieved, and works well for > non-vectorized reads already. > We should adjust the vectorized read path to support the same. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25360) Iceberg vectorized ORC reads don't support column reordering
[ https://issues.apache.org/jira/browse/HIVE-25360?focusedWorklogId=626104=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626104 ] ASF GitHub Bot logged work on HIVE-25360: - Author: ASF GitHub Bot Created on: 21/Jul/21 12:29 Start Date: 21/Jul/21 12:29 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2508: URL: https://github.com/apache/hive/pull/2508#discussion_r673927523 ## File path: iceberg/iceberg-handler/src/main/java/org/apache/iceberg/orc/ExpressionToOrcSearchArgument.java ## @@ -0,0 +1,296 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.iceberg.orc; + +import java.math.BigDecimal; +import java.sql.Date; +import java.sql.Timestamp; +import java.time.Instant; +import java.time.LocalDate; +import java.util.Map; +import java.util.Set; +import org.apache.hadoop.hive.common.type.HiveDecimal; +import org.apache.hadoop.hive.ql.io.sarg.PredicateLeaf; +import org.apache.hadoop.hive.ql.io.sarg.SearchArgument; +import org.apache.hadoop.hive.ql.io.sarg.SearchArgument.TruthValue; +import org.apache.hadoop.hive.ql.io.sarg.SearchArgumentFactory; +import org.apache.hadoop.hive.serde2.io.HiveDecimalWritable; +import org.apache.hive.iceberg.org.apache.orc.TypeDescription; +import org.apache.iceberg.expressions.Bound; +import org.apache.iceberg.expressions.BoundPredicate; +import org.apache.iceberg.expressions.Expression; +import org.apache.iceberg.expressions.ExpressionVisitors; +import org.apache.iceberg.expressions.Literal; +import org.apache.iceberg.relocated.com.google.common.collect.ImmutableSet; +import org.apache.iceberg.types.Type; +import org.apache.iceberg.types.Type.TypeID; + +/** + * Copy of ExpressionOrcSearchArgument from iceberg/orc module to provide java type compatibility between: Review comment: Is this a full copy, or it is modified? If it is modified, then could you help highlight where? Thanks, Peter -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626104) Time Spent: 20m (was: 10m) > Iceberg vectorized ORC reads don't support column reordering > > > Key: HIVE-25360 > URL: https://issues.apache.org/jira/browse/HIVE-25360 > Project: Hive > Issue Type: Bug >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > HIVE-25256 added support for Iceberg backed tables to support CHANGE COLUMN > statement. These include type, name and order changes to the schema. Native > ORC tables only support renames, but with the help of Iceberg as an > intermediary table format layer, this can be achieved, and works well for > non-vectorized reads already. > We should adjust the vectorized read path to support the same. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HIVE-23889) Empty bucket files are inserted with invalid schema after HIVE-21784
[ https://issues.apache.org/jira/browse/HIVE-23889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161891#comment-17161891 ] László Bodor edited comment on HIVE-23889 at 7/21/21, 12:03 PM: this has been solved as part of HIVE-22538: https://github.com/apache/hive/commit/964f08ae733b037c6e58dfb4ed149ccad2d3ddc0#diff-bb969e858664d98848960a801fd58b5cR579 I'm closing this as duplicate, but we can use this ticket for "tracking" the fix of the schema issues: {code} -OrcFile.WriterOptions wo = OrcFile.writerOptions(this.options.getConfiguration()) -.inspector(rowInspector) -.callback(new OrcRecordUpdater.KeyIndexBuilder("testEmpty")); -OrcFile.createWriter(path, wo).close(); +OrcFile.createWriter(path, writerOptions).close(); {code} patch is attached as [^HIVE-23889.01.patch] was (Author: abstractdog): this has been solved as part of HIVE-22538: https://github.com/apache/hive/commit/964f08ae733b037c6e58dfb4ed149ccad2d3ddc0#diff-bb969e858664d98848960a801fd58b5cR579 I'm closing this as duplicate, but we can use this ticket for "tracking" the fix of the schema issues: {code} -OrcFile.WriterOptions wo = OrcFile.writerOptions(this.options.getConfiguration()) -.inspector(rowInspector) -.callback(new OrcRecordUpdater.KeyIndexBuilder("testEmpty")); -OrcFile.createWriter(path, wo).close(); +OrcFile.createWriter(path, writerOptions).close(); {code} > Empty bucket files are inserted with invalid schema after HIVE-21784 > > > Key: HIVE-23889 > URL: https://issues.apache.org/jira/browse/HIVE-23889 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: HIVE-23889.01.patch > > > HIVE-21784 uses a new WriterOptions instead of the field in OrcRecordUpdater: > https://github.com/apache/hive/commit/f62379ba279f41b843fcd5f3d4a107b6fcd04dec#diff-bb969e858664d98848960a801fd58b5cR580-R583 > so in this scenario, the overwrite creates an empty bucket file, which is > fine as that was the intention of that patch, but it creates that with > invalid schema: > {code} > CREATE TABLE test_table ( >cda_id int, >cda_run_id varchar(255), >cda_load_tstimestamp, >global_party_idstring) > PARTITIONED BY ( >cda_date int, >cda_job_name varchar(12)) > CLUSTERED BY (cda_id) > INTO 2 BUCKETS > STORED AS ORC; > INSERT OVERWRITE TABLE test_table PARTITION (cda_date = 20200601 , > cda_job_name = 'core_base') > SELECT 1 as cda_id,'cda_run_id' as cda_run_id, NULL as cda_load_ts, > 'global_party_id' global_party_id > UNION ALL > SELECT 2 as cda_id,'cda_run_id' as cda_run_id, NULL as cda_load_ts, > 'global_party_id' global_party_id; > ALTER TABLE test_table ADD COLUMNS (group_id string) CASCADE ; > INSERT OVERWRITE TABLE test_table PARTITION (cda_date = 20200601 , > cda_job_name = 'core_base') > SELECT 1 as cda_id,'cda_run_id' as cda_run_id, NULL as cda_load_ts, > 'global_party_id' global_party_id, 'group_id' as group_id; > {code} > because of HIVE-21784, the new empty bucket_0 shows this schema in orc > dump: > {code} > Type: > struct<_col0:int,_col1:varchar(255),_col2:timestamp,_col3:string,_col4:string> > {code} > instead of: > {code} > Type: > struct> > {code} > and this could lead to problems later, when hive tries to look into the file > during split generation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-23889) Empty bucket files are inserted with invalid schema after HIVE-21784
[ https://issues.apache.org/jira/browse/HIVE-23889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] László Bodor updated HIVE-23889: Attachment: HIVE-23889.01.patch > Empty bucket files are inserted with invalid schema after HIVE-21784 > > > Key: HIVE-23889 > URL: https://issues.apache.org/jira/browse/HIVE-23889 > Project: Hive > Issue Type: Bug >Reporter: László Bodor >Assignee: László Bodor >Priority: Major > Attachments: HIVE-23889.01.patch > > > HIVE-21784 uses a new WriterOptions instead of the field in OrcRecordUpdater: > https://github.com/apache/hive/commit/f62379ba279f41b843fcd5f3d4a107b6fcd04dec#diff-bb969e858664d98848960a801fd58b5cR580-R583 > so in this scenario, the overwrite creates an empty bucket file, which is > fine as that was the intention of that patch, but it creates that with > invalid schema: > {code} > CREATE TABLE test_table ( >cda_id int, >cda_run_id varchar(255), >cda_load_tstimestamp, >global_party_idstring) > PARTITIONED BY ( >cda_date int, >cda_job_name varchar(12)) > CLUSTERED BY (cda_id) > INTO 2 BUCKETS > STORED AS ORC; > INSERT OVERWRITE TABLE test_table PARTITION (cda_date = 20200601 , > cda_job_name = 'core_base') > SELECT 1 as cda_id,'cda_run_id' as cda_run_id, NULL as cda_load_ts, > 'global_party_id' global_party_id > UNION ALL > SELECT 2 as cda_id,'cda_run_id' as cda_run_id, NULL as cda_load_ts, > 'global_party_id' global_party_id; > ALTER TABLE test_table ADD COLUMNS (group_id string) CASCADE ; > INSERT OVERWRITE TABLE test_table PARTITION (cda_date = 20200601 , > cda_job_name = 'core_base') > SELECT 1 as cda_id,'cda_run_id' as cda_run_id, NULL as cda_load_ts, > 'global_party_id' global_party_id, 'group_id' as group_id; > {code} > because of HIVE-21784, the new empty bucket_0 shows this schema in orc > dump: > {code} > Type: > struct<_col0:int,_col1:varchar(255),_col2:timestamp,_col3:string,_col4:string> > {code} > instead of: > {code} > Type: > struct> > {code} > and this could lead to problems later, when hive tries to look into the file > during split generation -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25360) Iceberg vectorized ORC reads don't support column reordering
[ https://issues.apache.org/jira/browse/HIVE-25360?focusedWorklogId=626045=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626045 ] ASF GitHub Bot logged work on HIVE-25360: - Author: ASF GitHub Bot Created on: 21/Jul/21 10:15 Start Date: 21/Jul/21 10:15 Worklog Time Spent: 10m Work Description: szlta opened a new pull request #2508: URL: https://github.com/apache/hive/pull/2508 HIVE-25256 added support for Iceberg backed tables to support CHANGE COLUMN statement. These include type, name and order changes to the schema. Native ORC tables only support renames, but with the help of Iceberg as an intermediary table format layer, this can be achieved, and works well for non-vectorized reads already. We should adjust the vectorized read path to support the same. This change makes vectorized orc reads relies on Iceberg on finding out about column renames and reorders, so that the low level ORC vectorized reader can be appropriately configured. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626045) Remaining Estimate: 0h Time Spent: 10m > Iceberg vectorized ORC reads don't support column reordering > > > Key: HIVE-25360 > URL: https://issues.apache.org/jira/browse/HIVE-25360 > Project: Hive > Issue Type: Bug >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-25256 added support for Iceberg backed tables to support CHANGE COLUMN > statement. These include type, name and order changes to the schema. Native > ORC tables only support renames, but with the help of Iceberg as an > intermediary table format layer, this can be achieved, and works well for > non-vectorized reads already. > We should adjust the vectorized read path to support the same. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25360) Iceberg vectorized ORC reads don't support column reordering
[ https://issues.apache.org/jira/browse/HIVE-25360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25360: -- Labels: pull-request-available (was: ) > Iceberg vectorized ORC reads don't support column reordering > > > Key: HIVE-25360 > URL: https://issues.apache.org/jira/browse/HIVE-25360 > Project: Hive > Issue Type: Bug >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HIVE-25256 added support for Iceberg backed tables to support CHANGE COLUMN > statement. These include type, name and order changes to the schema. Native > ORC tables only support renames, but with the help of Iceberg as an > intermediary table format layer, this can be achieved, and works well for > non-vectorized reads already. > We should adjust the vectorized read path to support the same. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25360) Iceberg vectorized ORC reads don't support column reordering
[ https://issues.apache.org/jira/browse/HIVE-25360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ádám Szita updated HIVE-25360: -- Description: HIVE-25256 added support for Iceberg backed tables to support CHANGE COLUMN statement. These include type, name and order changes to the schema. Native ORC tables only support renames, but with the help of Iceberg as an intermediary table format layer, this can be achieved, and works well for non-vectorized reads already. We should adjust the vectorized read path to support the same. > Iceberg vectorized ORC reads don't support column reordering > > > Key: HIVE-25360 > URL: https://issues.apache.org/jira/browse/HIVE-25360 > Project: Hive > Issue Type: Bug >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > > HIVE-25256 added support for Iceberg backed tables to support CHANGE COLUMN > statement. These include type, name and order changes to the schema. Native > ORC tables only support renames, but with the help of Iceberg as an > intermediary table format layer, this can be achieved, and works well for > non-vectorized reads already. > We should adjust the vectorized read path to support the same. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25360) Iceberg vectorized ORC reads don't support column reordering
[ https://issues.apache.org/jira/browse/HIVE-25360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ádám Szita reassigned HIVE-25360: - > Iceberg vectorized ORC reads don't support column reordering > > > Key: HIVE-25360 > URL: https://issues.apache.org/jira/browse/HIVE-25360 > Project: Hive > Issue Type: Bug >Reporter: Ádám Szita >Assignee: Ádám Szita >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25359) Changes to metastore API in HIVE-24880 are not backwards compatible
[ https://issues.apache.org/jira/browse/HIVE-25359?focusedWorklogId=626007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626007 ] ASF GitHub Bot logged work on HIVE-25359: - Author: ASF GitHub Bot Created on: 21/Jul/21 09:32 Start Date: 21/Jul/21 09:32 Worklog Time Spent: 10m Work Description: lcspinter commented on a change in pull request #2507: URL: https://github.com/apache/hive/pull/2507#discussion_r673815217 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java ## @@ -388,8 +388,17 @@ void onRename(String oldCatName, String oldDbName, String oldTabName, String old * @param workerVersion runtime version of the worker calling this * @return an info element for this compaction request, or null if there is no work to do now. */ + @Deprecated Review comment: Again, comment about what to use instead of this deprecated method. ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java ## @@ -165,16 +166,29 @@ public CompactionTxnHandler() { } } + /** * This will grab the next compaction request off of * the queue, and assign it to the worker. * @param workerId id of the worker calling this, will be recorded in the db - * @param workerVersion runtime version of the Worker calling this * @return an info element for this compaction request, or null if there is no work to do now. */ + @Deprecated Review comment: Could you please add some comments, what to use instead of this method? ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java ## @@ -204,7 +218,8 @@ public CompactionInfo findNextToCompact(String workerId, String workerVersion) t info.properties = rs.getString(6); // Now, update this record as being worked on by this worker. long now = getDbTime(dbConn); - s = "UPDATE \"COMPACTION_QUEUE\" SET \"CQ_WORKER_ID\" = '" + workerId + "', \"CQ_WORKER_VERSION\" = '" + workerVersion + "', " + + s = "UPDATE \"COMPACTION_QUEUE\" SET \"CQ_WORKER_ID\" = '" + rqst.getWorkerId() + "', " + +"\"CQ_WORKER_VERSION\" = '" + rqst.getWorkerVersion() + "', " + Review comment: I believe the `getWorkerVersion()` can be null, if this method is called from `findNextToCompact(String workerId)`. Is `CQ_WORKER_VERSION='null'` is a valid scenario? Are we prepared to handle null values when processing the `CQ_WORKER_VERSION` column? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626007) Time Spent: 20m (was: 10m) > Changes to metastore API in HIVE-24880 are not backwards compatible > --- > > Key: HIVE-25359 > URL: https://issues.apache.org/jira/browse/HIVE-25359 > Project: Hive > Issue Type: Sub-task >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > With HIVE-24880 find_next_compact(String workerId) was changed to > find_next_compact(String workerId, String workerVersion). This isn't > backwards compatible and could break other components > This commit reverts that change, deprecates find_next_compact, adds a new > method: find_next_compact2(FindNextCompactRequest rqst) where > FindNextCompactRequest has fields workerId and workerVersion, and makes Hive > use find_next_compact2. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT
[ https://issues.apache.org/jira/browse/HIVE-25306?focusedWorklogId=626004=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626004 ] ASF GitHub Bot logged work on HIVE-25306: - Author: ASF GitHub Bot Created on: 21/Jul/21 09:30 Start Date: 21/Jul/21 09:30 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #2445: URL: https://github.com/apache/hive/pull/2445#discussion_r673814570 ## File path: ql/src/test/queries/clientpositive/ambiguitycheck.q ## @@ -32,11 +32,9 @@ select int(1.2) from src limit 1; select bigint(1.34) from src limit 1; select binary('1') from src limit 1; select boolean(1) from src limit 1; -select date('1') from src limit 2; Review comment: reverted -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626004) Time Spent: 2h (was: 1h 50m) > Move Date and Timestamp parsing from ResolverStyle.LENIENT to > ResolverStyle.STRICT > -- > > Key: HIVE-25306 > URL: https://issues.apache.org/jira/browse/HIVE-25306 > Project: Hive > Issue Type: Bug > Components: Query Planning, UDF >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Attachments: DB_compare.JPG > > Time Spent: 2h > Remaining Estimate: 0h > > Description - > Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to > convert the date/timpstamp from int,string,char etc to Date or Timestamp. > Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date > like "1992-13-12" is converted to "2000-01-12", > Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like > "1992-13-12" is not be converted instead NULL is return. > https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT
[ https://issues.apache.org/jira/browse/HIVE-25306?focusedWorklogId=626002=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626002 ] ASF GitHub Bot logged work on HIVE-25306: - Author: ASF GitHub Bot Created on: 21/Jul/21 09:30 Start Date: 21/Jul/21 09:30 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #2445: URL: https://github.com/apache/hive/pull/2445#discussion_r673814327 ## File path: serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java ## @@ -1254,10 +1257,13 @@ public static Timestamp getTimestampFromString(String s) { s = s.trim(); s = trimNanoTimestamp(s); +if(StringUtils.isEmpty(s)) + return null; + try { return TimestampUtils.stringToTimestamp(s); -} catch (IllegalArgumentException e) { - return null; +} catch (IllegalArgumentException | DateTimeException e) { + throw new IllegalArgumentException("Cannot parse " + s); Review comment: reverted to null as discussed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626002) Time Spent: 1h 40m (was: 1.5h) > Move Date and Timestamp parsing from ResolverStyle.LENIENT to > ResolverStyle.STRICT > -- > > Key: HIVE-25306 > URL: https://issues.apache.org/jira/browse/HIVE-25306 > Project: Hive > Issue Type: Bug > Components: Query Planning, UDF >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Attachments: DB_compare.JPG > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Description - > Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to > convert the date/timpstamp from int,string,char etc to Date or Timestamp. > Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date > like "1992-13-12" is converted to "2000-01-12", > Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like > "1992-13-12" is not be converted instead NULL is return. > https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT
[ https://issues.apache.org/jira/browse/HIVE-25306?focusedWorklogId=626001=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626001 ] ASF GitHub Bot logged work on HIVE-25306: - Author: ASF GitHub Bot Created on: 21/Jul/21 09:30 Start Date: 21/Jul/21 09:30 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #2445: URL: https://github.com/apache/hive/pull/2445#discussion_r673814134 ## File path: ql/src/test/queries/clientpositive/type_conversions_1.q ## @@ -18,9 +18,3 @@ select cast(null as binary) from src limit 1; --- Invalid conversions, should all be null Review comment: reverted -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626001) Time Spent: 1.5h (was: 1h 20m) > Move Date and Timestamp parsing from ResolverStyle.LENIENT to > ResolverStyle.STRICT > -- > > Key: HIVE-25306 > URL: https://issues.apache.org/jira/browse/HIVE-25306 > Project: Hive > Issue Type: Bug > Components: Query Planning, UDF >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Attachments: DB_compare.JPG > > Time Spent: 1.5h > Remaining Estimate: 0h > > Description - > Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to > convert the date/timpstamp from int,string,char etc to Date or Timestamp. > Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date > like "1992-13-12" is converted to "2000-01-12", > Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like > "1992-13-12" is not be converted instead NULL is return. > https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25306) Move Date and Timestamp parsing from ResolverStyle.LENIENT to ResolverStyle.STRICT
[ https://issues.apache.org/jira/browse/HIVE-25306?focusedWorklogId=626003=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-626003 ] ASF GitHub Bot logged work on HIVE-25306: - Author: ASF GitHub Bot Created on: 21/Jul/21 09:30 Start Date: 21/Jul/21 09:30 Worklog Time Spent: 10m Work Description: ashish-kumar-sharma commented on a change in pull request #2445: URL: https://github.com/apache/hive/pull/2445#discussion_r673814445 ## File path: ql/src/test/queries/clientpositive/materialized_view_rewrite_in_between.q ## @@ -12,7 +12,7 @@ create database expr2; use expr2; create table sales(prod_id int, cust_id int, store_id int, sale_date timestamp, qty int, amt double, descr string); insert into sales values -(11,1,101,'12/24/2013',1000,1234.00,'onedummytwo'); +(11,1,101,'2013-12-24',1000,1234.00,'onedummytwo'); Review comment: reverted changes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 626003) Time Spent: 1h 50m (was: 1h 40m) > Move Date and Timestamp parsing from ResolverStyle.LENIENT to > ResolverStyle.STRICT > -- > > Key: HIVE-25306 > URL: https://issues.apache.org/jira/browse/HIVE-25306 > Project: Hive > Issue Type: Bug > Components: Query Planning, UDF >Reporter: Ashish Sharma >Assignee: Ashish Sharma >Priority: Major > Labels: pull-request-available > Attachments: DB_compare.JPG > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Description - > Currently Date.java and Timestamp.java use DateTimeFormatter for parsing to > convert the date/timpstamp from int,string,char etc to Date or Timestamp. > Default DateTimeFormatter which use ResolverStyle.LENIENT which mean date > like "1992-13-12" is converted to "2000-01-12", > Moving DateTimeFormatter which use ResolverStyle.STRICT which mean date like > "1992-13-12" is not be converted instead NULL is return. > https://docs.google.com/document/d/1YTTPlNq3qyzlKfYVkSl3EFhVQ6-wa9WFRdkdIeCoc1Y/edit?usp=sharing > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25249) Fix TestWorker
[ https://issues.apache.org/jira/browse/HIVE-25249?focusedWorklogId=625989=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625989 ] ASF GitHub Bot logged work on HIVE-25249: - Author: ASF GitHub Bot Created on: 21/Jul/21 08:59 Start Date: 21/Jul/21 08:59 Worklog Time Spent: 10m Work Description: deniskuzZ commented on pull request #2474: URL: https://github.com/apache/hive/pull/2474#issuecomment-884018526 > I remember when I was refactoring query-based compaction, I left out the drop temp table operations. I only realized my mistake because the visibility ids were hard-coded in the tests. > I think if we want these tests to run in parallel we should reset the DB before each test (instead of before each test class), no? reverted visibility ids check, there was another issue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 625989) Time Spent: 1h (was: 50m) > Fix TestWorker > -- > > Key: HIVE-25249 > URL: https://issues.apache.org/jira/browse/HIVE-25249 > Project: Hive > Issue Type: Bug >Reporter: Zoltan Haindrich >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > http://ci.hive.apache.org/job/hive-precommit/job/PR-2381/1/ > http://ci.hive.apache.org/job/hive-flaky-check/236/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-25359) Changes to metastore API in HIVE-24880 are not backwards compatible
[ https://issues.apache.org/jira/browse/HIVE-25359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384738#comment-17384738 ] Zoltan Haindrich commented on HIVE-25359: - [~klcopp]: thank you for keeping an eye on compatibility - I also wanted to keep a closer eye to thrift changes but recently I wasn't watching it that closely; but if you want you could also enable 'assign-by-files' to add you as a reviewer in case some important file changes https://github.com/apache/hive/blob/master/.github/assign-by-files.yml#L8 > Changes to metastore API in HIVE-24880 are not backwards compatible > --- > > Key: HIVE-25359 > URL: https://issues.apache.org/jira/browse/HIVE-25359 > Project: Hive > Issue Type: Sub-task >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > With HIVE-24880 find_next_compact(String workerId) was changed to > find_next_compact(String workerId, String workerVersion). This isn't > backwards compatible and could break other components > This commit reverts that change, deprecates find_next_compact, adds a new > method: find_next_compact2(FindNextCompactRequest rqst) where > FindNextCompactRequest has fields workerId and workerVersion, and makes Hive > use find_next_compact2. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (HIVE-25358) Remove reviewer pattern
[ https://issues.apache.org/jira/browse/HIVE-25358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zoltan Haindrich resolved HIVE-25358. - Resolution: Fixed merged into master. Thank you Jesus! > Remove reviewer pattern > --- > > Key: HIVE-25358 > URL: https://issues.apache.org/jira/browse/HIVE-25358 > Project: Hive > Issue Type: Sub-task >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25358) Remove reviewer pattern
[ https://issues.apache.org/jira/browse/HIVE-25358?focusedWorklogId=625977=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625977 ] ASF GitHub Bot logged work on HIVE-25358: - Author: ASF GitHub Bot Created on: 21/Jul/21 08:20 Start Date: 21/Jul/21 08:20 Worklog Time Spent: 10m Work Description: kgyrtkirk merged pull request #2506: URL: https://github.com/apache/hive/pull/2506 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 625977) Time Spent: 20m (was: 10m) > Remove reviewer pattern > --- > > Key: HIVE-25358 > URL: https://issues.apache.org/jira/browse/HIVE-25358 > Project: Hive > Issue Type: Sub-task >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Minor > Labels: pull-request-available > Fix For: 4.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25115) Compaction queue entries may accumulate in "ready for cleaning" state
[ https://issues.apache.org/jira/browse/HIVE-25115?focusedWorklogId=625976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625976 ] ASF GitHub Bot logged work on HIVE-25115: - Author: ASF GitHub Bot Created on: 21/Jul/21 08:18 Start Date: 21/Jul/21 08:18 Worklog Time Spent: 10m Work Description: klcopp closed pull request #2274: URL: https://github.com/apache/hive/pull/2274 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 625976) Time Spent: 2h 50m (was: 2h 40m) > Compaction queue entries may accumulate in "ready for cleaning" state > - > > Key: HIVE-25115 > URL: https://issues.apache.org/jira/browse/HIVE-25115 > Project: Hive > Issue Type: Improvement >Reporter: Karen Coppage >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > If the Cleaner does not delete any files, the compaction queue entry is > thrown back to the queue and remains in "ready for cleaning" state. > Problem: If 2 compactions run on the same table and enter "ready for > cleaning" state at the same time, only one "cleaning" will remove obsolete > files, the other entry will remain in the queue in "ready for cleaning" state. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25359) Changes to metastore API in HIVE-24880 are not backwards compatible
[ https://issues.apache.org/jira/browse/HIVE-25359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-25359: -- Labels: pull-request-available (was: ) > Changes to metastore API in HIVE-24880 are not backwards compatible > --- > > Key: HIVE-25359 > URL: https://issues.apache.org/jira/browse/HIVE-25359 > Project: Hive > Issue Type: Sub-task >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > With HIVE-24880 find_next_compact(String workerId) was changed to > find_next_compact(String workerId, String workerVersion). This isn't > backwards compatible and could break other components > This commit reverts that change, deprecates find_next_compact, adds a new > method: find_next_compact2(FindNextCompactRequest rqst) where > FindNextCompactRequest has fields workerId and workerVersion, and makes Hive > use find_next_compact2. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-25359) Changes to metastore API in HIVE-24880 are not backwards compatible
[ https://issues.apache.org/jira/browse/HIVE-25359?focusedWorklogId=625967=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-625967 ] ASF GitHub Bot logged work on HIVE-25359: - Author: ASF GitHub Bot Created on: 21/Jul/21 07:51 Start Date: 21/Jul/21 07:51 Worklog Time Spent: 10m Work Description: klcopp opened a new pull request #2507: URL: https://github.com/apache/hive/pull/2507 See HIVE-25359. Testing: none -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 625967) Remaining Estimate: 0h Time Spent: 10m > Changes to metastore API in HIVE-24880 are not backwards compatible > --- > > Key: HIVE-25359 > URL: https://issues.apache.org/jira/browse/HIVE-25359 > Project: Hive > Issue Type: Sub-task >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > With HIVE-24880 find_next_compact(String workerId) was changed to > find_next_compact(String workerId, String workerVersion). This isn't > backwards compatible and could break other components > This commit reverts that change, deprecates find_next_compact, adds a new > method: find_next_compact2(FindNextCompactRequest rqst) where > FindNextCompactRequest has fields workerId and workerVersion, and makes Hive > use find_next_compact2. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-25359) Changes to metastore API in HIVE-24880 are not backwards compatible
[ https://issues.apache.org/jira/browse/HIVE-25359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage updated HIVE-25359: - Parent: HIVE-24824 Issue Type: Sub-task (was: Bug) > Changes to metastore API in HIVE-24880 are not backwards compatible > --- > > Key: HIVE-25359 > URL: https://issues.apache.org/jira/browse/HIVE-25359 > Project: Hive > Issue Type: Sub-task >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > > With HIVE-24880 find_next_compact(String workerId) was changed to > find_next_compact(String workerId, String workerVersion). This isn't > backwards compatible and could break other components > This commit reverts that change, deprecates find_next_compact, adds a new > method: find_next_compact2(FindNextCompactRequest rqst) where > FindNextCompactRequest has fields workerId and workerVersion, and makes Hive > use find_next_compact2. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (HIVE-25359) Changes to metastore API in HIVE-24880 are not backwards compatible
[ https://issues.apache.org/jira/browse/HIVE-25359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Coppage reassigned HIVE-25359: > Changes to metastore API in HIVE-24880 are not backwards compatible > --- > > Key: HIVE-25359 > URL: https://issues.apache.org/jira/browse/HIVE-25359 > Project: Hive > Issue Type: Bug >Reporter: Karen Coppage >Assignee: Karen Coppage >Priority: Major > > With HIVE-24880 find_next_compact(String workerId) was changed to > find_next_compact(String workerId, String workerVersion). This isn't > backwards compatible and could break other components > This commit reverts that change, deprecates find_next_compact, adds a new > method: find_next_compact2(FindNextCompactRequest rqst) where > FindNextCompactRequest has fields workerId and workerVersion, and makes Hive > use find_next_compact2. -- This message was sent by Atlassian Jira (v8.3.4#803005)