[jira] [Work logged] (HIVE-25791) Improve SFS exception messages

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25791?focusedWorklogId=696424=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696424
 ]

ASF GitHub Bot logged work on HIVE-25791:
-

Author: ASF GitHub Bot
Created on: 15/Dec/21 07:49
Start Date: 15/Dec/21 07:49
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #2859:
URL: https://github.com/apache/hive/pull/2859


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696424)
Time Spent: 20m  (was: 10m)

> Improve SFS exception messages
> --
>
> Key: HIVE-25791
> URL: https://issues.apache.org/jira/browse/HIVE-25791
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Especially for cases when the path is already known to be invalid; like: 
> {code}sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25791) Improve SFS exception messages

2021-12-14 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-25791.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

merged into master. Thank you [~kkasa] for reviewing the changes!

> Improve SFS exception messages
> --
>
> Key: HIVE-25791
> URL: https://issues.apache.org/jira/browse/HIVE-25791
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Especially for cases when the path is already known to be invalid; like: 
> {code}sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-24893) Download data from Thriftserver through JDBC

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24893:
--
Labels: pull-request-available  (was: )

> Download data from Thriftserver through JDBC
> 
>
> Key: HIVE-24893
> URL: https://issues.apache.org/jira/browse/HIVE-24893
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2, JDBC
>Affects Versions: 4.0.0
>Reporter: Yuming Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It is very useful to support downloading large amounts of data (such as more 
> than 50GB) through JDBC.
> Snowflake has similar support :
> https://docs.snowflake.com/en/user-guide/jdbc-using.html#label-jdbc-download-from-stage-to-stream
> https://github.com/snowflakedb/snowflake-jdbc/blob/95a7d8a03316093430dc3960df6635643208b6fd/src/main/java/net/snowflake/client/jdbc/SnowflakeConnectionV1.java#L886



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-24893) Download data from Thriftserver through JDBC

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24893?focusedWorklogId=696409=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696409
 ]

ASF GitHub Bot logged work on HIVE-24893:
-

Author: ASF GitHub Bot
Created on: 15/Dec/21 07:22
Start Date: 15/Dec/21 07:22
Worklog Time Spent: 10m 
  Work Description: wangyum opened a new pull request #2878:
URL: https://github.com/apache/hive/pull/2878


   ### What changes were proposed in this pull request?
   
   Add `UploadData` and `DownloadData` to TCLIService.thrift.
   
   ### Why are the changes needed?
   
   It is very useful to support downloading large amounts of data (such as more 
than 50GB) through JDBC.
   
   Snowflake has similar support :
   
https://docs.snowflake.com/en/user-guide/jdbc-using.html#label-jdbc-download-from-stage-to-stream
   
https://github.com/snowflakedb/snowflake-jdbc/blob/95a7d8a03316093430dc3960df6635643208b6fd/src/main/java/net/snowflake/client/jdbc/SnowflakeConnectionV1.java#L886
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   // TODO


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696409)
Remaining Estimate: 0h
Time Spent: 10m

> Download data from Thriftserver through JDBC
> 
>
> Key: HIVE-24893
> URL: https://issues.apache.org/jira/browse/HIVE-24893
> Project: Hive
>  Issue Type: New Feature
>  Components: HiveServer2, JDBC
>Affects Versions: 4.0.0
>Reporter: Yuming Wang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> It is very useful to support downloading large amounts of data (such as more 
> than 50GB) through JDBC.
> Snowflake has similar support :
> https://docs.snowflake.com/en/user-guide/jdbc-using.html#label-jdbc-download-from-stage-to-stream
> https://github.com/snowflakedb/snowflake-jdbc/blob/95a7d8a03316093430dc3960df6635643208b6fd/src/main/java/net/snowflake/client/jdbc/SnowflakeConnectionV1.java#L886



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25805) Wrong result when rebuilding MV with count(col) incrementally

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25805?focusedWorklogId=696408=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696408
 ]

ASF GitHub Bot logged work on HIVE-25805:
-

Author: ASF GitHub Bot
Created on: 15/Dec/21 07:22
Start Date: 15/Dec/21 07:22
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #2872:
URL: https://github.com/apache/hive/pull/2872#discussion_r769317236



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/views/HiveAggregateInsertDeleteIncrementalRewritingRule.java
##
@@ -139,7 +139,14 @@ protected IncrementalComputePlanWithDeletedRows 
createJoinRightInput(RelOptRuleC
   switch (aggregateCall.getAggregation().getKind()) {
 case COUNT:
   aggFunction = SqlStdOperatorTable.SUM;
-  argument = relBuilder.literal(1);
+
+  // count(*)
+  if (aggregateCall.getArgList().isEmpty()) {
+argument = relBuilder.literal(1);
+  } else {
+// count()
+argument = genArgumentForCountColumn(relBuilder, rexBuilder, 
aggInput, aggregateCall);

Review comment:
   notes: I think you could access `rexBuilder` from `relBuilder`
   
   I think you could also push the encolsing `if` into this function; or inline 
the whole function - but right now you have some "count argument" related logic 
here and there too




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696408)
Time Spent: 20m  (was: 10m)

> Wrong result when rebuilding MV with count(col) incrementally
> -
>
> Key: HIVE-25805
> URL: https://issues.apache.org/jira/browse/HIVE-25805
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java}
> create table t1(a char(15), b int) stored as orc TBLPROPERTIES 
> ('transactional'='true');
> insert into t1(a, b) values ('old', 1);
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, count(t1.b), count(*) from t1 group by t1.a;
> delete from t1 where b = 1;
> insert into t1(a,b) values
> ('new', null);
> alter materialized view mat1 rebuild;
> select * from mat1;
> {code}
> returns
> {code:java}
> new   1   1
> {code}
> but, should be
> {code:java}
> new   0   1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=696405=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696405
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 15/Dec/21 07:16
Start Date: 15/Dec/21 07:16
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r769314898



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/optimizer/calcite/TestCBOReCompilation.java
##
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.optimizer.calcite;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.conf.HiveConf.ConfVars;
+import org.apache.hadoop.hive.ql.DriverFactory;
+import org.apache.hadoop.hive.ql.IDriver;
+import org.apache.hadoop.hive.ql.processors.CommandProcessorException;
+import org.apache.hadoop.hive.ql.session.SessionState;
+import org.apache.hive.testutils.HiveTestEnvSetup;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.ClassRule;
+import org.junit.Test;
+
+public class TestCBOReCompilation {
+
+  @ClassRule
+  public static HiveTestEnvSetup env_setup = new HiveTestEnvSetup();
+
+  @BeforeClass
+  public static void beforeClass() throws Exception {
+try (IDriver driver = createDriver()) {
+  dropTables(driver);
+  String[] cmds = {
+  // @formatter:off
+  "create table aa1 ( stf_id string)",
+  "create table bb1 ( stf_id string)",
+  "create table cc1 ( stf_id string)",
+  "create table ff1 ( x string)"
+  // @formatter:on
+  };
+  for (String cmd : cmds) {
+driver.run(cmd);
+  }
+}
+  }
+
+  @AfterClass
+  public static void afterClass() throws Exception {
+try (IDriver driver = createDriver()) {
+  dropTables(driver);
+}
+  }
+
+  public static void dropTables(IDriver driver) throws Exception {
+String[] tables = new String[] {"aa1", "bb1", "cc1", "ff1" };
+for (String t : tables) {
+  driver.run("drop table if exists " + t);
+}
+  }
+
+  @Test
+  public void testReExecutedOnError() throws Exception {
+try (IDriver driver = createDriver("ALWAYS")) {
+  String query = "explain from ff1 as a join cc1 as b " +
+  "insert overwrite table aa1 select   stf_id GROUP BY b.stf_id " +
+  "insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id";
+  driver.run(query);
+}
+  }
+
+  @Test
+  public void testFailOnError() throws Exception {
+try (IDriver driver = createDriver("TEST")) {
+  String query = "explain from ff1 as a join cc1 as b " +
+  "insert overwrite table aa1 select   stf_id GROUP BY b.stf_id " +
+  "insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id";
+  Assert.assertThrows("Plan not optimized by CBO", 
CommandProcessorException.class, () -> driver.run(query));

Review comment:
   Found other tests created by @zabetak, so no need to keep these




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696405)
Time Spent: 3.5h  (was: 3h 20m)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> 

[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=696403=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696403
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 15/Dec/21 07:15
Start Date: 15/Dec/21 07:15
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r769314437



##
File path: ql/src/java/org/apache/hadoop/hive/ql/reexec/IReExecutionPlugin.java
##
@@ -42,24 +42,72 @@
   /**
* Called before executing the query.
*/
-  void beforeExecute(int executionIndex, boolean explainReOptimization);
+  default void beforeExecute(int executionIndex, boolean 
explainReOptimization) {
+// default noop
+  }
 
   /**
* The query have failed, does this plugin advises to re-execute it again?
*/
-  boolean shouldReExecute(int executionNum);
+  default boolean shouldReExecute(int executionNum) {
+// default no
+return false;
+  }
 
   /**
-   * The plugin should prepare for the re-compilaton of the query.
+   * The plugin should prepare for the re-compilation of the query.
*/
-  void prepareToReExecute();
+  default void prepareToReExecute() {
+// default noop
+  }
 
   /**
-   * The query have failed; and have been recompiled - does this plugin 
advises to re-execute it again?
+   * The query has failed; and have been recompiled - does this plugin advises 
to re-execute it again?
*/
-  boolean shouldReExecute(int executionNum, PlanMapper oldPlanMapper, 
PlanMapper newPlanMapper);
+  default boolean shouldReExecute(int executionNum, PlanMapper oldPlanMapper, 
PlanMapper newPlanMapper) {

Review comment:
   Keeping as discussed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696403)
Time Spent: 3h 10m  (was: 3h)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=696404=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696404
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 15/Dec/21 07:15
Start Date: 15/Dec/21 07:15
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r769314666



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecutionCBOPlugin.java
##
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.ql.reexec;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.Driver;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHook;
+import org.apache.hadoop.hive.ql.hooks.QueryLifeTimeHookContext;
+import org.apache.hadoop.hive.ql.parse.CBOException;
+import org.apache.hadoop.hive.ql.parse.SemanticException;
+
+/**
+ * Re-compiles the query without CBO
+ */
+public class ReExecutionCBOPlugin implements IReExecutionPlugin {
+
+  private Driver driver;
+  private boolean retryPossible = false;
+  private CBOFallbackStrategy fallbackStrategy;
+
+  class LocalHook implements QueryLifeTimeHook {
+@Override
+public void beforeCompile(QueryLifeTimeHookContext ctx) {
+  // noop
+}
+
+@Override
+public void afterCompile(QueryLifeTimeHookContext ctx, boolean hasError) {
+  if (hasError) {
+Throwable throwable = ctx.getHookContext().getException();
+if (throwable != null) {
+  if (throwable instanceof CBOException) {
+// Determine if we should re-throw the exception OR if we retry 
planning with non-CBO.
+if (fallbackStrategy.isFatal((CBOException) throwable)) {
+  Throwable cause = throwable.getCause();
+  if (cause instanceof RuntimeException || cause instanceof 
SemanticException) {
+// These types of exceptions do not need wrapped
+retryPossible = false;
+return;
+  }
+  // Wrap all other errors (Should only hit in tests)
+  throw new RuntimeException(cause);
+} else {
+  // Only if the exception is a CBOException then we can retry
+  retryPossible = true;

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696404)
Time Spent: 3h 20m  (was: 3h 10m)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=696402=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696402
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 15/Dec/21 07:14
Start Date: 15/Dec/21 07:14
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r769314240



##
File path: ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecDriver.java
##
@@ -167,20 +201,25 @@ public CommandProcessorResponse run() throws 
CommandProcessorException {
   }
 
   PlanMapper oldPlanMapper = coreDriver.getPlanMapper();
-  afterExecute(oldPlanMapper, cpr != null);
+  final boolean success = cpr != null;
+  plugins.forEach(p -> p.afterExecute(oldPlanMapper, success));
+
+  // If the execution was successful return the result
+  if (success) {

Review comment:
   Reverted this change




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696402)
Time Spent: 3h  (was: 2h 50m)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=696401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696401
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 15/Dec/21 07:14
Start Date: 15/Dec/21 07:14
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r769313969



##
File path: ql/src/java/org/apache/hadoop/hive/ql/HookRunner.java
##
@@ -121,19 +121,27 @@ void runBeforeCompileHook(String command) {
   }
 
   /**
-  * Dispatches {@link QueryLifeTimeHook#afterCompile(QueryLifeTimeHookContext, 
boolean)}.
-  *
-  * @param command the Hive command that is being run
-  * @param compileError true if there was an error while compiling the 
command, false otherwise
-  */
-  void runAfterCompilationHook(String command, boolean compileError) {
+   * Dispatches {@link 
QueryLifeTimeHook#afterCompile(QueryLifeTimeHookContext, boolean)}.
+   *
+   * @param driverContext the DriverContext used for generating the HookContext
+   * @param analyzerContext the SemanticAnalyzer context for this query
+   * @param compileException the exception if one was thrown during the 
compilation
+   */
+  void runAfterCompilationHook(DriverContext driverContext, Context 
analyzerContext, Throwable compileException) {

Review comment:
   Keeping as it is based on our discussion




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696401)
Time Spent: 2h 50m  (was: 2h 40m)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25744) Support backward compatibility of thrift struct CreationMetadata

2021-12-14 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa resolved HIVE-25744.
---
Resolution: Fixed

Pushed to master. Thanks [~kgyrtkirk] for review.

> Support backward compatibility of thrift struct CreationMetadata
> 
>
> Key: HIVE-25744
> URL: https://issues.apache.org/jira/browse/HIVE-25744
> Project: Hive
>  Issue Type: Task
>  Components: Materialized views, Thrift API
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Old
> {code}
> struct CreationMetadata {
> 1: required string catName
> 2: required string dbName,
> 3: required string tblName,
> 4: required set tablesUsed,
> 5: optional string validTxnList,
> 6: optional i64 materializationTime
> }HIVE-25656 introduced a breaking change in the HiveServer2 <-> Metastore 
> thrift api:
> {code}
> New
> {code}
> struct CreationMetadata {
> 1: required string catName
> 2: required string dbName,
> 3: required string tblName,
> 4: required set tablesUsed,
> 5: optional string validTxnList,
> 6: optional i64 materializationTime
> }
> {code}
> 4th field type changed



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25744) Support backward compatibility of thrift struct CreationMetadata

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25744?focusedWorklogId=696360=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696360
 ]

ASF GitHub Bot logged work on HIVE-25744:
-

Author: ASF GitHub Bot
Created on: 15/Dec/21 05:27
Start Date: 15/Dec/21 05:27
Worklog Time Spent: 10m 
  Work Description: kasakrisz merged pull request #2821:
URL: https://github.com/apache/hive/pull/2821


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696360)
Time Spent: 20m  (was: 10m)

> Support backward compatibility of thrift struct CreationMetadata
> 
>
> Key: HIVE-25744
> URL: https://issues.apache.org/jira/browse/HIVE-25744
> Project: Hive
>  Issue Type: Task
>  Components: Materialized views, Thrift API
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Old
> {code}
> struct CreationMetadata {
> 1: required string catName
> 2: required string dbName,
> 3: required string tblName,
> 4: required set tablesUsed,
> 5: optional string validTxnList,
> 6: optional i64 materializationTime
> }HIVE-25656 introduced a breaking change in the HiveServer2 <-> Metastore 
> thrift api:
> {code}
> New
> {code}
> struct CreationMetadata {
> 1: required string catName
> 2: required string dbName,
> 3: required string tblName,
> 4: required set tablesUsed,
> 5: optional string validTxnList,
> 6: optional i64 materializationTime
> }
> {code}
> 4th field type changed



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25783) Enforce ASF headers on Metastore

2021-12-14 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng updated HIVE-25783:
---
Summary: Enforce ASF headers on Metastore  (was: Provide rat check to the 
CI)

> Enforce ASF headers on Metastore
> 
>
> Key: HIVE-25783
> URL: https://issues.apache.org/jira/browse/HIVE-25783
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>
> The Jira tries to investigate if we can provide rat check to the CI, make 
> sure that the newly added source files contain the ASF license information. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25615) Hive on tez will generate at least one MapContainer per 0 length file

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25615:
--
Labels: pull-request-available  (was: )

> Hive on tez will generate at least one MapContainer per 0 length file
> -
>
> Key: HIVE-25615
> URL: https://issues.apache.org/jira/browse/HIVE-25615
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Query Processor, Tez
>Affects Versions: 3.1.2
> Environment: hive-3.1.2
> tez-0.10.1
>Reporter: JackYan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When tez read a table with many partitions and those partitions contain 0 
> length file only, ColumnarSplitSizeEstimator will return Integer.MAX_VALUE 
> bytes length for every 0 length file.Then,TezSplitGrouper will treat those 
> files as big files,and generate at least one MapContainer per 0 file to 
> handle it.This is incorrect and even wasteful.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25615) Hive on tez will generate at least one MapContainer per 0 length file

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25615?focusedWorklogId=696251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696251
 ]

ASF GitHub Bot logged work on HIVE-25615:
-

Author: ASF GitHub Bot
Created on: 15/Dec/21 00:12
Start Date: 15/Dec/21 00:12
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2723:
URL: https://github.com/apache/hive/pull/2723#issuecomment-994158113


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696251)
Remaining Estimate: 0h
Time Spent: 10m

> Hive on tez will generate at least one MapContainer per 0 length file
> -
>
> Key: HIVE-25615
> URL: https://issues.apache.org/jira/browse/HIVE-25615
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning, Query Processor, Tez
>Affects Versions: 3.1.2
> Environment: hive-3.1.2
> tez-0.10.1
>Reporter: JackYan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When tez read a table with many partitions and those partitions contain 0 
> length file only, ColumnarSplitSizeEstimator will return Integer.MAX_VALUE 
> bytes length for every 0 length file.Then,TezSplitGrouper will treat those 
> files as big files,and generate at least one MapContainer per 0 file to 
> handle it.This is incorrect and even wasteful.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25800) loadDynamicPartitions in Hive.java should not load all partitions of a managed table

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25800?focusedWorklogId=696055=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696055
 ]

ASF GitHub Bot logged work on HIVE-25800:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 18:57
Start Date: 14/Dec/21 18:57
Worklog Time Spent: 10m 
  Work Description: sourabh912 commented on pull request #2868:
URL: https://github.com/apache/hive/pull/2868#issuecomment-993883393


   @kgyrtkirk : Please review and provide your feedback.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696055)
Time Spent: 0.5h  (was: 20m)

> loadDynamicPartitions in Hive.java should not load all partitions of a 
> managed table 
> -
>
> Key: HIVE-25800
> URL: https://issues.apache.org/jira/browse/HIVE-25800
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Sourabh Goyal
>Assignee: Sourabh Goyal
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-20661 added an improvement in loadDynamicPartitions() api in Hive.java 
> to not add partitions one by one in HMS. As part of that improvement, 
> following code was introduced: 
> {code:java}
> // fetch all the partitions matching the part spec using the partition 
> iterable
> // this way the maximum batch size configuration parameter is considered
> PartitionIterable partitionIterable = new PartitionIterable(Hive.get(), tbl, 
> partSpec,
>   conf.getInt(MetastoreConf.ConfVars.BATCH_RETRIEVE_MAX.getVarname(), 
> 300));
> Iterator iterator = partitionIterable.iterator();
> // Match valid partition path to partitions
> while (iterator.hasNext()) {
>   Partition partition = iterator.next();
>   partitionDetailsMap.entrySet().stream()
>   .filter(entry -> 
> entry.getValue().fullSpec.equals(partition.getSpec()))
>   .findAny().ifPresent(entry -> {
> entry.getValue().partition = partition;
> entry.getValue().hasOldPartition = true;
>   });
> } {code}
> The above code fetches all the existing partitions for a table from HMS and 
> compare that dynamic partitions list to decide old and new partitions to be 
> added to HMS (in batches). The call to fetch all partitions has introduced a 
> performance regression for tables with large number of partitions (of the 
> order of 100K). 
>  
> This is fixed for external tables in 
> https://issues.apache.org/jira/browse/HIVE-25178.  However for ACID tables 
> there is an open Jira(HIVE-25187). Until we have an appropriate fix in 
> HIVE-25187, we can apply the following: 
> Skip fetching all partitions. Instead, in the threadPool which loads each 
> partition individually,  call get_partition() to check if the partition 
> already exists in HMS or not.  
> This will introduce additional getPartition() call for every partition to be 
> loaded dynamically but removes fetching all existing partitions for a table. 
> I believe this is fine since for tables with small number of existing 
> partitions in HMS - getPartitions() won't add too much overhead but for 
> tables with large number of existing partitions, it will certainly avoid 
> getting all partitions from HMS 
> cc - [~lpinter] [~ngangam] 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25809) Implement URI Mapping for KuduStorageHandler in Hive

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25809:
--
Labels: pull-request-available  (was: )

> Implement URI Mapping for KuduStorageHandler in Hive 
> -
>
> Key: HIVE-25809
> URL: https://issues.apache.org/jira/browse/HIVE-25809
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Security
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, there is no storage URI mapping for KuduStorageHandler based on 
> the feature HIVE-24705. The API getURIForAuth() needs to be implemented in 
> KuduStorageHandler.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25809) Implement URI Mapping for KuduStorageHandler in Hive

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25809?focusedWorklogId=696036=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696036
 ]

ASF GitHub Bot logged work on HIVE-25809:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 18:36
Start Date: 14/Dec/21 18:36
Worklog Time Spent: 10m 
  Work Description: saihemanth-cloudera opened a new pull request #2877:
URL: https://github.com/apache/hive/pull/2877


   
   
   ### What changes were proposed in this pull request?
   Implemented getURIForAuth() API in the kudu storage handler
   
   
   
   ### Why are the changes needed?
   To prevent a user breaching hive-24705
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   
   ### How was this patch tested?
   Local machine, remote cluster
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696036)
Remaining Estimate: 0h
Time Spent: 10m

> Implement URI Mapping for KuduStorageHandler in Hive 
> -
>
> Key: HIVE-25809
> URL: https://issues.apache.org/jira/browse/HIVE-25809
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Security
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, there is no storage URI mapping for KuduStorageHandler based on 
> the feature HIVE-24705. The API getURIForAuth() needs to be implemented in 
> KuduStorageHandler.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25795) [CVE-2021-44228] Update log4j2 version to 2.15.0

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25795?focusedWorklogId=696020=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696020
 ]

ASF GitHub Bot logged work on HIVE-25795:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 18:19
Start Date: 14/Dec/21 18:19
Worklog Time Spent: 10m 
  Work Description: kevinverhoeven commented on pull request #2863:
URL: https://github.com/apache/hive/pull/2863#issuecomment-993853334


   @guptanikhil007 thank you for this change, are you planning a fix for 2.x 
and 3.x? This fix is applied to 4.x which has not been released.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696020)
Time Spent: 4h 20m  (was: 4h 10m)

> [CVE-2021-44228] Update log4j2 version to 2.15.0
> 
>
> Key: HIVE-25795
> URL: https://issues.apache.org/jira/browse/HIVE-25795
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> [Worst Apache Log4j RCE Zero day Dropped on Internet - Cyber 
> Kendra|https://www.cyberkendra.com/2021/12/worst-log4j-rce-zeroday-dropped-on.html]
> Vulnerability:
> https://github.com/apache/logging-log4j2/commit/7fe72d6



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25809) Implement URI Mapping for KuduStorageHandler in Hive

2021-12-14 Thread Sai Hemanth Gantasala (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sai Hemanth Gantasala reassigned HIVE-25809:



> Implement URI Mapping for KuduStorageHandler in Hive 
> -
>
> Key: HIVE-25809
> URL: https://issues.apache.org/jira/browse/HIVE-25809
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Security
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>
> Currently, there is no storage URI mapping for KuduStorageHandler based on 
> the feature HIVE-24705. The API getURIForAuth() needs to be implemented in 
> KuduStorageHandler.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25795) [CVE-2021-44228] Update log4j2 version to 2.15.0

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25795?focusedWorklogId=696007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-696007
 ]

ASF GitHub Bot logged work on HIVE-25795:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 18:06
Start Date: 14/Dec/21 18:06
Worklog Time Spent: 10m 
  Work Description: nrg4878 commented on pull request #2876:
URL: https://github.com/apache/hive/pull/2876#issuecomment-993842290


   @yongzhi Could you please review? Thank you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 696007)
Time Spent: 4h 10m  (was: 4h)

> [CVE-2021-44228] Update log4j2 version to 2.15.0
> 
>
> Key: HIVE-25795
> URL: https://issues.apache.org/jira/browse/HIVE-25795
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> [Worst Apache Log4j RCE Zero day Dropped on Internet - Cyber 
> Kendra|https://www.cyberkendra.com/2021/12/worst-log4j-rce-zeroday-dropped-on.html]
> Vulnerability:
> https://github.com/apache/logging-log4j2/commit/7fe72d6



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25808) Analyse table does not fail for non existing partitions

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25808?focusedWorklogId=695905=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695905
 ]

ASF GitHub Bot logged work on HIVE-25808:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 15:48
Start Date: 14/Dec/21 15:48
Worklog Time Spent: 10m 
  Work Description: maheshk114 opened a new pull request #2875:
URL: https://github.com/apache/hive/pull/2875


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695905)
Remaining Estimate: 0h
Time Spent: 10m

> Analyse table does not fail for non existing partitions
> ---
>
> Key: HIVE-25808
> URL: https://issues.apache.org/jira/browse/HIVE-25808
> Project: Hive
>  Issue Type: Bug
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If all the column names are given in the analyse command , then the query 
> fails. But if all the partition column values are not given then its not 
> failing.
> analyze table tbl partition *(fld1 = 2, fld2 = 3)* COMPUTE STATISTICS FOR 
> COLUMNS – This will fail with SemanticException, if partition corresponds to 
> fld1 = 2, fld2 = 3 does not exists. But analyze table tbl partition *(fld1 = 
> 2)* COMPUTE STATISTICS FOR COLUMNS, this will not fail and it will compute 
> stats for whole table.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25808) Analyse table does not fail for non existing partitions

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25808:
--
Labels: pull-request-available  (was: )

> Analyse table does not fail for non existing partitions
> ---
>
> Key: HIVE-25808
> URL: https://issues.apache.org/jira/browse/HIVE-25808
> Project: Hive
>  Issue Type: Bug
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If all the column names are given in the analyse command , then the query 
> fails. But if all the partition column values are not given then its not 
> failing.
> analyze table tbl partition *(fld1 = 2, fld2 = 3)* COMPUTE STATISTICS FOR 
> COLUMNS – This will fail with SemanticException, if partition corresponds to 
> fld1 = 2, fld2 = 3 does not exists. But analyze table tbl partition *(fld1 = 
> 2)* COMPUTE STATISTICS FOR COLUMNS, this will not fail and it will compute 
> stats for whole table.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25808) Analyse table does not fail for non existing partitions

2021-12-14 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera reassigned HIVE-25808:
--

Assignee: mahesh kumar behera

> Analyse table does not fail for non existing partitions
> ---
>
> Key: HIVE-25808
> URL: https://issues.apache.org/jira/browse/HIVE-25808
> Project: Hive
>  Issue Type: Bug
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> If all the column names are given in the analyse command , then the query 
> fails. But if all the partition column values are not given then its not 
> failing.
> analyze table tbl partition *(fld1 = 2, fld2 = 3)* COMPUTE STATISTICS FOR 
> COLUMNS – This will fail with SemanticException, if partition corresponds to 
> fld1 = 2, fld2 = 3 does not exists. But analyze table tbl partition *(fld1 = 
> 2)* COMPUTE STATISTICS FOR COLUMNS, this will not fail and it will compute 
> stats for whole table.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25540) Enable batch update of column stats only for MySql and Postgres

2021-12-14 Thread mahesh kumar behera (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459241#comment-17459241
 ] 

mahesh kumar behera commented on HIVE-25540:


[~zabetak] 

The batch update uses direct SQL to optimize the number of backend database 
calls. Some of the SQL used are not supported by Oracle. So we need to put a 
check to go via DN if the backend DB is Oracle. Currently we have tested only 
in Mysql and Postgres. Batch update  feature is not yet shipped.

> Enable batch update of column stats only for MySql and Postgres 
> 
>
> Key: HIVE-25540
> URL: https://issues.apache.org/jira/browse/HIVE-25540
> Project: Hive
>  Issue Type: Sub-task
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The batch updation of partition column stats using direct sql is tested only 
> for MySql and Postgres.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25778) Hive DB creation is failing when MANAGEDLOCATION is specified with existing location

2021-12-14 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera resolved HIVE-25778.

Resolution: Won't Fix

Closing this Jira as supporting this scenario may lead to data loss/corruption 
if user is not very careful. 

> Hive DB creation is failing when MANAGEDLOCATION is specified with existing 
> location
> 
>
> Key: HIVE-25778
> URL: https://issues.apache.org/jira/browse/HIVE-25778
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As part of HIVE-23387 check is added to restrict user from creating database 
> with managed table location, if the location is already present. This was not 
> the case. As this is causing backward compatibility issue, the check needs to 
> be removed.
>  
> {code:java}
> if (madeManagedDir) {
>   LOG.info("Created database path in managed directory " + dbMgdPath);
> } else {
>   throw new MetaException(
>   "Unable to create database managed directory " + dbMgdPath + ", failed 
> to create database " + db.getName());
> }  {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25778) Hive DB creation is failing when MANAGEDLOCATION is specified with existing location

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25778?focusedWorklogId=695884=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695884
 ]

ASF GitHub Bot logged work on HIVE-25778:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 15:21
Start Date: 14/Dec/21 15:21
Worklog Time Spent: 10m 
  Work Description: maheshk114 closed pull request #2846:
URL: https://github.com/apache/hive/pull/2846


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695884)
Time Spent: 0.5h  (was: 20m)

> Hive DB creation is failing when MANAGEDLOCATION is specified with existing 
> location
> 
>
> Key: HIVE-25778
> URL: https://issues.apache.org/jira/browse/HIVE-25778
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As part of HIVE-23387 check is added to restrict user from creating database 
> with managed table location, if the location is already present. This was not 
> the case. As this is causing backward compatibility issue, the check needs to 
> be removed.
>  
> {code:java}
> if (madeManagedDir) {
>   LOG.info("Created database path in managed directory " + dbMgdPath);
> } else {
>   throw new MetaException(
>   "Unable to create database managed directory " + dbMgdPath + ", failed 
> to create database " + db.getName());
> }  {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25778) Hive DB creation is failing when MANAGEDLOCATION is specified with existing location

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25778?focusedWorklogId=695883=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695883
 ]

ASF GitHub Bot logged work on HIVE-25778:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 15:20
Start Date: 14/Dec/21 15:20
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on pull request #2846:
URL: https://github.com/apache/hive/pull/2846#issuecomment-993650984


   @pgaref Thanks for the review. I am not committing this as supporting this 
may lead to data loss/ corruption if user is not very careful. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695883)
Time Spent: 20m  (was: 10m)

> Hive DB creation is failing when MANAGEDLOCATION is specified with existing 
> location
> 
>
> Key: HIVE-25778
> URL: https://issues.apache.org/jira/browse/HIVE-25778
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2, Metastore
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As part of HIVE-23387 check is added to restrict user from creating database 
> with managed table location, if the location is already present. This was not 
> the case. As this is causing backward compatibility issue, the check needs to 
> be removed.
>  
> {code:java}
> if (madeManagedDir) {
>   LOG.info("Created database path in managed directory " + dbMgdPath);
> } else {
>   throw new MetaException(
>   "Unable to create database managed directory " + dbMgdPath + ", failed 
> to create database " + db.getName());
> }  {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-14261) Support set/unset partition parameters

2021-12-14 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-14261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459178#comment-17459178
 ] 

Stamatis Zampetakis commented on HIVE-14261:


[~xiepengjie] The motivation for the change, according to the example posted 
above, comes from other projects using HMS directly (not via HS2). 
Nevertheless, the syntax change seems to only affect HS2 thus I don't 
understand who is going to benefit from this in the end. Can you give some 
examples of which users/projects are going to use the new syntax and how?

> Support set/unset partition parameters
> --
>
> Key: HIVE-14261
> URL: https://issues.apache.org/jira/browse/HIVE-14261
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
>Priority: Major
> Attachments: HIVE-14261.01.patch
>
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25807) ok

2021-12-14 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-25807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459179#comment-17459179
 ] 

László Bodor commented on HIVE-25807:
-

[~Din1]: may I ask what is this jira for?


> ok
> --
>
> Key: HIVE-25807
> URL: https://issues.apache.org/jira/browse/HIVE-25807
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Pawar
>Assignee: Pravin Pawar
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-21172) DEFAULT keyword handling in MERGE UPDATE clause issues

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21172?focusedWorklogId=695795=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695795
 ]

ASF GitHub Bot logged work on HIVE-21172:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 13:59
Start Date: 14/Dec/21 13:59
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #2857:
URL: https://github.com/apache/hive/pull/2857#discussion_r768692003



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
##
@@ -1711,13 +1713,13 @@ public void 
testMajorCompactionAfterTwoMergeStatements() throws Exception {
 
 // Verify contents of bucket files.
 List expectedRsBucket0 = 
Arrays.asList("{\"writeid\":1,\"bucketid\":536870912,\"rowid\":3}\t4\tvalue_4",
-"{\"writeid\":2,\"bucketid\":536870912,\"rowid\":0}\t6\tvalue_6",
-"{\"writeid\":2,\"bucketid\":536870913,\"rowid\":2}\t3\tnewvalue_3",
-"{\"writeid\":3,\"bucketid\":536870912,\"rowid\":0}\t8\tvalue_8",
-"{\"writeid\":3,\"bucketid\":536870913,\"rowid\":0}\t5\tnewestvalue_5",
-"{\"writeid\":3,\"bucketid\":536870913,\"rowid\":1}\t7\tnewestvalue_7",
-"{\"writeid\":3,\"bucketid\":536870913,\"rowid\":2}\t1\tnewestvalue_1",
-
"{\"writeid\":3,\"bucketid\":536870913,\"rowid\":3}\t2\tnewestvalue_2");
+"{\"writeid\":2,\"bucketid\":536870913,\"rowid\":2}\t3\tnewvalue_3",

Review comment:
   iiuc this test was created to verify the order of records in the bucket 
files after major compaction.
   Based on description of https://issues.apache.org/jira/browse/HIVE-25257
   It should be ordered by originalTransactionId, bucketProperty, rowId.
   Unfortunately originalTransactionId can not be queried so I debugged the 
test and stop the execution before this assert. Then I dumped the orc file 
created on my local fs:
   ```
   java -jar orc-tools-1.6.5/orc-tools-1.6.5-uber.jar data 
./itests/hive-unit/target/tmp/org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez-1639489721524_1338883398/warehouse/comp_and_merge_test/base_003_v014/bucket_0
   Processing data file 
itests/hive-unit/target/tmp/org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez-1639489721524_1338883398/warehouse/comp_and_merge_test/base_003_v014/bucket_0
 [length: 808]
   
{"operation":0,"originaltransaction":1,"bucket":536870912,"rowid":3,"currenttransaction":1,"row":{"id":4,"value":"value_4"}}
   
{"operation":0,"originaltransaction":2,"bucket":536870913,"rowid":2,"currenttransaction":2,"row":{"id":3,"value":"newvalue_3"}}
   
{"operation":0,"originaltransaction":2,"bucket":536870914,"rowid":0,"currenttransaction":2,"row":{"id":6,"value":"value_6"}}
   
{"operation":0,"originaltransaction":3,"bucket":536870913,"rowid":0,"currenttransaction":3,"row":{"id":1,"value":"newestvalue_1"}}
   
{"operation":0,"originaltransaction":3,"bucket":536870913,"rowid":1,"currenttransaction":3,"row":{"id":2,"value":"newestvalue_2"}}
   
{"operation":0,"originaltransaction":3,"bucket":536870913,"rowid":2,"currenttransaction":3,"row":{"id":5,"value":"newestvalue_5"}}
   
{"operation":0,"originaltransaction":3,"bucket":536870913,"rowid":3,"currenttransaction":3,"row":{"id":7,"value":"newestvalue_7"}}
   
{"operation":0,"originaltransaction":3,"bucket":536870914,"rowid":0,"currenttransaction":3,"row":{"id":8,"value":"value_8"}}
   

   ```
   Order seems to be valid.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695795)
Time Spent: 40m  (was: 0.5h)

> DEFAULT keyword handling in MERGE UPDATE clause issues
> --
>
> Key: HIVE-21172
> URL: https://issues.apache.org/jira/browse/HIVE-21172
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> once HIVE-21159 lands, enable {{HiveConf.MERGE_SPLIT_UPDATE}} and run these 
> tests.
> TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge_stats]
>  mvn test -Dtest=TestMiniLlapLocalCliDriver 
> -Dqfile=insert_into_default_keyword.q
> Merge is rewritten as a multi-insert. When Update clause has 

[jira] [Work logged] (HIVE-21172) DEFAULT keyword handling in MERGE UPDATE clause issues

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21172?focusedWorklogId=695748=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695748
 ]

ASF GitHub Bot logged work on HIVE-21172:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 13:16
Start Date: 14/Dec/21 13:16
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #2857:
URL: https://github.com/apache/hive/pull/2857#discussion_r768652584



##
File path: ql/src/test/results/clientpositive/llap/masking_acid_no_masking.q.out
##
@@ -54,8 +53,9 @@ POSTHOOK: Input: default@masking_acid_no_masking
 POSTHOOK: Input: default@nonacid_n0
 POSTHOOK: Output: default@masking_acid_no_masking
 POSTHOOK: Output: default@masking_acid_no_masking
-POSTHOOK: Output: default@masking_acid_no_masking
 POSTHOOK: Output: default@merge_tmp_table
 POSTHOOK: Lineage: masking_acid_no_masking.key SIMPLE 
[(nonacid_n0)s.FieldSchema(name:key, type:int, comment:null), ]
+POSTHOOK: Lineage: masking_acid_no_masking.key SIMPLE 
[(nonacid_n0)s.FieldSchema(name:key, type:int, comment:null), ]

Review comment:
   These lineages are generated by the MoveTask when inserting.
   
   By turning on `hive.merge.split.update` update branch of merge statements 
are splitted into a insert and a delete branch.
   
   Originally this merge had only one insert branch but now it has two to the 
same table same columns:
   - one for the insert branch
   - one for the update branch
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695748)
Time Spent: 0.5h  (was: 20m)

> DEFAULT keyword handling in MERGE UPDATE clause issues
> --
>
> Key: HIVE-21172
> URL: https://issues.apache.org/jira/browse/HIVE-21172
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL, Transactions
>Affects Versions: 4.0.0
>Reporter: Eugene Koifman
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> once HIVE-21159 lands, enable {{HiveConf.MERGE_SPLIT_UPDATE}} and run these 
> tests.
> TestMiniLlapLocalCliDriver.testCliDriver[sqlmerge_stats]
>  mvn test -Dtest=TestMiniLlapLocalCliDriver 
> -Dqfile=insert_into_default_keyword.q
> Merge is rewritten as a multi-insert. When Update clause has DEFAULT, it's 
> not properly replaced with a value in the muli-insert - it's treated as a 
> literal
> {noformat}
> INSERT INTO `default`.`acidTable`-- update clause(insert part)
>  SELECT `t`.`key`, `DEFAULT`, `t`.`value`
>WHERE `t`.`key` = `s`.`key` AND `s`.`key` > 3 AND NOT(`s`.`key` < 3)
> {noformat}
> See {{LOG.info("Going to reparse <" + originalQuery + "> as \n<" + 
> rewrittenQueryStr.toString() + ">");}} in hive.log
> {{MergeSemanticAnalyzer.replaceDefaultKeywordForMerge()}} is only called in 
> {{handleInsert}} but not {{handleUpdate()}}. Why does issue only show up with 
> {{MERGE_SPLIT_UPDATE}}?
> Once this is fixed, HiveConf.MERGE_SPLIT_UPDATE should be true by default



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25804) Update log4j2 version to 2.16.0 to incorporate further CVE-2021-44228 hardening

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25804?focusedWorklogId=695723=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695723
 ]

ASF GitHub Bot logged work on HIVE-25804:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 12:55
Start Date: 14/Dec/21 12:55
Worklog Time Spent: 10m 
  Work Description: csjuhasz-c opened a new pull request #2874:
URL: https://github.com/apache/hive/pull/2874


   HIVE-25804: Update log4j2 version to 2.16.0 to incorporate further 
CVE-2021-44228 hardening
   
   
   
   ### What changes were proposed in this pull request?
   Update log4j version to 2.16.0
   
   
   
   ### Why are the changes needed?
   To incorporate further changes related to CVE-2021-44228.
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695723)
Time Spent: 20m  (was: 10m)

> Update log4j2 version to 2.16.0 to incorporate further CVE-2021-44228 
> hardening
> ---
>
> Key: HIVE-25804
> URL: https://issues.apache.org/jira/browse/HIVE-25804
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Csaba Juhász
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> https://lists.apache.org/thread/d6v4r6nosxysyq9rvnr779336yf0woz4
> https://logging.apache.org/log4j/2.x/changes-report.html#a2.16.0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25804) Update log4j2 version to 2.16.0 to incorporate further CVE-2021-44228 hardening

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25804:
--
Labels: pull-request-available  (was: )

> Update log4j2 version to 2.16.0 to incorporate further CVE-2021-44228 
> hardening
> ---
>
> Key: HIVE-25804
> URL: https://issues.apache.org/jira/browse/HIVE-25804
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Csaba Juhász
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://lists.apache.org/thread/d6v4r6nosxysyq9rvnr779336yf0woz4
> https://logging.apache.org/log4j/2.x/changes-report.html#a2.16.0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25804) Update log4j2 version to 2.16.0 to incorporate further CVE-2021-44228 hardening

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25804?focusedWorklogId=695722=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695722
 ]

ASF GitHub Bot logged work on HIVE-25804:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 12:53
Start Date: 14/Dec/21 12:53
Worklog Time Spent: 10m 
  Work Description: csjuhasz-c closed pull request #2871:
URL: https://github.com/apache/hive/pull/2871


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695722)
Remaining Estimate: 0h
Time Spent: 10m

> Update log4j2 version to 2.16.0 to incorporate further CVE-2021-44228 
> hardening
> ---
>
> Key: HIVE-25804
> URL: https://issues.apache.org/jira/browse/HIVE-25804
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Csaba Juhász
>Assignee: Csaba Juhász
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://lists.apache.org/thread/d6v4r6nosxysyq9rvnr779336yf0woz4
> https://logging.apache.org/log4j/2.x/changes-report.html#a2.16.0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work started] (HIVE-25804) Update log4j2 version to 2.16.0 to incorporate further CVE-2021-44228 hardening

2021-12-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25804 started by Csaba Juhász.
---
> Update log4j2 version to 2.16.0 to incorporate further CVE-2021-44228 
> hardening
> ---
>
> Key: HIVE-25804
> URL: https://issues.apache.org/jira/browse/HIVE-25804
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Csaba Juhász
>Assignee: Csaba Juhász
>Priority: Major
>
> https://lists.apache.org/thread/d6v4r6nosxysyq9rvnr779336yf0woz4
> https://logging.apache.org/log4j/2.x/changes-report.html#a2.16.0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25804) Update log4j2 version to 2.16.0 to incorporate further CVE-2021-44228 hardening

2021-12-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Csaba Juhász reassigned HIVE-25804:
---

Assignee: Csaba Juhász

> Update log4j2 version to 2.16.0 to incorporate further CVE-2021-44228 
> hardening
> ---
>
> Key: HIVE-25804
> URL: https://issues.apache.org/jira/browse/HIVE-25804
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Csaba Juhász
>Assignee: Csaba Juhász
>Priority: Major
>
> https://lists.apache.org/thread/d6v4r6nosxysyq9rvnr779336yf0woz4
> https://logging.apache.org/log4j/2.x/changes-report.html#a2.16.0



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=695672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695672
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 12:04
Start Date: 14/Dec/21 12:04
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r768598009



##
File path: ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecDriver.java
##
@@ -190,52 +229,21 @@ public CommandProcessorResponse run() throws 
CommandProcessorException {
   }
 
   PlanMapper newPlanMapper = coreDriver.getPlanMapper();
-  if (!explainReOptimization && 
!shouldReExecuteAfterCompile(oldPlanMapper, newPlanMapper)) {
+  if (!explainReOptimization &&
+  !plugins.stream().anyMatch(p -> p.shouldReExecute(executionIndex, 
oldPlanMapper, newPlanMapper))) {
 LOG.info("re-running the query would probably not yield better 
results; returning with last error");
 // FIXME: retain old error; or create a new one?
 return cpr;
   }
 }
   }
 
-  private void afterExecute(PlanMapper planMapper, boolean success) {
-for (IReExecutionPlugin p : plugins) {
-  p.afterExecute(planMapper, success);
-}
-  }
-
-  private boolean shouldReExecuteAfterCompile(PlanMapper oldPlanMapper, 
PlanMapper newPlanMapper) {
-boolean ret = false;
-for (IReExecutionPlugin p : plugins) {
-  boolean shouldReExecute = p.shouldReExecute(executionIndex, 
oldPlanMapper, newPlanMapper);
-  LOG.debug("{}.shouldReExecuteAfterCompile = {}", p, shouldReExecute);
-  ret |= shouldReExecute;
-}
-return ret;
-  }
-
-  private boolean shouldReExecute() {
-boolean ret = false;
-for (IReExecutionPlugin p : plugins) {
-  boolean shouldReExecute = p.shouldReExecute(executionIndex);
-  LOG.debug("{}.shouldReExecute = {}", p, shouldReExecute);

Review comment:
   Same as above




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695672)
Time Spent: 2h 40m  (was: 2.5h)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=695671=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695671
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 12:04
Start Date: 14/Dec/21 12:04
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r768597811



##
File path: ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecDriver.java
##
@@ -190,52 +229,21 @@ public CommandProcessorResponse run() throws 
CommandProcessorException {
   }
 
   PlanMapper newPlanMapper = coreDriver.getPlanMapper();
-  if (!explainReOptimization && 
!shouldReExecuteAfterCompile(oldPlanMapper, newPlanMapper)) {
+  if (!explainReOptimization &&
+  !plugins.stream().anyMatch(p -> p.shouldReExecute(executionIndex, 
oldPlanMapper, newPlanMapper))) {
 LOG.info("re-running the query would probably not yield better 
results; returning with last error");
 // FIXME: retain old error; or create a new one?
 return cpr;
   }
 }
   }
 
-  private void afterExecute(PlanMapper planMapper, boolean success) {
-for (IReExecutionPlugin p : plugins) {
-  p.afterExecute(planMapper, success);
-}
-  }
-
-  private boolean shouldReExecuteAfterCompile(PlanMapper oldPlanMapper, 
PlanMapper newPlanMapper) {
-boolean ret = false;
-for (IReExecutionPlugin p : plugins) {
-  boolean shouldReExecute = p.shouldReExecute(executionIndex, 
oldPlanMapper, newPlanMapper);
-  LOG.debug("{}.shouldReExecuteAfterCompile = {}", p, shouldReExecute);

Review comment:
   TBH, I am unsure here. We can keep:
   - `shouldReExecuteAfterCompile`
   - `shouldReExecute`
   - `shouldReCompile`
   
   Or, we can replace with a stream version:
   ```
   plugins.stream()
 .peek(p -> LOG.debug("{}.shouldReCompile = {}", p))
 .anyMatch(p -> p.shouldReCompile(currentIndex))
   ```
   
   Or we can omit the logs, and use only:
   ```
   plugins.stream().anyMatch(p -> p.shouldReCompile(currentIndex))
   ```
   
   Your thoughts?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695671)
Time Spent: 2.5h  (was: 2h 20m)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25807) ok

2021-12-14 Thread Pravin Pawar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Pawar resolved HIVE-25807.
-
Release Note: ok
  Resolution: Fixed

> ok
> --
>
> Key: HIVE-25807
> URL: https://issues.apache.org/jira/browse/HIVE-25807
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Pawar
>Assignee: Pravin Pawar
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25807) ok

2021-12-14 Thread Pravin Pawar (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459109#comment-17459109
 ] 

Pravin Pawar commented on HIVE-25807:
-

ok

> ok
> --
>
> Key: HIVE-25807
> URL: https://issues.apache.org/jira/browse/HIVE-25807
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Pawar
>Assignee: Pravin Pawar
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work started] (HIVE-25807) ok

2021-12-14 Thread Pravin Pawar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25807 started by Pravin Pawar.
---
> ok
> --
>
> Key: HIVE-25807
> URL: https://issues.apache.org/jira/browse/HIVE-25807
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Pawar
>Assignee: Pravin Pawar
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25807) ok

2021-12-14 Thread Pravin Pawar (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pravin Pawar reassigned HIVE-25807:
---

Assignee: Pravin Pawar

> ok
> --
>
> Key: HIVE-25807
> URL: https://issues.apache.org/jira/browse/HIVE-25807
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Pawar
>Assignee: Pravin Pawar
>Priority: Blocker
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=695667=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695667
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 11:57
Start Date: 14/Dec/21 11:57
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r768592941



##
File path: ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecDriver.java
##
@@ -190,52 +229,21 @@ public CommandProcessorResponse run() throws 
CommandProcessorException {
   }
 
   PlanMapper newPlanMapper = coreDriver.getPlanMapper();
-  if (!explainReOptimization && 
!shouldReExecuteAfterCompile(oldPlanMapper, newPlanMapper)) {
+  if (!explainReOptimization &&
+  !plugins.stream().anyMatch(p -> p.shouldReExecute(executionIndex, 
oldPlanMapper, newPlanMapper))) {
 LOG.info("re-running the query would probably not yield better 
results; returning with last error");
 // FIXME: retain old error; or create a new one?
 return cpr;
   }
 }
   }
 
-  private void afterExecute(PlanMapper planMapper, boolean success) {
-for (IReExecutionPlugin p : plugins) {
-  p.afterExecute(planMapper, success);
-}
-  }
-
-  private boolean shouldReExecuteAfterCompile(PlanMapper oldPlanMapper, 
PlanMapper newPlanMapper) {
-boolean ret = false;
-for (IReExecutionPlugin p : plugins) {
-  boolean shouldReExecute = p.shouldReExecute(executionIndex, 
oldPlanMapper, newPlanMapper);

Review comment:
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695667)
Time Spent: 2h 20m  (was: 2h 10m)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=695666=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695666
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 11:57
Start Date: 14/Dec/21 11:57
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r768592824



##
File path: ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecDriver.java
##
@@ -167,20 +201,25 @@ public CommandProcessorResponse run() throws 
CommandProcessorException {
   }
 
   PlanMapper oldPlanMapper = coreDriver.getPlanMapper();
-  afterExecute(oldPlanMapper, cpr != null);
+  final boolean success = cpr != null;
+  plugins.forEach(p -> p.afterExecute(oldPlanMapper, success));
+
+  // If the execution was successful return the result
+  if (success) {
+return cpr;
+  }
 
   boolean shouldReExecute = explainReOptimization && executionIndex==1;
-  shouldReExecute |= cpr == null && shouldReExecute();
+  shouldReExecute |= plugins.stream().anyMatch(p -> 
p.shouldReExecute(executionIndex));
 
-  if (executionIndex >= maxExecutuions || !shouldReExecute) {
-if (cpr != null) {
-  return cpr;
-} else {
-  throw cpe;
-}
+  if (executionIndex >= maxExecutions || !shouldReExecute) {
+// If we do not have to reexecute, return the last error
+throw cpe;
   }
+
   LOG.info("Preparing to re-execute query");
-  prepareToReExecute();
+  plugins.forEach(IReExecutionPlugin::prepareToReExecute);
+
   try {
 coreDriver.compileAndRespond(currentQuery);

Review comment:
   Yeah, that's ok




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695666)
Time Spent: 2h 10m  (was: 2h)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=695664=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695664
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 11:55
Start Date: 14/Dec/21 11:55
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r768591706



##
File path: ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecDriver.java
##
@@ -167,20 +201,25 @@ public CommandProcessorResponse run() throws 
CommandProcessorException {
   }
 
   PlanMapper oldPlanMapper = coreDriver.getPlanMapper();
-  afterExecute(oldPlanMapper, cpr != null);
+  final boolean success = cpr != null;
+  plugins.forEach(p -> p.afterExecute(oldPlanMapper, success));
+
+  // If the execution was successful return the result
+  if (success) {
+return cpr;
+  }
 
   boolean shouldReExecute = explainReOptimization && executionIndex==1;
-  shouldReExecute |= cpr == null && shouldReExecute();
+  shouldReExecute |= plugins.stream().anyMatch(p -> 
p.shouldReExecute(executionIndex));
 
-  if (executionIndex >= maxExecutuions || !shouldReExecute) {
-if (cpr != null) {
-  return cpr;
-} else {
-  throw cpe;
-}
+  if (executionIndex >= maxExecutions || !shouldReExecute) {
+// If we do not have to reexecute, return the last error
+throw cpe;
   }
+
   LOG.info("Preparing to re-execute query");
-  prepareToReExecute();
+  plugins.forEach(IReExecutionPlugin::prepareToReExecute);

Review comment:
   As discussed, most of them was just a loop, so I would keep this instead 
of having 6 methods for loops




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695664)
Time Spent: 2h  (was: 1h 50m)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=695663=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695663
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 11:53
Start Date: 14/Dec/21 11:53
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r768590360



##
File path: ql/src/java/org/apache/hadoop/hive/ql/reexec/ReExecDriver.java
##
@@ -115,14 +115,48 @@ public ReExecDriver(QueryState queryState, QueryInfo 
queryInfo, ArrayList Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=695655=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695655
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 11:32
Start Date: 14/Dec/21 11:32
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r768575045



##
File path: ql/src/java/org/apache/hadoop/hive/ql/reexec/IReExecutionPlugin.java
##
@@ -42,24 +42,72 @@
   /**
* Called before executing the query.
*/
-  void beforeExecute(int executionIndex, boolean explainReOptimization);
+  default void beforeExecute(int executionIndex, boolean 
explainReOptimization) {
+// default noop
+  }
 
   /**
* The query have failed, does this plugin advises to re-execute it again?
*/
-  boolean shouldReExecute(int executionNum);
+  default boolean shouldReExecute(int executionNum) {

Review comment:
   We discussed, and renamed the other method




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695655)
Time Spent: 1h 40m  (was: 1.5h)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25792?focusedWorklogId=695652=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695652
 ]

ASF GitHub Bot logged work on HIVE-25792:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 11:26
Start Date: 14/Dec/21 11:26
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #2865:
URL: https://github.com/apache/hive/pull/2865#discussion_r768571234



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -5536,10 +5536,12 @@ private static void 
populateLlapDaemonVarsSet(Set llapDaemonVarsSetLocal
 
 HIVE_QUERY_REEXECUTION_ENABLED("hive.query.reexecution.enabled", true,
 "Enable query reexecutions"),
-HIVE_QUERY_REEXECUTION_STRATEGIES("hive.query.reexecution.strategies", 
"overlay,reoptimize,reexecute_lost_am,dagsubmit",
+HIVE_QUERY_REEXECUTION_STRATEGIES("hive.query.reexecution.strategies",
+"overlay,reoptimize,reexecute_lost_am,dagsubmit,reexecute_cbo",
 "comma separated list of plugin can be used:\n"
 + "  overlay: hiveconf subtree 'reexec.overlay' is used as an 
overlay in case of an execution errors out\n"
 + "  reoptimize: collects operator statistics during execution and 
recompile the query after a failure\n"
++ "  reexecute_cbo: reexecutes query after a CBO failure\n"

Review comment:
   Renamed to `recompile_without_cbo`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695652)
Time Spent: 1.5h  (was: 1h 20m)

> Multi Insert query fails on CBO path 
> -
>
> Key: HIVE-25792
> URL: https://issues.apache.org/jira/browse/HIVE-25792
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> {code}
> set hive.cbo.enable=true;
> drop table if exists aa1;
> drop table if exists bb1;
> drop table if exists cc1;
> drop table if exists dd1;
> drop table if exists ee1;
> drop table if exists ff1;
> create table aa1 ( stf_id string);
> create table bb1 ( stf_id string);
> create table cc1 ( stf_id string);
> create table ff1 ( x string);
> explain
> from ff1 as a join cc1 as b 
> insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
> insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25806) Possible leak in LlapCacheAwareFs - Parquet, LLAP IO

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25806:
--
Labels: pull-request-available  (was: )

> Possible leak in LlapCacheAwareFs - Parquet, LLAP IO
> 
>
> Key: HIVE-25806
> URL: https://issues.apache.org/jira/browse/HIVE-25806
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> there is an inputstream there which is never closed:
> https://github.com/apache/hive/blob/9f9844dbc881e2a9267c259b8c04e7787f7fadc4/ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java#L243
> my understanding is that in an InputStream chain, every InputStream is 
> responsible for closing its enclosed InputStream, here the chain is like:
> DelegatingSeekableInputStream -> io.DataInputStream -> 
> LlapCacheAwareFs$CacheAwareInputStream -> io.DataInputStream -> 
> crypto.CryptoInputStream -> hdfs.DFSInputStream
> {code}
>   at sun.nio.ch.SocketChannelImpl.(SocketChannelImpl.java:106)
>   at 
> sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60)
>   at java.nio.channels.SocketChannel.open(SocketChannel.java:145)
>   at 
> org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62)
>   at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2933)
>   at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:821)
>   at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:746)
>   at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:379)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
>   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:836)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:183)
>   at java.io.DataInputStream.readFully(DataInputStream.java:195)
>   at 
> org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:264)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:102)
>   at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFullyHeapBuffer(DelegatingSeekableInputStream.java:127)
>   at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:91)
>   at 
> org.apache.parquet.hadoop.ParquetFileReader$ConsecutiveChunkList.readAll(ParquetFileReader.java:1174)
>   at 
> org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:805)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:429)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:407)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:359)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:93)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:361)
>   at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>   at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:117)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)

[jira] [Work logged] (HIVE-25806) Possible leak in LlapCacheAwareFs - Parquet, LLAP IO

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25806?focusedWorklogId=695649=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695649
 ]

ASF GitHub Bot logged work on HIVE-25806:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 11:25
Start Date: 14/Dec/21 11:25
Worklog Time Spent: 10m 
  Work Description: abstractdog opened a new pull request #2873:
URL: https://github.com/apache/hive/pull/2873


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695649)
Remaining Estimate: 0h
Time Spent: 10m

> Possible leak in LlapCacheAwareFs - Parquet, LLAP IO
> 
>
> Key: HIVE-25806
> URL: https://issues.apache.org/jira/browse/HIVE-25806
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> there is an inputstream there which is never closed:
> https://github.com/apache/hive/blob/9f9844dbc881e2a9267c259b8c04e7787f7fadc4/ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java#L243
> my understanding is that in an InputStream chain, every InputStream is 
> responsible for closing its enclosed InputStream, here the chain is like:
> DelegatingSeekableInputStream -> io.DataInputStream -> 
> LlapCacheAwareFs$CacheAwareInputStream -> io.DataInputStream -> 
> crypto.CryptoInputStream -> hdfs.DFSInputStream
> {code}
>   at sun.nio.ch.SocketChannelImpl.(SocketChannelImpl.java:106)
>   at 
> sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60)
>   at java.nio.channels.SocketChannel.open(SocketChannel.java:145)
>   at 
> org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62)
>   at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2933)
>   at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:821)
>   at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:746)
>   at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:379)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
>   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:836)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:183)
>   at java.io.DataInputStream.readFully(DataInputStream.java:195)
>   at 
> org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:264)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:102)
>   at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFullyHeapBuffer(DelegatingSeekableInputStream.java:127)
>   at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:91)
>   at 
> org.apache.parquet.hadoop.ParquetFileReader$ConsecutiveChunkList.readAll(ParquetFileReader.java:1174)
>   at 
> org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:805)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:429)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:407)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:359)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:93)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:361)
>   at 
> 

[jira] [Assigned] (HIVE-25783) Provide rat check to the CI

2021-12-14 Thread Zhihua Deng (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihua Deng reassigned HIVE-25783:
--

Assignee: Zhihua Deng

> Provide rat check to the CI
> ---
>
> Key: HIVE-25783
> URL: https://issues.apache.org/jira/browse/HIVE-25783
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Zhihua Deng
>Assignee: Zhihua Deng
>Priority: Major
>
> The Jira tries to investigate if we can provide rat check to the CI, make 
> sure that the newly added source files contain the ASF license information. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25806) Possible leak in LlapCacheAwareFs - Parquet, LLAP IO

2021-12-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-25806:

Summary: Possible leak in LlapCacheAwareFs - Parquet, LLAP IO  (was: 
Possible leak in LlapCacheAwareFs - parquet,llapio)

> Possible leak in LlapCacheAwareFs - Parquet, LLAP IO
> 
>
> Key: HIVE-25806
> URL: https://issues.apache.org/jira/browse/HIVE-25806
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> there is an inputstream there which is never closed:
> https://github.com/apache/hive/blob/9f9844dbc881e2a9267c259b8c04e7787f7fadc4/ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java#L243
> my understanding is that in an InputStream chain, every InputStream is 
> responsible for closing its enclosed InputStream, here the chain is like:
> DelegatingSeekableInputStream -> io.DataInputStream -> 
> LlapCacheAwareFs$CacheAwareInputStream -> io.DataInputStream -> 
> crypto.CryptoInputStream -> hdfs.DFSInputStream
> {code}
>   at sun.nio.ch.SocketChannelImpl.(SocketChannelImpl.java:106)
>   at 
> sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60)
>   at java.nio.channels.SocketChannel.open(SocketChannel.java:145)
>   at 
> org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62)
>   at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2933)
>   at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:821)
>   at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:746)
>   at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:379)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
>   at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
>   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:836)
>   at 
> org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:183)
>   at java.io.DataInputStream.readFully(DataInputStream.java:195)
>   at 
> org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:264)
>   at java.io.DataInputStream.read(DataInputStream.java:149)
>   at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:102)
>   at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFullyHeapBuffer(DelegatingSeekableInputStream.java:127)
>   at 
> org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:91)
>   at 
> org.apache.parquet.hadoop.ParquetFileReader$ConsecutiveChunkList.readAll(ParquetFileReader.java:1174)
>   at 
> org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:805)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:429)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:407)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:359)
>   at 
> org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:93)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:361)
>   at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
>   at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
>   at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:117)
>   at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
>   at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> 

[jira] [Updated] (HIVE-25806) Possible leak in LlapCacheAwareFs - parquet,llapio

2021-12-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-25806:

Description: 
there is an inputstream there which is never closed:
https://github.com/apache/hive/blob/9f9844dbc881e2a9267c259b8c04e7787f7fadc4/ql/src/java/org/apache/hadoop/hive/llap/LlapCacheAwareFs.java#L243

my understanding is that in an InputStream chain, every InputStream is 
responsible for closing its enclosed InputStream, here the chain is like:
DelegatingSeekableInputStream -> io.DataInputStream -> 
LlapCacheAwareFs$CacheAwareInputStream -> io.DataInputStream -> 
crypto.CryptoInputStream -> hdfs.DFSInputStream

{code}
at sun.nio.ch.SocketChannelImpl.(SocketChannelImpl.java:106)
at 
sun.nio.ch.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:60)
at java.nio.channels.SocketChannel.open(SocketChannel.java:145)
at 
org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:62)
at 
org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2933)
at 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:821)
at 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:746)
at 
org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:379)
at 
org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
at 
org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:836)
at 
org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:183)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at 
org.apache.hadoop.hive.llap.LlapCacheAwareFs$CacheAwareInputStream.read(LlapCacheAwareFs.java:264)
at java.io.DataInputStream.read(DataInputStream.java:149)
at 
org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:102)
at 
org.apache.parquet.io.DelegatingSeekableInputStream.readFullyHeapBuffer(DelegatingSeekableInputStream.java:127)
at 
org.apache.parquet.io.DelegatingSeekableInputStream.readFully(DelegatingSeekableInputStream.java:91)
at 
org.apache.parquet.hadoop.ParquetFileReader$ConsecutiveChunkList.readAll(ParquetFileReader.java:1174)
at 
org.apache.parquet.hadoop.ParquetFileReader.readNextRowGroup(ParquetFileReader.java:805)
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.checkEndOfRowGroup(VectorizedParquetRecordReader.java:429)
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.nextBatch(VectorizedParquetRecordReader.java:407)
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:359)
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.next(VectorizedParquetRecordReader.java:93)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:361)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:117)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
at 

[jira] [Assigned] (HIVE-25806) Possible leak in LlapCacheAwareFs - parquet,llapio

2021-12-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-25806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-25806:
---

Assignee: László Bodor

> Possible leak in LlapCacheAwareFs - parquet,llapio
> --
>
> Key: HIVE-25806
> URL: https://issues.apache.org/jira/browse/HIVE-25806
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25805) Wrong result when rebuilding MV with count(col) incrementally

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25805:
--
Labels: pull-request-available  (was: )

> Wrong result when rebuilding MV with count(col) incrementally
> -
>
> Key: HIVE-25805
> URL: https://issues.apache.org/jira/browse/HIVE-25805
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> create table t1(a char(15), b int) stored as orc TBLPROPERTIES 
> ('transactional'='true');
> insert into t1(a, b) values ('old', 1);
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, count(t1.b), count(*) from t1 group by t1.a;
> delete from t1 where b = 1;
> insert into t1(a,b) values
> ('new', null);
> alter materialized view mat1 rebuild;
> select * from mat1;
> {code}
> returns
> {code:java}
> new   1   1
> {code}
> but, should be
> {code:java}
> new   0   1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25805) Wrong result when rebuilding MV with count(col) incrementally

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25805?focusedWorklogId=695631=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695631
 ]

ASF GitHub Bot logged work on HIVE-25805:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 10:47
Start Date: 14/Dec/21 10:47
Worklog Time Spent: 10m 
  Work Description: kasakrisz opened a new pull request #2872:
URL: https://github.com/apache/hive/pull/2872


   
   
   ### What changes were proposed in this pull request?
   When generating incremental rebuild plan for MVs having aggregate and delete 
operations in any source tables check if the view definition contains any 
`count` aggregate function which has argument. If it has add expression to 
check if that argument is `null` or not. 
   
   ### Why are the changes needed?
   Records with `null` values should not be counted in the final aggregation.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. This patch fixes a data correctness issue.
   
   ### How was this patch tested?
   ```
   mvn test -Dtest.output.overwrite -DskipSparkTests 
-Dtest=TestMiniLlapLocalCliDriver -Dqfile=materialized_view_create_rewrite_6.q 
-pl itests/qtest -Pitests
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695631)
Remaining Estimate: 0h
Time Spent: 10m

> Wrong result when rebuilding MV with count(col) incrementally
> -
>
> Key: HIVE-25805
> URL: https://issues.apache.org/jira/browse/HIVE-25805
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> create table t1(a char(15), b int) stored as orc TBLPROPERTIES 
> ('transactional'='true');
> insert into t1(a, b) values ('old', 1);
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, count(t1.b), count(*) from t1 group by t1.a;
> delete from t1 where b = 1;
> insert into t1(a,b) values
> ('new', null);
> alter materialized view mat1 rebuild;
> select * from mat1;
> {code}
> returns
> {code:java}
> new   1   1
> {code}
> but, should be
> {code:java}
> new   0   1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25805) Wrong result when rebuilding MV with count(col) incrementally

2021-12-14 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-25805:
--
Summary: Wrong result when rebuilding MV with count(col) incrementally  
(was: Wrong result when rebuilding MV with count(col) incremental)

> Wrong result when rebuilding MV with count(col) incrementally
> -
>
> Key: HIVE-25805
> URL: https://issues.apache.org/jira/browse/HIVE-25805
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>
> {code:java}
> create table t1(a char(15), b int) stored as orc TBLPROPERTIES 
> ('transactional'='true');
> insert into t1(a, b) values ('old', 1);
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, count(t1.b), count(*) from t1 group by t1.a;
> delete from t1 where b = 1;
> insert into t1(a,b) values
> ('new', null);
> alter materialized view mat1 rebuild;
> select * from mat1;
> {code}
> returns
> {code:java}
> new   1   1
> {code}
> but, should be
> {code:java}
> new   0   1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25805) Wrong result when rebuilding MV with count(col) incremental

2021-12-14 Thread Krisztian Kasa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa reassigned HIVE-25805:
-


> Wrong result when rebuilding MV with count(col) incremental
> ---
>
> Key: HIVE-25805
> URL: https://issues.apache.org/jira/browse/HIVE-25805
> Project: Hive
>  Issue Type: Bug
>  Components: CBO, Materialized views
>Reporter: Krisztian Kasa
>Assignee: Krisztian Kasa
>Priority: Major
>
> {code:java}
> create table t1(a char(15), b int) stored as orc TBLPROPERTIES 
> ('transactional'='true');
> insert into t1(a, b) values ('old', 1);
> create materialized view mat1 stored as orc TBLPROPERTIES 
> ('transactional'='true') as
> select t1.a, count(t1.b), count(*) from t1 group by t1.a;
> delete from t1 where b = 1;
> insert into t1(a,b) values
> ('new', null);
> alter materialized view mat1 rebuild;
> select * from mat1;
> {code}
> returns
> {code:java}
> new   1   1
> {code}
> but, should be
> {code:java}
> new   0   1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-21172) DEFAULT keyword handling in MERGE UPDATE clause issues

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21172?focusedWorklogId=695610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695610
 ]

ASF GitHub Bot logged work on HIVE-21172:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 09:56
Start Date: 14/Dec/21 09:56
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on a change in pull request #2857:
URL: https://github.com/apache/hive/pull/2857#discussion_r768476035



##
File path: ql/src/test/results/clientpositive/llap/masking_acid_no_masking.q.out
##
@@ -54,8 +53,9 @@ POSTHOOK: Input: default@masking_acid_no_masking
 POSTHOOK: Input: default@nonacid_n0
 POSTHOOK: Output: default@masking_acid_no_masking
 POSTHOOK: Output: default@masking_acid_no_masking
-POSTHOOK: Output: default@masking_acid_no_masking
 POSTHOOK: Output: default@merge_tmp_table
 POSTHOOK: Lineage: masking_acid_no_masking.key SIMPLE 
[(nonacid_n0)s.FieldSchema(name:key, type:int, comment:null), ]
+POSTHOOK: Lineage: masking_acid_no_masking.key SIMPLE 
[(nonacid_n0)s.FieldSchema(name:key, type:int, comment:null), ]

Review comment:
   is this a duplicate?

##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
##
@@ -1711,13 +1713,13 @@ public void 
testMajorCompactionAfterTwoMergeStatements() throws Exception {
 
 // Verify contents of bucket files.
 List expectedRsBucket0 = 
Arrays.asList("{\"writeid\":1,\"bucketid\":536870912,\"rowid\":3}\t4\tvalue_4",
-"{\"writeid\":2,\"bucketid\":536870912,\"rowid\":0}\t6\tvalue_6",
-"{\"writeid\":2,\"bucketid\":536870913,\"rowid\":2}\t3\tnewvalue_3",
-"{\"writeid\":3,\"bucketid\":536870912,\"rowid\":0}\t8\tvalue_8",
-"{\"writeid\":3,\"bucketid\":536870913,\"rowid\":0}\t5\tnewestvalue_5",
-"{\"writeid\":3,\"bucketid\":536870913,\"rowid\":1}\t7\tnewestvalue_7",
-"{\"writeid\":3,\"bucketid\":536870913,\"rowid\":2}\t1\tnewestvalue_1",
-
"{\"writeid\":3,\"bucketid\":536870913,\"rowid\":3}\t2\tnewestvalue_2");
+"{\"writeid\":2,\"bucketid\":536870913,\"rowid\":2}\t3\tnewvalue_3",

Review comment:
   seeing a change like this I wonder how much value this test adds...

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/MergeSemanticAnalyzer.java
##
@@ -441,6 +442,11 @@ private String handleUpdate(ASTNode 
whenMatchedUpdateClause, StringBuilder rewri
 default:
   //do nothing
 }
+
+if ("`default`".equalsIgnoreCase(rhsExp.trim())) {
+  rhsExp = MapUtils.getString(colNameToDefaultConstraint, name, 
"null");

Review comment:
   iiuc this makes changes the column value to the default if we see the 
"default" in the query
   
   I see that this also work for a plain insert:
   ```
   create table q2(a string default 'asd');
   insert into q2 values(`default`)
   select * from q2;
   ```
   
   however I think the standard suggest to use the `DEFAULT` keyword and not as 
a string literal; in Hive we seem to also "support" the "default" as a string 
literal to be interpreted as default.
   
   I think because of various reasons - the default keyword becomes the string 
default at some point and it works like that right now.
   
   could you open a follow-up to fix the `default` literal's handling?
   

##
File path: ql/src/test/results/clientpositive/llap/explain_locks.q.out
##
@@ -233,20 +221,14 @@ POSTHOOK: Input: default@target@p=2/q=2
 POSTHOOK: Output: default@merge_tmp_table
 POSTHOOK: Output: default@target
 POSTHOOK: Output: default@target@p=1/q=2
-POSTHOOK: Output: default@target@p=1/q=2
-POSTHOOK: Output: default@target@p=1/q=3
 POSTHOOK: Output: default@target@p=1/q=3
 POSTHOOK: Output: default@target@p=2/q=2
-POSTHOOK: Output: default@target@p=2/q=2
 LOCK INFORMATION:
 default.source -> SHARED_READ
 default.target.p=1/q=2 -> SHARED_READ
 default.target.p=1/q=3 -> SHARED_READ
 default.target.p=2/q=2 -> SHARED_READ
 default.target.p=2/q=2 -> SHARED_WRITE
-default.target.p=2/q=2 -> SHARED_WRITE

Review comment:
   I wonder why were these lock duplicated? 

##
File path: 
ql/src/test/results/clientpositive/llap/acid_direct_update_delete_with_merge.q.out
##
@@ -112,11 +110,13 @@ POSTHOOK: Input: default@transactions@tran_date=20170413
 POSTHOOK: Output: default@merge_tmp_table
 POSTHOOK: Output: default@transactions
 POSTHOOK: Output: default@transactions@tran_date=20170410
-POSTHOOK: Output: default@transactions@tran_date=20170410
 POSTHOOK: Output: default@transactions@tran_date=20170413
 POSTHOOK: Output: default@transactions@tran_date=20170413
 POSTHOOK: Output: default@transactions@tran_date=20170415
 POSTHOOK: Lineage: merge_tmp_table.val EXPRESSION 
[(transactions)t.FieldSchema(name:ROW__ID, 
type:struct, comment:), 
(transactions)t.FieldSchema(name:tran_date, 

[jira] [Work logged] (HIVE-25576) Add config to parse date with older date format

2021-12-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25576?focusedWorklogId=695599=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-695599
 ]

ASF GitHub Bot logged work on HIVE-25576:
-

Author: ASF GitHub Bot
Created on: 14/Dec/21 09:34
Start Date: 14/Dec/21 09:34
Worklog Time Spent: 10m 
  Work Description: zabetak commented on pull request #2690:
URL: https://github.com/apache/hive/pull/2690#issuecomment-993349113


   Apologies for the delay @ashish-kumar-sharma , I will try to rearrange this 
on my TODO list. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 695599)
Time Spent: 2h 10m  (was: 2h)

> Add config to parse date with older date format
> ---
>
> Key: HIVE-25576
> URL: https://issues.apache.org/jira/browse/HIVE-25576
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 3.1.0, 3.0.0, 3.1.1, 3.1.2, 4.0.0
>Reporter: Ashish Sharma
>Assignee: Ashish Sharma
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> *History*
> *Hive 1.2* - 
> VM time zone set to Asia/Bangkok
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 07:00:00
> *Implementation details* - 
> SimpleDateFormat formatter = new SimpleDateFormat(pattern);
> Long unixtime = formatter.parse(textval).getTime() / 1000;
> Date date = new Date(unixtime * 1000L);
> https://docs.oracle.com/javase/8/docs/api/java/util/Date.html . In official 
> documentation they have mention that "Unfortunately, the API for these 
> functions was not amenable to internationalization and The corresponding 
> methods in Date are deprecated" . Due to that this is producing wrong result
> *Master branch* - 
> set hive.local.time.zone=Asia/Bangkok;
> *Query* - SELECT FROM_UNIXTIME(UNIX_TIMESTAMP('1800-01-01 00:00:00 
> UTC','-MM-dd HH:mm:ss z'));
> *Result* - 1800-01-01 06:42:04
> *Implementation details* - 
> DateTimeFormatter dtformatter = new DateTimeFormatterBuilder()
> .parseCaseInsensitive()
> .appendPattern(pattern)
> .toFormatter();
> ZonedDateTime zonedDateTime = 
> ZonedDateTime.parse(textval,dtformatter).withZoneSameInstant(ZoneId.of(timezone));
> Long dttime = zonedDateTime.toInstant().getEpochSecond();
> *Problem*- 
> Now *SimpleDateFormat* has been replaced with *DateTimeFormatter* which is 
> giving the correct result but it is not backword compatible. Which is causing 
> issue at time for migration to new version. Because the older data written is 
> using Hive 1.x or 2.x is not compatible with *DateTimeFormatter*.
> *Solution*
> Introduce an config "hive.legacy.timeParserPolicy" with following values -
> 1. *True*- use *SimpleDateFormat* 
> 2. *False*  - use *DateTimeFormatter*
> Note: apache spark also face the same issue 
> https://issues.apache.org/jira/browse/SPARK-30668



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-25783) Provide rat check to the CI

2021-12-14 Thread Zhihua Deng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459000#comment-17459000
 ] 

Zhihua Deng commented on HIVE-25783:


I will take a look. Thank you for the information.

> Provide rat check to the CI
> ---
>
> Key: HIVE-25783
> URL: https://issues.apache.org/jira/browse/HIVE-25783
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Zhihua Deng
>Priority: Major
>
> The Jira tries to investigate if we can provide rat check to the CI, make 
> sure that the newly added source files contain the ASF license information. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)