[jira] [Work logged] (HIVE-25670) Avoid getTable() calls for foreign key tables not used in a query

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25670?focusedWorklogId=676087=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-676087
 ]

ASF GitHub Bot logged work on HIVE-25670:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:58
Start Date: 04/Nov/21 01:58
Worklog Time Spent: 10m 
  Work Description: scarlin-cloudera opened a new pull request #2763:
URL: https://github.com/apache/hive/pull/2763


   … a query
   
   RelOptHiveTable currently fetches the Table information for all
   referential constraint tables. However, it only needs to fetch the tables
   that are used in the query.
   
   
   
   ### What changes were proposed in this pull request?
   
   Changing RelOptHiveTable to only fetch referential constraint tables that 
are used in the query
   
   ### Why are the changes needed?
   
   To improve compilation performance.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Regression testing (no new tests added)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 676087)
Time Spent: 0.5h  (was: 20m)

> Avoid getTable() calls for foreign key tables not used in a query
> -
>
> Key: HIVE-25670
> URL: https://issues.apache.org/jira/browse/HIVE-25670
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Steve Carlin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In RelOptHiveTable, we generate the referential constraints for the table. In 
> this process, we make a metastore call to fetch these tables.  This is used 
> later on for potential gain on joins done on the key.
> However, there is no need to fetch these constraints if the table is not used 
> in the query. If we can get this information up front, we can save a bit on 
> compilation time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24590?focusedWorklogId=676073=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-676073
 ]

ASF GitHub Bot logged work on HIVE-24590:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:56
Start Date: 04/Nov/21 01:56
Worklog Time Spent: 10m 
  Work Description: belugabehr edited a comment on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past.
   
   Thank you for helping to document this behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 676073)
Time Spent: 6h 10m  (was: 6h)

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=676029=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-676029
 ]

ASF GitHub Bot logged work on HIVE-25591:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:52
Start Date: 04/Nov/21 01:52
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #2759:
URL: https://github.com/apache/hive/pull/2759#discussion_r741675549



##
File path: ql/src/test/queries/clientpositive/jdbc_table_with_schema_mssql.q
##
@@ -0,0 +1,54 @@
+--! qt:database:mssql:q_test_country_table_with_schema.mssql.sql
+-- Microsoft SQL server allows multiple schemas per database so to 
disambiguate between tables in different schemas it
+-- is necessary to set the hive.sql.schema property properly.
+
+-- Some JDBC APIs require the catalog, schema, and table names to be passed 
exactly as they are stored in the database.
+-- MSSQL stores unquoted identifiers by first converting them to lowercase 
thus the hive.sql.schema and
+-- hive.sql.table properties below are specified in lowercase.
+
+CREATE EXTERNAL TABLE country_0 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "bob",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_0;
+SELECT COUNT(*) FROM country_0;
+
+CREATE EXTERNAL TABLE country_1 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "alice",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_1;
+SELECT COUNT(*) FROM country_1;
+
+-- Test DML statements are working fine when accessing table in non-default 
schema
+INSERT INTO country_1 VALUES (8, 'Hungary');

Review comment:
   I'm not familiar how dml statements are treated when the target is a 
JDBC table. I guess a `HiveJdbcConverter` shows up in the plan.
   Could you please add an 
   ```
   explain cbo insert into country_1... 
   ```
   statement. Feel free to ignore my comment if we already have one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 676029)
Time Spent: 2h 10m  (was: 2h)

> CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
> 
>
> Key: HIVE-25591
> URL: https://issues.apache.org/jira/browse/HIVE-25591
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Consider the following use case where tables reside in some user-defined 
> schema in some JDBC compliant database:
> +Postgres+
> {code:sql}
> create schema world;
> create table if not exists world.country (name varchar(80) not null);
> insert into world.country (name) values ('India');
> insert into world.country (name) values ('Russia');
> insert into world.country (name) values ('USA');
> {code}
> The following DDL statement in Hive fails:
> +Hive+
> {code:sql}
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test",
> "hive.sql.dbcp.username" = "user",
> "hive.sql.dbcp.password" = "pwd",
> "hive.sql.schema" = "world",
> "hive.sql.table" = "country");
> {code}
> The exception is the following:
> {noformat}
> org.postgresql.util.PSQLException: ERROR: relation "country" does not exist
>   Position: 15
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532)
>  

[jira] [Work logged] (HIVE-25596) Compress Hive Replication Metrics while storing

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25596?focusedWorklogId=675955=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675955
 ]

ASF GitHub Bot logged work on HIVE-25596:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:44
Start Date: 04/Nov/21 01:44
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2724:
URL: https://github.com/apache/hive/pull/2724#discussion_r741594555



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/metric/MetricSink.java
##
@@ -116,14 +117,17 @@ public void run() {
   int totalMetricsSize = metrics.size();
   List replicationMetricsList = new 
ArrayList<>(totalMetricsSize);
   ObjectMapper mapper = new ObjectMapper();
+  MessageEncoder encoder = 
MessageFactory.getDefaultInstanceForReplMetrics(conf);
+  MessageSerializer serializer = encoder.getSerializer();
   for (int index = 0; index < totalMetricsSize; index++) {
 ReplicationMetric metric = metrics.removeFirst();
 ReplicationMetrics persistentMetric = new ReplicationMetrics();
 persistentMetric.setDumpExecutionId(metric.getDumpExecutionId());
 
persistentMetric.setScheduledExecutionId(metric.getScheduledExecutionId());
 persistentMetric.setPolicy(metric.getPolicy());
-
persistentMetric.setProgress(mapper.writeValueAsString(metric.getProgress()));
-
persistentMetric.setMetadata(mapper.writeValueAsString(metric.getMetadata()));
+
persistentMetric.setProgress(serializer.serialize(mapper.writeValueAsString(metric.getProgress(;
+
persistentMetric.setMetadata(serializer.serialize(mapper.writeValueAsString(metric.getMetadata(;

Review comment:
   How does this justify a need to compress the metadata filed in that 
case? I think we should focus on the size in worst case and then see change 
post compression. That way we can decide on:
   a) whether we really need compressetion for metadata column
   b) if so, how much should the column size be. 

##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFDeserialize.java
##
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.udf.generic;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.messaging.MessageEncoder;
+import org.apache.hadoop.hive.metastore.messaging.MessageFactory;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.io.Text;
+import org.junit.Test;
+import static org.junit.Assert.assertNotNull;
+import static org.junit.Assert.assertNull;
+import static org.junit.Assert.assertEquals;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
+
+/**
+ * TestGenericUDFGzipJsonDeserialize.
+ */
+public class TestGenericUDFDeserialize {
+
+@Test
+public void testOneArg() throws HiveException {
+GenericUDFDeserialize udf = new GenericUDFDeserialize();
+ObjectInspector valueOI1 = 
PrimitiveObjectInspectorFactory.writableStringObjectInspector;
+ObjectInspector valueOI2 = 
PrimitiveObjectInspectorFactory.writableStringObjectInspector;
+UDFArgumentException ex = null;
+try {
+udf.initialize(new ObjectInspector[]{valueOI1});
+} catch (UDFArgumentException e) {
+ex = e;
+}
+assertNotNull("The function deserialize() accepts 2 argument.", ex);
+ex = null;
+try {
+udf.initialize(new ObjectInspector[]{valueOI2, valueOI1});
+} catch (UDFArgumentException e) {
+ex = e;
+}
+assertNull("The function deserialize() accepts 2 argument.", ex);
+}
+
+@Test
+public void testGZIPJsonDeserializeString() throws HiveException {
+GenericUDFDeserialize udf = new 

[jira] [Work logged] (HIVE-25594) Setup JDBC databases in tests via QT options

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25594?focusedWorklogId=675885=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675885
 ]

ASF GitHub Bot logged work on HIVE-25594:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:38
Start Date: 04/Nov/21 01:38
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #2742:
URL: https://github.com/apache/hive/pull/2742#discussion_r741839431



##
File path: 
itests/util/src/main/java/org/apache/hadoop/hive/ql/externalDB/AbstractExternalDB.java
##
@@ -71,27 +59,8 @@ public ProcessResults(String stdout, String stderr, int rc) {
 }
 }
 
-public static AbstractExternalDB initalizeExternalDB(String 
externalDBType) throws IOException {
-AbstractExternalDB abstractExternalDB;
-switch (externalDBType) {
-case "mysql":
-abstractExternalDB = new MySQLExternalDB();
-break;
-case "postgres":
-abstractExternalDB = new PostgresExternalDB();
-break;
-default:
-throw new IOException("unsupported external database type " + 
externalDBType);
-}
-return abstractExternalDB;
-}
-
-public AbstractExternalDB(String externalDBType) {
-this.externalDBType = externalDBType;
-}
-
-protected String getDockerContainerName() {
-return String.format("qtestExternalDB-%s", externalDBType);
+private final String getDockerContainerName() {

Review comment:
   Logged https://issues.apache.org/jira/browse/HIVE-25667 for tracking 
this further.

##
File path: 
itests/util/src/main/java/org/apache/hadoop/hive/ql/qoption/QTestDatabaseHandler.java
##
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.qoption;
+
+import org.apache.hadoop.hive.ql.QTestUtil;
+import org.apache.hadoop.hive.ql.externalDB.AbstractExternalDB;
+import org.apache.hadoop.hive.ql.externalDB.MSSQLServer;
+import org.apache.hadoop.hive.ql.externalDB.MariaDB;
+import org.apache.hadoop.hive.ql.externalDB.MySQLExternalDB;
+import org.apache.hadoop.hive.ql.externalDB.Oracle;
+import org.apache.hadoop.hive.ql.externalDB.PostgresExternalDB;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.Arrays;
+import java.util.EnumMap;
+import java.util.Map;
+
+/**
+ * An option handler for spinning (resp. stopping) databases before (resp. 
after) running a test.
+ *
+ * Syntax: qt:database:DatabaseType:path_to_init_script
+ *
+ * The database type ({@link DatabaseType}) is obligatory argument. The 
initialization script can be omitted.
+ *
+ * Current limitations:
+ * 
+ *   Only one test/file per run
+ *   Does not support parallel execution
+ *   Cannot instantiate more than one database of the same type per 
test
+ * 
+ */
+public class QTestDatabaseHandler implements QTestOptionHandler {
+  private static final Logger LOG = 
LoggerFactory.getLogger(QTestDatabaseHandler.class);
+
+  private enum DatabaseType {
+POSTGRES {
+  @Override
+  AbstractExternalDB create() {
+return new PostgresExternalDB();
+  }
+}, MYSQL {
+  @Override
+  AbstractExternalDB create() {
+return new MySQLExternalDB();
+  }
+}, MARIADB {
+  @Override
+  AbstractExternalDB create() {
+return new MariaDB();
+  }
+}, MSSQL {
+  @Override
+  AbstractExternalDB create() {
+return new MSSQLServer();
+  }
+}, ORACLE {
+  @Override
+  AbstractExternalDB create() {
+return new Oracle();
+  }
+};
+
+abstract AbstractExternalDB create();
+  }
+
+  private final Map databaseToScript = new 
EnumMap<>(DatabaseType.class);
+
+  @Override
+  public void processArguments(String arguments) {
+String[] args = arguments.split(":");
+if (args.length == 0) {
+  throw new IllegalArgumentException("No arguments provided");
+}
+if 

[jira] [Work logged] (HIVE-25659) Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?focusedWorklogId=675849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675849
 ]

ASF GitHub Bot logged work on HIVE-25659:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:34
Start Date: 04/Nov/21 01:34
Worklog Time Spent: 10m 
  Work Description: guptanikhil007 commented on a change in pull request 
#2758:
URL: https://github.com/apache/hive/pull/2758#discussion_r741713387



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the not in query cannot be fit in one query, it should fail according 
to function definition

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the `not in` query cannot be fit in one query, it should fail 
according to function definition

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the `not in` query cannot be fit in one single query, it should fail 
according to function definition




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675849)
Time Spent: 2.5h  (was: 2h 20m)

> Metastore direct sql queries with IN/(NOT IN) should be split based on max 
> parameters allowed by SQL DB
> ---
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of 

[jira] [Work logged] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24590?focusedWorklogId=675793=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675793
 ]

ASF GitHub Bot logged work on HIVE-24590:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:28
Start Date: 04/Nov/21 01:28
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675793)
Time Spent: 6h  (was: 5h 50m)

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25596) Compress Hive Replication Metrics while storing

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25596?focusedWorklogId=675792=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675792
 ]

ASF GitHub Bot logged work on HIVE-25596:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:28
Start Date: 04/Nov/21 01:28
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2724:
URL: https://github.com/apache/hive/pull/2724#discussion_r741578189



##
File path: 
standalone-metastore/metastore-server/src/test/resources/sql/postgres/upgrade-3.1.3000-to-4.0.0.postgres.sql
##
@@ -51,6 +51,10 @@ CREATE TABLE "REPLICATION_METRICS" (
 --Increase the size of RM_PROGRESS to accomodate the replication statistics
 ALTER TABLE "REPLICATION_METRICS" ALTER "RM_PROGRESS" TYPE varchar(24000);
 
+ALTER TABLE "REPLICATION_METRICS" ALTER "RM_PROGRESS" TYPE varchar(1);

Review comment:
   This is tested as part of ITestPostgres.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDeserialize.java
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.udf.generic;
+
+import org.apache.hadoop.hive.metastore.messaging.json.JSONMessageEncoder;
+import 
org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder;
+import org.apache.hadoop.hive.ql.exec.Description;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils;
+
+/**
+ * GenericUDFDeserializeString.
+ *
+ */
+@Description(name = "deserialize",
+value="_FUNC_(message, encodingFormat) - Returns deserialized string 
of encoded message.",
+extended="Example:\n"
++ "  > SELECT _FUNC_('H4sI/ytJLS4BAAx+f9gE', 
'gzip(json-2.0)') FROM src LIMIT 1;\n"
++ "  test")
+public class GenericUDFDeserialize extends GenericUDF {
+
+private static final int ARG_COUNT = 2; // Number of arguments to this UDF
+private static final String FUNC_NAME = "deserialize"; // External Name
+
+private transient PrimitiveObjectInspector stringOI = null;
+private transient PrimitiveObjectInspector encodingFormat = null;
+
+@Override
+public ObjectInspector initialize(ObjectInspector[] arguments)
+throws UDFArgumentException {
+if (arguments.length != ARG_COUNT) {
+throw new UDFArgumentException("The function " + FUNC_NAME + " 
accepts " + ARG_COUNT + " arguments.");
+}
+for (ObjectInspector arg: arguments) {
+if (arg.getCategory() != ObjectInspector.Category.PRIMITIVE ||
+
PrimitiveObjectInspectorUtils.PrimitiveGrouping.STRING_GROUP != 
PrimitiveObjectInspectorUtils.getPrimitiveGrouping(
+
((PrimitiveObjectInspector)arg).getPrimitiveCategory())){
+throw new UDFArgumentTypeException(0, "The arguments to " + 
FUNC_NAME + " must be a string/varchar");
+}
+}
+stringOI = (PrimitiveObjectInspector) arguments[0];
+encodingFormat = (PrimitiveObjectInspector) arguments[1];
+return PrimitiveObjectInspectorFactory.javaStringObjectInspector;
+}
+
+@Override
+public Object evaluate(DeferredObject[] arguments) throws HiveException {
+String value = 
PrimitiveObjectInspectorUtils.getString(arguments[0].get(), stringOI);
+String messageFormat = 
PrimitiveObjectInspectorUtils.getString(arguments[1].get(), encodingFormat);
+if (value == null) {
+return null;
+} else if (messageFormat == null || messageFormat.isEmpty() || 
JSONMessageEncoder.FORMAT.equalsIgnoreCase(value)) {
+ 

[jira] [Work logged] (HIVE-25594) Setup JDBC databases in tests via QT options

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25594?focusedWorklogId=675774=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675774
 ]

ASF GitHub Bot logged work on HIVE-25594:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:27
Start Date: 04/Nov/21 01:27
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #2742:
URL: https://github.com/apache/hive/pull/2742


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675774)
Time Spent: 3.5h  (was: 3h 20m)

> Setup JDBC databases in tests via QT options
> 
>
> Key: HIVE-25594
> URL: https://issues.apache.org/jira/browse/HIVE-25594
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> The goal of this jira is to add a new QT option for setting up JDBC DBMS and 
> using it in qtests which need a JDBC endpoint up and running. It can be used 
> in tests with external JDBC tables, connectors, etc.
> A sample file using the proposed option ({{qt:database}}) is shown below.
> {code:sql}
> --!qt:database:postgres:init_sript_1234.sql:cleanup_script_1234.sql
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/qtestDB",
> "hive.sql.dbcp.username" = "qtestuser",
> "hive.sql.dbcp.password" = "qtestpassword",
> "hive.sql.table" = "country");
> EXPLAIN CBO SELECT COUNT(*) from country;
> SELECT COUNT(*) from country;
> {code}
> This builds upon HIVE-25423 but proposes to use JDBC datasources without the 
> need for a using a specific CLI driver. Furthermore, the proposed QT option 
> syntax allows using customised init/cleanup scripts for the JDBC datasource 
> per test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=675778=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675778
 ]

ASF GitHub Bot logged work on HIVE-25591:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:27
Start Date: 04/Nov/21 01:27
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #2759:
URL: https://github.com/apache/hive/pull/2759


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675778)
Time Spent: 2h  (was: 1h 50m)

> CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
> 
>
> Key: HIVE-25591
> URL: https://issues.apache.org/jira/browse/HIVE-25591
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Consider the following use case where tables reside in some user-defined 
> schema in some JDBC compliant database:
> +Postgres+
> {code:sql}
> create schema world;
> create table if not exists world.country (name varchar(80) not null);
> insert into world.country (name) values ('India');
> insert into world.country (name) values ('Russia');
> insert into world.country (name) values ('USA');
> {code}
> The following DDL statement in Hive fails:
> +Hive+
> {code:sql}
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test",
> "hive.sql.dbcp.username" = "user",
> "hive.sql.dbcp.password" = "pwd",
> "hive.sql.schema" = "world",
> "hive.sql.table" = "country");
> {code}
> The exception is the following:
> {noformat}
> org.postgresql.util.PSQLException: ERROR: relation "country" does not exist
>   Position: 15
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2267)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:312) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:448) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:369) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:153)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:103)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>  ~[commons-dbcp2-2.7.0.jar:2.7.0]
>   at 
> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>  ~[commons-dbcp2-2.7.0.jar:2.7.0]
>   at 
> org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getColumnNames(GenericJdbcDatabaseAccessor.java:83)
>  [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at org.apache.hive.storage.jdbc.JdbcSerDe.initialize(JdbcSerDe.java:98) 
> [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:95)
>  [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:78)
>  [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:342)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:324) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getColsInternal(Table.java:734) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:717) 
> 

[jira] [Work logged] (HIVE-25659) Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?focusedWorklogId=675771=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675771
 ]

ASF GitHub Bot logged work on HIVE-25659:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:27
Start Date: 04/Nov/21 01:27
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #2758:
URL: https://github.com/apache/hive/pull/2758#discussion_r741678476



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
##
@@ -680,6 +680,9 @@ public static ConfVars getMetaConf(String name) {
 
DIRECT_SQL_MAX_ELEMENTS_VALUES_CLAUSE("metastore.direct.sql.max.elements.values.clause",
 "hive.direct.sql.max.elements.values.clause",
 1000, "The maximum number of values in a VALUES clause for INSERT 
statement."),
+DIRECT_SQL_MAX_PARAMETERS("metastore.direct.sql.max.parameters",
+"hive.direct.sql.max.parameters", 1000, "The maximum parameters the\n" 
+

Review comment:
   Rephrase: "... maximum query parameters..."

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
##
@@ -316,7 +317,7 @@ public static String getFullTableName(String dbName, String 
tableName) {
   // Compute the size of a query when the 'nextValue' is added to the 
current query.
   int querySize = querySizeExpected(buf.length(), nextValue.length(), 
suffix.length(), addParens);
 
-  if (querySize > maxQueryLength * 1024) {
+  if (querySize > maxQueryLength * 1024 || currentCount >= maxParameters) {

Review comment:
   nit: Keep each condition within ()

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   Why would this test fail?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675771)
Time Spent: 2h 20m  (was: 2h 10m)

> Metastore direct sql queries with IN/(NOT IN) should be split based on max 
> parameters allowed by SQL DB
> ---
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of parameters with very small value of 
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while 
> generating delete query for completed_compactions table
> Example:
> {code:java}
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100
> DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB)
> Number of parameters in a single query = 4759
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=675701=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675701
 ]

ASF GitHub Bot logged work on HIVE-25591:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:19
Start Date: 04/Nov/21 01:19
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #2759:
URL: https://github.com/apache/hive/pull/2759#discussion_r741798099



##
File path: ql/src/test/queries/clientpositive/jdbc_table_with_schema_mssql.q
##
@@ -0,0 +1,54 @@
+--! qt:database:mssql:q_test_country_table_with_schema.mssql.sql
+-- Microsoft SQL server allows multiple schemas per database so to 
disambiguate between tables in different schemas it
+-- is necessary to set the hive.sql.schema property properly.
+
+-- Some JDBC APIs require the catalog, schema, and table names to be passed 
exactly as they are stored in the database.
+-- MSSQL stores unquoted identifiers by first converting them to lowercase 
thus the hive.sql.schema and
+-- hive.sql.table properties below are specified in lowercase.
+
+CREATE EXTERNAL TABLE country_0 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "bob",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_0;
+SELECT COUNT(*) FROM country_0;
+
+CREATE EXTERNAL TABLE country_1 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "alice",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_1;
+SELECT COUNT(*) FROM country_1;
+
+-- Test DML statements are working fine when accessing table in non-default 
schema
+INSERT INTO country_1 VALUES (8, 'Hungary');

Review comment:
   I added some plans in 
https://github.com/apache/hive/pull/2759/commits/5aa343db3cb36a5aca6c4ef0cfb94263fba38585

##
File path: ql/src/test/queries/clientpositive/jdbc_table_with_schema_mssql.q
##
@@ -0,0 +1,54 @@
+--! qt:database:mssql:q_test_country_table_with_schema.mssql.sql
+-- Microsoft SQL server allows multiple schemas per database so to 
disambiguate between tables in different schemas it
+-- is necessary to set the hive.sql.schema property properly.
+
+-- Some JDBC APIs require the catalog, schema, and table names to be passed 
exactly as they are stored in the database.
+-- MSSQL stores unquoted identifiers by first converting them to lowercase 
thus the hive.sql.schema and
+-- hive.sql.table properties below are specified in lowercase.
+
+CREATE EXTERNAL TABLE country_0 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "bob",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_0;
+SELECT COUNT(*) FROM country_0;
+
+CREATE EXTERNAL TABLE country_1 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "alice",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_1;
+SELECT COUNT(*) FROM country_1;
+
+-- Test DML statements are working fine when accessing table in non-default 
schema
+INSERT INTO country_1 VALUES (8, 'Hungary');

Review comment:
   I added some plans in 
https://github.com/apache/hive/pull/2759/commits/5aa343db3cb36a5aca6c4ef0cfb94263fba38585




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact 

[jira] [Work logged] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24590?focusedWorklogId=675660=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675660
 ]

ASF GitHub Bot logged work on HIVE-24590:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:14
Start Date: 04/Nov/21 01:14
Worklog Time Spent: 10m 
  Work Description: belugabehr edited a comment on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past.
   
   Thank you for helping to document this behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675660)
Time Spent: 5h 50m  (was: 5h 40m)

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25659) Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?focusedWorklogId=675634=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675634
 ]

ASF GitHub Bot logged work on HIVE-25659:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:11
Start Date: 04/Nov/21 01:11
Worklog Time Spent: 10m 
  Work Description: guptanikhil007 commented on a change in pull request 
#2758:
URL: https://github.com/apache/hive/pull/2758#discussion_r741713387



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the not in query cannot be fit in one query, it should fail according 
to function definition

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the `not in` query cannot be fit in one query, it should fail 
according to function definition

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the `not in` query cannot be fit in one single query, it should fail 
according to function definition




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675634)
Time Spent: 2h 10m  (was: 2h)

> Metastore direct sql queries with IN/(NOT IN) should be split based on max 
> parameters allowed by SQL DB
> ---
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of 

[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=675621=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675621
 ]

ASF GitHub Bot logged work on HIVE-25591:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:09
Start Date: 04/Nov/21 01:09
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #2759:
URL: https://github.com/apache/hive/pull/2759#discussion_r741675549



##
File path: ql/src/test/queries/clientpositive/jdbc_table_with_schema_mssql.q
##
@@ -0,0 +1,54 @@
+--! qt:database:mssql:q_test_country_table_with_schema.mssql.sql
+-- Microsoft SQL server allows multiple schemas per database so to 
disambiguate between tables in different schemas it
+-- is necessary to set the hive.sql.schema property properly.
+
+-- Some JDBC APIs require the catalog, schema, and table names to be passed 
exactly as they are stored in the database.
+-- MSSQL stores unquoted identifiers by first converting them to lowercase 
thus the hive.sql.schema and
+-- hive.sql.table properties below are specified in lowercase.
+
+CREATE EXTERNAL TABLE country_0 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "bob",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_0;
+SELECT COUNT(*) FROM country_0;
+
+CREATE EXTERNAL TABLE country_1 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "alice",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_1;
+SELECT COUNT(*) FROM country_1;
+
+-- Test DML statements are working fine when accessing table in non-default 
schema
+INSERT INTO country_1 VALUES (8, 'Hungary');

Review comment:
   I'm not familiar how dml statements are treated when the target is a 
JDBC table. I guess a `HiveJdbcConverter` shows up in the plan.
   Could you please add an 
   ```
   explain cbo insert into country_1... 
   ```
   statement. Feel free to ignore my comment if we already have one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675621)
Time Spent: 1h 40m  (was: 1.5h)

> CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
> 
>
> Key: HIVE-25591
> URL: https://issues.apache.org/jira/browse/HIVE-25591
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Consider the following use case where tables reside in some user-defined 
> schema in some JDBC compliant database:
> +Postgres+
> {code:sql}
> create schema world;
> create table if not exists world.country (name varchar(80) not null);
> insert into world.country (name) values ('India');
> insert into world.country (name) values ('Russia');
> insert into world.country (name) values ('USA');
> {code}
> The following DDL statement in Hive fails:
> +Hive+
> {code:sql}
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test",
> "hive.sql.dbcp.username" = "user",
> "hive.sql.dbcp.password" = "pwd",
> "hive.sql.schema" = "world",
> "hive.sql.table" = "country");
> {code}
> The exception is the following:
> {noformat}
> org.postgresql.util.PSQLException: ERROR: relation "country" does not exist
>   Position: 15
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532)
> 

[jira] [Work logged] (HIVE-25596) Compress Hive Replication Metrics while storing

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25596?focusedWorklogId=675609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675609
 ]

ASF GitHub Bot logged work on HIVE-25596:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:07
Start Date: 04/Nov/21 01:07
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2724:
URL: https://github.com/apache/hive/pull/2724#discussion_r741567981



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/metric/ReplicationMetricCollector.java
##
@@ -58,14 +59,15 @@ public void setMetricsMBean(ObjectName metricsMBean) {
 
   public ReplicationMetricCollector(String dbName, Metadata.ReplicationType 
replicationType,
  String stagingDir, long dumpExecutionId, HiveConf 
conf) {
+this.conf = conf;
 checkEnabledForTests(conf);
 String policy = conf.get(Constants.SCHEDULED_QUERY_SCHEDULENAME);
 long executionId = conf.getLong(Constants.SCHEDULED_QUERY_EXECUTIONID, 0L);
 if (!StringUtils.isEmpty(policy) && executionId > 0) {
   isEnabled = true;
   metricCollector = MetricCollector.getInstance().init(conf);
   MetricSink.getInstance().init(conf);
-  Metadata metadata = new Metadata(dbName, replicationType, stagingDir);
+  Metadata metadata = new Metadata(dbName, replicationType, 
testingModeEnabled() ? "dummyDir" :stagingDir);

Review comment:
   nit: shift this staging logic calculation to a separate method

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDeserialize.java
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.udf.generic;
+
+import org.apache.hadoop.hive.metastore.messaging.json.JSONMessageEncoder;
+import 
org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder;
+import org.apache.hadoop.hive.ql.exec.Description;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils;
+
+/**
+ * GenericUDFDeserializeString.
+ *
+ */
+@Description(name = "deserialize",
+value="_FUNC_(message, encodingFormat) - Returns deserialized string 
of encoded message.",
+extended="Example:\n"
++ "  > SELECT _FUNC_('H4sI/ytJLS4BAAx+f9gE', 
'gzip(json-2.0)') FROM src LIMIT 1;\n"

Review comment:
   The base64 encoding is  missed out here in description i.e even though 
the passed parameter is "gzip(json-2.0)", the fact is that it won't work unless 
the passed content is base64 encoded. 
   Moreover,  json part in "gzip(json-2.0)" at a UDF level would be confusing. 
Meaning even if underlying string is non json one, the UDF will work just fine. 
 user mandated to pass json even if it is not doesn't go well. for a UDF level 
just gzip would have been fine?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDeserialize.java
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, 

[jira] [Work logged] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24590?focusedWorklogId=675593=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675593
 ]

ASF GitHub Bot logged work on HIVE-24590:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:05
Start Date: 04/Nov/21 01:05
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675593)
Time Spent: 5h 40m  (was: 5.5h)

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25596) Compress Hive Replication Metrics while storing

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25596?focusedWorklogId=675592=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675592
 ]

ASF GitHub Bot logged work on HIVE-25596:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:05
Start Date: 04/Nov/21 01:05
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2724:
URL: https://github.com/apache/hive/pull/2724#discussion_r741578189



##
File path: 
standalone-metastore/metastore-server/src/test/resources/sql/postgres/upgrade-3.1.3000-to-4.0.0.postgres.sql
##
@@ -51,6 +51,10 @@ CREATE TABLE "REPLICATION_METRICS" (
 --Increase the size of RM_PROGRESS to accomodate the replication statistics
 ALTER TABLE "REPLICATION_METRICS" ALTER "RM_PROGRESS" TYPE varchar(24000);
 
+ALTER TABLE "REPLICATION_METRICS" ALTER "RM_PROGRESS" TYPE varchar(1);

Review comment:
   This is tested as part of ITestPostgres.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDeserialize.java
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.udf.generic;
+
+import org.apache.hadoop.hive.metastore.messaging.json.JSONMessageEncoder;
+import 
org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder;
+import org.apache.hadoop.hive.ql.exec.Description;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils;
+
+/**
+ * GenericUDFDeserializeString.
+ *
+ */
+@Description(name = "deserialize",
+value="_FUNC_(message, encodingFormat) - Returns deserialized string 
of encoded message.",
+extended="Example:\n"
++ "  > SELECT _FUNC_('H4sI/ytJLS4BAAx+f9gE', 
'gzip(json-2.0)') FROM src LIMIT 1;\n"
++ "  test")
+public class GenericUDFDeserialize extends GenericUDF {
+
+private static final int ARG_COUNT = 2; // Number of arguments to this UDF
+private static final String FUNC_NAME = "deserialize"; // External Name
+
+private transient PrimitiveObjectInspector stringOI = null;
+private transient PrimitiveObjectInspector encodingFormat = null;
+
+@Override
+public ObjectInspector initialize(ObjectInspector[] arguments)
+throws UDFArgumentException {
+if (arguments.length != ARG_COUNT) {
+throw new UDFArgumentException("The function " + FUNC_NAME + " 
accepts " + ARG_COUNT + " arguments.");
+}
+for (ObjectInspector arg: arguments) {
+if (arg.getCategory() != ObjectInspector.Category.PRIMITIVE ||
+
PrimitiveObjectInspectorUtils.PrimitiveGrouping.STRING_GROUP != 
PrimitiveObjectInspectorUtils.getPrimitiveGrouping(
+
((PrimitiveObjectInspector)arg).getPrimitiveCategory())){
+throw new UDFArgumentTypeException(0, "The arguments to " + 
FUNC_NAME + " must be a string/varchar");
+}
+}
+stringOI = (PrimitiveObjectInspector) arguments[0];
+encodingFormat = (PrimitiveObjectInspector) arguments[1];
+return PrimitiveObjectInspectorFactory.javaStringObjectInspector;
+}
+
+@Override
+public Object evaluate(DeferredObject[] arguments) throws HiveException {
+String value = 
PrimitiveObjectInspectorUtils.getString(arguments[0].get(), stringOI);
+String messageFormat = 
PrimitiveObjectInspectorUtils.getString(arguments[1].get(), encodingFormat);
+if (value == null) {
+return null;
+} else if (messageFormat == null || messageFormat.isEmpty() || 
JSONMessageEncoder.FORMAT.equalsIgnoreCase(value)) {
+ 

[jira] [Work logged] (HIVE-25594) Setup JDBC databases in tests via QT options

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25594?focusedWorklogId=675580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675580
 ]

ASF GitHub Bot logged work on HIVE-25594:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:04
Start Date: 04/Nov/21 01:04
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #2742:
URL: https://github.com/apache/hive/pull/2742


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675580)
Time Spent: 3h 20m  (was: 3h 10m)

> Setup JDBC databases in tests via QT options
> 
>
> Key: HIVE-25594
> URL: https://issues.apache.org/jira/browse/HIVE-25594
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The goal of this jira is to add a new QT option for setting up JDBC DBMS and 
> using it in qtests which need a JDBC endpoint up and running. It can be used 
> in tests with external JDBC tables, connectors, etc.
> A sample file using the proposed option ({{qt:database}}) is shown below.
> {code:sql}
> --!qt:database:postgres:init_sript_1234.sql:cleanup_script_1234.sql
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/qtestDB",
> "hive.sql.dbcp.username" = "qtestuser",
> "hive.sql.dbcp.password" = "qtestpassword",
> "hive.sql.table" = "country");
> EXPLAIN CBO SELECT COUNT(*) from country;
> SELECT COUNT(*) from country;
> {code}
> This builds upon HIVE-25423 but proposes to use JDBC datasources without the 
> need for a using a specific CLI driver. Furthermore, the proposed QT option 
> syntax allows using customised init/cleanup scripts for the JDBC datasource 
> per test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25659) Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?focusedWorklogId=675577=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675577
 ]

ASF GitHub Bot logged work on HIVE-25659:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 01:04
Start Date: 04/Nov/21 01:04
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #2758:
URL: https://github.com/apache/hive/pull/2758#discussion_r741678476



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
##
@@ -680,6 +680,9 @@ public static ConfVars getMetaConf(String name) {
 
DIRECT_SQL_MAX_ELEMENTS_VALUES_CLAUSE("metastore.direct.sql.max.elements.values.clause",
 "hive.direct.sql.max.elements.values.clause",
 1000, "The maximum number of values in a VALUES clause for INSERT 
statement."),
+DIRECT_SQL_MAX_PARAMETERS("metastore.direct.sql.max.parameters",
+"hive.direct.sql.max.parameters", 1000, "The maximum parameters the\n" 
+

Review comment:
   Rephrase: "... maximum query parameters..."

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
##
@@ -316,7 +317,7 @@ public static String getFullTableName(String dbName, String 
tableName) {
   // Compute the size of a query when the 'nextValue' is added to the 
current query.
   int querySize = querySizeExpected(buf.length(), nextValue.length(), 
suffix.length(), addParens);
 
-  if (querySize > maxQueryLength * 1024) {
+  if (querySize > maxQueryLength * 1024 || currentCount >= maxParameters) {

Review comment:
   nit: Keep each condition within ()

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   Why would this test fail?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675577)
Time Spent: 2h  (was: 1h 50m)

> Metastore direct sql queries with IN/(NOT IN) should be split based on max 
> parameters allowed by SQL DB
> ---
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of parameters with very small value of 
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while 
> generating delete query for completed_compactions table
> Example:
> {code:java}
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100
> DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB)
> Number of parameters in a single query = 4759
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25670) Avoid getTable() calls for foreign key tables not used in a query

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25670?focusedWorklogId=675448=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675448
 ]

ASF GitHub Bot logged work on HIVE-25670:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:51
Start Date: 04/Nov/21 00:51
Worklog Time Spent: 10m 
  Work Description: scarlin-cloudera opened a new pull request #2763:
URL: https://github.com/apache/hive/pull/2763






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675448)
Time Spent: 20m  (was: 10m)

> Avoid getTable() calls for foreign key tables not used in a query
> -
>
> Key: HIVE-25670
> URL: https://issues.apache.org/jira/browse/HIVE-25670
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Steve Carlin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In RelOptHiveTable, we generate the referential constraints for the table. In 
> this process, we make a metastore call to fetch these tables.  This is used 
> later on for potential gain on joins done on the key.
> However, there is no need to fetch these constraints if the table is not used 
> in the query. If we can get this information up front, we can save a bit on 
> compilation time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25594) Setup JDBC databases in tests via QT options

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25594?focusedWorklogId=675439=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675439
 ]

ASF GitHub Bot logged work on HIVE-25594:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:51
Start Date: 04/Nov/21 00:51
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #2742:
URL: https://github.com/apache/hive/pull/2742#discussion_r741839431



##
File path: 
itests/util/src/main/java/org/apache/hadoop/hive/ql/externalDB/AbstractExternalDB.java
##
@@ -71,27 +59,8 @@ public ProcessResults(String stdout, String stderr, int rc) {
 }
 }
 
-public static AbstractExternalDB initalizeExternalDB(String 
externalDBType) throws IOException {
-AbstractExternalDB abstractExternalDB;
-switch (externalDBType) {
-case "mysql":
-abstractExternalDB = new MySQLExternalDB();
-break;
-case "postgres":
-abstractExternalDB = new PostgresExternalDB();
-break;
-default:
-throw new IOException("unsupported external database type " + 
externalDBType);
-}
-return abstractExternalDB;
-}
-
-public AbstractExternalDB(String externalDBType) {
-this.externalDBType = externalDBType;
-}
-
-protected String getDockerContainerName() {
-return String.format("qtestExternalDB-%s", externalDBType);
+private final String getDockerContainerName() {

Review comment:
   Logged https://issues.apache.org/jira/browse/HIVE-25667 for tracking 
this further.

##
File path: 
itests/util/src/main/java/org/apache/hadoop/hive/ql/qoption/QTestDatabaseHandler.java
##
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.qoption;
+
+import org.apache.hadoop.hive.ql.QTestUtil;
+import org.apache.hadoop.hive.ql.externalDB.AbstractExternalDB;
+import org.apache.hadoop.hive.ql.externalDB.MSSQLServer;
+import org.apache.hadoop.hive.ql.externalDB.MariaDB;
+import org.apache.hadoop.hive.ql.externalDB.MySQLExternalDB;
+import org.apache.hadoop.hive.ql.externalDB.Oracle;
+import org.apache.hadoop.hive.ql.externalDB.PostgresExternalDB;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.Arrays;
+import java.util.EnumMap;
+import java.util.Map;
+
+/**
+ * An option handler for spinning (resp. stopping) databases before (resp. 
after) running a test.
+ *
+ * Syntax: qt:database:DatabaseType:path_to_init_script
+ *
+ * The database type ({@link DatabaseType}) is obligatory argument. The 
initialization script can be omitted.
+ *
+ * Current limitations:
+ * 
+ *   Only one test/file per run
+ *   Does not support parallel execution
+ *   Cannot instantiate more than one database of the same type per 
test
+ * 
+ */
+public class QTestDatabaseHandler implements QTestOptionHandler {
+  private static final Logger LOG = 
LoggerFactory.getLogger(QTestDatabaseHandler.class);
+
+  private enum DatabaseType {
+POSTGRES {
+  @Override
+  AbstractExternalDB create() {
+return new PostgresExternalDB();
+  }
+}, MYSQL {
+  @Override
+  AbstractExternalDB create() {
+return new MySQLExternalDB();
+  }
+}, MARIADB {
+  @Override
+  AbstractExternalDB create() {
+return new MariaDB();
+  }
+}, MSSQL {
+  @Override
+  AbstractExternalDB create() {
+return new MSSQLServer();
+  }
+}, ORACLE {
+  @Override
+  AbstractExternalDB create() {
+return new Oracle();
+  }
+};
+
+abstract AbstractExternalDB create();
+  }
+
+  private final Map databaseToScript = new 
EnumMap<>(DatabaseType.class);
+
+  @Override
+  public void processArguments(String arguments) {
+String[] args = arguments.split(":");
+if (args.length == 0) {
+  throw new IllegalArgumentException("No arguments provided");
+}
+if 

[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=675295=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675295
 ]

ASF GitHub Bot logged work on HIVE-25591:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:37
Start Date: 04/Nov/21 00:37
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #2759:
URL: https://github.com/apache/hive/pull/2759






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675295)
Time Spent: 1.5h  (was: 1h 20m)

> CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
> 
>
> Key: HIVE-25591
> URL: https://issues.apache.org/jira/browse/HIVE-25591
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Consider the following use case where tables reside in some user-defined 
> schema in some JDBC compliant database:
> +Postgres+
> {code:sql}
> create schema world;
> create table if not exists world.country (name varchar(80) not null);
> insert into world.country (name) values ('India');
> insert into world.country (name) values ('Russia');
> insert into world.country (name) values ('USA');
> {code}
> The following DDL statement in Hive fails:
> +Hive+
> {code:sql}
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test",
> "hive.sql.dbcp.username" = "user",
> "hive.sql.dbcp.password" = "pwd",
> "hive.sql.schema" = "world",
> "hive.sql.table" = "country");
> {code}
> The exception is the following:
> {noformat}
> org.postgresql.util.PSQLException: ERROR: relation "country" does not exist
>   Position: 15
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2267)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:312) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:448) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:369) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:153)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:103)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>  ~[commons-dbcp2-2.7.0.jar:2.7.0]
>   at 
> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>  ~[commons-dbcp2-2.7.0.jar:2.7.0]
>   at 
> org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getColumnNames(GenericJdbcDatabaseAccessor.java:83)
>  [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at org.apache.hive.storage.jdbc.JdbcSerDe.initialize(JdbcSerDe.java:98) 
> [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:95)
>  [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:78)
>  [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:342)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:324) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getColsInternal(Table.java:734) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:717) 
> 

[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=675196=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675196
 ]

ASF GitHub Bot logged work on HIVE-25591:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:23
Start Date: 04/Nov/21 00:23
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #2759:
URL: https://github.com/apache/hive/pull/2759#discussion_r741798099



##
File path: ql/src/test/queries/clientpositive/jdbc_table_with_schema_mssql.q
##
@@ -0,0 +1,54 @@
+--! qt:database:mssql:q_test_country_table_with_schema.mssql.sql
+-- Microsoft SQL server allows multiple schemas per database so to 
disambiguate between tables in different schemas it
+-- is necessary to set the hive.sql.schema property properly.
+
+-- Some JDBC APIs require the catalog, schema, and table names to be passed 
exactly as they are stored in the database.
+-- MSSQL stores unquoted identifiers by first converting them to lowercase 
thus the hive.sql.schema and
+-- hive.sql.table properties below are specified in lowercase.
+
+CREATE EXTERNAL TABLE country_0 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "bob",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_0;
+SELECT COUNT(*) FROM country_0;
+
+CREATE EXTERNAL TABLE country_1 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "alice",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_1;
+SELECT COUNT(*) FROM country_1;
+
+-- Test DML statements are working fine when accessing table in non-default 
schema
+INSERT INTO country_1 VALUES (8, 'Hungary');

Review comment:
   I added some plans in 
https://github.com/apache/hive/pull/2759/commits/5aa343db3cb36a5aca6c4ef0cfb94263fba38585




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675196)
Time Spent: 1h 20m  (was: 1h 10m)

> CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
> 
>
> Key: HIVE-25591
> URL: https://issues.apache.org/jira/browse/HIVE-25591
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Consider the following use case where tables reside in some user-defined 
> schema in some JDBC compliant database:
> +Postgres+
> {code:sql}
> create schema world;
> create table if not exists world.country (name varchar(80) not null);
> insert into world.country (name) values ('India');
> insert into world.country (name) values ('Russia');
> insert into world.country (name) values ('USA');
> {code}
> The following DDL statement in Hive fails:
> +Hive+
> {code:sql}
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test",
> "hive.sql.dbcp.username" = "user",
> "hive.sql.dbcp.password" = "pwd",
> "hive.sql.schema" = "world",
> "hive.sql.table" = "country");
> {code}
> The exception is the following:
> {noformat}
> org.postgresql.util.PSQLException: ERROR: relation "country" does not exist
>   Position: 15
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2267)
>  ~[postgresql-42.2.14.jar:42.2.14]
> 

[jira] [Work logged] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24590?focusedWorklogId=675153=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675153
 ]

ASF GitHub Bot logged work on HIVE-24590:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:17
Start Date: 04/Nov/21 00:17
Worklog Time Spent: 10m 
  Work Description: belugabehr edited a comment on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past.
   
   Thank you for helping to document this behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675153)
Time Spent: 5.5h  (was: 5h 20m)

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25659) Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?focusedWorklogId=675126=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675126
 ]

ASF GitHub Bot logged work on HIVE-25659:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:14
Start Date: 04/Nov/21 00:14
Worklog Time Spent: 10m 
  Work Description: guptanikhil007 commented on a change in pull request 
#2758:
URL: https://github.com/apache/hive/pull/2758#discussion_r741713387



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the not in query cannot be fit in one query, it should fail according 
to function definition

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the `not in` query cannot be fit in one query, it should fail 
according to function definition

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the `not in` query cannot be fit in one single query, it should fail 
according to function definition




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675126)
Time Spent: 1h 50m  (was: 1h 40m)

> Metastore direct sql queries with IN/(NOT IN) should be split based on max 
> parameters allowed by SQL DB
> ---
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of 

[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=675114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675114
 ]

ASF GitHub Bot logged work on HIVE-25591:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:13
Start Date: 04/Nov/21 00:13
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #2759:
URL: https://github.com/apache/hive/pull/2759#discussion_r741675549



##
File path: ql/src/test/queries/clientpositive/jdbc_table_with_schema_mssql.q
##
@@ -0,0 +1,54 @@
+--! qt:database:mssql:q_test_country_table_with_schema.mssql.sql
+-- Microsoft SQL server allows multiple schemas per database so to 
disambiguate between tables in different schemas it
+-- is necessary to set the hive.sql.schema property properly.
+
+-- Some JDBC APIs require the catalog, schema, and table names to be passed 
exactly as they are stored in the database.
+-- MSSQL stores unquoted identifiers by first converting them to lowercase 
thus the hive.sql.schema and
+-- hive.sql.table properties below are specified in lowercase.
+
+CREATE EXTERNAL TABLE country_0 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "bob",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_0;
+SELECT COUNT(*) FROM country_0;
+
+CREATE EXTERNAL TABLE country_1 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "alice",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_1;
+SELECT COUNT(*) FROM country_1;
+
+-- Test DML statements are working fine when accessing table in non-default 
schema
+INSERT INTO country_1 VALUES (8, 'Hungary');

Review comment:
   I'm not familiar how dml statements are treated when the target is a 
JDBC table. I guess a `HiveJdbcConverter` shows up in the plan.
   Could you please add an 
   ```
   explain cbo insert into country_1... 
   ```
   statement. Feel free to ignore my comment if we already have one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675114)
Time Spent: 1h 10m  (was: 1h)

> CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
> 
>
> Key: HIVE-25591
> URL: https://issues.apache.org/jira/browse/HIVE-25591
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Consider the following use case where tables reside in some user-defined 
> schema in some JDBC compliant database:
> +Postgres+
> {code:sql}
> create schema world;
> create table if not exists world.country (name varchar(80) not null);
> insert into world.country (name) values ('India');
> insert into world.country (name) values ('Russia');
> insert into world.country (name) values ('USA');
> {code}
> The following DDL statement in Hive fails:
> +Hive+
> {code:sql}
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test",
> "hive.sql.dbcp.username" = "user",
> "hive.sql.dbcp.password" = "pwd",
> "hive.sql.schema" = "world",
> "hive.sql.table" = "country");
> {code}
> The exception is the following:
> {noformat}
> org.postgresql.util.PSQLException: ERROR: relation "country" does not exist
>   Position: 15
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532)
>  

[jira] [Work logged] (HIVE-25596) Compress Hive Replication Metrics while storing

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25596?focusedWorklogId=675098=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675098
 ]

ASF GitHub Bot logged work on HIVE-25596:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:11
Start Date: 04/Nov/21 00:11
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #2724:
URL: https://github.com/apache/hive/pull/2724#discussion_r741567981



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/metric/ReplicationMetricCollector.java
##
@@ -58,14 +59,15 @@ public void setMetricsMBean(ObjectName metricsMBean) {
 
   public ReplicationMetricCollector(String dbName, Metadata.ReplicationType 
replicationType,
  String stagingDir, long dumpExecutionId, HiveConf 
conf) {
+this.conf = conf;
 checkEnabledForTests(conf);
 String policy = conf.get(Constants.SCHEDULED_QUERY_SCHEDULENAME);
 long executionId = conf.getLong(Constants.SCHEDULED_QUERY_EXECUTIONID, 0L);
 if (!StringUtils.isEmpty(policy) && executionId > 0) {
   isEnabled = true;
   metricCollector = MetricCollector.getInstance().init(conf);
   MetricSink.getInstance().init(conf);
-  Metadata metadata = new Metadata(dbName, replicationType, stagingDir);
+  Metadata metadata = new Metadata(dbName, replicationType, 
testingModeEnabled() ? "dummyDir" :stagingDir);

Review comment:
   nit: shift this staging logic calculation to a separate method

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDeserialize.java
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.udf.generic;
+
+import org.apache.hadoop.hive.metastore.messaging.json.JSONMessageEncoder;
+import 
org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder;
+import org.apache.hadoop.hive.ql.exec.Description;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils;
+
+/**
+ * GenericUDFDeserializeString.
+ *
+ */
+@Description(name = "deserialize",
+value="_FUNC_(message, encodingFormat) - Returns deserialized string 
of encoded message.",
+extended="Example:\n"
++ "  > SELECT _FUNC_('H4sI/ytJLS4BAAx+f9gE', 
'gzip(json-2.0)') FROM src LIMIT 1;\n"

Review comment:
   The base64 encoding is  missed out here in description i.e even though 
the passed parameter is "gzip(json-2.0)", the fact is that it won't work unless 
the passed content is base64 encoded. 
   Moreover,  json part in "gzip(json-2.0)" at a UDF level would be confusing. 
Meaning even if underlying string is non json one, the UDF will work just fine. 
 user mandated to pass json even if it is not doesn't go well. for a UDF level 
just gzip would have been fine?

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDeserialize.java
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, 

[jira] [Work logged] (HIVE-25596) Compress Hive Replication Metrics while storing

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25596?focusedWorklogId=675079=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675079
 ]

ASF GitHub Bot logged work on HIVE-25596:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:08
Start Date: 04/Nov/21 00:08
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2724:
URL: https://github.com/apache/hive/pull/2724#discussion_r741578189



##
File path: 
standalone-metastore/metastore-server/src/test/resources/sql/postgres/upgrade-3.1.3000-to-4.0.0.postgres.sql
##
@@ -51,6 +51,10 @@ CREATE TABLE "REPLICATION_METRICS" (
 --Increase the size of RM_PROGRESS to accomodate the replication statistics
 ALTER TABLE "REPLICATION_METRICS" ALTER "RM_PROGRESS" TYPE varchar(24000);
 
+ALTER TABLE "REPLICATION_METRICS" ALTER "RM_PROGRESS" TYPE varchar(1);

Review comment:
   This is tested as part of ITestPostgres.

##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFDeserialize.java
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.udf.generic;
+
+import org.apache.hadoop.hive.metastore.messaging.json.JSONMessageEncoder;
+import 
org.apache.hadoop.hive.metastore.messaging.json.gzip.GzipJSONMessageEncoder;
+import org.apache.hadoop.hive.ql.exec.Description;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
+import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException;
+import org.apache.hadoop.hive.ql.metadata.HiveException;
+import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
+import org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
+import 
org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils;
+
+/**
+ * GenericUDFDeserializeString.
+ *
+ */
+@Description(name = "deserialize",
+value="_FUNC_(message, encodingFormat) - Returns deserialized string 
of encoded message.",
+extended="Example:\n"
++ "  > SELECT _FUNC_('H4sI/ytJLS4BAAx+f9gE', 
'gzip(json-2.0)') FROM src LIMIT 1;\n"
++ "  test")
+public class GenericUDFDeserialize extends GenericUDF {
+
+private static final int ARG_COUNT = 2; // Number of arguments to this UDF
+private static final String FUNC_NAME = "deserialize"; // External Name
+
+private transient PrimitiveObjectInspector stringOI = null;
+private transient PrimitiveObjectInspector encodingFormat = null;
+
+@Override
+public ObjectInspector initialize(ObjectInspector[] arguments)
+throws UDFArgumentException {
+if (arguments.length != ARG_COUNT) {
+throw new UDFArgumentException("The function " + FUNC_NAME + " 
accepts " + ARG_COUNT + " arguments.");
+}
+for (ObjectInspector arg: arguments) {
+if (arg.getCategory() != ObjectInspector.Category.PRIMITIVE ||
+
PrimitiveObjectInspectorUtils.PrimitiveGrouping.STRING_GROUP != 
PrimitiveObjectInspectorUtils.getPrimitiveGrouping(
+
((PrimitiveObjectInspector)arg).getPrimitiveCategory())){
+throw new UDFArgumentTypeException(0, "The arguments to " + 
FUNC_NAME + " must be a string/varchar");
+}
+}
+stringOI = (PrimitiveObjectInspector) arguments[0];
+encodingFormat = (PrimitiveObjectInspector) arguments[1];
+return PrimitiveObjectInspectorFactory.javaStringObjectInspector;
+}
+
+@Override
+public Object evaluate(DeferredObject[] arguments) throws HiveException {
+String value = 
PrimitiveObjectInspectorUtils.getString(arguments[0].get(), stringOI);
+String messageFormat = 
PrimitiveObjectInspectorUtils.getString(arguments[1].get(), encodingFormat);
+if (value == null) {
+return null;
+} else if (messageFormat == null || messageFormat.isEmpty() || 
JSONMessageEncoder.FORMAT.equalsIgnoreCase(value)) {
+ 

[jira] [Work logged] (HIVE-24590) Operation Logging still leaks the log4j Appenders

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24590?focusedWorklogId=675080=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675080
 ]

ASF GitHub Bot logged work on HIVE-24590:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:08
Start Date: 04/Nov/21 00:08
Worklog Time Spent: 10m 
  Work Description: belugabehr commented on pull request #2432:
URL: https://github.com/apache/hive/pull/2432#issuecomment-958607011


   OK, if there is no changes in overall behavior, LGTM (+1) pending tests past


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675080)
Time Spent: 5h 20m  (was: 5h 10m)

> Operation Logging still leaks the log4j Appenders
> -
>
> Key: HIVE-24590
> URL: https://issues.apache.org/jira/browse/HIVE-24590
> Project: Hive
>  Issue Type: Bug
>  Components: Logging
>Reporter: Eugene Chung
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: Screen Shot 2021-01-06 at 18.42.05.png, Screen Shot 
> 2021-01-06 at 18.42.24.png, Screen Shot 2021-01-06 at 18.42.55.png, Screen 
> Shot 2021-01-06 at 21.38.32.png, Screen Shot 2021-01-06 at 21.47.28.png, 
> Screen Shot 2021-01-08 at 21.01.40.png, add_debug_log_and_trace.patch
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> I'm using Hive 3.1.2 with options below.
>  * hive.server2.logging.operation.enabled=true
>  * hive.server2.logging.operation.level=VERBOSE
>  * hive.async.log.enabled=false
> I already know the ticket, https://issues.apache.org/jira/browse/HIVE-17128 
> but HS2 still leaks log4j RandomAccessFileManager.
> !Screen Shot 2021-01-06 at 18.42.05.png|width=756,height=197!
> I checked the operation log file which is not closed/deleted properly.
> !Screen Shot 2021-01-06 at 18.42.24.png|width=603,height=272!
> Then there's the log,
> {code:java}
> client.TezClient: Shutting down Tez Session, sessionName= {code}
> !Screen Shot 2021-01-06 at 18.42.55.png|width=1372,height=26!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25594) Setup JDBC databases in tests via QT options

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25594?focusedWorklogId=675067=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675067
 ]

ASF GitHub Bot logged work on HIVE-25594:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:07
Start Date: 04/Nov/21 00:07
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #2742:
URL: https://github.com/apache/hive/pull/2742


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675067)
Time Spent: 3h  (was: 2h 50m)

> Setup JDBC databases in tests via QT options
> 
>
> Key: HIVE-25594
> URL: https://issues.apache.org/jira/browse/HIVE-25594
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> The goal of this jira is to add a new QT option for setting up JDBC DBMS and 
> using it in qtests which need a JDBC endpoint up and running. It can be used 
> in tests with external JDBC tables, connectors, etc.
> A sample file using the proposed option ({{qt:database}}) is shown below.
> {code:sql}
> --!qt:database:postgres:init_sript_1234.sql:cleanup_script_1234.sql
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/qtestDB",
> "hive.sql.dbcp.username" = "qtestuser",
> "hive.sql.dbcp.password" = "qtestpassword",
> "hive.sql.table" = "country");
> EXPLAIN CBO SELECT COUNT(*) from country;
> SELECT COUNT(*) from country;
> {code}
> This builds upon HIVE-25423 but proposes to use JDBC datasources without the 
> need for a using a specific CLI driver. Furthermore, the proposed QT option 
> syntax allows using customised init/cleanup scripts for the JDBC datasource 
> per test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25659) Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?focusedWorklogId=675064=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-675064
 ]

ASF GitHub Bot logged work on HIVE-25659:
-

Author: ASF GitHub Bot
Created on: 04/Nov/21 00:07
Start Date: 04/Nov/21 00:07
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #2758:
URL: https://github.com/apache/hive/pull/2758#discussion_r741678476



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
##
@@ -680,6 +680,9 @@ public static ConfVars getMetaConf(String name) {
 
DIRECT_SQL_MAX_ELEMENTS_VALUES_CLAUSE("metastore.direct.sql.max.elements.values.clause",
 "hive.direct.sql.max.elements.values.clause",
 1000, "The maximum number of values in a VALUES clause for INSERT 
statement."),
+DIRECT_SQL_MAX_PARAMETERS("metastore.direct.sql.max.parameters",
+"hive.direct.sql.max.parameters", 1000, "The maximum parameters the\n" 
+

Review comment:
   Rephrase: "... maximum query parameters..."

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
##
@@ -316,7 +317,7 @@ public static String getFullTableName(String dbName, String 
tableName) {
   // Compute the size of a query when the 'nextValue' is added to the 
current query.
   int querySize = querySizeExpected(buf.length(), nextValue.length(), 
suffix.length(), addParens);
 
-  if (querySize > maxQueryLength * 1024) {
+  if (querySize > maxQueryLength * 1024 || currentCount >= maxParameters) {

Review comment:
   nit: Keep each condition within ()

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   Why would this test fail?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 675064)
Time Spent: 1h 40m  (was: 1.5h)

> Metastore direct sql queries with IN/(NOT IN) should be split based on max 
> parameters allowed by SQL DB
> ---
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of parameters with very small value of 
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while 
> generating delete query for completed_compactions table
> Example:
> {code:java}
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100
> DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB)
> Number of parameters in a single query = 4759
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema

2021-11-03 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis resolved HIVE-25591.

Fix Version/s: 4.0.0
   Resolution: Fixed

Fixed in 
[0616bcaa2436ccbf388b635bfea160b47849553c|https://github.com/apache/hive/commit/0616bcaa2436ccbf388b635bfea160b47849553c].
 Thanks [~chiran54321]  for the quick prototype and [~kkasa] for the review!

> CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
> 
>
> Key: HIVE-25591
> URL: https://issues.apache.org/jira/browse/HIVE-25591
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Consider the following use case where tables reside in some user-defined 
> schema in some JDBC compliant database:
> +Postgres+
> {code:sql}
> create schema world;
> create table if not exists world.country (name varchar(80) not null);
> insert into world.country (name) values ('India');
> insert into world.country (name) values ('Russia');
> insert into world.country (name) values ('USA');
> {code}
> The following DDL statement in Hive fails:
> +Hive+
> {code:sql}
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test",
> "hive.sql.dbcp.username" = "user",
> "hive.sql.dbcp.password" = "pwd",
> "hive.sql.schema" = "world",
> "hive.sql.table" = "country");
> {code}
> The exception is the following:
> {noformat}
> org.postgresql.util.PSQLException: ERROR: relation "country" does not exist
>   Position: 15
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2267)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:312) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:448) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:369) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:153)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:103)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>  ~[commons-dbcp2-2.7.0.jar:2.7.0]
>   at 
> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>  ~[commons-dbcp2-2.7.0.jar:2.7.0]
>   at 
> org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getColumnNames(GenericJdbcDatabaseAccessor.java:83)
>  [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at org.apache.hive.storage.jdbc.JdbcSerDe.initialize(JdbcSerDe.java:98) 
> [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:95)
>  [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:78)
>  [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:342)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:324) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getColsInternal(Table.java:734) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:717) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableDesc.toTable(CreateTableDesc.java:933)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.ddl.table.create.CreateTableOperation.execute(CreateTableOperation.java:59)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.ddl.DDLTask.execute(DDLTask.java:84) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 

[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=674991=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674991
 ]

ASF GitHub Bot logged work on HIVE-25591:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 22:24
Start Date: 03/Nov/21 22:24
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #2759:
URL: https://github.com/apache/hive/pull/2759


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674991)
Time Spent: 1h  (was: 50m)

> CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
> 
>
> Key: HIVE-25591
> URL: https://issues.apache.org/jira/browse/HIVE-25591
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Consider the following use case where tables reside in some user-defined 
> schema in some JDBC compliant database:
> +Postgres+
> {code:sql}
> create schema world;
> create table if not exists world.country (name varchar(80) not null);
> insert into world.country (name) values ('India');
> insert into world.country (name) values ('Russia');
> insert into world.country (name) values ('USA');
> {code}
> The following DDL statement in Hive fails:
> +Hive+
> {code:sql}
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test",
> "hive.sql.dbcp.username" = "user",
> "hive.sql.dbcp.password" = "pwd",
> "hive.sql.schema" = "world",
> "hive.sql.table" = "country");
> {code}
> The exception is the following:
> {noformat}
> org.postgresql.util.PSQLException: ERROR: relation "country" does not exist
>   Position: 15
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2267)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:312) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:448) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:369) 
> ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:153)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:103)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>  ~[commons-dbcp2-2.7.0.jar:2.7.0]
>   at 
> org.apache.commons.dbcp2.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:122)
>  ~[commons-dbcp2-2.7.0.jar:2.7.0]
>   at 
> org.apache.hive.storage.jdbc.dao.GenericJdbcDatabaseAccessor.getColumnNames(GenericJdbcDatabaseAccessor.java:83)
>  [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at org.apache.hive.storage.jdbc.JdbcSerDe.initialize(JdbcSerDe.java:98) 
> [hive-jdbc-handler-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:95)
>  [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreUtils.getDeserializer(HiveMetaStoreUtils.java:78)
>  [hive-metastore-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:342)
>  [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:324) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getColsInternal(Table.java:734) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:717) 
> [hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
>   at 
> 

[jira] [Work logged] (HIVE-25670) Avoid getTable() calls for foreign key tables not used in a query

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25670?focusedWorklogId=674987=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674987
 ]

ASF GitHub Bot logged work on HIVE-25670:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 22:09
Start Date: 03/Nov/21 22:09
Worklog Time Spent: 10m 
  Work Description: scarlin-cloudera opened a new pull request #2763:
URL: https://github.com/apache/hive/pull/2763


   … a query
   
   RelOptHiveTable currently fetches the Table information for all
   referential constraint tables. However, it only needs to fetch the tables
   that are used in the query.
   
   
   
   ### What changes were proposed in this pull request?
   
   Changing RelOptHiveTable to only fetch referential constraint tables that 
are used in the query
   
   ### Why are the changes needed?
   
   To improve compilation performance.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   Regression testing (no new tests added)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674987)
Remaining Estimate: 0h
Time Spent: 10m

> Avoid getTable() calls for foreign key tables not used in a query
> -
>
> Key: HIVE-25670
> URL: https://issues.apache.org/jira/browse/HIVE-25670
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Steve Carlin
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In RelOptHiveTable, we generate the referential constraints for the table. In 
> this process, we make a metastore call to fetch these tables.  This is used 
> later on for potential gain on joins done on the key.
> However, there is no need to fetch these constraints if the table is not used 
> in the query. If we can get this information up front, we can save a bit on 
> compilation time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25670) Avoid getTable() calls for foreign key tables not used in a query

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25670:
--
Labels: pull-request-available  (was: )

> Avoid getTable() calls for foreign key tables not used in a query
> -
>
> Key: HIVE-25670
> URL: https://issues.apache.org/jira/browse/HIVE-25670
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Steve Carlin
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In RelOptHiveTable, we generate the referential constraints for the table. In 
> this process, we make a metastore call to fetch these tables.  This is used 
> later on for potential gain on joins done on the key.
> However, there is no need to fetch these constraints if the table is not used 
> in the query. If we can get this information up front, we can save a bit on 
> compilation time.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25669) After Insert overwrite (managed table), the previous data of the table is not deleted

2021-11-03 Thread Gopal Vijayaraghavan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438138#comment-17438138
 ] 

Gopal Vijayaraghavan commented on HIVE-25669:
-

bq. why aren't my old folders(base_009, base_0010) deleted?

https://cwiki.apache.org/confluence/display/hive/hive+transactions#HiveTransactions-BaseandDeltaDirectories

> After Insert overwrite (managed table), the previous data of the table is not 
> deleted
> -
>
> Key: HIVE-25669
> URL: https://issues.apache.org/jira/browse/HIVE-25669
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.1.0
> Environment: 1. hadoop eco versions
>   - hive : 3.1.0
>   - Tez : 0.9.1
>   - hdfs : 3.1.1
> 2. Table info
>   - table name : test_t1    (*sample name)
>   - table : Managed table
>   - partitioning : X (non partition)
> 3. Table properties
>   - transactional = true
>   - transactional_properties = insert_only
>   - bucketing_version = 2
>   - auto.purge =  true / false  (*apply both)
>  
>Reporter: Jihoon Lee
>Priority: Minor
>
> When insert overwrite table, 'auto.purge' does not seem to work well.
> h2. Step1. Create table
> create table test_t1 (
> col1 string,
> col2 string,
> col3 string,
> col4 string
> )
>  
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
> STORED AS INPUTFORMAT
> 'org.apache.hadoop.mapred.TextInputFormat'
> OUTPUTFORMAT
> 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
> LOCATION
> 'hdfs://nameservice1/user/hive/warehouse/st.db/test_ljh5'
> TBLPROPERTIES (
> 'auto.purge'='{color:#de350b}*false*{color}',
> 'bucketing_version'='2',
> 'transactional'='true',
> 'transactional_properties'='insert_only')
>  
> h2. 2. Insert overwrite
>  2-1)
> insert overwrite table test_t1 
> select * from origin_t1 limit 1;
>  2-2)
> insert overwrite table test_t1 
> select * from origin_t1 limit 2;
>   2-3)
> insert overwrite table test_t1 
> select * from origin_t1 limit 3;
>  
> h2. 3. Check HDFS files
>  - Hue file browser 
>   
> !https://mail.google.com/mail/u/0?ui=2=10577dc09a=0.1=msg-f:1715412595915827826=17ce5eaad6cb2a72=fimg=ip=s0-l75-ft=ANGjdJ9ygBFCoYIqI3etBmYvvRfg1l7ea2lSBC5QLxHMFhuWOh8f5u_JbzO2d65-t5I6v4Xxn9zF-ZKVya4uwIL_nDsELRTYiZ321XsPwqXzHZmG_HYA0wL3tAGLAN8=emb!
>  why aren't my old folders(base_009, base_0010) deleted?
> It's the same even if i set the setting to '*auto.purge=true*' and to 
> '*auto.purge=false*'.
>  
> And I have referenced here. 
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML]
>  * INSERT OVERWRITE will overwrite any existing data in the table or partition
>  ** unless {{IF NOT EXISTS}} is provided for a partition (as of Hive 0.9.0).
>  ** As of Hive 2.3.0 (HIVE-15880), if the table has 
> [TBLPROPERTIES|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-listTableProperties]
>  ("auto.purge"="true") the previous data of the table is not moved to Trash 
> when INSERT OVERWRITE query is run against the table. This functionality is 
> applicable only for managed tables (see [managed 
> tables|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ManagedandExternalTables])
>  and is turned off when "auto.purge" property is unset or set to false.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24528) Wrong implicit type conversion when comparing decimals and strings

2021-11-03 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437988#comment-17437988
 ] 

Stamatis Zampetakis commented on HIVE-24528:


Hi [~ibenny], as I noted also in the summary the behavior changed in Hive 
2.3.0. Every version from 2.3.0 onwards now casts to doubles (instead of 
casting to decimals). If let's say in 4.0.0 we decide to cast again to decimals 
then people may see again different results.

> Wrong implicit type conversion when comparing decimals and strings
> --
>
> Key: HIVE-24528
> URL: https://issues.apache.org/jira/browse/HIVE-24528
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.3.0
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In many cases when comparing decimals and strings (literals/columns) the 
> comparison is done using doubles which can create some quite unexpected 
> results in the answers of queries.
> {code:sql}
> create table t_str (str_col string);
> insert into t_str values ('1208925742523269458163819');
> select * from t_str where str_col=1208925742523269479013976;
> {code}
> The SELECT query brings up one row while the filtering value is not the same 
> with the one present in the string column of the table. The problem is that 
> both types are converted to doubles and due to loss of precision the values 
> are deemed equal.
> The same happens during the join of a decimal with a string type.
> {code:sql}
> create table t_dec (dec_col decimal(25,0));
> insert into t_dec values (1208925742523269479013976);
> select * from t_dec inner join t_str on dec_col=str_col;
> {code}
> The join result contains one row although the values are not equal.
> Implicit type conversions are working differently in every DBMS and for some 
> of them (e.g., mysql) the above behavior is normal or not allowed at all 
> (e.g. Postgres). 
> In the past, Hive used to compare decimal with string columns by converting 
> to decimals but this behavior changed in 2.3.0 (with HIVE-13380). It seems 
> that this behavior change was not intentional since following jiras (e.g., 
> HIVE-18434) imply that comparison of decimals with strings should be done 
> using decimals. Since decimal is an exact numeric it appears a more 
> appropriate type for comparing decimals and strings.
> The goal of this issue is to change the implicit conversion of decimals with 
> strings to doubles and use decimals instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25594) Setup JDBC databases in tests via QT options

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25594?focusedWorklogId=674555=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674555
 ]

ASF GitHub Bot logged work on HIVE-25594:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 11:41
Start Date: 03/Nov/21 11:41
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #2742:
URL: https://github.com/apache/hive/pull/2742#discussion_r741854852



##
File path: 
itests/util/src/main/java/org/apache/hadoop/hive/ql/qoption/QTestDatabaseHandler.java
##
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hive.ql.qoption;
+
+import org.apache.hadoop.hive.ql.QTestUtil;
+import org.apache.hadoop.hive.ql.externalDB.AbstractExternalDB;
+import org.apache.hadoop.hive.ql.externalDB.MSSQLServer;
+import org.apache.hadoop.hive.ql.externalDB.MariaDB;
+import org.apache.hadoop.hive.ql.externalDB.MySQLExternalDB;
+import org.apache.hadoop.hive.ql.externalDB.Oracle;
+import org.apache.hadoop.hive.ql.externalDB.PostgresExternalDB;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.nio.file.Paths;
+import java.util.Arrays;
+import java.util.EnumMap;
+import java.util.Map;
+
+/**
+ * An option handler for spinning (resp. stopping) databases before (resp. 
after) running a test.
+ *
+ * Syntax: qt:database:DatabaseType:path_to_init_script
+ *
+ * The database type ({@link DatabaseType}) is obligatory argument. The 
initialization script can be omitted.
+ *
+ * Current limitations:
+ * 
+ *   Only one test/file per run
+ *   Does not support parallel execution
+ *   Cannot instantiate more than one database of the same type per 
test
+ * 
+ */
+public class QTestDatabaseHandler implements QTestOptionHandler {
+  private static final Logger LOG = 
LoggerFactory.getLogger(QTestDatabaseHandler.class);
+
+  private enum DatabaseType {
+POSTGRES {
+  @Override
+  AbstractExternalDB create() {
+return new PostgresExternalDB();
+  }
+}, MYSQL {
+  @Override
+  AbstractExternalDB create() {
+return new MySQLExternalDB();
+  }
+}, MARIADB {
+  @Override
+  AbstractExternalDB create() {
+return new MariaDB();
+  }
+}, MSSQL {
+  @Override
+  AbstractExternalDB create() {
+return new MSSQLServer();
+  }
+}, ORACLE {
+  @Override
+  AbstractExternalDB create() {
+return new Oracle();
+  }
+};
+
+abstract AbstractExternalDB create();
+  }
+
+  private final Map databaseToScript = new 
EnumMap<>(DatabaseType.class);
+
+  @Override
+  public void processArguments(String arguments) {
+String[] args = arguments.split(":");
+if (args.length == 0) {
+  throw new IllegalArgumentException("No arguments provided");
+}
+if (args.length > 2) {
+  throw new IllegalArgumentException(
+  "Too many arguments. Expected {dbtype:script}. Found: " + 
Arrays.toString(args));
+}
+DatabaseType dbType = DatabaseType.valueOf(args[0].toUpperCase());
+String initScript = args.length == 2 ? args[1] : "";
+if (databaseToScript.containsKey(dbType)) {
+  throw new IllegalArgumentException(dbType + " database is already 
defined in this file.");
+}
+databaseToScript.put(dbType, initScript);
+  }
+
+  @Override
+  public void beforeTest(QTestUtil qt) throws Exception {
+if (databaseToScript.isEmpty()) {
+  return;
+}
+for (Map.Entry dbEntry : 
databaseToScript.entrySet()) {
+  String scriptsDir = QTestUtil.getScriptsDir(qt.getConf());
+  Path dbScript = Paths.get(scriptsDir, dbEntry.getValue());
+  AbstractExternalDB db = dbEntry.getKey().create();

Review comment:
   This suggestion would be definitely very useful and necessary if we want 
to support multiple databases of the same type running side by side or enabling 
database reuse among qfiles. Leaving this for 
https://issues.apache.org/jira/browse/HIVE-25668 where we are going to have 
more concrete 

[jira] [Commented] (HIVE-25668) Support database reuse when using qt:database option

2021-11-03 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437968#comment-17437968
 ] 

Stamatis Zampetakis commented on HIVE-25668:


To  support reuse it would be necessary to extend the base functionality 
introduced in HIVE-25594 to allow more databases of the same type to run side 
by side.

> Support database reuse when using qt:database option
> 
>
> Key: HIVE-25668
> URL: https://issues.apache.org/jira/browse/HIVE-25668
> Project: Hive
>  Issue Type: Task
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Priority: Major
>
> With HIVE-25594 it is possible to initialize and use various types of 
> databases in tests. At the moment all the supported databases rely on docker 
> containers which are initialized/destroyed in per test basis. This is good in 
> terms of test isolation but it brings a certain performance overhead slowing 
> down tests. At the moment it is fine since the feature it is not widely used  
> but it would be good to have a way to reuse a database in multiple qfiles. 
> The developper could specify in the qfile if they want to reuse a container 
> (if it is possible) by passing certain additional options. The declaration 
> could look like below:
> {noformat}
> --!qt:database:type=mysql;script=q_test_country_table.sql;reuse=true{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25668) Support database reuse when using qt:database option

2021-11-03 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437962#comment-17437962
 ] 

Stamatis Zampetakis commented on HIVE-25668:


The reuse reasoning could be the following. The database is not destroyed at 
the end of the test and if the same type of database with the same 
initialization script and reuse enabled appears in another test then we are 
going to use the existing one.

> Support database reuse when using qt:database option
> 
>
> Key: HIVE-25668
> URL: https://issues.apache.org/jira/browse/HIVE-25668
> Project: Hive
>  Issue Type: Task
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Priority: Major
>
> With HIVE-25594 it is possible to initialize and use various types of 
> databases in tests. At the moment all the supported databases rely on docker 
> containers which are initialized/destroyed in per test basis. This is good in 
> terms of test isolation but it brings a certain performance overhead slowing 
> down tests. At the moment it is fine since the feature it is not widely used  
> but it would be good to have a way to reuse a database in multiple qfiles. 
> The developper could specify in the qfile if they want to reuse a container 
> (if it is possible) by passing certain additional options. The declaration 
> could look like below:
> {noformat}
> --!qt:database:type=mysql;script=q_test_country_table.sql;reuse=true{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25594) Setup JDBC databases in tests via QT options

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25594?focusedWorklogId=674540=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674540
 ]

ASF GitHub Bot logged work on HIVE-25594:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 11:17
Start Date: 03/Nov/21 11:17
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #2742:
URL: https://github.com/apache/hive/pull/2742#discussion_r741839431



##
File path: 
itests/util/src/main/java/org/apache/hadoop/hive/ql/externalDB/AbstractExternalDB.java
##
@@ -71,27 +59,8 @@ public ProcessResults(String stdout, String stderr, int rc) {
 }
 }
 
-public static AbstractExternalDB initalizeExternalDB(String 
externalDBType) throws IOException {
-AbstractExternalDB abstractExternalDB;
-switch (externalDBType) {
-case "mysql":
-abstractExternalDB = new MySQLExternalDB();
-break;
-case "postgres":
-abstractExternalDB = new PostgresExternalDB();
-break;
-default:
-throw new IOException("unsupported external database type " + 
externalDBType);
-}
-return abstractExternalDB;
-}
-
-public AbstractExternalDB(String externalDBType) {
-this.externalDBType = externalDBType;
-}
-
-protected String getDockerContainerName() {
-return String.format("qtestExternalDB-%s", externalDBType);
+private final String getDockerContainerName() {

Review comment:
   Logged https://issues.apache.org/jira/browse/HIVE-25667 for tracking 
this further.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674540)
Time Spent: 2h 40m  (was: 2.5h)

> Setup JDBC databases in tests via QT options
> 
>
> Key: HIVE-25594
> URL: https://issues.apache.org/jira/browse/HIVE-25594
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> The goal of this jira is to add a new QT option for setting up JDBC DBMS and 
> using it in qtests which need a JDBC endpoint up and running. It can be used 
> in tests with external JDBC tables, connectors, etc.
> A sample file using the proposed option ({{qt:database}}) is shown below.
> {code:sql}
> --!qt:database:postgres:init_sript_1234.sql:cleanup_script_1234.sql
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/qtestDB",
> "hive.sql.dbcp.username" = "qtestuser",
> "hive.sql.dbcp.password" = "qtestpassword",
> "hive.sql.table" = "country");
> EXPLAIN CBO SELECT COUNT(*) from country;
> SELECT COUNT(*) from country;
> {code}
> This builds upon HIVE-25423 but proposes to use JDBC datasources without the 
> need for a using a specific CLI driver. Furthermore, the proposed QT option 
> syntax allows using customised init/cleanup scripts for the JDBC datasource 
> per test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25667) Unify code managing JDBC databases in tests

2021-11-03 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-25667:
---
Description: 
Currently there are two class hierarchies managing JDBC databases in tests, 
[DatabaseRule|https://github.com/apache/hive/blob/d35de014dd49fdcfe0aacb68e6c587beff6d1dea/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/DatabaseRule.java]
 and 
[AbstractExternalDB|https://github.com/apache/hive/blob/d35de014dd49fdcfe0aacb68e6c587beff6d1dea/itests/util/src/main/java/org/apache/hadoop/hive/ql/externalDB/AbstractExternalDB.java].
 There are many similarities between these hierarchies and certain parts are 
duplicated. 

The goal of this JIRA is to refactor the aforementioned hierarchies to reduce 
code duplication and improve extensibility.

  was:
Currently there are two class hierarchies managing JDBC databases in tests, 
[DatabaseRule|
https://github.com/apache/hive/blob/d35de014dd49fdcfe0aacb68e6c587beff6d1dea/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/DatabaseRule.java]
 and 
[AbstractExternalDB|https://github.com/apache/hive/blob/d35de014dd49fdcfe0aacb68e6c587beff6d1dea/itests/util/src/main/java/org/apache/hadoop/hive/ql/externalDB/AbstractExternalDB.java].
 There are many similarities between these hierarchies and certain parts are 
duplicated. 

The goal of this JIRA is to refactor the aforementioned hierarchies to reduce 
code duplication and improve extensibility.


> Unify code managing JDBC databases in tests
> ---
>
> Key: HIVE-25667
> URL: https://issues.apache.org/jira/browse/HIVE-25667
> Project: Hive
>  Issue Type: Task
>  Components: Testing Infrastructure
>Affects Versions: 4.0.0
>Reporter: Stamatis Zampetakis
>Priority: Major
>
> Currently there are two class hierarchies managing JDBC databases in tests, 
> [DatabaseRule|https://github.com/apache/hive/blob/d35de014dd49fdcfe0aacb68e6c587beff6d1dea/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/DatabaseRule.java]
>  and 
> [AbstractExternalDB|https://github.com/apache/hive/blob/d35de014dd49fdcfe0aacb68e6c587beff6d1dea/itests/util/src/main/java/org/apache/hadoop/hive/ql/externalDB/AbstractExternalDB.java].
>  There are many similarities between these hierarchies and certain parts are 
> duplicated. 
> The goal of this JIRA is to refactor the aforementioned hierarchies to reduce 
> code duplication and improve extensibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25626) CBO fails when JDBC table specifies password via dbcp.password.uri

2021-11-03 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437929#comment-17437929
 ] 

Stamatis Zampetakis commented on HIVE-25626:


Hey [~harishjp], I am in favor of unifying the logic, I made a similar comment 
in the PR. Given that this issue has been merged already, I would suggest to 
create a new JIRA and follow up with the refactoring there. 

> CBO fails when JDBC table specifies password via dbcp.password.uri
> --
>
> Key: HIVE-25626
> URL: https://issues.apache.org/jira/browse/HIVE-25626
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, JDBC storage handler
>Affects Versions: 3.1.2, 4.0.0
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When table created with JDBCStorageHandler and JDBC_PASSWORD_URI is used as a 
> password mechanism, CBO fails causing all the data to be fetched from DB and 
> then processed in hive.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25661) Cover the test case for HIVE-25626

2021-11-03 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437928#comment-17437928
 ] 

Stamatis Zampetakis commented on HIVE-25661:


[~chiran54321] Small heads up. HIVE-25594 has been merged so when you have time 
you can follow up on this one.

> Cover the test case for HIVE-25626
> --
>
> Key: HIVE-25661
> URL: https://issues.apache.org/jira/browse/HIVE-25661
> Project: Hive
>  Issue Type: Test
>  Components: Hive
>Reporter: Chiran Ravani
>Assignee: Chiran Ravani
>Priority: Major
>
> Cover the test case for HIVE-25626 so the problem is not broken in future.
> HIVE-25594  introduces multiple JDBCStorageHandler test cases, once this is 
> in upstream it would be easy to add the case.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-25594) Setup JDBC databases in tests via QT options

2021-11-03 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis resolved HIVE-25594.

Fix Version/s: 4.0.0
   Resolution: Fixed

Fixed in 
[d35de014dd49fdcfe0aacb68e6c587beff6d1dea|https://github.com/apache/hive/commit/d35de014dd49fdcfe0aacb68e6c587beff6d1dea].
 Thanks for the review [~kgyrtkirk], [~asolimando]!

> Setup JDBC databases in tests via QT options
> 
>
> Key: HIVE-25594
> URL: https://issues.apache.org/jira/browse/HIVE-25594
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The goal of this jira is to add a new QT option for setting up JDBC DBMS and 
> using it in qtests which need a JDBC endpoint up and running. It can be used 
> in tests with external JDBC tables, connectors, etc.
> A sample file using the proposed option ({{qt:database}}) is shown below.
> {code:sql}
> --!qt:database:postgres:init_sript_1234.sql:cleanup_script_1234.sql
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/qtestDB",
> "hive.sql.dbcp.username" = "qtestuser",
> "hive.sql.dbcp.password" = "qtestpassword",
> "hive.sql.table" = "country");
> EXPLAIN CBO SELECT COUNT(*) from country;
> SELECT COUNT(*) from country;
> {code}
> This builds upon HIVE-25423 but proposes to use JDBC datasources without the 
> need for a using a specific CLI driver. Furthermore, the proposed QT option 
> syntax allows using customised init/cleanup scripts for the JDBC datasource 
> per test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25594) Setup JDBC databases in tests via QT options

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25594?focusedWorklogId=674526=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674526
 ]

ASF GitHub Bot logged work on HIVE-25594:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 10:46
Start Date: 03/Nov/21 10:46
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #2742:
URL: https://github.com/apache/hive/pull/2742


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674526)
Time Spent: 2.5h  (was: 2h 20m)

> Setup JDBC databases in tests via QT options
> 
>
> Key: HIVE-25594
> URL: https://issues.apache.org/jira/browse/HIVE-25594
> Project: Hive
>  Issue Type: Improvement
>  Components: Testing Infrastructure
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> The goal of this jira is to add a new QT option for setting up JDBC DBMS and 
> using it in qtests which need a JDBC endpoint up and running. It can be used 
> in tests with external JDBC tables, connectors, etc.
> A sample file using the proposed option ({{qt:database}}) is shown below.
> {code:sql}
> --!qt:database:postgres:init_sript_1234.sql:cleanup_script_1234.sql
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/qtestDB",
> "hive.sql.dbcp.username" = "qtestuser",
> "hive.sql.dbcp.password" = "qtestpassword",
> "hive.sql.table" = "country");
> EXPLAIN CBO SELECT COUNT(*) from country;
> SELECT COUNT(*) from country;
> {code}
> This builds upon HIVE-25423 but proposes to use JDBC datasources without the 
> need for a using a specific CLI driver. Furthermore, the proposed QT option 
> syntax allows using customised init/cleanup scripts for the JDBC datasource 
> per test.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25596) Compress Hive Replication Metrics while storing

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25596?focusedWorklogId=674520=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674520
 ]

ASF GitHub Bot logged work on HIVE-25596:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 10:42
Start Date: 03/Nov/21 10:42
Worklog Time Spent: 10m 
  Work Description: hmangla98 commented on a change in pull request #2724:
URL: https://github.com/apache/hive/pull/2724#discussion_r741814503



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplStatsTracker.java
##
@@ -36,7 +36,7 @@
 public class ReplStatsTracker {
 
   // Maintains the length of the RM_Progress column in the RDBMS, which stores 
the ReplStats
-  public static int RM_PROGRESS_LENGTH = 24000;
+  public static int RM_PROGRESS_LENGTH = 1;

Review comment:
   Attached the sample outputs and size enhancements in comment section.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674520)
Time Spent: 3h 50m  (was: 3h 40m)

> Compress Hive Replication Metrics while storing
> ---
>
> Key: HIVE-25596
> URL: https://issues.apache.org/jira/browse/HIVE-25596
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
> Attachments: CompressedRM_Progress(k=10), CompressedRM_Progress(k=5), 
> PlainTextRM_Progress(k=10), PlainTextRM_Progress(k=5)
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Compress the json fields of sys.replication_metrics table to optimise RDBMS 
> space usage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=674505=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674505
 ]

ASF GitHub Bot logged work on HIVE-25591:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 10:27
Start Date: 03/Nov/21 10:27
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #2759:
URL: https://github.com/apache/hive/pull/2759#discussion_r741798099



##
File path: ql/src/test/queries/clientpositive/jdbc_table_with_schema_mssql.q
##
@@ -0,0 +1,54 @@
+--! qt:database:mssql:q_test_country_table_with_schema.mssql.sql
+-- Microsoft SQL server allows multiple schemas per database so to 
disambiguate between tables in different schemas it
+-- is necessary to set the hive.sql.schema property properly.
+
+-- Some JDBC APIs require the catalog, schema, and table names to be passed 
exactly as they are stored in the database.
+-- MSSQL stores unquoted identifiers by first converting them to lowercase 
thus the hive.sql.schema and
+-- hive.sql.table properties below are specified in lowercase.
+
+CREATE EXTERNAL TABLE country_0 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "bob",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_0;
+SELECT COUNT(*) FROM country_0;
+
+CREATE EXTERNAL TABLE country_1 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "alice",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_1;
+SELECT COUNT(*) FROM country_1;
+
+-- Test DML statements are working fine when accessing table in non-default 
schema
+INSERT INTO country_1 VALUES (8, 'Hungary');

Review comment:
   I added some plans in 
https://github.com/apache/hive/pull/2759/commits/5aa343db3cb36a5aca6c4ef0cfb94263fba38585




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674505)
Time Spent: 50m  (was: 40m)

> CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
> 
>
> Key: HIVE-25591
> URL: https://issues.apache.org/jira/browse/HIVE-25591
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Consider the following use case where tables reside in some user-defined 
> schema in some JDBC compliant database:
> +Postgres+
> {code:sql}
> create schema world;
> create table if not exists world.country (name varchar(80) not null);
> insert into world.country (name) values ('India');
> insert into world.country (name) values ('Russia');
> insert into world.country (name) values ('USA');
> {code}
> The following DDL statement in Hive fails:
> +Hive+
> {code:sql}
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test",
> "hive.sql.dbcp.username" = "user",
> "hive.sql.dbcp.password" = "pwd",
> "hive.sql.schema" = "world",
> "hive.sql.table" = "country");
> {code}
> The exception is the following:
> {noformat}
> org.postgresql.util.PSQLException: ERROR: relation "country" does not exist
>   Position: 15
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2267)
>  ~[postgresql-42.2.14.jar:42.2.14]
>   at 
> 

[jira] [Work logged] (HIVE-25659) Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?focusedWorklogId=674469=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674469
 ]

ASF GitHub Bot logged work on HIVE-25659:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 09:00
Start Date: 03/Nov/21 09:00
Worklog Time Spent: 10m 
  Work Description: guptanikhil007 commented on a change in pull request 
#2758:
URL: https://github.com/apache/hive/pull/2758#discussion_r741713387



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the `not in` query cannot be fit in one single query, it should fail 
according to function definition




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674469)
Time Spent: 1.5h  (was: 1h 20m)

> Metastore direct sql queries with IN/(NOT IN) should be split based on max 
> parameters allowed by SQL DB
> ---
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of parameters with very small value of 
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while 
> generating delete query for completed_compactions table
> Example:
> {code:java}
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100
> DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB)
> Number of parameters in a single query = 4759
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25659) Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?focusedWorklogId=674468=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674468
 ]

ASF GitHub Bot logged work on HIVE-25659:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 08:59
Start Date: 03/Nov/21 08:59
Worklog Time Spent: 10m 
  Work Description: guptanikhil007 commented on a change in pull request 
#2758:
URL: https://github.com/apache/hive/pull/2758#discussion_r741713387



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the `not in` query cannot be fit in one query, it should fail 
according to function definition




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674468)
Time Spent: 1h 20m  (was: 1h 10m)

> Metastore direct sql queries with IN/(NOT IN) should be split based on max 
> parameters allowed by SQL DB
> ---
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of parameters with very small value of 
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while 
> generating delete query for completed_compactions table
> Example:
> {code:java}
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100
> DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB)
> Number of parameters in a single query = 4759
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25659) Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?focusedWorklogId=674458=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674458
 ]

ASF GitHub Bot logged work on HIVE-25659:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 08:47
Start Date: 03/Nov/21 08:47
Worklog Time Spent: 10m 
  Work Description: guptanikhil007 commented on a change in pull request 
#2758:
URL: https://github.com/apache/hive/pull/2758#discussion_r741713387



##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   If the not in query cannot be fit in one query, it should fail according 
to function definition




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674458)
Time Spent: 1h 10m  (was: 1h)

> Metastore direct sql queries with IN/(NOT IN) should be split based on max 
> parameters allowed by SQL DB
> ---
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of parameters with very small value of 
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while 
> generating delete query for completed_compactions table
> Example:
> {code:java}
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100
> DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB)
> Number of parameters in a single query = 4759
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25659) Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters supported by SQL DB

2021-11-03 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-25659:

Summary: Metastore direct sql queries with IN/(NOT IN) should be split 
based on max parameters supported by SQL DB  (was: Divide IN/(NOT IN) queries 
based on number of max parameters SQL engine can support)

> Metastore direct sql queries with IN/(NOT IN) should be split based on max 
> parameters supported by SQL DB
> -
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of parameters with very small value of 
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while 
> generating delete query for completed_compactions table
> Example:
> {code:java}
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100
> DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB)
> Number of parameters in a single query = 4759
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25659) Divide IN/(NOT IN) queries based on number of max parameters SQL engine can support

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?focusedWorklogId=674425=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674425
 ]

ASF GitHub Bot logged work on HIVE-25659:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 07:51
Start Date: 03/Nov/21 07:51
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #2758:
URL: https://github.com/apache/hive/pull/2758#discussion_r741678476



##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
##
@@ -680,6 +680,9 @@ public static ConfVars getMetaConf(String name) {
 
DIRECT_SQL_MAX_ELEMENTS_VALUES_CLAUSE("metastore.direct.sql.max.elements.values.clause",
 "hive.direct.sql.max.elements.values.clause",
 1000, "The maximum number of values in a VALUES clause for INSERT 
statement."),
+DIRECT_SQL_MAX_PARAMETERS("metastore.direct.sql.max.parameters",
+"hive.direct.sql.max.parameters", 1000, "The maximum parameters the\n" 
+

Review comment:
   Rephrase: "... maximum query parameters..."

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
##
@@ -316,7 +317,7 @@ public static String getFullTableName(String dbName, String 
tableName) {
   // Compute the size of a query when the 'nextValue' is added to the 
current query.
   int querySize = querySizeExpected(buf.length(), nextValue.length(), 
suffix.length(), addParens);
 
-  if (querySize > maxQueryLength * 1024) {
+  if (querySize > maxQueryLength * 1024 || currentCount >= maxParameters) {

Review comment:
   nit: Keep each condition within ()

##
File path: 
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestTxnUtils.java
##
@@ -150,12 +151,34 @@ public void testBuildQueryWithINClause() throws Exception 
{
 ret = TxnUtils.buildQueryWithINClause(conf, queries, prefix, suffix, 
inList, "TXN_ID", false, false);
 Assert.assertEquals(3, queries.size());
 Assert.assertEquals(queries.size(), ret.size());
-Assert.assertEquals(2255L, ret.get(0).longValue());
-Assert.assertEquals(2033L, ret.get(1).longValue());
-Assert.assertEquals(33L, ret.get(2).longValue());
+Assert.assertEquals(2000L, ret.get(0).longValue());
+Assert.assertEquals(2000L, ret.get(1).longValue());
+Assert.assertEquals(321L, ret.get(2).longValue());
 runAgainstDerby(queries);
   }
 
+  @Test(expected = IllegalArgumentException.class)
+  public void testBuildQueryWithNOTINClauseFailure() throws Exception {

Review comment:
   Why would this test fail?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674425)
Time Spent: 1h  (was: 50m)

> Divide IN/(NOT IN) queries based on number of max parameters SQL engine can 
> support
> ---
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of parameters with very small value of 
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while 
> generating delete query for completed_compactions table
> Example:
> {code:java}
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100
> DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB)
> Number of parameters in a single query = 4759
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25659) Metastore direct sql queries with IN/(NOT IN) should be split based on max parameters allowed by SQL DB

2021-11-03 Thread Sankar Hariappan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-25659:

Summary: Metastore direct sql queries with IN/(NOT IN) should be split 
based on max parameters allowed by SQL DB  (was: Metastore direct sql queries 
with IN/(NOT IN) should be split based on max parameters supported by SQL DB)

> Metastore direct sql queries with IN/(NOT IN) should be split based on max 
> parameters allowed by SQL DB
> ---
>
> Key: HIVE-25659
> URL: https://issues.apache.org/jira/browse/HIVE-25659
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 3.1.0, 4.0.0
>Reporter: Nikhil Gupta
>Assignee: Nikhil Gupta
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Function 
> org.apache.hadoop.hive.metastore.txn.TxnUtils#buildQueryWithINClauseStrings 
> can generate queries with huge number of parameters with very small value of 
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE and DIRECT_SQL_MAX_QUERY_LENGTH while 
> generating delete query for completed_compactions table
> Example:
> {code:java}
> DIRECT_SQL_MAX_ELEMENTS_IN_CLAUSE = 100
> DIRECT_SQL_MAX_QUERY_LENGTH = 10 (10 KB)
> Number of parameters in a single query = 4759
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-25596) Compress Hive Replication Metrics while storing

2021-11-03 Thread Haymant Mangla (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437796#comment-17437796
 ] 

Haymant Mangla commented on HIVE-25596:
---

Attached two sample RM_Progress field pre and post compression.
 # For k=5, Original String:14276 Compressed String: 5564
 # For k=10, Original String:16559  Compressed String: 7176

> Compress Hive Replication Metrics while storing
> ---
>
> Key: HIVE-25596
> URL: https://issues.apache.org/jira/browse/HIVE-25596
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
> Attachments: CompressedRM_Progress(k=10), CompressedRM_Progress(k=5), 
> PlainTextRM_Progress(k=10), PlainTextRM_Progress(k=5)
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Compress the json fields of sys.replication_metrics table to optimise RDBMS 
> space usage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25596) Compress Hive Replication Metrics while storing

2021-11-03 Thread Haymant Mangla (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haymant Mangla updated HIVE-25596:
--
Attachment: PlainTextRM_Progress(k=10)
PlainTextRM_Progress(k=5)
CompressedRM_Progress(k=10)
CompressedRM_Progress(k=5)

> Compress Hive Replication Metrics while storing
> ---
>
> Key: HIVE-25596
> URL: https://issues.apache.org/jira/browse/HIVE-25596
> Project: Hive
>  Issue Type: Improvement
>Reporter: Haymant Mangla
>Assignee: Haymant Mangla
>Priority: Major
>  Labels: pull-request-available
> Attachments: CompressedRM_Progress(k=10), CompressedRM_Progress(k=5), 
> PlainTextRM_Progress(k=10), PlainTextRM_Progress(k=5)
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Compress the json fields of sys.replication_metrics table to optimise RDBMS 
> space usage.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25591) CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema

2021-11-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25591?focusedWorklogId=674424=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674424
 ]

ASF GitHub Bot logged work on HIVE-25591:
-

Author: ASF GitHub Bot
Created on: 03/Nov/21 07:48
Start Date: 03/Nov/21 07:48
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #2759:
URL: https://github.com/apache/hive/pull/2759#discussion_r741675549



##
File path: ql/src/test/queries/clientpositive/jdbc_table_with_schema_mssql.q
##
@@ -0,0 +1,54 @@
+--! qt:database:mssql:q_test_country_table_with_schema.mssql.sql
+-- Microsoft SQL server allows multiple schemas per database so to 
disambiguate between tables in different schemas it
+-- is necessary to set the hive.sql.schema property properly.
+
+-- Some JDBC APIs require the catalog, schema, and table names to be passed 
exactly as they are stored in the database.
+-- MSSQL stores unquoted identifiers by first converting them to lowercase 
thus the hive.sql.schema and
+-- hive.sql.table properties below are specified in lowercase.
+
+CREATE EXTERNAL TABLE country_0 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "bob",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_0;
+SELECT COUNT(*) FROM country_0;
+
+CREATE EXTERNAL TABLE country_1 (id int, name varchar(20))
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "MSSQL",
+"hive.sql.jdbc.driver" = "com.microsoft.sqlserver.jdbc.SQLServerDriver",
+"hive.sql.jdbc.url" = 
"jdbc:sqlserver://localhost:1433;DatabaseName=world;",
+"hive.sql.dbcp.username" = "sa",
+"hive.sql.dbcp.password" = "Its-a-s3cret",
+"hive.sql.schema" = "alice",
+"hive.sql.table" = "country");
+
+EXPLAIN CBO SELECT COUNT(*) FROM country_1;
+SELECT COUNT(*) FROM country_1;
+
+-- Test DML statements are working fine when accessing table in non-default 
schema
+INSERT INTO country_1 VALUES (8, 'Hungary');

Review comment:
   I'm not familiar how dml statements are treated when the target is a 
JDBC table. I guess a `HiveJdbcConverter` shows up in the plan.
   Could you please add an 
   ```
   explain cbo insert into country_1... 
   ```
   statement. Feel free to ignore my comment if we already have one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 674424)
Time Spent: 40m  (was: 0.5h)

> CREATE EXTERNAL TABLE fails for JDBC tables stored in non-default schema
> 
>
> Key: HIVE-25591
> URL: https://issues.apache.org/jira/browse/HIVE-25591
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Consider the following use case where tables reside in some user-defined 
> schema in some JDBC compliant database:
> +Postgres+
> {code:sql}
> create schema world;
> create table if not exists world.country (name varchar(80) not null);
> insert into world.country (name) values ('India');
> insert into world.country (name) values ('Russia');
> insert into world.country (name) values ('USA');
> {code}
> The following DDL statement in Hive fails:
> +Hive+
> {code:sql}
> CREATE EXTERNAL TABLE country (name varchar(80))
> STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
> TBLPROPERTIES (
> "hive.sql.database.type" = "POSTGRES",
> "hive.sql.jdbc.driver" = "org.postgresql.Driver",
> "hive.sql.jdbc.url" = "jdbc:postgresql://localhost:5432/test",
> "hive.sql.dbcp.username" = "user",
> "hive.sql.dbcp.password" = "pwd",
> "hive.sql.schema" = "world",
> "hive.sql.table" = "country");
> {code}
> The exception is the following:
> {noformat}
> org.postgresql.util.PSQLException: ERROR: relation "country" does not exist
>   Position: 15
>   at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2532)
>  ~[postgresql-42.2.14.jar:42.2.14]
>