[jira] [Work logged] (HIVE-24861) Hive JDBC driver doesn't consider the value of 'hasMoreRows'

2021-06-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24861?focusedWorklogId=610514=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-610514
 ]

ASF GitHub Bot logged work on HIVE-24861:
-

Author: ASF GitHub Bot
Created on: 14/Jun/21 08:00
Start Date: 14/Jun/21 08:00
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #2070:
URL: https://github.com/apache/hive/pull/2070


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 610514)
Time Spent: 1h  (was: 50m)

> Hive JDBC driver doesn't consider the value of 'hasMoreRows'
> 
>
> Key: HIVE-24861
> URL: https://issues.apache.org/jira/browse/HIVE-24861
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Zoltán Borók-Nagy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> TCLIService's FetchResults might return an empty result set, but with 
> hasMoreRows=true. In that case the driver ignores the flag hasMoreRows and 
> thinks it is the end of the result stream, causing data loss.
> I've seen this when the Hive JDBC driver was used to connect to Impala. 
> IMPALA-7312 introduced a timeout on FetchResults(). If Impala cannot produce 
> rows in the given timeout then it returns an empty result set, but setting 
> hasMoreRows=true. However, the Hive JDBC driver interprets it as the end of 
> the result stream and closes the operation.
> I think if hasMoreRows=true then the Hive JDBC driver should issue 
> FetchResults() again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24861) Hive JDBC driver doesn't consider the value of 'hasMoreRows'

2021-06-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24861?focusedWorklogId=607467=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-607467
 ]

ASF GitHub Bot logged work on HIVE-24861:
-

Author: ASF GitHub Bot
Created on: 06/Jun/21 00:18
Start Date: 06/Jun/21 00:18
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2070:
URL: https://github.com/apache/hive/pull/2070#issuecomment-855314871


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 607467)
Time Spent: 50m  (was: 40m)

> Hive JDBC driver doesn't consider the value of 'hasMoreRows'
> 
>
> Key: HIVE-24861
> URL: https://issues.apache.org/jira/browse/HIVE-24861
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Zoltán Borók-Nagy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> TCLIService's FetchResults might return an empty result set, but with 
> hasMoreRows=true. In that case the driver ignores the flag hasMoreRows and 
> thinks it is the end of the result stream, causing data loss.
> I've seen this when the Hive JDBC driver was used to connect to Impala. 
> IMPALA-7312 introduced a timeout on FetchResults(). If Impala cannot produce 
> rows in the given timeout then it returns an empty result set, but setting 
> hasMoreRows=true. However, the Hive JDBC driver interprets it as the end of 
> the result stream and closes the operation.
> I think if hasMoreRows=true then the Hive JDBC driver should issue 
> FetchResults() again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24861) Hive JDBC driver doesn't consider the value of 'hasMoreRows'

2021-03-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24861?focusedWorklogId=571611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-571611
 ]

ASF GitHub Bot logged work on HIVE-24861:
-

Author: ASF GitHub Bot
Created on: 25/Mar/21 02:44
Start Date: 25/Mar/21 02:44
Worklog Time Spent: 10m 
  Work Description: ellieMayVelasquez commented on a change in pull request 
#2070:
URL: https://github.com/apache/hive/pull/2070#discussion_r601003329



##
File path: 
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestHiveQueryResultSet.java
##
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hive.jdbc;
+
+import java.sql.ResultSet;
+import java.sql.Statement;
+import java.util.Properties;
+
+import org.junit.Test;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.FieldSchema;
+import org.apache.hadoop.hive.metastore.api.Schema;
+import org.apache.hive.service.cli.RowSet;
+import org.apache.hive.service.cli.RowSetFactory;
+import org.apache.hive.service.cli.TableSchema;
+import org.apache.hive.service.cli.thrift.EmbeddedThriftBinaryCLIService;
+import org.apache.hive.service.rpc.thrift.TFetchResultsReq;
+import org.apache.hive.service.rpc.thrift.TFetchResultsResp;
+import org.apache.hive.service.rpc.thrift.TRowSet;
+import org.apache.hive.service.rpc.thrift.TStatus;
+import org.apache.hive.service.rpc.thrift.TStatusCode;
+import org.apache.thrift.TException;
+
+import static 
org.apache.hive.jdbc.EmbeddedCLIServicePortal.EMBEDDED_CLISERVICE;
+import static org.apache.hive.jdbc.Utils.URL_PREFIX;
+import static 
org.apache.hive.service.rpc.thrift.TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V10;
+import static org.junit.Assert.assertEquals;
+
+public class TestHiveQueryResultSet {
+  private static final String EMIT_EMPTY_ROWS = "hive.test.emit.empty.rows";
+  private static final String EMIT_NUM_ROWS = "hive.test.emit.num.rows";
+
+  // Create subclass of EmbeddedThriftBinaryCLIService
+  public static class MyThriftBinaryCLIService extends 
EmbeddedThriftBinaryCLIService {
+private TStatus success = new TStatus(TStatusCode.SUCCESS_STATUS);
+private boolean emitEmptyRows;
+private int numRows;
+private int position;
+private int emptyRowPos;
+
+@Override
+public synchronized void init(HiveConf hiveConf) {
+  hiveConf.setBoolean("hive.support.concurrency", false);
+  this.emitEmptyRows = hiveConf.getBoolean(EMIT_EMPTY_ROWS, false);
+  this.numRows = hiveConf.getInt(EMIT_NUM_ROWS, 10);
+  this.emptyRowPos = (numRows / 2);
+  this.position = 0;
+  super.init(hiveConf);
+}
+
+@Override
+public TFetchResultsResp FetchResults(TFetchResultsReq req) throws 
TException {

Review comment:
   noticed.

##
File path: 
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestHiveQueryResultSet.java
##
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hive.jdbc;
+
+import java.sql.ResultSet;
+import java.sql.Statement;
+import java.util.Properties;
+
+import org.junit.Test;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.FieldSchema;
+import org.apache.hadoop.hive.metastore.api.Schema;
+import org.apache.hive.service.cli.RowSet;
+import 

[jira] [Work logged] (HIVE-24861) Hive JDBC driver doesn't consider the value of 'hasMoreRows'

2021-03-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24861?focusedWorklogId=571596=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-571596
 ]

ASF GitHub Bot logged work on HIVE-24861:
-

Author: ASF GitHub Bot
Created on: 25/Mar/21 02:11
Start Date: 25/Mar/21 02:11
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on a change in pull request #2070:
URL: https://github.com/apache/hive/pull/2070#discussion_r600992887



##
File path: 
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestHiveQueryResultSet.java
##
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hive.jdbc;
+
+import java.sql.ResultSet;
+import java.sql.Statement;
+import java.util.Properties;
+
+import org.junit.Test;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.FieldSchema;
+import org.apache.hadoop.hive.metastore.api.Schema;
+import org.apache.hive.service.cli.RowSet;
+import org.apache.hive.service.cli.RowSetFactory;
+import org.apache.hive.service.cli.TableSchema;
+import org.apache.hive.service.cli.thrift.EmbeddedThriftBinaryCLIService;
+import org.apache.hive.service.rpc.thrift.TFetchResultsReq;
+import org.apache.hive.service.rpc.thrift.TFetchResultsResp;
+import org.apache.hive.service.rpc.thrift.TRowSet;
+import org.apache.hive.service.rpc.thrift.TStatus;
+import org.apache.hive.service.rpc.thrift.TStatusCode;
+import org.apache.thrift.TException;
+
+import static 
org.apache.hive.jdbc.EmbeddedCLIServicePortal.EMBEDDED_CLISERVICE;
+import static org.apache.hive.jdbc.Utils.URL_PREFIX;
+import static 
org.apache.hive.service.rpc.thrift.TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V10;
+import static org.junit.Assert.assertEquals;
+
+public class TestHiveQueryResultSet {
+  private static final String EMIT_EMPTY_ROWS = "hive.test.emit.empty.rows";
+  private static final String EMIT_NUM_ROWS = "hive.test.emit.num.rows";
+
+  // Create subclass of EmbeddedThriftBinaryCLIService
+  public static class MyThriftBinaryCLIService extends 
EmbeddedThriftBinaryCLIService {
+private TStatus success = new TStatus(TStatusCode.SUCCESS_STATUS);
+private boolean emitEmptyRows;
+private int numRows;
+private int position;
+private int emptyRowPos;
+
+@Override
+public synchronized void init(HiveConf hiveConf) {
+  hiveConf.setBoolean("hive.support.concurrency", false);
+  this.emitEmptyRows = hiveConf.getBoolean(EMIT_EMPTY_ROWS, false);
+  this.numRows = hiveConf.getInt(EMIT_NUM_ROWS, 10);
+  this.emptyRowPos = (numRows / 2);
+  this.position = 0;
+  super.init(hiveConf);
+}
+
+@Override
+public TFetchResultsResp FetchResults(TFetchResultsReq req) throws 
TException {

Review comment:
   Thanks for the comments, @ellieMayVelasquez! The method is defined the 
thrift, and used for many other languages: php, python, etc. It's may difficult 
to change the method.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 571596)
Time Spent: 0.5h  (was: 20m)

> Hive JDBC driver doesn't consider the value of 'hasMoreRows'
> 
>
> Key: HIVE-24861
> URL: https://issues.apache.org/jira/browse/HIVE-24861
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Zoltán Borók-Nagy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> TCLIService's FetchResults might return an empty result set, but with 
> hasMoreRows=true. In that case the driver ignores the flag hasMoreRows and 
> thinks it is the end of the result stream, causing data loss.
> I've seen this when the Hive JDBC driver was used to connect to Impala. 
> 

[jira] [Work logged] (HIVE-24861) Hive JDBC driver doesn't consider the value of 'hasMoreRows'

2021-03-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24861?focusedWorklogId=569445=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-569445
 ]

ASF GitHub Bot logged work on HIVE-24861:
-

Author: ASF GitHub Bot
Created on: 21/Mar/21 18:29
Start Date: 21/Mar/21 18:29
Worklog Time Spent: 10m 
  Work Description: ellieMayVelasquez commented on a change in pull request 
#2070:
URL: https://github.com/apache/hive/pull/2070#discussion_r598319032



##
File path: 
itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestHiveQueryResultSet.java
##
@@ -0,0 +1,143 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hive.jdbc;
+
+import java.sql.ResultSet;
+import java.sql.Statement;
+import java.util.Properties;
+
+import org.junit.Test;
+
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.metastore.api.FieldSchema;
+import org.apache.hadoop.hive.metastore.api.Schema;
+import org.apache.hive.service.cli.RowSet;
+import org.apache.hive.service.cli.RowSetFactory;
+import org.apache.hive.service.cli.TableSchema;
+import org.apache.hive.service.cli.thrift.EmbeddedThriftBinaryCLIService;
+import org.apache.hive.service.rpc.thrift.TFetchResultsReq;
+import org.apache.hive.service.rpc.thrift.TFetchResultsResp;
+import org.apache.hive.service.rpc.thrift.TRowSet;
+import org.apache.hive.service.rpc.thrift.TStatus;
+import org.apache.hive.service.rpc.thrift.TStatusCode;
+import org.apache.thrift.TException;
+
+import static 
org.apache.hive.jdbc.EmbeddedCLIServicePortal.EMBEDDED_CLISERVICE;
+import static org.apache.hive.jdbc.Utils.URL_PREFIX;
+import static 
org.apache.hive.service.rpc.thrift.TProtocolVersion.HIVE_CLI_SERVICE_PROTOCOL_V10;
+import static org.junit.Assert.assertEquals;
+
+public class TestHiveQueryResultSet {
+  private static final String EMIT_EMPTY_ROWS = "hive.test.emit.empty.rows";
+  private static final String EMIT_NUM_ROWS = "hive.test.emit.num.rows";
+
+  // Create subclass of EmbeddedThriftBinaryCLIService
+  public static class MyThriftBinaryCLIService extends 
EmbeddedThriftBinaryCLIService {
+private TStatus success = new TStatus(TStatusCode.SUCCESS_STATUS);
+private boolean emitEmptyRows;
+private int numRows;
+private int position;
+private int emptyRowPos;
+
+@Override
+public synchronized void init(HiveConf hiveConf) {
+  hiveConf.setBoolean("hive.support.concurrency", false);
+  this.emitEmptyRows = hiveConf.getBoolean(EMIT_EMPTY_ROWS, false);
+  this.numRows = hiveConf.getInt(EMIT_NUM_ROWS, 10);
+  this.emptyRowPos = (numRows / 2);
+  this.position = 0;
+  super.init(hiveConf);
+}
+
+@Override
+public TFetchResultsResp FetchResults(TFetchResultsReq req) throws 
TException {

Review comment:
   I detect that this code is problematic. According to the [Bad practice 
(BAD_PRACTICE)](https://spotbugs.readthedocs.io/en/stable/bugDescriptions.html#bad-practice-bad-practice),
 [Nm: Method names should start with a lower case letter 
(NM_METHOD_NAMING_CONVENTION)](https://spotbugs.readthedocs.io/en/stable/bugDescriptions.html#nm-method-names-should-start-with-a-lower-case-letter-nm-method-naming-convention).
   Methods should be verbs, in mixed case with the first letter lowercase, with 
the first letter of each internal word capitalized.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 569445)
Time Spent: 20m  (was: 10m)

> Hive JDBC driver doesn't consider the value of 'hasMoreRows'
> 
>
> Key: HIVE-24861
> URL: https://issues.apache.org/jira/browse/HIVE-24861
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Zoltán Borók-Nagy
>Priority: Major
>  

[jira] [Work logged] (HIVE-24861) Hive JDBC driver doesn't consider the value of 'hasMoreRows'

2021-03-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24861?focusedWorklogId=565602=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-565602
 ]

ASF GitHub Bot logged work on HIVE-24861:
-

Author: ASF GitHub Bot
Created on: 13/Mar/21 03:01
Start Date: 13/Mar/21 03:01
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 opened a new pull request #2070:
URL: https://github.com/apache/hive/pull/2070


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 565602)
Remaining Estimate: 0h
Time Spent: 10m

> Hive JDBC driver doesn't consider the value of 'hasMoreRows'
> 
>
> Key: HIVE-24861
> URL: https://issues.apache.org/jira/browse/HIVE-24861
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Zoltán Borók-Nagy
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> TCLIService's FetchResults might return an empty result set, but with 
> hasMoreRows=true. In that case the driver ignores the flag hasMoreRows and 
> thinks it is the end of the result stream, causing data loss.
> I've seen this when the Hive JDBC driver was used to connect to Impala. 
> IMPALA-7312 introduced a timeout on FetchResults(). If Impala cannot produce 
> rows in the given timeout then it returns an empty result set, but setting 
> hasMoreRows=true. However, the Hive JDBC driver interprets it as the end of 
> the result stream and closes the operation.
> I think if hasMoreRows=true then the Hive JDBC driver should issue 
> FetchResults() again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)