[jira] [Commented] (HIVE-6852) JDBC client connections hang at TSaslTransport

2020-06-28 Thread jamesqjiang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147320#comment-17147320
 ] 

jamesqjiang commented on HIVE-6852:
---

It worked after I upgraded jdk version.

> JDBC client connections hang at TSaslTransport
> --
>
> Key: HIVE-6852
> URL: https://issues.apache.org/jira/browse/HIVE-6852
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: jay vyas
>Priority: Major
>
> I've noticed that when there is an underlying issue in connecting a client to 
> the JDBC interface of the HiveServer2 to run queries, you get a hang after 
> the thrift portion, at least in certain scenarios: 
> Turning log4j to DEBUG, you can see the following when trying to get a 
> connection using:
> {noformat}
> Connection jdbc = 
> DriverManager.getConnection(this.con,"hive","password");
> "jdbc:hive2://localhost:1/default",
> {noformat}
> The logs get to here before the hang :
> {noformat}
> 0[main] DEBUG org.apache.thrift.transport.TSaslTransport  - opening 
> transport org.apache.thrift.transport.TSaslClientTransport@219ba640
> 0 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - opening 
> transport org.apache.thrift.transport.TSaslClientTransport@219ba640
> 3[main] DEBUG org.apache.thrift.transport.TSaslClientTransport  - Sending 
> mechanism name PLAIN and initial response of length 14
> 3 [main] DEBUG org.apache.thrift.transport.TSaslClientTransport  - Sending 
> mechanism name PLAIN and initial response of length 14
> 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: 
> Writing message with status START and payload length 5
> 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Writing 
> message with status START and payload length 5
> 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: 
> Writing message with status COMPLETE and payload length 14
> 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Writing 
> message with status COMPLETE and payload length 14
> 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Start 
> message handled
> 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Start 
> message handled
> 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Main 
> negotiation loop complete
> 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Main 
> negotiation loop complete
> 6[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: SASL 
> Client receiving last message
> 6 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: SASL 
> Client receiving last message
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-23770) Druid filter translation unable to handle inverted between

2020-06-28 Thread Nishant Bangarwa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa reassigned HIVE-23770:
---


> Druid filter translation unable to handle inverted between
> --
>
> Key: HIVE-23770
> URL: https://issues.apache.org/jira/browse/HIVE-23770
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
>
> Druid filter translation happens in Calcite and does not uses HiveBetween 
> inverted flag for translation this misses a negation in the planned query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23770) Druid filter translation unable to handle inverted between

2020-06-28 Thread Nishant Bangarwa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-23770:

Status: Patch Available  (was: Open)

> Druid filter translation unable to handle inverted between
> --
>
> Key: HIVE-23770
> URL: https://issues.apache.org/jira/browse/HIVE-23770
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-23770.patch
>
>
> Druid filter translation happens in Calcite and does not uses HiveBetween 
> inverted flag for translation this misses a negation in the planned query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-23770) Druid filter translation unable to handle inverted between

2020-06-28 Thread Nishant Bangarwa (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nishant Bangarwa updated HIVE-23770:

Attachment: HIVE-23770.patch

> Druid filter translation unable to handle inverted between
> --
>
> Key: HIVE-23770
> URL: https://issues.apache.org/jira/browse/HIVE-23770
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-23770.patch
>
>
> Druid filter translation happens in Calcite and does not uses HiveBetween 
> inverted flag for translation this misses a negation in the planned query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23770) Druid filter translation unable to handle inverted between

2020-06-28 Thread Nishant Bangarwa (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147468#comment-17147468
 ] 

Nishant Bangarwa commented on HIVE-23770:
-

[~jcamachorodriguez] Can you please help review this one. 

> Druid filter translation unable to handle inverted between
> --
>
> Key: HIVE-23770
> URL: https://issues.apache.org/jira/browse/HIVE-23770
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-23770.patch
>
>
> Druid filter translation happens in Calcite and does not uses HiveBetween 
> inverted flag for translation this misses a negation in the planned query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-23764) Remove unnecessary getLastFlushLength when checking delete delta files

2020-06-28 Thread Rajesh Balamohan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147553#comment-17147553
 ] 

Rajesh Balamohan commented on HIVE-23764:
-

Related ticket : https://issues.apache.org/jira/browse/HIVE-23597

> Remove unnecessary getLastFlushLength when checking delete delta files
> --
>
> Key: HIVE-23764
> URL: https://issues.apache.org/jira/browse/HIVE-23764
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> VectorizedOrcAcidRowBatchReader$ColumnizedDeleteEventRegistry calls 
> OrcAcidUtils.getLastFlushLength for every delete delta file.
> Even the comment says:
> {code}
>   // NOTE: Calling last flush length below is more for 
> future-proofing when we have
>   // streaming deletes. But currently we don't support streaming 
> deletes, and this can
>   // be removed if this becomes a performance issue.
> {code}
> If we have a table with 5 updates (1 base + 5 delta + 5 delete_delta), then 
> for every base + delta dir we will check all of the delete_delta directories, 
> and check the getLastFlushLength method which will result in 6*5=30 
> unnecessary NN/S3 calls.
> We should remove the check as already proposed in the comment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23597) VectorizedOrcAcidRowBatchReader::ColumnizedDeleteEventRegistry reads delete delta directories multiple times

2020-06-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23597?focusedWorklogId=452175&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-452175
 ]

ASF GitHub Bot logged work on HIVE-23597:
-

Author: ASF GitHub Bot
Created on: 29/Jun/20 05:43
Start Date: 29/Jun/20 05:43
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #1081:
URL: https://github.com/apache/hive/pull/1081#discussion_r446784234



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
##
@@ -1605,6 +1618,46 @@ public int compareTo(CompressedOwid other) {
 throw e; // rethrow the exception so that the caller can handle.
   }
 }
+
+/**
+ * Create delete delta reader. Caching orc tail to avoid FS lookup/reads 
for repeated scans.
+ *
+ * @param deleteDeltaFile
+ * @param conf
+ * @param fs FileSystem
+ * @return delete file reader
+ * @throws IOException
+ */
+private Reader getDeleteDeltaReader(Path deleteDeltaFile, JobConf conf, 
FileSystem fs) throws IOException {
+  OrcTail deleteDeltaTail = 
deleteDeltaOrcTailCache.getIfPresent(deleteDeltaFile);

Review comment:
   Is the OrcTail thread safe? If I understandcorrectly the tail will be 
read by multiple LLAP threads concurrenly





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 452175)
Time Spent: 1h 10m  (was: 1h)

> VectorizedOrcAcidRowBatchReader::ColumnizedDeleteEventRegistry reads delete 
> delta directories multiple times
> 
>
> Key: HIVE-23597
> URL: https://issues.apache.org/jira/browse/HIVE-23597
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java#L1562]
> {code:java}
> try {
> final Path[] deleteDeltaDirs = getDeleteDeltaDirsFromSplit(orcSplit);
> if (deleteDeltaDirs.length > 0) {
>   int totalDeleteEventCount = 0;
>   for (Path deleteDeltaDir : deleteDeltaDirs) {
> {code}
>  
> Consider a directory layout like the following. This was created by having 
> simple set of "insert --> update --> select" queries.
>  
> {noformat}
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/base_001
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/base_002
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_003_003_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_004_004_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_005_005_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_006_006_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_007_007_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_008_008_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_009_009_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_010_010_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_011_011_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_012_012_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delete_delta_013_013_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_003_003_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_004_004_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_005_005_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_006_006_
> /warehouse-1591131255-hl5z/warehouse/tablespace/managed/hive/sequential_update_4/delta_007_

[jira] [Commented] (HIVE-23764) Remove unnecessary getLastFlushLength when checking delete delta files

2020-06-28 Thread Peter Vary (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-23764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17147563#comment-17147563
 ] 

Peter Vary commented on HIVE-23764:
---

Yeah. I somehow forgotten about that jira. Sry :(
Do you still plan to push that?

Thanks, Peter 

> Remove unnecessary getLastFlushLength when checking delete delta files
> --
>
> Key: HIVE-23764
> URL: https://issues.apache.org/jira/browse/HIVE-23764
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> VectorizedOrcAcidRowBatchReader$ColumnizedDeleteEventRegistry calls 
> OrcAcidUtils.getLastFlushLength for every delete delta file.
> Even the comment says:
> {code}
>   // NOTE: Calling last flush length below is more for 
> future-proofing when we have
>   // streaming deletes. But currently we don't support streaming 
> deletes, and this can
>   // be removed if this becomes a performance issue.
> {code}
> If we have a table with 5 updates (1 base + 5 delta + 5 delete_delta), then 
> for every base + delta dir we will check all of the delete_delta directories, 
> and check the getLastFlushLength method which will result in 6*5=30 
> unnecessary NN/S3 calls.
> We should remove the check as already proposed in the comment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23755) Fix Ranger Url extra slash

2020-06-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23755?focusedWorklogId=452182&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-452182
 ]

ASF GitHub Bot logged work on HIVE-23755:
-

Author: ASF GitHub Bot
Created on: 29/Jun/20 06:38
Start Date: 29/Jun/20 06:38
Worklog Time Spent: 10m 
  Work Description: pkumarsinha commented on a change in pull request #1173:
URL: https://github.com/apache/hive/pull/1173#discussion_r446802228



##
File path: 
ql/src/test/org/apache/hadoop/hive/ql/exec/repl/TestRangerLoadTask.java
##
@@ -263,4 +267,56 @@ public void testSuccessDisableDenyRangerPolicies() throws 
Exception {
 //Deny policy is added
 Assert.assertEquals(1, actualPolicyList.getListSize());
   }
+
+  @Test
+  public void testRangerEndpointCreation() throws Exception {
+URIBuilder uriBuilder = new URIBuilder("http://ranger.apache.org:6080";);

Review comment:
   It tests actually URIBuilder, and not the testRanger endpoint creation 
flow. It would have been good if we could have had this test in 
RangerRestClient level. Otherwise the change looks to me.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 452182)
Time Spent: 40m  (was: 0.5h)

> Fix Ranger Url extra slash
> --
>
> Key: HIVE-23755
> URL: https://issues.apache.org/jira/browse/HIVE-23755
> Project: Hive
>  Issue Type: Task
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-23755.01.patch, HIVE-23755.02.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)