[jira] [Resolved] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-09-14 Thread Ramesh Kumar Thangarajan (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan resolved HIVE-24070.
-
Resolution: Duplicate

> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
>  It should fetch events in batches.
> Similar to https://issues.apache.org/jira/browse/HIVE-19430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-09-14 Thread Ramesh Kumar Thangarajan (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195269#comment-17195269
 ] 

Ramesh Kumar Thangarajan commented on HIVE-24070:
-

Yes I will close this jira, and lets work on HIVE-22290

> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
>  It should fetch events in batches.
> Similar to https://issues.apache.org/jira/browse/HIVE-19430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HIVE-24070) ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of pending events

2020-09-14 Thread Riju Trivedi (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195184#comment-17195184
 ] 

Riju Trivedi edited comment on HIVE-24070 at 9/14/20, 8:54 AM:
---

[~rameshkumar] [~nareshpr] I think we are trying to address same issues in both 
of these jiras HIVE-22290


was (Author: rtrivedi12):
[~rameshkumar] [~nareshpr] I think we are trying address same issues in both of 
these jiras [HIVE-22290|https://issues.apache.org/jira/browse/HIVE-22290]

> ObjectStore.cleanWriteNotificationEvents OutOfMemory on large number of 
> pending events
> --
>
> Key: HIVE-24070
> URL: https://issues.apache.org/jira/browse/HIVE-24070
> Project: Hive
>  Issue Type: Bug
>  Components: repl
>Affects Versions: 4.0.0
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Fix For: 4.0.0
>
>
> If there are large number of events that haven't been cleaned up for some 
> reason, then ObjectStore.cleanWriteNotificationEvents() can run out of memory 
> while it loads all the events to be deleted.
>  It should fetch events in batches.
> Similar to https://issues.apache.org/jira/browse/HIVE-19430



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-18537) [Calcite-CBO] Queries with a nested distinct clause and a windowing function seem to fail with calcite Assertion error

2020-09-14 Thread Nemon Lou (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-18537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-18537:
-
Affects Version/s: 3.1.2

> [Calcite-CBO] Queries with a nested distinct clause and a windowing function 
> seem to fail with calcite Assertion error
> --
>
> Key: HIVE-18537
> URL: https://issues.apache.org/jira/browse/HIVE-18537
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0, 2.3.2, 3.1.2
>Reporter: Amruth Sampath
>Priority: Critical
>
> Sample test case to re-produce the issue. The issue does not occur if 
> *hive.cbo.enable=false*
> {code:java}
> create table test_cbo (
>  `a` BIGINT,
>  `b` STRING,
>  `c` TIMESTAMP,
>  `d` STRING
>  );
> SELECT 1
>  FROM
>  (SELECT
>  DISTINCT
>  a AS a_,
>  b AS b_,
>  rank() over (partition BY a ORDER BY c DESC) AS c_,
>  d AS d_
>  FROM test_cbo
>  WHERE b = 'some_filter' ) n
>  WHERE c_ = 1;
> {code}
> Fails with, 
> {code:java}
> Exception in thread "main" java.lang.AssertionError: Internal error: Cannot 
> add expression of different type to set:
> set type is RecordType(BIGINT a_, INTEGER c_, VARCHAR(2147483647) CHARACTER 
> SET "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" d_) NOT NULL
> expression type is RecordType(BIGINT a_, VARCHAR(2147483647) CHARACTER SET 
> "UTF-16LE" COLLATE "ISO-8859-1$en_US$primary" c_, INTEGER d_) NOT NULL
> set is rel#112:HiveAggregate.HIVE.[](input=HepRelVertex#121,group={0, 2, 3})
> expression is HiveProject#123{code}
> This might be related to https://issues.apache.org/jira/browse/CALCITE-1868.
>   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24138) Llap external client flow is broken due to netty shading

2020-09-14 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-24138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195318#comment-17195318
 ] 

László Bodor commented on HIVE-24138:
-

[~ayushtkn]: I can see in the pull request that you almost achieved a green run 
without upgrading hadoop/guava, which is promising, but I'm a bit confused 
about the contradictional contents of the consecutive commits...is it possible 
to squash and force push them to easily see an overall picture on the PR? this 
will result in a smaller commit, and maybe helps us to decide how to solve this 
issue finally? 

> Llap external client flow is broken due to netty shading
> 
>
> Key: HIVE-24138
> URL: https://issues.apache.org/jira/browse/HIVE-24138
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Ayush Saxena
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We shaded netty in hive-exec in - 
> https://issues.apache.org/jira/browse/HIVE-23073
> This breaks LLAP external client flow on LLAP daemon side - 
> LLAP daemon stacktrace - 
> {code}
> 2020-09-09T18:22:13,413  INFO [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] llap.LlapOutputFormat: Returning 
> writer for: attempt_497418324441977_0004_0_00_00_0
> 2020-09-09T18:22:13,419 ERROR [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] tez.MapRecordSource: 
> java.lang.NoSuchMethodError: 
> org.apache.arrow.memory.BufferAllocator.buffer(I)Lorg/apache/hive/io/netty/buffer/ArrowBuf;
>   at 
> org.apache.hadoop.hive.llap.WritableByteChannelAdapter.write(WritableByteChannelAdapter.java:96)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:74)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:57)
>   at 
> org.apache.arrow.vector.ipc.WriteChannel.writeIntLittleEndian(WriteChannel.java:89)
>   at 
> org.apache.arrow.vector.ipc.message.MessageSerializer.serialize(MessageSerializer.java:88)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.ensureStarted(ArrowWriter.java:130)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.writeBatch(ArrowWriter.java:102)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:85)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:46)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.filesink.VectorFileSinkArrowOperator.process(VectorFileSinkArrowOperator.java:137)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:842)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38)
>   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>   at 
> org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolE

[jira] [Commented] (HIVE-24138) Llap external client flow is broken due to netty shading

2020-09-14 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195339#comment-17195339
 ] 

Ayush Saxena commented on HIVE-24138:
-

[~abstractdog] 
I have squashed the commits into one :
https://github.com/apache/hive/pull/1491/commits/432443f1c14f1032a4fefba9a5ba712e122fea87

 I will explain the approach as well :
The netty version in hadoop and hive were contradicting which lead to shading, 
so first I tried with upgrading hadoop, (3.1.4/3.2.1/3.3.0), but these tend to 
upgrade other dependencies like {{Guava}} as well, which is hard to upgrade in 
Hive due to bunch of reasons.
So, I went with with another approach of excluding netty from hadoop 
dependency. This works since the netty upgrade from 4.0 line to 4.1 line in 
hadoop, didn't involve any code change, that means hadoop jars can happily work 
with this higher netty version. That got me just 2 failures which look 
unrelated though.

--> The code is in WIP stage, I excluded {{netty-all}} from all hadoop 
sub-components, if the approach gets agreement, I will restrict that to just 
the dependencies which actually has {{netty-all}}, probably the ones having 
{{hdfs}} and {{hdfs-client}} as dependency.

> Llap external client flow is broken due to netty shading
> 
>
> Key: HIVE-24138
> URL: https://issues.apache.org/jira/browse/HIVE-24138
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Shubham Chaurasia
>Assignee: Ayush Saxena
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We shaded netty in hive-exec in - 
> https://issues.apache.org/jira/browse/HIVE-23073
> This breaks LLAP external client flow on LLAP daemon side - 
> LLAP daemon stacktrace - 
> {code}
> 2020-09-09T18:22:13,413  INFO [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] llap.LlapOutputFormat: Returning 
> writer for: attempt_497418324441977_0004_0_00_00_0
> 2020-09-09T18:22:13,419 ERROR [TezTR-222977_4_0_0_0_0 
> (497418324441977_0004_0_00_00_0)] tez.MapRecordSource: 
> java.lang.NoSuchMethodError: 
> org.apache.arrow.memory.BufferAllocator.buffer(I)Lorg/apache/hive/io/netty/buffer/ArrowBuf;
>   at 
> org.apache.hadoop.hive.llap.WritableByteChannelAdapter.write(WritableByteChannelAdapter.java:96)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:74)
>   at org.apache.arrow.vector.ipc.WriteChannel.write(WriteChannel.java:57)
>   at 
> org.apache.arrow.vector.ipc.WriteChannel.writeIntLittleEndian(WriteChannel.java:89)
>   at 
> org.apache.arrow.vector.ipc.message.MessageSerializer.serialize(MessageSerializer.java:88)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.ensureStarted(ArrowWriter.java:130)
>   at 
> org.apache.arrow.vector.ipc.ArrowWriter.writeBatch(ArrowWriter.java:102)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:85)
>   at 
> org.apache.hadoop.hive.llap.LlapArrowRecordWriter.write(LlapArrowRecordWriter.java:46)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.filesink.VectorFileSinkArrowOperator.process(VectorFileSinkArrowOperator.java:137)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:969)
>   at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:172)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809)
>   at 
> org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:842)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:76)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
>   at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
>   at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
>   at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   

[jira] [Assigned] (HIVE-24158) Cleanup isn't complete in OrcFileMergeOperator#closeOp

2020-09-14 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage reassigned HIVE-24158:



> Cleanup isn't complete in OrcFileMergeOperator#closeOp
> --
>
> Key: HIVE-24158
> URL: https://issues.apache.org/jira/browse/HIVE-24158
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>
> Field Map outWriters isn't cleared during operation close:
> {code:java}
> if (outWriters != null) {
> for (Map.Entry outWriterEntry : outWriters.entrySet()) {
>  Writer outWriter = outWriterEntry.getValue();
>  outWriter.close();
>  outWriter = null;
> }
>    }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24158) Cleanup isn't complete in OrcFileMergeOperator#closeOp

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24158?focusedWorklogId=483895&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483895
 ]

ASF GitHub Bot logged work on HIVE-24158:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 09:42
Start Date: 14/Sep/20 09:42
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #1494:
URL: https://github.com/apache/hive/pull/1494


   Field Map outWriters isn't cleared during operation close.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483895)
Remaining Estimate: 0h
Time Spent: 10m

> Cleanup isn't complete in OrcFileMergeOperator#closeOp
> --
>
> Key: HIVE-24158
> URL: https://issues.apache.org/jira/browse/HIVE-24158
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Field Map outWriters isn't cleared during operation close:
> {code:java}
> if (outWriters != null) {
> for (Map.Entry outWriterEntry : outWriters.entrySet()) {
>  Writer outWriter = outWriterEntry.getValue();
>  outWriter.close();
>  outWriter = null;
> }
>    }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24158) Cleanup isn't complete in OrcFileMergeOperator#closeOp

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24158:
--
Labels: pull-request-available  (was: )

> Cleanup isn't complete in OrcFileMergeOperator#closeOp
> --
>
> Key: HIVE-24158
> URL: https://issues.apache.org/jira/browse/HIVE-24158
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Field Map outWriters isn't cleared during operation close:
> {code:java}
> if (outWriters != null) {
> for (Map.Entry outWriterEntry : outWriters.entrySet()) {
>  Writer outWriter = outWriterEntry.getValue();
>  outWriter.close();
>  outWriter = null;
> }
>    }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-24159:
---

Assignee: László Bodor

> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24159:

Description: As kafka_storage_handler.q was disabled by HIVE-23985, I 
haven't realized upstream that the kafka qtest fails. Instead of setting up a 
kerberized environment in qtest (which doesn't seem to be a usual usecase, e.g. 
haven't seen hive.server2.authentication.kerberos.principal used in *.q files) 
I managed to make the test with a simple 
UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
every non-secure environment.

> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
> upstream that the kafka qtest fails. Instead of setting up a kerberized 
> environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
> seen hive.server2.authentication.kerberos.principal used in *.q files) I 
> managed to make the test with a simple 
> UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
> every non-secure environment.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24159:

Description: 
As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
upstream that the kafka qtest fails. Instead of setting up a kerberized 
environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
seen hive.server2.authentication.kerberos.principal used in *.q files) I 
managed to make the test with a simple UserGroupInformation.isSecurityEnabled() 
check, which can be also useful for every non-secure environment.

For reference, the exception was:
{code}

{code}

  was:As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
upstream that the kafka qtest fails. Instead of setting up a kerberized 
environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
seen hive.server2.authentication.kerberos.principal used in *.q files) I 
managed to make the test with a simple UserGroupInformation.isSecurityEnabled() 
check, which can be also useful for every non-secure environment.


> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>
> As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
> upstream that the kafka qtest fails. Instead of setting up a kerberized 
> environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
> seen hive.server2.authentication.kerberos.principal used in *.q files) I 
> managed to make the test with a simple 
> UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
> every non-secure environment.
> For reference, the exception was:
> {code}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24159:
--
Labels: pull-request-available  (was: )

> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
> upstream that the kafka qtest fails. Instead of setting up a kerberized 
> environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
> seen hive.server2.authentication.kerberos.principal used in *.q files) I 
> managed to make the test with a simple 
> UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
> every non-secure environment.
> For reference, the exception was:
> {code}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24159?focusedWorklogId=483912&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483912
 ]

ASF GitHub Bot logged work on HIVE-24159:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 10:30
Start Date: 14/Sep/20 10:30
Worklog Time Spent: 10m 
  Work Description: abstractdog opened a new pull request #1495:
URL: https://github.com/apache/hive/pull/1495


   
   
   
   ### What changes were proposed in this pull request?
   Check secure env before taking care of delegation tokens.
   
   ### Why are the changes needed?
   Broken kafka_storage_handler.q test after HIVE-23408
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Tested with temporarily enabled qtest.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483912)
Remaining Estimate: 0h
Time Spent: 10m

> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
> upstream that the kafka qtest fails. Instead of setting up a kerberized 
> environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
> seen hive.server2.authentication.kerberos.principal used in *.q files) I 
> managed to make the test with a simple 
> UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
> every non-secure environment.
> For reference, the exception was:
> {code}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24159?focusedWorklogId=483913&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483913
 ]

ASF GitHub Bot logged work on HIVE-24159:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 10:31
Start Date: 14/Sep/20 10:31
Worklog Time Spent: 10m 
  Work Description: abstractdog commented on pull request #1495:
URL: https://github.com/apache/hive/pull/1495#issuecomment-691967429


   @ashutoshc : could you please take a quick look? thanks...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483913)
Time Spent: 20m  (was: 10m)

> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
> upstream that the kafka qtest fails. Instead of setting up a kerberized 
> environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
> seen hive.server2.authentication.kerberos.principal used in *.q files) I 
> managed to make the test with a simple 
> UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
> every non-secure environment.
> For reference, the exception was:
> {code}
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-14 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24159:

Description: 
As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
upstream that the kafka qtest fails. Instead of setting up a kerberized 
environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
seen hive.server2.authentication.kerberos.principal used in *.q files) I 
managed to make the test with a simple UserGroupInformation.isSecurityEnabled() 
check, which can be also useful for every non-secure environment.

For reference, the exception was:
{code}
2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] 
tez.TezTask: Failed to execute tez graph.
org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
at 
org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451)
 ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) 
~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
at 
org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) 
~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333)
 ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301)
 ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) 
~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) 
~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) 
[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247) 
[hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:193) 
[hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) 
[hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:343) 
[hive-cli-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1465) 
[classes/:?]
at 
org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1438) 
[classes/:?]
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:194)
 [classes/:?]
at 
org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) 
[classes/:?]
at 
org.apache.hadoop.hive.cli.TestMiniHiveKafkaCliDriver.testCliDriver(TestMiniHiveKafkaCliDriver.java:60)
 [test-classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[?:1.8.0_151]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_151]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_151]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
at 
org.junit.runners.model.FrameworkMethod

[jira] [Commented] (HIVE-24131) Use original src location always when data copy runs on target

2020-09-14 Thread Anishek Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195366#comment-17195366
 ] 

Anishek Agarwal commented on HIVE-24131:


Committed to master , Thanks for the patch [~pkumarsinha] and review [~aasha]

> Use original src location always when data copy runs on target 
> ---
>
> Key: HIVE-24131
> URL: https://issues.apache.org/jira/browse/HIVE-24131
> Project: Hive
>  Issue Type: Bug
>Reporter: Pravin Sinha
>Assignee: Pravin Sinha
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24131.01.patch, HIVE-24131.02.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24094) cast type mismatch and use is not null, the results are error if cbo is true

2020-09-14 Thread zhaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhaolong resolved HIVE-24094.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

> cast type mismatch and use is not null, the results are error if cbo is true
> 
>
> Key: HIVE-24094
> URL: https://issues.apache.org/jira/browse/HIVE-24094
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 3.1.0
>Reporter: zhaolong
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: image-2020-08-31-10-01-26-250.png, 
> image-2020-08-31-10-02-39-154.png, image-2020-09-04-10-54-43-141.png, 
> image-2020-09-04-10-56-00-764.png, image-2020-09-04-10-56-07-286.png, 
> image-2020-09-04-10-59-36-780.png, image-2020-09-04-11-02-07-917.png, 
> image-2020-09-04-11-02-18-008.png, image-2020-09-07-15-20-44-201.png, 
> image-2020-09-07-15-21-35-566.png, image-2020-09-07-15-24-59-015.png, 
> image-2020-09-07-15-25-18-785.png, image-2020-09-08-16-42-54-728.png, 
> image-2020-09-08-16-43-00-848.png
>
>
> 1.CREATE TABLE IF NOT EXISTS testa
> ( 
>  SEARCHWORD STRING, 
>  COUNT_NUM BIGINT, 
>  WORDS STRING 
> ) 
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\27' 
> STORED AS TEXTFILE; 
> 2.insert into testa values('searchword', 1, 'a');
> 3.set hive.cbo.enable=false;
> 4.SELECT 
> CASE 
>  WHEN CAST(searchword as bigint) IS NOT NULL THEN CAST(CAST(searchword as 
> bigint) as String) 
>  ELSE searchword 
> END AS WORDS, 
> searchword FROM testa;
> !image-2020-08-31-10-01-26-250.png!
> 5.set hive.cbo.enable=true;
> 6.SELECT 
> CASE 
>  WHEN CAST(searchword as bigint) IS NOT NULL THEN CAST(CAST(searchword as 
> bigint) as String) 
>  ELSE searchword 
> END AS WORDS, 
> searchword FROM testa;
> !image-2020-08-31-10-02-39-154.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24160) Scheduled executions must allow state transitions to TIMED_OUT from any state

2020-09-14 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-24160:
---


> Scheduled executions must allow state transitions to TIMED_OUT from any state
> -
>
> Key: HIVE-24160
> URL: https://issues.apache.org/jira/browse/HIVE-24160
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24157) Strict mode to fail on CAST timestamp <-> numeric

2020-09-14 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-24157:
---

Assignee: Zoltan Haindrich

> Strict mode to fail on CAST timestamp <-> numeric
> -
>
> Key: HIVE-24157
> URL: https://issues.apache.org/jira/browse/HIVE-24157
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Jesus Camacho Rodriguez
>Assignee: Zoltan Haindrich
>Priority: Major
>
> There is some interest in enforcing that CAST numeric <\-> timestamp is 
> disallowed to avoid confusion among users, e.g., SQL standard does not allow 
> numeric <\-> timestamp casting, timestamp type is timezone agnostic, etc.
> We should introduce a strict config for timestamp (similar to others before): 
> If the config is true, we shall fail while compiling the query with a 
> meaningful message.
> To provide similar behavior, Hive has multiple functions that provide clearer 
> semantics for numeric to timestamp conversion (and vice versa):
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24160) Scheduled executions must allow state transitions to TIMED_OUT from any state

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24160?focusedWorklogId=483976&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483976
 ]

ASF GitHub Bot logged work on HIVE-24160:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 13:09
Start Date: 14/Sep/20 13:09
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #1496:
URL: https://github.com/apache/hive/pull/1496


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483976)
Remaining Estimate: 0h
Time Spent: 10m

> Scheduled executions must allow state transitions to TIMED_OUT from any state
> -
>
> Key: HIVE-24160
> URL: https://issues.apache.org/jira/browse/HIVE-24160
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24160) Scheduled executions must allow state transitions to TIMED_OUT from any state

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24160:
--
Labels: pull-request-available  (was: )

> Scheduled executions must allow state transitions to TIMED_OUT from any state
> -
>
> Key: HIVE-24160
> URL: https://issues.apache.org/jira/browse/HIVE-24160
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24157) Strict mode to fail on CAST timestamp <-> numeric

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24157:
--
Labels: pull-request-available  (was: )

> Strict mode to fail on CAST timestamp <-> numeric
> -
>
> Key: HIVE-24157
> URL: https://issues.apache.org/jira/browse/HIVE-24157
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Jesus Camacho Rodriguez
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is some interest in enforcing that CAST numeric <\-> timestamp is 
> disallowed to avoid confusion among users, e.g., SQL standard does not allow 
> numeric <\-> timestamp casting, timestamp type is timezone agnostic, etc.
> We should introduce a strict config for timestamp (similar to others before): 
> If the config is true, we shall fail while compiling the query with a 
> meaningful message.
> To provide similar behavior, Hive has multiple functions that provide clearer 
> semantics for numeric to timestamp conversion (and vice versa):
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24157) Strict mode to fail on CAST timestamp <-> numeric

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24157?focusedWorklogId=483981&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-483981
 ]

ASF GitHub Bot logged work on HIVE-24157:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 13:17
Start Date: 14/Sep/20 13:17
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #1497:
URL: https://github.com/apache/hive/pull/1497


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 483981)
Remaining Estimate: 0h
Time Spent: 10m

> Strict mode to fail on CAST timestamp <-> numeric
> -
>
> Key: HIVE-24157
> URL: https://issues.apache.org/jira/browse/HIVE-24157
> Project: Hive
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Jesus Camacho Rodriguez
>Assignee: Zoltan Haindrich
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is some interest in enforcing that CAST numeric <\-> timestamp is 
> disallowed to avoid confusion among users, e.g., SQL standard does not allow 
> numeric <\-> timestamp casting, timestamp type is timezone agnostic, etc.
> We should introduce a strict config for timestamp (similar to others before): 
> If the config is true, we shall fail while compiling the query with a 
> meaningful message.
> To provide similar behavior, Hive has multiple functions that provide clearer 
> semantics for numeric to timestamp conversion (and vice versa):
> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24161) Support Oracle CLOB type in beeline

2020-09-14 Thread Robbie Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robbie Zhang reassigned HIVE-24161:
---

Assignee: Robbie Zhang

> Support Oracle CLOB type in beeline
> ---
>
> Key: HIVE-24161
> URL: https://issues.apache.org/jira/browse/HIVE-24161
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline
>Reporter: Robbie Zhang
>Assignee: Robbie Zhang
>Priority: Major
>
> We can use beeline as a JDBC client to access RDBMS such as Oracle. Sometimes 
> Oracle JDBC driver will return a CLOB object instead of a String object if 
> the string is too long. Beeline used to work well with CLOB type but it's 
> broken by HIVE-14786:
> [https://github.com/apache/hive/blob/2a760dd607e206d7f1061c01075767ecfff40d0c/beeline/src/java/org/apache/hive/beeline/Rows.java#L169]
> In the above line, when Oracle JDBC driver returns a CLOB object, it returns 
> a string like "oracle.sql.CLOB@2f7c7260". In this case, we should use 
> ResultSet.getString() rather than ResultSet.getObject().toString().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24031) Infinite planning time on syntactically big queries

2020-09-14 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis resolved HIVE-24031.

Resolution: Fixed

Fixed in 
[587b402aa6f5357cf6bd16893606b48b3406e9e5|https://github.com/apache/hive/commit/587b402aa6f5357cf6bd16893606b48b3406e9e5].

> Infinite planning time on syntactically big queries
> ---
>
> Key: HIVE-24031
> URL: https://issues.apache.org/jira/browse/HIVE-24031
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: ASTNode_getChildren_cost.png, 
> query_big_array_constructor.nps
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Syntactically big queries (~1 million tokens), such as the query shown below, 
> lead to very big (seemingly infinite) planning times.
> {code:sql}
> select posexplode(array('item1', 'item2', ..., 'item1M'));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24154?focusedWorklogId=484089&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484089
 ]

ASF GitHub Bot logged work on HIVE-24154:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 16:28
Start Date: 14/Sep/20 16:28
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on pull request #1492:
URL: https://github.com/apache/hive/pull/1492#issuecomment-692169075


   @kgyrtkirk , could you take another look? Thanks



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 484089)
Time Spent: 1h 10m  (was: 1h)

> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24162) Query based compaction looses bloom filter

2020-09-14 Thread Peter Varga (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Varga reassigned HIVE-24162:
--


> Query based compaction looses bloom filter
> --
>
> Key: HIVE-24162
> URL: https://issues.apache.org/jira/browse/HIVE-24162
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>
> *Steps to reproduce:*
>   
> {noformat}
> ++
> |   createtab_stmt   |
> ++
> | CREATE TABLE `bloomTest`(  |
> |   `msisdn` string, |
> |   `imsi` varchar(20),  |
> |   `imei` bigint,   |
> |   `cell_id` bigint)|
> | ROW FORMAT SERDE   |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  |
> | STORED AS INPUTFORMAT  |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  |
> | OUTPUTFORMAT   |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |
> | LOCATION   |
> |   
> 's3a://dwxtpcds30-wwgq-dwx-managed/clusters/env-6cwwgq/warehouse-1580338415-7dph/warehouse/tablespace/managed/hive/del_db.db/bloomtest'
>  |
> | TBLPROPERTIES (|
> |   'bucketing_version'='2', |
> |   'orc.bloom.filter.columns'='msisdn,cell_id,imsi',  |
> |   'orc.bloom.filter.fpp'='0.02',   |
> |   'transactional'='true',  |
> |   'transactional_properties'='default',|
> |   'transient_lastDdlTime'='1597222946')|
> ++
> insert into  bloomTest values ("a", "b", 10, 20);
> insert into  bloomTest values ("aa", "bb", 100, 200);
> insert into  bloomTest values ("aaa", "bbb", 1000, 2000);
> select * from bloomTest;
> +---+-+-++
> | bloomtest.msisdn  | bloomtest.imsi  | bloomtest.imei  | bloomtest.cell_id  |
> +---+-+-++
> | a | b   | 10  | 20 |
> | aa| bb  | 100 | 200|
> | aaa   | bbb | 1000| 2000   |
> +---+-+-++
> {noformat}
>  - Compact the table
> {code:java}
> alter table bloomTest compact 'MAJOR';
> {code}
>  - Wait for the compaction to be over and check for bloom filters in dataset.
>   
>  - delta would have it, but not in the base dataset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-22167) TIMESTAMP - Backwards incompatible change: Hive 3.1 reads back binary RCFILE timestamps written by Hive 2.x incorrectly

2020-09-14 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez reassigned HIVE-22167:
--

Assignee: Jesus Camacho Rodriguez

> TIMESTAMP - Backwards incompatible change: Hive 3.1 reads back binary RCFILE 
> timestamps written by Hive 2.x incorrectly
> ---
>
> Key: HIVE-22167
> URL: https://issues.apache.org/jira/browse/HIVE-22167
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: Piotr Findeisen
>Assignee: Jesus Camacho Rodriguez
>Priority: Major
>
> Same as HIVE-21002 but for binary RCFILE ({{ROW FORMAT SERDE 
> 'org.apache.hadoop.hive.serde2.columnar.LazyBinaryColumnarSerDe' STORED AS 
> RCFILE;}})



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24162) Query based compaction looses bloom filter

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24162?focusedWorklogId=484104&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484104
 ]

ASF GitHub Bot logged work on HIVE-24162:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 16:51
Start Date: 14/Sep/20 16:51
Worklog Time Spent: 10m 
  Work Description: pvargacl opened a new pull request #1498:
URL: https://github.com/apache/hive/pull/1498


   
   ### What changes were proposed in this pull request?
   Keep the orc.bloom.filter during Query based compaction.
   
   ### Why are the changes needed?
   Fix the bug
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Test added
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 484104)
Remaining Estimate: 0h
Time Spent: 10m

> Query based compaction looses bloom filter
> --
>
> Key: HIVE-24162
> URL: https://issues.apache.org/jira/browse/HIVE-24162
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
>   
> {noformat}
> ++
> |   createtab_stmt   |
> ++
> | CREATE TABLE `bloomTest`(  |
> |   `msisdn` string, |
> |   `imsi` varchar(20),  |
> |   `imei` bigint,   |
> |   `cell_id` bigint)|
> | ROW FORMAT SERDE   |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  |
> | STORED AS INPUTFORMAT  |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  |
> | OUTPUTFORMAT   |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |
> | LOCATION   |
> |   
> 's3a://dwxtpcds30-wwgq-dwx-managed/clusters/env-6cwwgq/warehouse-1580338415-7dph/warehouse/tablespace/managed/hive/del_db.db/bloomtest'
>  |
> | TBLPROPERTIES (|
> |   'bucketing_version'='2', |
> |   'orc.bloom.filter.columns'='msisdn,cell_id,imsi',  |
> |   'orc.bloom.filter.fpp'='0.02',   |
> |   'transactional'='true',  |
> |   'transactional_properties'='default',|
> |   'transient_lastDdlTime'='1597222946')|
> ++
> insert into  bloomTest values ("a", "b", 10, 20);
> insert into  bloomTest values ("aa", "bb", 100, 200);
> insert into  bloomTest values ("aaa", "bbb", 1000, 2000);
> select * from bloomTest;
> +---+-+-++
> | bloomtest.msisdn  | bloomtest.imsi  | bloomtest.imei  | bloomtest.cell_id  |
> +---+-+-++
> | a | b   | 10  | 20 |
> | aa| bb  | 100 | 200|
> | aaa   | bbb | 1000| 2000   |
> +---+-+-++
> {noformat}
>  - Compact the table
> {code:java}
> alter table bloomTest compact 'MAJOR';
> {code}
>  - Wait for the compaction to be over and check for bloom filters in dataset.
>   
>  - delta would have it, but not in the base dataset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24162) Query based compaction looses bloom filter

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24162:
--
Labels: pull-request-available  (was: )

> Query based compaction looses bloom filter
> --
>
> Key: HIVE-24162
> URL: https://issues.apache.org/jira/browse/HIVE-24162
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
>   
> {noformat}
> ++
> |   createtab_stmt   |
> ++
> | CREATE TABLE `bloomTest`(  |
> |   `msisdn` string, |
> |   `imsi` varchar(20),  |
> |   `imei` bigint,   |
> |   `cell_id` bigint)|
> | ROW FORMAT SERDE   |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  |
> | STORED AS INPUTFORMAT  |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  |
> | OUTPUTFORMAT   |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |
> | LOCATION   |
> |   
> 's3a://dwxtpcds30-wwgq-dwx-managed/clusters/env-6cwwgq/warehouse-1580338415-7dph/warehouse/tablespace/managed/hive/del_db.db/bloomtest'
>  |
> | TBLPROPERTIES (|
> |   'bucketing_version'='2', |
> |   'orc.bloom.filter.columns'='msisdn,cell_id,imsi',  |
> |   'orc.bloom.filter.fpp'='0.02',   |
> |   'transactional'='true',  |
> |   'transactional_properties'='default',|
> |   'transient_lastDdlTime'='1597222946')|
> ++
> insert into  bloomTest values ("a", "b", 10, 20);
> insert into  bloomTest values ("aa", "bb", 100, 200);
> insert into  bloomTest values ("aaa", "bbb", 1000, 2000);
> select * from bloomTest;
> +---+-+-++
> | bloomtest.msisdn  | bloomtest.imsi  | bloomtest.imei  | bloomtest.cell_id  |
> +---+-+-++
> | a | b   | 10  | 20 |
> | aa| bb  | 100 | 200|
> | aaa   | bbb | 1000| 2000   |
> +---+-+-++
> {noformat}
>  - Compact the table
> {code:java}
> alter table bloomTest compact 'MAJOR';
> {code}
>  - Wait for the compaction to be over and check for bloom filters in dataset.
>   
>  - delta would have it, but not in the base dataset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24154) Missing simplification opportunity with IN and EQUALS clauses

2020-09-14 Thread Jesus Camacho Rodriguez (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-24154:
---
Status: Patch Available  (was: Open)

> Missing simplification opportunity with IN and EQUALS clauses
> -
>
> Key: HIVE-24154
> URL: https://issues.apache.org/jira/browse/HIVE-24154
> Project: Hive
>  Issue Type: Improvement
>  Components: CBO
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> For instance, in perf driver CBO query 74, there are several filters that 
> could be simplified further:
> {code}
> HiveFilter(condition=[AND(=($1, 1999), IN($1, 1998, 1999))])
> {code}
> This may lead to incorrect estimates and leads to unnecessary execution time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement

2020-09-14 Thread Vineet Garg (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-24009:
---
Summary: Support partition pruning and other physical transformations for 
EXECUTE statement   (was: Support partition pruning for EXECUTE statement)

> Support partition pruning and other physical transformations for EXECUTE 
> statement 
> ---
>
> Key: HIVE-24009
> URL: https://issues.apache.org/jira/browse/HIVE-24009
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> Current partition pruning (compile time) isn't kicked in for EXECUTE 
> statements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HIVE-24164) Throw error for parameterized query containing parameters in group by

2020-09-14 Thread Vineet Garg (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg reassigned HIVE-24164:
--


> Throw error for parameterized query containing parameters in group by
> -
>
> Key: HIVE-24164
> URL: https://issues.apache.org/jira/browse/HIVE-24164
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Planning
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>
> e.g. following query should throw a useful error message since parameters 
> aren't support in group by
> {code:sql}
> prepare query1 from select count(*) from src where key > ? and value < ? 
> group by ?;  
>  execute query1 using 1,100,1;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24009?focusedWorklogId=484202&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484202
 ]

ASF GitHub Bot logged work on HIVE-24009:
-

Author: ASF GitHub Bot
Created on: 14/Sep/20 20:41
Start Date: 14/Sep/20 20:41
Worklog Time Spent: 10m 
  Work Description: vineetgarg02 commented on a change in pull request 
#1472:
URL: https://github.com/apache/hive/pull/1472#discussion_r488207028



##
File path: ql/src/test/queries/clientnegative/prepare_execute_1.q
##
@@ -1,3 +0,0 @@
---! qt:dataset:src
-prepare query1 from select count(*) from src where key > ? and value < ? group 
by ?;
-execute query1 using 1,100,1;

Review comment:
   This query no longer fails. I have opened a follow-up to fix this 
https://issues.apache.org/jira/browse/HIVE-24164





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 484202)
Remaining Estimate: 0h
Time Spent: 10m

> Support partition pruning and other physical transformations for EXECUTE 
> statement 
> ---
>
> Key: HIVE-24009
> URL: https://issues.apache.org/jira/browse/HIVE-24009
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Current partition pruning (compile time) isn't kicked in for EXECUTE 
> statements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24009) Support partition pruning and other physical transformations for EXECUTE statement

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24009:
--
Labels: pull-request-available  (was: )

> Support partition pruning and other physical transformations for EXECUTE 
> statement 
> ---
>
> Key: HIVE-24009
> URL: https://issues.apache.org/jira/browse/HIVE-24009
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Current partition pruning (compile time) isn't kicked in for EXECUTE 
> statements.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HIVE-24163) Dynamic Partitioning Insert fail for MM table fail while Move Operation

2020-09-14 Thread Rajkumar Singh (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195741#comment-17195741
 ] 

Rajkumar Singh commented on HIVE-24163:
---

this seems regression of https://issues.apache.org/jira/browse/HIVE-21164

> Dynamic Partitioning Insert fail for MM table fail while Move Operation
> ---
>
> Key: HIVE-24163
> URL: https://issues.apache.org/jira/browse/HIVE-24163
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Rajkumar Singh
>Priority: Major
> Fix For: 3.1.2
>
>
> -- create MM table 
> {code:java}
> CREATE TABLE `part1`(  |
> |   `id` double, |
> |   `n` double,  |
> |   `name` varchar(8),   |
> |   `sex` varchar(1))|
> | PARTITIONED BY (   |
> |   `weight` string, |
> |   `age` string,|
> |   `height` string) |
> | ROW FORMAT SERDE   |
> |   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'  |
> | WITH SERDEPROPERTIES ( |
> |   'field.delim'='\u0001',  |
> |   'line.delim'='\n',   |
> |   'serialization.format'='\u0001') |
> | STORED AS INPUTFORMAT  |
> |   'org.apache.hadoop.mapred.TextInputFormat'   |
> | OUTPUTFORMAT   |
> |   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' |
> | LOCATION   |
> |   'hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1' |
> | TBLPROPERTIES (|
> |   'bucketing_version'='2', |
> |   'transactional'='true',  |
> |   'transactional_properties'='insert_only',|
> |   'transient_lastDdlTime'='1599053368')
> {code}
> -- create managed table 
> {code:java}
> CREATE TABLE `class`(  |
> |   `name` varchar(8),   |
> |   `sex` varchar(1),|
> |   `age` double,|
> |   `height` double, |
> |   `weight` double) |
> | ROW FORMAT SERDE   |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  |
> | STORED AS INPUTFORMAT  |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  |
> | OUTPUTFORMAT   |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |
> | LOCATION   |
> |   'hdfs://hostname:8020/warehouse/tablespace/managed/hive/class' |
> | TBLPROPERTIES (|
> |   'bucketing_version'='2', |
> |   'transactional'='true',  |
> |   'transactional_properties'='default',|
> |   'transient_lastDdlTime'='1599053345')  
> {code}
> -- Run Insert query
> {code:java}
> INSERT INTO TABLE `part1` PARTITION (`Weight`,`Age`,`Height`)  SELECT 0, 0, 
> `Name`,`Sex`,`Weight`,`Age`,`Height` FROM `class`;
> {code}
> it fail during the MoveTask execution:
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: partition 
> hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1/.hive-staging_hive_2020-09-02_13-29-58_765_4475282758764123921-1/-ext-1/tmpstats-0_FS_3
>  is not a directory!
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidPartitionsInPath(Hive.java:2769)
>  ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:2837) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.exec.MoveTask.handleDynParts(MoveTask.java:562) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:440) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.

[jira] [Updated] (HIVE-24163) Dynamic Partitioning Insert fail for MM table fail while Move Operation

2020-09-14 Thread Rajkumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-24163:
--
Description: 
-- DDLs and Query
{code:java}
create table `class` (name varchar(8), sex varchar(1), age double precision, 
height double precision, weight double precision);

insert into table class values ('RAJ','MALE',28,12,12);
CREATE TABLE `PART1` (`id` DOUBLE,`N` DOUBLE,`Name` VARCHAR(8),`Sex` 
VARCHAR(1)) PARTITIONED BY(Weight string, Age
string, Height string)  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES 
TERMINATED BY '\012' STORED AS TEXTFILE;

INSERT INTO TABLE `part1` PARTITION (`Weight`,`Age`,`Height`)  SELECT 0, 0, 
`Name`,`Sex`,`Weight`,`Age`,`Height` FROM `class`;
{code}



it fail during the MoveTask execution:

{code:java}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: partition 
hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1/.hive-staging_hive_2020-09-02_13-29-58_765_4475282758764123921-1/-ext-1/tmpstats-0_FS_3
 is not a directory!
at 
org.apache.hadoop.hive.ql.metadata.Hive.getValidPartitionsInPath(Hive.java:2769)
 ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:2837) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.ql.exec.MoveTask.handleDynParts(MoveTask.java:562) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:440) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
 ~[hive-service-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]

{code}

The reason is Task write the fsstat during the FileSinkOperator closing, HS2 
ran the MoveTask to move data into the destination partition directory, while 
getting the partition location hive check whether destination is directory or 
not and failing.

-- hive set the stat location during 
https://github.com/apache/hive/blob/d700ea54ec5da5364d92a9faaa58f89ea03181e0/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L8135

which is relative to the  hive-staging directory:

https://github.com/apache/hive/blob/fecad5b0f72c535ed1c53f2cc62b0d6649b651ae/ql/src/java/org/apache/hadoop/hive/ql/Context.java#L617






  was:
-- create MM table 
{code:java}
CREATE TABLE `part1`(  |
|   `id` double, |
|   `n` double,  |
|   `name` varchar(8),   |
|   `sex` varchar(1))|
| PARTITIONED BY (   |
|   `weight` string, |
|   `age` string,|
|   `height` string) |
| ROW FORMAT SERDE   |
|   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'  |
| WITH SERDEPROPERTIES ( |
|   'field.delim'='\u0001',  |
|   'line.delim'='\n',   |
|   'serialization.format'='\u0001') |
| STORED AS INPUTFORMAT  |
|   'org.apache.hadoop.mapred.TextInputFormat'   |
| OUTPUTFORMAT   |
|   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' |
| LOCATION   |
|

[jira] [Updated] (HIVE-24163) Dynamic Partitioning Insert fail for MM table fail during MoveTask

2020-09-14 Thread Rajkumar Singh (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajkumar Singh updated HIVE-24163:
--
Summary: Dynamic Partitioning Insert fail for MM table fail during MoveTask 
 (was: Dynamic Partitioning Insert fail for MM table fail while Move Operation)

> Dynamic Partitioning Insert fail for MM table fail during MoveTask
> --
>
> Key: HIVE-24163
> URL: https://issues.apache.org/jira/browse/HIVE-24163
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Rajkumar Singh
>Priority: Major
> Fix For: 3.1.2
>
>
> -- DDLs and Query
> {code:java}
> create table `class` (name varchar(8), sex varchar(1), age double precision, 
> height double precision, weight double precision);
> insert into table class values ('RAJ','MALE',28,12,12);
> CREATE TABLE `PART1` (`id` DOUBLE,`N` DOUBLE,`Name` VARCHAR(8),`Sex` 
> VARCHAR(1)) PARTITIONED BY(Weight string, Age
> string, Height string)  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' 
> LINES TERMINATED BY '\012' STORED AS TEXTFILE;
> INSERT INTO TABLE `part1` PARTITION (`Weight`,`Age`,`Height`)  SELECT 0, 0, 
> `Name`,`Sex`,`Weight`,`Age`,`Height` FROM `class`;
> {code}
> it fail during the MoveTask execution:
> {code:java}
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: partition 
> hdfs://hostname:8020/warehouse/tablespace/managed/hive/part1/.hive-staging_hive_2020-09-02_13-29-58_765_4475282758764123921-1/-ext-1/tmpstats-0_FS_3
>  is not a directory!
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getValidPartitionsInPath(Hive.java:2769)
>  ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadDynamicPartitions(Hive.java:2837) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.exec.MoveTask.handleDynParts(MoveTask.java:562) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:440) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:359) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
> ~[hive-exec-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:225)
>  ~[hive-service-3.1.3000.7.2.0.0-237.jar:3.1.3000.7.2.0.0-237]
> {code}
> The reason is Task write the fsstat during the FileSinkOperator closing, HS2 
> ran the MoveTask to move data into the destination partition directory, while 
> getting the partition location hive check whether destination is directory or 
> not and failing.
> -- hive set the stat location during 
> https://github.com/apache/hive/blob/d700ea54ec5da5364d92a9faaa58f89ea03181e0/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L8135
> which is relative to the  hive-staging directory:
> https://github.com/apache/hive/blob/fecad5b0f72c535ed1c53f2cc62b0d6649b651ae/ql/src/java/org/apache/hadoop/hive/ql/Context.java#L617



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23754) LLAP: Add LoggingHandler in ShuffleHandler pipeline for better debuggability

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23754?focusedWorklogId=484252&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484252
 ]

ASF GitHub Bot logged work on HIVE-23754:
-

Author: ASF GitHub Bot
Created on: 15/Sep/20 00:46
Start Date: 15/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1172:
URL: https://github.com/apache/hive/pull/1172


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 484252)
Time Spent: 0.5h  (was: 20m)

> LLAP: Add LoggingHandler in ShuffleHandler pipeline for better debuggability
> 
>
> Key: HIVE-23754
> URL: https://issues.apache.org/jira/browse/HIVE-23754
> Project: Hive
>  Issue Type: Improvement
> Environment:  
>  
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/shufflehandler/ShuffleHandler.java#L616]
>  
> For corner case debugging, it would be helpful to understand when netty 
> processed OPEN/BOUND/CLOSE/RECEIVED/CONNECTED events along with payload 
> details.
> Adding "LoggingHandler" in ChannelPipeline mode can help in debugging.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23772) Relocate calcite-core to prevent NoSuchFiledError

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23772?focusedWorklogId=484254&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484254
 ]

ASF GitHub Bot logged work on HIVE-23772:
-

Author: ASF GitHub Bot
Created on: 15/Sep/20 00:46
Start Date: 15/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #1187:
URL: https://github.com/apache/hive/pull/1187


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 484254)
Time Spent: 2h  (was: 1h 50m)

> Relocate calcite-core to prevent NoSuchFiledError
> -
>
> Key: HIVE-23772
> URL: https://issues.apache.org/jira/browse/HIVE-23772
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Exception trace due to conflict with {{calcite-core}}
> {noformat}
> Caused by: java.lang.NoSuchFieldError: operands
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$RexVisitor.visitCall(ASTConverter.java:785)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$RexVisitor.visitCall(ASTConverter.java:509)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.calcite.rex.RexCall.accept(RexCall.java:191) 
> ~[calcite-core-1.21.0.jar:1.21.0]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:239)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convertSource(ASTConverter.java:437)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:124)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:112)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1620)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:555)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12456)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:433)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:290)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Compiler.analyze(Compiler.java:220) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Compiler.compile(Compiler.java:104) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:184) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:602) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:548) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:542) 
> ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:125)
>  ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:199)
>  ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23074) SchemaTool sql script execution errors when updating the metadata's schema

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23074?focusedWorklogId=484253&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484253
 ]

ASF GitHub Bot logged work on HIVE-23074:
-

Author: ASF GitHub Bot
Created on: 15/Sep/20 00:46
Start Date: 15/Sep/20 00:46
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] closed pull request #967:
URL: https://github.com/apache/hive/pull/967


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 484253)
Time Spent: 1.5h  (was: 1h 20m)

> SchemaTool sql script execution errors when updating the metadata's schema
> --
>
> Key: HIVE-23074
> URL: https://issues.apache.org/jira/browse/HIVE-23074
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.1.2
> Environment: running machine: centos7.2 
> metadata db: PostgreSQL 11.3 on x86_64-pc-linux-gnu
> hive version: upgrade from version 3.0.0 to 3.1.2
>Reporter: John1Tang
>Assignee: John1Tang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.1.2
>
>   Original Estimate: 1h
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> SchemaTool sql script executed with conflicts on indices and columns and 
> missed " for avoiding keywords when updating the metadata's schema
> {code:java}
> bin/schematool -dbType postgres -upgradeSchemaFrom 3.0.0{code}
> went like this:
> {code:java}
> ALTER TABLE "GLOBAL_PRIVS" ADD COLUMN "AUTHORIZER" character varying(128) 
> DEFAULT NULL::character varying
> Error: ERROR: column "AUTHORIZER" of relation "GLOBAL_PRIVS" already exists 
> (state=42701,code=0){code}
> {code:java}
> ALTER TABLE COMPLETED_TXN_COMPONENTS ADD COLUMN IF NOT EXISTS 
> CTC_UPDATE_DELETE char(1) NULL
> Error: ERROR: relation "completed_txn_components" does not exist 
> (state=42P01,code=0)
> {code}
> I've already come up with a solution and created a pull request for this 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite

2020-09-14 Thread Nemon Lou (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-24165:
-
Description: 
One way to reproduce:
 
{code:sql}

 CREATE TABLE test(
 `device_id` string, 
 `level` string, 
 `site_id` string, 
 `user_id` string, 
 `first_date` string, 
 `last_date` string,
 `dt` string) ;

 set hive.execution.engine=tez;
 set hive.optimize.distinct.rewrite=true;
 set hive.cli.print.header=true;

 select 
 dt,
 site_id,
 count(DISTINCT t1.device_id) as device_tol_cnt,
 count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else 
null end) as device_add_cnt 
 from test t1 where dt='2020-09-15' 
 group by
 dt,
 site_id
 ;
{code}
 

Error log:  

```
Exception in thread "main" java.lang.AssertionError: Cannot add expression of 
different type to set:
set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL
expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT $f3_0) 
NOT NULL
set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, 
3},agg#0=count($0),agg#1=count($1))
expression is HiveProject#95
at 
org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411)
at 
org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57)
at 
org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234)
at 
org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186)
at 
org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317)
at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
at 
org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415)
at 
org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
at 
org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
at 
org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211)
at 
org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609)
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.Dele

[jira] [Updated] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite

2020-09-14 Thread Nemon Lou (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-24165:
-
Description: 
One way to reproduce:
 
{code:sql}

 CREATE TABLE test(
 `device_id` string, 
 `level` string, 
 `site_id` string, 
 `user_id` string, 
 `first_date` string, 
 `last_date` string,
 `dt` string) ;

 set hive.execution.engine=tez;
 set hive.optimize.distinct.rewrite=true;
 set hive.cli.print.header=true;

 select 
 dt,
 site_id,
 count(DISTINCT t1.device_id) as device_tol_cnt,
 count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else 
null end) as device_add_cnt 
 from test t1 where dt='2020-09-15' 
 group by
 dt,
 site_id
 ;
{code}
 

Error log:  

{code:java}
Exception in thread "main" java.lang.AssertionError: Cannot add expression of 
different type to set:
set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
"ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL
expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT $f3_0) 
NOT NULL
set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, 
3},agg#0=count($0),agg#1=count($1))
expression is HiveProject#95
at 
org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411)
at 
org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57)
at 
org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234)
at 
org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186)
at 
org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317)
at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
at 
org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415)
at 
org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
at 
org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
at 
org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211)
at 
org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609)
at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
at 
org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.refl

[jira] [Commented] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite

2020-09-14 Thread Nemon Lou (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195807#comment-17195807
 ] 

Nemon Lou commented on HIVE-24165:
--

In fact , i reproduce this issue by apply HIVE-22448 back to Hive branch 3.1.2. 
Master branch should have the same issue.

AggregateProjectPullUpConstantsRule expects groupSet in Aggregate to be ordered 
and start with 0, like \{0,1,2}.but after multiple distinct rewrite, groupSet 
is \{3,4,5}.

 

> CBO: Query fails after multiple count distinct rewrite 
> ---
>
> Key: HIVE-24165
> URL: https://issues.apache.org/jira/browse/HIVE-24165
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Nemon Lou
>Priority: Major
>
> One way to reproduce:
>  
> {code:sql}
>  CREATE TABLE test(
>  `device_id` string, 
>  `level` string, 
>  `site_id` string, 
>  `user_id` string, 
>  `first_date` string, 
>  `last_date` string,
>  `dt` string) ;
>  set hive.execution.engine=tez;
>  set hive.optimize.distinct.rewrite=true;
>  set hive.cli.print.header=true;
>  select 
>  dt,
>  site_id,
>  count(DISTINCT t1.device_id) as device_tol_cnt,
>  count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else 
> null end) as device_add_cnt 
>  from test t1 where dt='2020-09-15' 
>  group by
>  dt,
>  site_id
>  ;
> {code}
>  
> Error log:  
> {code:java}
> Exception in thread "main" java.lang.AssertionError: Cannot add expression of 
> different type to set:
> set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
> "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL
> expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT 
> $f3_0) NOT NULL
> set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, 
> 3},agg#0=count($0),agg#1=count($1))
> expression is HiveProject#95
>   at 
> org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411)
>   at 
> org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234)
>   at 
> org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317)
>   at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609)
>   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespon

[jira] [Updated] (HIVE-24165) CBO: Query fails after multiple count distinct rewrite

2020-09-14 Thread Nemon Lou (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-24165:
-
Attachment: HIVE-24165.patch

> CBO: Query fails after multiple count distinct rewrite 
> ---
>
> Key: HIVE-24165
> URL: https://issues.apache.org/jira/browse/HIVE-24165
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Affects Versions: 4.0.0
>Reporter: Nemon Lou
>Priority: Major
> Attachments: HIVE-24165.patch
>
>
> One way to reproduce:
>  
> {code:sql}
>  CREATE TABLE test(
>  `device_id` string, 
>  `level` string, 
>  `site_id` string, 
>  `user_id` string, 
>  `first_date` string, 
>  `last_date` string,
>  `dt` string) ;
>  set hive.execution.engine=tez;
>  set hive.optimize.distinct.rewrite=true;
>  set hive.cli.print.header=true;
>  select 
>  dt,
>  site_id,
>  count(DISTINCT t1.device_id) as device_tol_cnt,
>  count(DISTINCT case when t1.first_date='2020-09-15' then t1.device_id else 
> null end) as device_add_cnt 
>  from test t1 where dt='2020-09-15' 
>  group by
>  dt,
>  site_id
>  ;
> {code}
>  
> Error log:  
> {code:java}
> Exception in thread "main" java.lang.AssertionError: Cannot add expression of 
> different type to set:
> set type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" COLLATE 
> "ISO-8859-1$en_US$primary" $f2, VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> COLLATE "ISO-8859-1$en_US$primary" $f3, BIGINT $f2_0, BIGINT $f3_0) NOT NULL
> expression type is RecordType(VARCHAR(2147483647) CHARACTER SET "UTF-16LE" 
> COLLATE "ISO-8859-1$en_US$primary" $f2, BIGINT $f3, BIGINT $f2_0, BIGINT 
> $f3_0) NOT NULL
> set is rel#85:HiveAggregate.HIVE.[](input=HepRelVertex#84,group={2, 
> 3},agg#0=count($0),agg#1=count($1))
> expression is HiveProject#95
>   at 
> org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelOptUtil.java:411)
>   at 
> org.apache.calcite.plan.hep.HepRuleCall.transformTo(HepRuleCall.java:57)
>   at 
> org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234)
>   at 
> org.apache.calcite.rel.rules.AggregateProjectPullUpConstantsRule.onMatch(AggregateProjectPullUpConstantsRule.java:186)
>   at 
> org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:317)
>   at org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:415)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeInstruction(HepPlanner.java:280)
>   at 
> org.apache.calcite.plan.hep.HepInstruction$RuleCollection.execute(HepInstruction.java:74)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:211)
>   at 
> org.apache.calcite.plan.hep.HepPlanner.findBestExp(HepPlanner.java:198)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.hepPlan(CalcitePlanner.java:2273)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.applyPreJoinOrderingTransforms(CalcitePlanner.java:2002)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1709)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1609)
>   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:118)
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:1052)
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:154)
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:111)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.logicalPlan(CalcitePlanner.java:1414)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:1430)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:450)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12164)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:330)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:659)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDrive

[jira] [Updated] (HIVE-24155) Upgrade Arrow version to 1.0.1

2020-09-14 Thread Igor Dvorzhak (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Dvorzhak updated HIVE-24155:
-
Attachment: HIVE-24155-branch-3.1.patch
Status: Patch Available  (was: Open)

> Upgrade Arrow version to 1.0.1
> --
>
> Key: HIVE-24155
> URL: https://issues.apache.org/jira/browse/HIVE-24155
> Project: Hive
>  Issue Type: Task
>Affects Versions: 3.1.2
>Reporter: Igor Dvorzhak
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-24155-branch-3.1.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24159) Kafka storage handler broken in secure environment pt2: short-circuit on non-secure environment

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24159?focusedWorklogId=484293&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484293
 ]

ASF GitHub Bot logged work on HIVE-24159:
-

Author: ASF GitHub Bot
Created on: 15/Sep/20 04:52
Start Date: 15/Sep/20 04:52
Worklog Time Spent: 10m 
  Work Description: ashutoshc commented on pull request #1495:
URL: https://github.com/apache/hive/pull/1495#issuecomment-692462659


   +1 LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 484293)
Time Spent: 0.5h  (was: 20m)

> Kafka storage handler broken in secure environment pt2: short-circuit on 
> non-secure environment
> ---
>
> Key: HIVE-24159
> URL: https://issues.apache.org/jira/browse/HIVE-24159
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> As kafka_storage_handler.q was disabled by HIVE-23985, I haven't realized 
> upstream that the kafka qtest fails. Instead of setting up a kerberized 
> environment in qtest (which doesn't seem to be a usual usecase, e.g. haven't 
> seen hive.server2.authentication.kerberos.principal used in *.q files) I 
> managed to make the test with a simple 
> UserGroupInformation.isSecurityEnabled() check, which can be also useful for 
> every non-secure environment.
> For reference, the exception was:
> {code}
> 2020-09-14T03:30:01,217 ERROR [a42ef4c6-190c-47a6-86ad-8bf13b8a2dc1 main] 
> tez.TezTask: Failed to execute tez graph.
> org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
>   at 
> org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:451)
>  ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.kafka.clients.admin.Admin.create(Admin.java:59) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:39) 
> ~[kafka-clients-2.4.1.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaDelegationTokenForBrokers(DagUtils.java:333)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.getKafkaCredentials(DagUtils.java:301)
>  ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addCredentials(DagUtils.java:282) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.build(TezTask.java:516) 
> ~[hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:223) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:213) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:357) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:330) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:246) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:109) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:721) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:488) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:482) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:?]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:166) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:232) 
> [hive-exec-3.1.3000.7.1.4.0-SNAPSHOT.jar:3.1.3000.7.1.4.0-SNAPSHOT]
>   at 
> org.apache.hadoop.hive.cli.CliDriver.proce

[jira] [Work logged] (HIVE-24162) Query based compaction looses bloom filter

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24162?focusedWorklogId=484319&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484319
 ]

ASF GitHub Bot logged work on HIVE-24162:
-

Author: ASF GitHub Bot
Created on: 15/Sep/20 06:32
Start Date: 15/Sep/20 06:32
Worklog Time Spent: 10m 
  Work Description: pvargacl commented on pull request #1498:
URL: https://github.com/apache/hive/pull/1498#issuecomment-692497872


   @klcopp @laszlopinter86  could you take a look please?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 484319)
Time Spent: 20m  (was: 10m)

> Query based compaction looses bloom filter
> --
>
> Key: HIVE-24162
> URL: https://issues.apache.org/jira/browse/HIVE-24162
> Project: Hive
>  Issue Type: Bug
>Reporter: Peter Varga
>Assignee: Peter Varga
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
>   
> {noformat}
> ++
> |   createtab_stmt   |
> ++
> | CREATE TABLE `bloomTest`(  |
> |   `msisdn` string, |
> |   `imsi` varchar(20),  |
> |   `imei` bigint,   |
> |   `cell_id` bigint)|
> | ROW FORMAT SERDE   |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcSerde'  |
> | STORED AS INPUTFORMAT  |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'  |
> | OUTPUTFORMAT   |
> |   'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' |
> | LOCATION   |
> |   
> 's3a://dwxtpcds30-wwgq-dwx-managed/clusters/env-6cwwgq/warehouse-1580338415-7dph/warehouse/tablespace/managed/hive/del_db.db/bloomtest'
>  |
> | TBLPROPERTIES (|
> |   'bucketing_version'='2', |
> |   'orc.bloom.filter.columns'='msisdn,cell_id,imsi',  |
> |   'orc.bloom.filter.fpp'='0.02',   |
> |   'transactional'='true',  |
> |   'transactional_properties'='default',|
> |   'transient_lastDdlTime'='1597222946')|
> ++
> insert into  bloomTest values ("a", "b", 10, 20);
> insert into  bloomTest values ("aa", "bb", 100, 200);
> insert into  bloomTest values ("aaa", "bbb", 1000, 2000);
> select * from bloomTest;
> +---+-+-++
> | bloomtest.msisdn  | bloomtest.imsi  | bloomtest.imei  | bloomtest.cell_id  |
> +---+-+-++
> | a | b   | 10  | 20 |
> | aa| bb  | 100 | 200|
> | aaa   | bbb | 1000| 2000   |
> +---+-+-++
> {noformat}
>  - Compact the table
> {code:java}
> alter table bloomTest compact 'MAJOR';
> {code}
>  - Wait for the compaction to be over and check for bloom filters in dataset.
>   
>  - delta would have it, but not in the base dataset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23618) NotificationLog should also contain events for default/check constraints

2020-09-14 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23618?focusedWorklogId=484325&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-484325
 ]

ASF GitHub Bot logged work on HIVE-23618:
-

Author: ASF GitHub Bot
Created on: 15/Sep/20 06:56
Start Date: 15/Sep/20 06:56
Worklog Time Spent: 10m 
  Work Description: sankarh commented on a change in pull request #1237:
URL: https://github.com/apache/hive/pull/1237#discussion_r488391678



##
File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
##
@@ -703,6 +709,49 @@ public void 
onAddNotNullConstraint(AddNotNullConstraintEvent addNotNullConstrain
 }
   }
 
+  /***
+   * @param addDefaultConstraintEvent add default constraint event
+   * @throws MetaException
+   */
+  @Override
+  public void onAddDefaultConstraint(AddDefaultConstraintEvent 
addDefaultConstraintEvent) throws MetaException {
+List cols = 
addDefaultConstraintEvent.getDefaultConstraintCols();
+if (cols.size() > 0) {
+  AddDefaultConstraintMessage msg = MessageBuilder.getInstance()
+
.buildAddDefaultConstraintMessage(addDefaultConstraintEvent.getDefaultConstraintCols());

Review comment:
   Shall use "cols" here.

##
File path: 
hcatalog/server-extensions/src/main/java/org/apache/hive/hcatalog/listener/DbNotificationListener.java
##
@@ -703,6 +709,49 @@ public void 
onAddNotNullConstraint(AddNotNullConstraintEvent addNotNullConstrain
 }
   }
 
+  /***
+   * @param addDefaultConstraintEvent add default constraint event
+   * @throws MetaException
+   */
+  @Override
+  public void onAddDefaultConstraint(AddDefaultConstraintEvent 
addDefaultConstraintEvent) throws MetaException {
+List cols = 
addDefaultConstraintEvent.getDefaultConstraintCols();
+if (cols.size() > 0) {
+  AddDefaultConstraintMessage msg = MessageBuilder.getInstance()
+
.buildAddDefaultConstraintMessage(addDefaultConstraintEvent.getDefaultConstraintCols());
+  NotificationEvent event =
+new NotificationEvent(0, now(), 
EventType.ADD_DEFAULTCONSTRAINT.toString(),
+  msgEncoder.getSerializer().serialize(msg)
+);
+  event.setCatName(cols.get(0).isSetCatName() ? cols.get(0).getCatName() : 
DEFAULT_CATALOG_NAME);
+  event.setDbName(cols.get(0).getTable_db());
+  event.setTableName(cols.get(0).getTable_name());
+  process(event, addDefaultConstraintEvent);
+}
+  }
+
+  /***
+   * @param addCheckConstraintEvent add check constraint event
+   * @throws MetaException
+   */
+  @Override
+  public void onAddCheckConstraint(AddCheckConstraintEvent 
addCheckConstraintEvent) throws MetaException {
+LOG.info("Inside DBNotification listener for check constraint.");
+List cols = 
addCheckConstraintEvent.getCheckConstraintCols();
+if (cols.size() > 0) {
+  AddCheckConstraintMessage msg = MessageBuilder.getInstance()
+
.buildAddCheckConstraintMessage(addCheckConstraintEvent.getCheckConstraintCols());

Review comment:
   Use "cols".

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/events/AddCheckConstraintEvent.java
##
@@ -0,0 +1,42 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.hive.metastore.events;
+
+import org.apache.hadoop.classification.InterfaceAudience;
+import org.apache.hadoop.classification.InterfaceStability;
+import org.apache.hadoop.hive.metastore.IHMSHandler;
+import org.apache.hadoop.hive.metastore.api.SQLCheckConstraint;
+
+import java.util.List;
+
+@InterfaceAudience.Public
+@InterfaceStability.Stable
+public class AddCheckConstraintEvent extends ListenerEvent {
+  private final List ds;

Review comment:
   nit: Use "cc" instead of "ds"

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/messaging/MessageBuilder.java
##
@@ -241,6 +247,16 @@ public AddNotNullConstraintMessage 
buildAddNotNullConstraintMessage(
 return new JSONAddNotNullConstraintMessage(MS_SERVER_URL, 
MS