[jira] [Created] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite

2019-03-28 Thread anishek (JIRA)
anishek created HIVE-21539:
--

 Summary: GroupBy + where clause on same column results in 
incorrect query rewrite
 Key: HIVE-21539
 URL: https://issues.apache.org/jira/browse/HIVE-21539
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: anishek


{code}

create table a (i int, j string);
insert into a values ( 1, 'a'),(2,'b');
explain extended select min(j) from a where j='a' group by j;
++
|  Explain   |
++
| OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`|
| FROM `default`.`a` |
| WHERE `j` = 'a'|
| GROUP BY TRUE  |
| STAGE DEPENDENCIES:|
|   Stage-1 is a root stage  |
|   Stage-0 depends on stages: Stage-1   |
||
| STAGE PLANS:   |
|   Stage: Stage-1   |
| Tez|
|   DagId: anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 
|
|   Edges:   |
| Reducer 2 <- Map 1 (SIMPLE_EDGE)   |
|   DagName: 
anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 |
|   Vertices:|
| Map 1  |
| Map Operator Tree: |
| TableScan  |
|   alias: a |
|   filterExpr: (j = 'a') (type: boolean) |
|   Statistics: Num rows: 2 Data size: 170 Basic stats: 
COMPLETE Column stats: COMPLETE |
|   GatherStats: false   |
|   Filter Operator  |
| isSamplingPred: false  |
| predicate: (j = 'a') (type: boolean) |
| Statistics: Num rows: 1 Data size: 85 Basic stats: 
COMPLETE Column stats: COMPLETE |
| Select Operator|
|   Statistics: Num rows: 1 Data size: 85 Basic stats: 
COMPLETE Column stats: COMPLETE |
|   Group By Operator|
| aggregations: min(true)|
| keys: true (type: boolean) |
| mode: hash |
| outputColumnNames: _col0, _col1 |
| Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: COMPLETE |
| Reduce Output Operator |
|   key expressions: _col0 (type: boolean) |
|   null sort order: a   |
|   sort order: +|
|   Map-reduce partition columns: _col0 (type: boolean) 
|
|   Statistics: Num rows: 1 Data size: 8 Basic stats: 
COMPLETE Column stats: COMPLETE |
|   tag: -1  |
|   value expressions: _col1 (type: boolean) |
|   auto parallelism: true   |
| Path -> Alias: |
|   hdfs://localhost:9000/tmp/hive/warehouse/a [a] |
| Path -> Partition: |
|   hdfs://localhost:9000/tmp/hive/warehouse/a  |
| Partition  |
|   base file name: a|
|   input format: org.apache.hadoop.mapred.TextInputFormat |
|   output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |
|   properties:  |
| COLUMN_STATS_ACCURATE 
{"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} |
| bucket_count -1|
| bucketing_version 2|
| column.name.delimiter ,|
| columns i,j|
| columns.comments   |
| columns.types int:string   |
| file.inputformat org.apache.hadoop.mapred.TextInputFormat 
|
| file.outputformat 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat |
| location hdfs://localhost:9000/tmp/hive/warehouse/a |
| name default.a |
| numFiles 3 |
| numRows 2  |
|   

[jira] [Created] (HIVE-21538) Beeline: password source though the console reader did not pass to connection param

2019-03-28 Thread Rajkumar Singh (JIRA)
Rajkumar Singh created HIVE-21538:
-

 Summary: Beeline: password source though the console reader did 
not pass to connection param
 Key: HIVE-21538
 URL: https://issues.apache.org/jira/browse/HIVE-21538
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.1.0
 Environment: Hive-3.1 auth set to LDAP
Reporter: Rajkumar Singh


Beeline: password source through the console reader do not pass to connection 
param, this will yield into the Authentication failure in case of LDAP 
authentication.
{code}
beeline -n USER -u 
"jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
 -p

Connecting to 
jdbc:hive2://host:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;user=USER
Enter password for jdbc:hive2://host:2181/: 
19/03/26 19:49:44 [main]: WARN jdbc.HiveConnection: Failed to connect to 
host:1
19/03/26 19:49:44 [main]: ERROR jdbc.Utils: Unable to read HiveServer2 configs 
from ZooKeeper
Unknown HS2 problem when communicating with Thrift server.
Error: Could not open client transport for any of the Server URI's in 
ZooKeeper: Peer indicated failure: PLAIN auth failed: 
javax.security.sasl.AuthenticationException: Error validating LDAP user [Caused 
by javax.naming.AuthenticationException: [LDAP: error code 49 - 80090308: 
LdapErr: DSID-0C0903C8, comment: AcceptSecurityContext error, data 52e, v2580]] 
(state=08S01,code=0)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21537) Scalar query rewrite could be improved to not generate an extra join if subquery is guaranteed to produce atmost one row

2019-03-28 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-21537:
--

 Summary: Scalar query rewrite could be improved to not generate an 
extra join if subquery is guaranteed to produce atmost one row
 Key: HIVE-21537
 URL: https://issues.apache.org/jira/browse/HIVE-21537
 Project: Hive
  Issue Type: Improvement
  Components: Query Planning
Affects Versions: 4.0.0
Reporter: Vineet Garg
Assignee: Vineet Garg


Currently Hive planner introduces this branch and later executes a rule to 
remove this branch if it could. 
Subquery remove rule itself could check if subquery will produce max one row 
(using relmetadat's getMaxRowCount) and avoid introducing this branch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21536) Backport HIVE-17764 to branch-2.3

2019-03-28 Thread Yuming Wang (JIRA)
Yuming Wang created HIVE-21536:
--

 Summary: Backport HIVE-17764 to branch-2.3
 Key: HIVE-21536
 URL: https://issues.apache.org/jira/browse/HIVE-21536
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 2.3.4
Reporter: Yuming Wang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Declare myself emeritus

2019-03-28 Thread Lefty Leverenz
Thanks for all your contributions, Edward.

-- Lefty


On Tue, Mar 19, 2019 at 1:44 PM Edward Capriolo 
wrote:

> Hello,
>
> I would like to declare myself emeritus.
>
> Thank you,
> Edward Capriolo
>
>
> --
> Sorry this was sent from mobile. Will do less grammar and spell check than
> usual.
>


[jira] [Created] (HIVE-21535) Re-enable TestCliDriver#vector_groupby_reduce

2019-03-28 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21535:
--

 Summary: Re-enable TestCliDriver#vector_groupby_reduce
 Key: HIVE-21535
 URL: https://issues.apache.org/jira/browse/HIVE-21535
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Vihang Karajgaonkar


The test was disabled since it was flaky in HIVE-21396. Creating this JIRA to 
re-enable the test by fixing the rounding logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21534) Flaky test : TestActivePassiveHA

2019-03-28 Thread Vihang Karajgaonkar (JIRA)
Vihang Karajgaonkar created HIVE-21534:
--

 Summary: Flaky test : TestActivePassiveHA
 Key: HIVE-21534
 URL: https://issues.apache.org/jira/browse/HIVE-21534
 Project: Hive
  Issue Type: Test
Reporter: Vihang Karajgaonkar


Failed in 
https://issues.apache.org/jira/browse/HIVE-21484?focusedCommentId=16798031&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16798031

Works locally as well in the subsequent run of precommit later in the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69918: HIVE-21001 Update to Calcite 1.19

2019-03-28 Thread Zoltan Haindrich


> On March 27, 2019, 9:06 p.m., Jesús Camacho Rodríguez wrote:
> > accumulo-handler/src/test/results/positive/accumulo_queries.q.out
> > Line 150 (original), 150 (patched)
> > 
> >
> > Is this a known issue?

removing the cast might have suppressed problematic cases;

In Hive: `select cast('xxx' as double) is null` should be true; however `select 
'xxx' is null` is false

see: CALCITE-2929


> On March 27, 2019, 9:06 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/test/results/clientpositive/allcolref_in_udf.q.out
> > Line 107 (original), 107 (patched)
> > 
> >
> > Known issue?

CALCITE-2929


> On March 27, 2019, 9:06 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/test/results/clientpositive/auto_join2.q.out
> > Line 42 (original), 42 (patched)
> > 
> >
> > I guess same as previous one, but this time it is preventing folding in 
> > more complex expression from happening.

CALCITE-2929 was a late comer to 1.19 release;

this change was happening in many q.out-s; not sure how much impact this has - 
but from `UDFToDouble(key) is not null` we could deduce that `key is not 
null`...
however; I think keeping `key is not null` is also good; because I don't think 
that readers could be able to do the `cast` as well.


> On March 27, 2019, 9:06 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/test/results/clientpositive/filter_cond_pushdown.q.out
> > Line 344 (original), 344 (patched)
> > 
> >
> > Is the literal transformation here expected (int -> double?)?

query expressions is: `(int1 + int2) > 2`; so integer comparision would be 
fine
I've checked that even earlier if the literal was 2.1 or something it worked 
correctly and the comparision happened in decimal


> On March 27, 2019, 9:06 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/test/results/clientpositive/llap/explainuser_1.q.out
> > Line 588 (original), 588 (patched)
> > 
> >
> > Constant literal transformation. Is it expected?

`((UDFToFloat(c_int) + c_float) >= 0.0)`

I think in this case it is ok to change to floating point; as left hand side is 
all float as well


> On March 27, 2019, 9:06 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/test/results/clientpositive/llap/orc_ppd_varchar.q.out
> > Line 29 (original), 29 (patched)
> > 
> >
> > Expected?

this issue has its own ticket: issues is that a literal varchar value have 
changed type to string
https://issues.apache.org/jira/browse/HIVE-21316


> On March 27, 2019, 9:06 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/test/results/clientpositive/orc_ppd_char.q.out
> > Line 29 (original), 29 (patched)
> > 
> >
> > Expected?

this issue has its own ticket: issues is that a literal varchar value have 
changed type to string
https://issues.apache.org/jira/browse/HIVE-21316


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69918/#review214127
---


On March 27, 2019, 4:40 p.m., Zoltan Haindrich wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69918/
> ---
> 
> (Updated March 27, 2019, 4:40 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Jesús Camacho Rodríguez.
> 
> 
> Bugs: HIVE-21001
> https://issues.apache.org/jira/browse/HIVE-21001
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> patch#1 here is #23 on jira
> patch#2 here is #48 on jira
> 
> 
> Diffs
> -
> 
>   
> accumulo-handler/src/test/results/positive/accumulo_predicate_pushdown.q.out 
> fb8fca9324 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> 80a7dc9717 
>   hbase-handler/src/test/results/positive/hbase_ppd_key_range.q.out 
> b80738b263 
>   hbase-handler/src/test/results/positive/hbase_pushdown.q.out f37460c6d3 
>   hbase-handler/src/test/results/positive/hbase_queries.q.out cfcfaf3274 
>   pom.xml 7b45d84de8 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/HiveRelBuilder.java 
> e85a99e846 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveRelDecorrelator.java
>  238ae4ef4e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveSubQueryRemoveRule.java
>  50ed8eda89 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translat

[jira] [Created] (HIVE-21533) Nested CTE's with join does not return any data.

2019-03-28 Thread Naveen Gangam (JIRA)
Naveen Gangam created HIVE-21533:


 Summary: Nested CTE's with join does not return any data.
 Key: HIVE-21533
 URL: https://issues.apache.org/jira/browse/HIVE-21533
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 2.1.0
Reporter: Naveen Gangam
 Attachments: testcase.sql

Attached is the testcase to reproduce the issue. the join on CTE6 is causing 
the problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21532) RuntimeException due to AccessControlException during creating hive-staging-dir

2019-03-28 Thread Oleksandr Polishchuk (JIRA)
Oleksandr Polishchuk created HIVE-21532:
---

 Summary: RuntimeException due to AccessControlException during 
creating hive-staging-dir
 Key: HIVE-21532
 URL: https://issues.apache.org/jira/browse/HIVE-21532
 Project: Hive
  Issue Type: Bug
Reporter: Oleksandr Polishchuk


The bug was found with environment - Hive-2.3.

Steps lead to an exception:
1) Create user without root permissions on your node.
2) The {{hive-site.xml}} file has to contain the next properties:
{code:java}
 
    hive.security.authorization.enabled
  true
  
  
   hive.security.authorization.manager
 
org.apache.hadoop.hive.ql.security.authorization.plugin.fallback.FallbackHiveAuthorizerFactory
  
{code}
3) Open Hive CLI and do next query:
{code:java}
 insert overwrite local directory '/tmp/test_dir' row format delimited fields 
terminated by ',' select * from temp.test;
{code}
The previous query will fails with the next exception:
{code:java}
FAILED: RuntimeException Cannot create staging directory 
'hdfs:///tmp/test_dir/.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1':
 User testuser(user id 3456)  has been denied access to create 
.hive-staging_hive_2019-03-28_11-51-05_319_5882446299335967521-1


{code}
The investigation shows that if delete the mentioned above properties from 
{{hive-site.xml}} and pass {{`queryTmpdir`}} instead of {{`dest_path`}} in the 
{{org.apache.hadoop.hive.ql.Context#getTempDirForPath()}} as was in the 
Hive-2.1. everything will be fine. The current method is using in the 
{{org.apache.hadoop.hive.ql.parse.SemanticAnalyzer}}  - {{String statsTmpLoc = 
ctx.getTempDirForPath(dest_path).toString();}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [hive] sankarh commented on a change in pull request #579: HIVE-21109 : Support stats replication for ACID tables.

2019-03-28 Thread GitBox
sankarh commented on a change in pull request #579: HIVE-21109 : Support stats 
replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r269880708
 
 

 ##
 File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnCommonUtils.java
 ##
 @@ -84,6 +86,73 @@ public static ValidTxnList 
createValidReadTxnList(GetOpenTxnsResponse txns, long
 return new ValidReadTxnList(exceptions, outAbortedBits, highWaterMark, 
minOpenTxnId);
   }
 
+  /**
+   * Transform a {@link 
org.apache.hadoop.hive.metastore.api.GetOpenTxnsResponse} to a
+   * {@link org.apache.hadoop.hive.common.ValidTxnList}.  This assumes that 
the caller intends to
+   * read the files, and thus treats both open and aborted transactions as 
invalid.
+   *
+   * This API is used by Hive replication which may have multiple transactions 
open at a time.
+   *
+   * @param txns open txn list from the metastore
+   * @param currentTxns Current transactions that the replication has opened.  
If any of the
+   *transactions is greater than 0 it will be removed from 
the exceptions
+   *list so that the replication sees its own transaction 
as valid.
+   * @return a valid txn list.
+   */
+  public static ValidTxnList createValidReadTxnList(GetOpenTxnsResponse txns,
 
 Review comment:
   Yes, even I think, for REPL LOAD, we should always hardcode the 
ValidWriteIdList using current writeId so that stats are always valid while 
applying current event. Even if it is invalid, the subsequent 
alterTable/partition event would set it so in the table/partition parameters.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] sankarh commented on a change in pull request #579: HIVE-21109 : Support stats replication for ACID tables.

2019-03-28 Thread GitBox
sankarh commented on a change in pull request #579: HIVE-21109 : Support stats 
replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r269880083
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
 ##
 @@ -2950,21 +2956,33 @@ public Partition createPartition(Table tbl, 
Map partSpec) throws
 int size = addPartitionDesc.getPartitionCount();
 List in =
 new ArrayList(size);
-AcidUtils.TableSnapshot tableSnapshot = AcidUtils.getTableSnapshot(conf, 
tbl, true);
 long writeId;
 String validWriteIdList;
-if (tableSnapshot != null && tableSnapshot.getWriteId() > 0) {
-  writeId = tableSnapshot.getWriteId();
-  validWriteIdList = tableSnapshot.getValidWriteIdList();
+
+// In case of replication, get the writeId from the source and use valid 
write Id list
+// for replication.
+if (addPartitionDesc.getReplicationSpec() != null &&
+addPartitionDesc.getReplicationSpec().isInReplicationScope() &&
+addPartitionDesc.getPartition(0).getWriteId() > 0) {
+  writeId = addPartitionDesc.getPartition(0).getWriteId();
+  validWriteIdList =
 
 Review comment:
   Even that logic to create ValidWriteIdList based on all repl opened txns 
isn't right as it says that stats are valid for all these open txns but it 
isn't. Also, it sets high water mark based on 0th index in the replTxnsList map 
which might be pointing to wrong writeId compared to current txn's writeId. So, 
I doubt, this logic should be anyways removed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [hive] ashutosh-bapat commented on a change in pull request #579: HIVE-21109 : Support stats replication for ACID tables.

2019-03-28 Thread GitBox
ashutosh-bapat commented on a change in pull request #579: HIVE-21109 : Support 
stats replication for ACID tables.
URL: https://github.com/apache/hive/pull/579#discussion_r269879748
 
 

 ##
 File path: ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
 ##
 @@ -2689,7 +2689,19 @@ private int alterTable(Hive db, AlterTableDesc 
alterTbl) throws HiveException {
   } else {
 // Note: this is necessary for UPDATE_STATISTICS command, that 
operates via ADDPROPS (why?).
 //   For any other updates, we don't want to do txn check on 
partitions when altering table.
-boolean isTxn = alterTbl.getPartSpec() != null && alterTbl.getOp() == 
AlterTableTypes.ADDPROPS;
+boolean isTxn = false;
+if (alterTbl.getPartSpec() != null && alterTbl.getOp() == 
AlterTableTypes.ADDPROPS) {
+  // ADDPROPS is used to add repl.last.id during replication. That's 
not a transactional
+  // change.
+  Map props = alterTbl.getProps();
+  if (props.size() <= 1 && 
props.get(ReplicationSpec.KEY.CURR_STATE_ID.toString()) != null) {
 
 Review comment:
   Done. Instead of last.repl.id, I am explicitly checking if the property is 
related to stats and then set isTxn only in case of replication.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services