[jira] [Created] (HIVE-21781) Vectorization: Incomplete vectorization of "or false" with CBO off

2019-05-22 Thread Gopal V (JIRA)
Gopal V created HIVE-21781:
--

 Summary: Vectorization: Incomplete vectorization of "or false" 
with CBO off
 Key: HIVE-21781
 URL: https://issues.apache.org/jira/browse/HIVE-21781
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Reporter: Gopal V


{code}
create temporary table foo (x int) ;
insert into foo values(1),(2),(3),(4),(5);
set hive.explain.user=false;

explain vectorization detail select count(case when (x=1 or false) then 1 else 
0 end ) from foo
{code}

{code}
| Group By Operator  |
|   aggregations: count(CASE WHEN (((x = 1) or false)) THEN 
(1) ELSE (0) END) |
|   Group By Vectorization:  |
|   aggregators: 
VectorUDAFCount(IfExprLongScalarLongScalar(col 3:boolean, val 1, val 
0)(children: VectorUDFAdaptor(((x = 1) or false))(children: 
LongColEqualLongScalar(col 0:int, val 1) -> 2:boolean) -> 3:boolean) -> 4:int) 
-> bigint |
|   className: VectorGroupByOperator |
{code}

The pass-through Calcite fixes this.

{code}
| OPTIMIZED SQL: SELECT COUNT(CASE WHEN `x` = 1 THEN 1 ELSE 0 END) AS `$f0` |
| FROM `default`.`foo` 
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21780) SetHashGroupByMinReduction should check parent operator number of rows to compute reduction

2019-05-22 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21780:
--

 Summary: SetHashGroupByMinReduction should check parent operator 
number of rows to compute reduction
 Key: HIVE-21780
 URL: https://issues.apache.org/jira/browse/HIVE-21780
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Currently it incorrectly checks the number of rows in the Group By operator 
itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21779) ACID: Delete deltas should not project the row struct

2019-05-22 Thread Gopal V (JIRA)
Gopal V created HIVE-21779:
--

 Summary: ACID: Delete deltas should not project the row struct
 Key: HIVE-21779
 URL: https://issues.apache.org/jira/browse/HIVE-21779
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.1, 4.0.0
Reporter: Gopal V


{code}
if (useDecimal64ColumnVector) {
  this.batch = deleteDeltaReader.getSchema().createRowBatchV2();
} else {
  this.batch = deleteDeltaReader.getSchema().createRowBatch();
}
{code}

Is creating a fully wide row-batch, despite the fact that all row columns have 
to be NULL in a delete delta.

createRowBatch() should follow includes and avoid over-allocating column vrbs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21778) CBO: "Struct is not null" gets evaluated as `nullable` always causing pushdown miss in the query

2019-05-22 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created HIVE-21778:
---

 Summary: CBO: "Struct is not null" gets evaluated as `nullable` 
always causing pushdown miss in the query
 Key: HIVE-21778
 URL: https://issues.apache.org/jira/browse/HIVE-21778
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 2.3.5
Reporter: Rajesh Balamohan



{noformat}
drop table if exists test_struct;
CREATE external TABLE test_struct
(
  f1 string,
  demo_struct struct,
  datestr string
);

set hive.cbo.enable=true;
explain select * from etltmp.test_struct where datestr='2019-01-01' and 
demo_struct is not null;



STAGE PLANS:
  Stage: Stage-0
Fetch Operator
  limit: -1
  Processor Tree:
TableScan
  alias: test_struct
  filterExpr: (datestr = '2019-01-01') (type: boolean) <- Note that 
demo_struct filter is not added here
  Filter Operator
predicate: (datestr = '2019-01-01') (type: boolean)
Select Operator
  expressions: f1 (type: string), demo_struct (type: 
struct), '2019-01-01' (type: string)
  outputColumnNames: _col0, _col1, _col2
  ListSink


set hive.cbo.enable=false;
explain select * from etltmp.test_struct where datestr='2019-01-01' and 
demo_struct is not null;


STAGE PLANS:
  Stage: Stage-0
Fetch Operator
  limit: -1
  Processor Tree:
TableScan
  alias: test_struct
  filterExpr: ((datestr = '2019-01-01') and demo_struct is not null) 
(type: boolean) <- Note that demo_struct filter is added when CBO is turned 
off
  Filter Operator
predicate: ((datestr = '2019-01-01') and demo_struct is not null) 
(type: boolean)
Select Operator
  expressions: f1 (type: string), demo_struct (type: 
struct), '2019-01-01' (type: string)
  outputColumnNames: _col0, _col1, _col2
  ListSink

{noformat}

In CalcitePlanner::genFilterRelNode, the following code misses to evaluate this 
filter. 
{noformat}
RexNode factoredFilterExpr = RexUtil
  .pullFactors(cluster.getRexBuilder(), convertedFilterExpr);
{noformat}

Note that even if we add `demo_struct.f1` it would end up pushing the filter 
correctly. Suspecting {code}RexCall::isAlwaysTrue{code} is evaluating to true 
in this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21777) Maven jar goal is producing warning due to missing dependency

2019-05-22 Thread Aron Hamvas (JIRA)
Aron Hamvas created HIVE-21777:
--

 Summary: Maven jar goal is producing warning due to missing 
dependency
 Key: HIVE-21777
 URL: https://issues.apache.org/jira/browse/HIVE-21777
 Project: Hive
  Issue Type: Improvement
Affects Versions: 3.1.1, 4.0.0
Reporter: Aron Hamvas
Assignee: Aron Hamvas


org.apache.directory.client.ldap:ldap-client-directory is a test scope 
dependecy. Hive is using version 0.1 but 0.1-SNAPSHOT is also there as 
transitive dependency (omitted for collision with 0.1 which is already there on 
top level) causing warning in the maven default lifecycle execution:

[WARNING] The POM for 
org.apache.directory.client.ldap:ldap-client-api:jar:0.1-SNAPSHOT is missing, 
no dependency information available

The warning appears in the jar goal logs and it can easily be removed by 
excluding this transitive dependency. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21776) Add test for incremental replication of a UDF with jar on HDFS

2019-05-22 Thread Ashutosh Bapat (JIRA)
Ashutosh Bapat created HIVE-21776:
-

 Summary: Add test for incremental replication of a UDF with jar on 
HDFS
 Key: HIVE-21776
 URL: https://issues.apache.org/jira/browse/HIVE-21776
 Project: Hive
  Issue Type: Test
Affects Versions: 4.0.0
Reporter: Ashutosh Bapat
Assignee: Ashutosh Bapat
 Fix For: 4.0.0


TestReplicationScenariosAcrossInstances has test to test bootstrap of a UDF 
with jar on HDFS but no test for incremental. Add the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)