[jira] [Created] (HIVE-26605) Remove reviewer pattern

2022-10-07 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-26605:
---

 Summary: Remove reviewer pattern
 Key: HIVE-26605
 URL: https://issues.apache.org/jira/browse/HIVE-26605
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HIVE-26138) Fix mapjoin_memcheck

2022-04-12 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-26138:
---

 Summary: Fix mapjoin_memcheck
 Key: HIVE-26138
 URL: https://issues.apache.org/jira/browse/HIVE-26138
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


this test fails very frequently

http://ci.hive.apache.org/job/hive-precommit/job/master/1169/testReport/junit/org.apache.hadoop.hive.cli.split7/TestCliDriver/Testing___split_01___PostProcess___testCliDriver_mapjoin_memcheck_/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-26135) Invalid Anti join conversion may cause missing results

2022-04-12 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-26135:
---

 Summary: Invalid Anti join conversion may cause missing results
 Key: HIVE-26135
 URL: https://issues.apache.org/jira/browse/HIVE-26135
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


right now I think the following is needed to trigger the issue:
* left outer join
* only select left hand side columns
* conditional which is using some udf
* the nullness of the udf is checked

repro sql; in case the conversion happens the row with 'a' will be missing
{code}
drop table if exists t;
drop table if exists n;

create table t(a string) stored as orc;
create table n(a string) stored as orc;

insert into t values ('a'),('1'),('2'),(null);
insert into n values ('a'),('b'),('1'),('3'),(null);


explain select n.* from n left outer join t on (n.a=t.a) where assert_true(t.a 
is null) is null;
explain select n.* from n left outer join t on (n.a=t.a) where cast(t.a as 
float) is null;


select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;
set hive.auto.convert.anti.join=false;
select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is 
null;

{code}



workaround could be to disable the feature:
{code}
set hive.auto.convert.anti.join=false;
{code}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25994) Analyze table runs into ClassNotFoundException-s in case binary distribution is used

2022-03-01 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25994:
---

 Summary: Analyze table runs into ClassNotFoundException-s in case 
binary distribution is used
 Key: HIVE-25994
 URL: https://issues.apache.org/jira/browse/HIVE-25994
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


any nightly release can be used to reproduce this:

{code}
create table t (a integer); insert into t values (1) ; analyze table t compute 
statistics for columns;
{code}

results in
{code}
Caused by: java.lang.NoClassDefFoundError: org/antlr/runtime/tree/CommonTree
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:757)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)
at java.lang.Class.getConstructor0(Class.java:3075)
at java.lang.Class.getDeclaredConstructor(Class.java:2178)
at 
org.apache.hive.com.esotericsoftware.reflectasm.ConstructorAccess.get(ConstructorAccess.java:65)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultInstantiatorStrategy.newInstantiatorOf(DefaultInstantiatorStrategy.java:60)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.newInstantiator(Kryo.java:1119)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.newInstance(Kryo.java:1128)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.create(FieldSerializer.java:153)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:118)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:729)
at 
org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:216)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ReflectField.read(ReflectField.java:125)
... 38 more
Caused by: java.lang.ClassNotFoundException: org.antlr.runtime.tree.CommonTree
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25977) Enhance Compaction Cleaner to skip when there is nothing to do #2

2022-02-23 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25977:
---

 Summary: Enhance Compaction Cleaner to skip when there is nothing 
to do #2
 Key: HIVE-25977
 URL: https://issues.apache.org/jira/browse/HIVE-25977
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


initially this was just an addendum to the original patch ; but got delayed and 
altered - so it should have its own ticket



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25976) Cleaner may remove files being accessed from a fetch-task-converted reader

2022-02-23 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25976:
---

 Summary: Cleaner may remove files being accessed from a 
fetch-task-converted reader
 Key: HIVE-25976
 URL: https://issues.apache.org/jira/browse/HIVE-25976
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


in a nutshell the following happens:
* query is compiled in fetch-task-converted mode
* no real execution happensbut the locks are released
* the HS2 is communicating with the client and uses the fetch-task to get the 
rows - which in this case will directly read files from the table's 
directory
* client sleeps between reads - so there is ample time for other events...
* cleaner wakes up and removes some files
* in the next read the fetch-task encounters a read error...



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25944) Format pom.xml-s

2022-02-09 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25944:
---

 Summary: Format pom.xml-s
 Key: HIVE-25944
 URL: https://issues.apache.org/jira/browse/HIVE-25944
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


at the moment I touch pom.xml-s with xmlstarlet it starts fixing indentation 
which makes seeing real diffs harder.

fix and enforce that the pom.xmls are indented correctly



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25883) Enhance Compaction Cleaner to skip when there is nothing to do

2022-01-20 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25883:
---

 Summary: Enhance Compaction Cleaner to skip when there is nothing 
to do
 Key: HIVE-25883
 URL: https://issues.apache.org/jira/browse/HIVE-25883
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


the cleaner works the following way:
* it identifies obsolete directories (delta dirs ; which doesn't have open txns)
* removes them and done

if there are no obsolete directoris that is attributed to that there might be 
open txns so the request should be retried later.

however if for some reason the directory was already cleaned - similarily it 
has no obsolete directories; and thus the request is retried for forever 




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25874) Slow filter evaluation of nest struct fields in vectorized executions

2022-01-18 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25874:
---

 Summary: Slow filter evaluation of nest struct fields in 
vectorized executions
 Key: HIVE-25874
 URL: https://issues.apache.org/jira/browse/HIVE-25874
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


{code:java}

create table t as
select
named_struct('id',13,'str','string','nest',named_struct('id',12,'str','string','arr',array('value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value')))
s;

-- go up to 1M rows
insert into table t select * from t union all select * from t union all select 
* from t union all select * from t union all select * from t union all select * 
from t union all select * from t union all select * from t union all select * 
from t;
insert into table t select * from t union all select * from t union all select 
* from t union all select * from t union all select * from t union all select * 
from t union all select * from t union all select * from t union all select * 
from t;
insert into table t select * from t union all select * from t union all select 
* from t union all select * from t union all select * from t union all select * 
from t union all select * from t union all select * from t union all select * 
from t;
insert into table t select * from t union all select * from t union all select 
* from t union all select * from t union all select * from t union all select * 
from t union all select * from t union all select * from t union all select * 
from t;
insert into table t select * from t union all select * from t union all select 
* from t union all select * from t union all select * from t union all select * 
from t union all select * from t union all select * from t union all select * 
from t;
-- insert into table t select * from t union all select * from t union all 
select * from t union all select * from t union all select * from t union all 
select * from t union all select * from t union all select * from t union all 
select * from t;


set hive.fetch.task.conversion=none;

select count(1) from t;
--explain
select s
.id from t
where 
s
.nest
.id  > 0;

 {code}


interestingly; the issue is not present:
* for a query not looking into the nested struct
* and in case the struct with the array is at the top level

{code}
select count(1) from t;
--explain
select s
.id from t
where 
s
-- .nest
.id  > 0;
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25844) Exception deserialization error-s may cause beeline to terminate immediately

2022-01-04 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25844:
---

 Summary: Exception deserialization error-s may cause beeline to 
terminate immediately
 Key: HIVE-25844
 URL: https://issues.apache.org/jira/browse/HIVE-25844
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 3.1.2
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


the exception on the server side happens:
 * fetch task conversion is on
 * there is an exception during reading the table the error bubbles up
 * => transmits a message to beeline that error class name is: 
"org.apache.phoenix.schema.ColumnNotFoundException" + the message
 * it tries to reconstruct the exception around HiveSqlException
 * but during the constructor call 
org.apache.phoenix.exception.SQLExceptionCode is needed which fails to load 
org/apache/hadoop/hbase/shaded/com/google/protobuf/Service
 * a
java.lang.NoClassDefFoundError: 
org/apache/hadoop/hbase/shaded/com/google/protobuf/Service is thrown - which is 
not handled in that method - so it becomes a real error ; and shuts down the 
client

{code:java}
java.lang.NoClassDefFoundError: 
org/apache/hadoop/hbase/shaded/com/google/protobuf/Service
[...]
at java.lang.Class.forName(Class.java:264)
at 
org.apache.hive.service.cli.HiveSQLException.newInstance(HiveSQLException.java:245)
at 
org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:211)
[...]
Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.shaded.com.google.protobuf.Service
[...]
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25823) Incorrect false positive results for outer join using non-satisfiable residual filters

2021-12-20 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25823:
---

 Summary: Incorrect false positive results for outer join using 
non-satisfiable residual filters
 Key: HIVE-25823
 URL: https://issues.apache.org/jira/browse/HIVE-25823
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


similar to HIVE-25822 
{code}
create table t_y (id integer,s string);
create table t_xy (id integer,s string);

insert into t_y values(0,'a'),(1,'y'),(1,'x');
insert into t_xy values(1,'x'),(1,'y');
select * from t_xy l full outer join t_y r on (l.id=r.id and l.s='y' and 
l.id+2*r.id=1);
{code}

the rows with full of NULLs are incorrect
{code}
+---+---+---+---+
| l.id  |  l.s  | r.id  |  r.s  |
+---+---+---+---+
| NULL  | NULL  | 0 | a |
| NULL  | NULL  | NULL  | NULL  |
| 1 | y | NULL  | NULL  |
| NULL  | NULL  | NULL  | NULL  |
| NULL  | NULL  | 1 | y |
| NULL  | NULL  | 1 | x |
+---+---+---+---+
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25822) Unexpected result rows in case of outer join contains conditions only affecting one side

2021-12-18 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25822:
---

 Summary: Unexpected result rows in case of outer join contains 
conditions only affecting one side
 Key: HIVE-25822
 URL: https://issues.apache.org/jira/browse/HIVE-25822
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


needed
* outer join
* on condition has at least one condition for one side of the join
* in a single reducer:
** a right hand side only row outputted right before
** >=2 rows on LHS and 1 on RHS matching in the join keys but the first LHS 
doesn't satisfies the filter condition
** second LHS row with good filter condition

{code}
with
t_y as (select col1 as id,col2 as s from (VALUES(0,'a'),(1,'y')) as c),
t_xy as (select col1 as id,col2 as s from (VALUES(1,'x'),(1,'y')) as c) 
select * from t_xy l full outer join t_y r on (l.id=r.id and l.s='y');
{code}

null,null,1,y is an unexpected result
{code}
+---+---+---+---+
| l.id  |  l.s  | r.id  |  r.s  |
+---+---+---+---+
| NULL  | NULL  | 0 | a |
| 1 | x | NULL  | NULL  |
| NULL  | NULL  | 1 | y |
| 1 | y | 1 | y |
+---+---+---+---+
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25820) Provide a way to disable join filters

2021-12-17 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25820:
---

 Summary: Provide a way to disable join filters
 Key: HIVE-25820
 URL: https://issues.apache.org/jira/browse/HIVE-25820
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25792) Multi Insert query fails on CBO path

2021-12-09 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25792:
---

 Summary: Multi Insert query fails on CBO path 
 Key: HIVE-25792
 URL: https://issues.apache.org/jira/browse/HIVE-25792
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


{code}
set hive.cbo.enable=true;

drop table if exists aa1;
drop table if exists bb1;
drop table if exists cc1;
drop table if exists dd1;
drop table if exists ee1;
drop table if exists ff1;

create table aa1 ( stf_id string);
create table bb1 ( stf_id string);
create table cc1 ( stf_id string);
create table ff1 ( x string);

explain
from ff1 as a join cc1 as b 
insert overwrite table aa1 select   stf_id GROUP BY b.stf_id
insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id
;

{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25791) Improve SFS exception messages

2021-12-09 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25791:
---

 Summary: Improve SFS exception messages
 Key: HIVE-25791
 URL: https://issues.apache.org/jira/browse/HIVE-25791
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


Especially for cases when the path is already known to be invalid; like: 
`sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#`



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25780) DistinctExpansion creates more than 64 grouping sets II

2021-12-06 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25780:
---

 Summary: DistinctExpansion creates more than 64 grouping sets II
 Key: HIVE-25780
 URL: https://issues.apache.org/jira/browse/HIVE-25780
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


HIVE-25498 have fixed this when there are only count(distinct x) queries.

however after the rewrite happens grouping sets are used to handle group by 
columns as well



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25770) AST is corrupted after CBO fallback for CTAS queries

2021-12-03 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25770:
---

 Summary: AST is corrupted after CBO fallback for CTAS queries
 Key: HIVE-25770
 URL: https://issues.apache.org/jira/browse/HIVE-25770
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
 Attachments: repro.q

reproduce:
* revert ec44c6081c88b81245185fa6a552d8c3631e47fa to force cbo fallbacks for 
>64 grouping sets
* use repro.q test

* the query would run with cbo turned off
* but with cbo enabled it would fail in conservative mode as well



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25752) Fix incremental compilation of parser module

2021-11-30 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25752:
---

 Summary: Fix incremental compilation of parser module
 Key: HIVE-25752
 URL: https://issues.apache.org/jira/browse/HIVE-25752
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


this issue doesn't happen all the time - but when it does its really annoying

the problem is that the antlr files are not regenerated; however the 
"HiveParser.java Fix" is run regardless...which corrupts the java files after a 
second run and causes compilation errors
{code}
[INFO] --- antlr3-maven-plugin:3.5.2:antlr (default) @ hive-parser ---
[INFO] ANTLR: Processing source directory /home/dev/hive/parser/src/java
ANTLR Parser Generator  Version 3.5.2
Grammar 
/home/dev/hive/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g is 
up to date - build skipped
Grammar 
/home/dev/hive/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g is 
up to date - build skipped
Grammar 
/home/dev/hive/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexerStandard.g
 is up to date - build skipped
Grammar 
/home/dev/hive/parser/src/java/org/apache/hadoop/hive/ql/parse/HintParser.g is 
up to date - build skipped
[INFO] 
[INFO] --- exec-maven-plugin:3.0.0:exec (HiveParser.java fix) @ hive-parser ---
[INFO] 
{code}

erros like:
{code}
[ERROR] 
/home/dev/hive/parser/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HiveParser.java:[50,16]
 class, interface, or enum expected
{code}

but I've also seen
{code}
[ERROR] 
/home/dev/hive/parser/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HiveParser.java:[49,32]
 cannot find symbol
[ERROR]   symbol:   class statement_return
[ERROR]   location: class org.apache.hadoop.hive.ql.parse.HiveParser
[ERROR] 
/home/dev/hive/parser/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HiveParserTokens.java:[13,19]
 cannot find symbol
{code}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25748) Investigate Union comparision

2021-11-29 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25748:
---

 Summary: Investigate Union comparision
 Key: HIVE-25748
 URL: https://issues.apache.org/jira/browse/HIVE-25748
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


both of the following cases change the "non-used" part of the union (note: 
`create_union(idx,o0,o1)` creates a union which uses the `idx`-th object)

{code}
SELECT (NULLIF(create_union(0,1,2),create_union(0,1,3)) is not null);
false
SELECT (NULLIF(create_union(0,1,2),create_union(1,2,1)) is not null);
true
{code}




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25738) NullIf doesn't support complex types

2021-11-24 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25738:
---

 Summary: NullIf doesn't support complex types
 Key: HIVE-25738
 URL: https://issues.apache.org/jira/browse/HIVE-25738
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


{code}
SELECT NULLIF(array(1,2,3),array(1,2,3))
{code}

results in:
{code}
 java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector 
cannot be cast to 
org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFNullif.evaluate(GenericUDFNullif.java:96)
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:177)
at 
org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.getReturnType(HiveFunctionHelper.java:135)
at 
org.apache.hadoop.hive.ql.parse.type.RexNodeExprFactory.createFuncCallExpr(RexNodeExprFactory.java:647)
[...]
{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25735) Improve statestimator in UDFWhen/UDFCase

2021-11-24 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25735:
---

 Summary: Improve statestimator in UDFWhen/UDFCase
 Key: HIVE-25735
 URL: https://issues.apache.org/jira/browse/HIVE-25735
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25732) Improve HLL insert performance

2021-11-23 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25732:
---

 Summary: Improve HLL insert performance
 Key: HIVE-25732
 URL: https://issues.apache.org/jira/browse/HIVE-25732
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


HIVE-23095 have fixed a correctness issue and removed a temporary list which 
supposed to speed up the algorithm and thus it suffered some performance 
degradation.

There are ways to put back some of that stuff; or consider other options to 
gain back the lost performance - now that the bug is fixed it should be a 
performance only improvement ticket.

It would be interesting to know how much time we spend on updating this DS 
during a large insert to know the weight of such an improvement.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25725) Upgrade used docker-in-docker container version

2021-11-19 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25725:
---

 Summary: Upgrade used docker-in-docker container version
 Key: HIVE-25725
 URL: https://issues.apache.org/jira/browse/HIVE-25725
 Project: Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


in HIVE-25714 I came to the conclusion that there might be something wrong with 
dind - upgrading it would be the first step.. and while doing so the storage 
driver should be checked if its appropriate/etc



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25720) Fix flaky test TestScheduledReplicationScenarios

2021-11-17 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25720:
---

 Summary: Fix flaky test TestScheduledReplicationScenarios
 Key: HIVE-25720
 URL: https://issues.apache.org/jira/browse/HIVE-25720
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


failed at the first attempt; the issue happened during
{code}
drop scheduled query repl_load_p2
{code}
which is in a finally block ; so this exception may be shadowing another 
exception

http://ci.hive.apache.org/job/hive-flaky-check/463/





--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25719) Fix flaky test TestMiniLlapLocalCliDri​ver#testCliDriver[replication_​metrics_ingest]

2021-11-17 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25719:
---

 Summary: Fix flaky test 
TestMiniLlapLocalCliDri​ver#testCliDriver[replication_​metrics_ingest]
 Key: HIVE-25719
 URL: https://issues.apache.org/jira/browse/HIVE-25719
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


flaky checker failed after 3 attempts with a q.out difference

there seems to be some ID difference - maybe 2 events happened in a different 
order?

http://ci.hive.apache.org/job/hive-flaky-check/465/testReport/junit/org.apache.hadoop.hive.cli/TestMiniLlapLocalCliDriver/testCliDriver_replication_metrics_ingest_/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25715) Provide nightly builds

2021-11-17 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25715:
---

 Summary: Provide nightly builds
 Key: HIVE-25715
 URL: https://issues.apache.org/jira/browse/HIVE-25715
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


provide nightly builds for the master branch



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25714) Some tests are flaky because docker is not able to start in 5 seconds

2021-11-17 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25714:
---

 Summary: Some tests are flaky because docker is not able to start 
in 5 seconds
 Key: HIVE-25714
 URL: https://issues.apache.org/jira/browse/HIVE-25714
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


there are some testruns failing with; and on the test site multiple pods are 
running in parallel - its not an ideal environment for tight deadlines
{code}
Unexpected exception java.lang.RuntimeException: Process docker failed to run 
in 5 seconds
 at 
org.apache.hadoop.hive.ql.externalDB.AbstractExternalDB.runCmd(AbstractExternalDB.java:92)
 at 
org.apache.hadoop.hive.ql.externalDB.AbstractExternalDB.launchDockerContainer(AbstractExternalDB.java:123)
 at 
org.apache.hadoop.hive.ql.qoption.QTestDatabaseHandler.beforeTest(QTestDatabaseHandler.java:111)
 at 
org.apache.hadoop.hive.ql.qoption.QTestOptionDispatcher.beforeTest(QTestOptionDispatcher.java:79)
{code}

http://ci.hive.apache.org/job/hive-precommit/job/PR-1674/4/testReport/junit/org.apache.hadoop.hive.cli.split19/TestMiniLlapLocalCliDriver/Testing___split_14___PostProcess___testCliDriver_qt_database_all_/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25713) Fix test TestLlapTaskSchedulerService#testPreemption

2021-11-17 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25713:
---

 Summary: Fix test TestLlapTaskSchedulerService#testPreemption
 Key: HIVE-25713
 URL: https://issues.apache.org/jira/browse/HIVE-25713
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


when this test passes it passes under 100ms - but when it fails it keeps 
waiting or more than 10 seconds - the test seem to be using singal/await 

http://ci.hive.apache.org/job/hive-flaky-check/462/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25712) Fix test TestContribCliDriver#testCliDriver[url_hook]

2021-11-16 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25712:
---

 Summary: Fix test TestContribCliDriver#testCliDriver[url_hook]
 Key: HIVE-25712
 URL: https://issues.apache.org/jira/browse/HIVE-25712
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


The test makes use of SampleURLHook - which could change the JDO url
http://ci.hive.apache.org/job/hive-flaky-check/460/



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25711) Make Table#isEmpty more efficient

2021-11-16 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25711:
---

 Summary: Make Table#isEmpty more efficient
 Key: HIVE-25711
 URL: https://issues.apache.org/jira/browse/HIVE-25711
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


[~stevel] suggested in another ticket that we could make our isEmpty method 
faster:

https://issues.apache.org/jira/browse/HIVE-24849?focusedCommentId=17372145&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17372145




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25707) SchemaTool may leave the metastore in-between upgrade steps

2021-11-16 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25707:
---

 Summary: SchemaTool may leave the metastore in-between upgrade 
steps
 Key: HIVE-25707
 URL: https://issues.apache.org/jira/browse/HIVE-25707
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


it seems like:
* schematool runs the sql files via beeline
* autocommit is turned on
* pressing ctrl+c or killing the process will result in an invalid schema

https://github.com/apache/hive/blob/6e02f6164385a370ee8014c795bee1fa423d7937/beeline/src/java/org/apache/hive/beeline/schematool/HiveSchemaTool.java#L79



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25703) Postgres metastore test failures

2021-11-16 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25703:
---

 Summary: Postgres metastore test failures
 Key: HIVE-25703
 URL: https://issues.apache.org/jira/browse/HIVE-25703
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


all recent builds are failing because postgres metastore don't start

underlying issue is that the docker container can't start because of:
```
ls: cannot access '/docker-entrypoint-initdb.d/': Operation not permitted
```



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25692) ExceptionHandler may mask checked exceptions

2021-11-12 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25692:
---

 Summary: ExceptionHandler may mask checked exceptions
 Key: HIVE-25692
 URL: https://issues.apache.org/jira/browse/HIVE-25692
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


HIVE-25055 have changed the way exceptions as rethrowed - but one of the 
methods may let checked exception out without them being declared on the method 
(and avoid the compile time error for it)

testcase for:
org.apache.hadoop.hive.metastore.TestExceptionHandler

{code}
  @Test
  public void testInvalid() throws MetaException {
try {
  throw new IOException("IOException test");
} catch (Exception e) {
  throw handleException(e).throwIfInstance(AccessControlException.class, 
IOException.class).defaultMetaException();
}
  }
{code}

this testcase should not compile - as it may throw IOException or 
AccessControlException as well



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (HIVE-25634) Eclipse compiler bumps into AIOBE during ObjectStore compilation

2021-10-21 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25634:
---

 Summary: Eclipse compiler bumps into AIOBE during ObjectStore 
compilation
 Key: HIVE-25634
 URL: https://issues.apache.org/jira/browse/HIVE-25634
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


this issue seem to have started appearing after HIVE-23633



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25633) Prevent shutdown of MetaStore scheduled worker ThreadPool

2021-10-21 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25633:
---

 Summary: Prevent shutdown of MetaStore scheduled worker ThreadPool
 Key: HIVE-25633
 URL: https://issues.apache.org/jira/browse/HIVE-25633
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


[~lpinter] have noticed that this patch has some sideffect:

in HIVE-23164 the patch have added a {{ThreadPool#shutdown}} to 
{{HMSHandler#shutdown}} - which could cause trouble in case a {{HMSHandler}} is 
shutdown and a new one is created

I was looking for cases in which a HMSHandler is created inside the metastore 
(beyond the one HiveMetaStore is using) - and I think tasks like Msck use it to 
access the metastore - and they close the client - which closes the hmshandler 
client ; which will shut down the threadpool




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25630) Translator fixes

2021-10-21 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25630:
---

 Summary: Translator fixes
 Key: HIVE-25630
 URL: https://issues.apache.org/jira/browse/HIVE-25630
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


there are some issues:
* AlreadyExistsException might be suppressed by the translator
* uppercase letter usage may cause problems for some clients
* add a way to suppress location checks for legacy clients




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25569) Enable table definition over a single file

2021-09-28 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25569:
---

 Summary: Enable table definition over a single file
 Key: HIVE-25569
 URL: https://issues.apache.org/jira/browse/HIVE-25569
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


Suppose there is a directory where multiple files are present - and by a 3rd 
party database system this is perfectly normal - because its treating a single 
file as the contents of the table.

Tables defined in the metastore follow a different principle - tables are 
considered to be under a directory - and all files under that directory are the 
contents of that directory.

To enable seamless migration/evaluation of Hive and other databases using HMS 
as a metadatabackend the ability to define a table over a single file would be 
usefull.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25531) Remove the core classified hive-exec artifact

2021-09-16 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25531:
---

 Summary: Remove the core classified hive-exec artifact
 Key: HIVE-25531
 URL: https://issues.apache.org/jira/browse/HIVE-25531
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


* this artifact was introduced in HIVE-7423 
* loading this artifact and the shaded hive-exec (along with the jdbc driver) 
could create interesting classpath problems
* if other projects have issues with the shaded hive-exec artifact we must 
start fix those problems



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25508) Partitioned tables created with CTAS queries doesnt have lineage informations

2021-09-09 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25508:
---

 Summary: Partitioned tables created with CTAS queries doesnt have 
lineage informations
 Key: HIVE-25508
 URL: https://issues.apache.org/jira/browse/HIVE-25508
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25485) Transform selects of literals under a UNION ALL to inline table scan

2021-08-26 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25485:
---

 Summary: Transform selects of literals under a UNION ALL to inline 
table scan
 Key: HIVE-25485
 URL: https://issues.apache.org/jira/browse/HIVE-25485
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich



{code}
select 1
union all
select 1
union all
[...]
union all
select 1
{code}

results in a very big plan; which will have vertexes proportional to the number 
of union all branch - hence it could be slow to execute it



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25404) Inserts inside merge statements are rewritten incorrectly for partitioned tables

2021-07-29 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25404:
---

 Summary: Inserts inside merge statements are rewritten incorrectly 
for partitioned tables
 Key: HIVE-25404
 URL: https://issues.apache.org/jira/browse/HIVE-25404
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


{code}
drop table u;drop table t;

create table t(value string default 'def') partitioned by (id integer);
create table u(id integer);
{code}

#1 id&value specified
rewritten
{code}
FROM
  `default`.`t`
  RIGHT OUTER JOIN
  `default`.`u`
  ON `t`.`id`=`u`.`id`
INSERT INTO `default`.`t` (`id`,`value`) partition (`id`)-- insert clause
  SELECT `u`.`id`,'x'
   WHERE `t`.`id` IS NULL
{code}
it should be
{code}
[...]
INSERT INTO `default`.`t` partition (`id`) (`value`)-- insert clause
[...]
{code}

#2 when values is not specified

{code}
merge into t using u on t.id=u.id when not matched then insert (id) values 
(u.id);
{code}

rewritten query:
{code}
FROM
  `default`.`t`
  RIGHT OUTER JOIN
  `default`.`u`
  ON `t`.`id`=`u`.`id`
INSERT INTO `default`.`t` (`id`) partition (`id`)-- insert clause
  SELECT `u`.`id`
   WHERE `t`.`id` IS NULL
{code}

it should be
{code}
[...]
INSERT INTO `default`.`t` partition (`id`) ()-- insert clause
[...]
{code}

however we don't accept empty column lists



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25395) Update hadoop to a more recent version

2021-07-27 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25395:
---

 Summary: Update hadoop to a more recent version
 Key: HIVE-25395
 URL: https://issues.apache.org/jira/browse/HIVE-25395
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


we are still depending on hadoop 3.1.0

which doesn't have source attachments - and makes development harder



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25378) Enable removal of old builds on hive ci

2021-07-23 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25378:
---

 Summary: Enable removal of old builds on hive ci
 Key: HIVE-25378
 URL: https://issues.apache.org/jira/browse/HIVE-25378
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


We are using the github plugin to run builds on PRs

However to remove old builds that plugin needs to have periodic branch scanning 
enabled - however since we also use the plugins merge mechanism; this will 
cause to rediscover all open PRs after there is a new commit on the target 
branch. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25370) Improve SharedWorkOptimizer performance

2021-07-22 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25370:
---

 Summary: Improve SharedWorkOptimizer performance
 Key: HIVE-25370
 URL: https://issues.apache.org/jira/browse/HIVE-25370
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


for queries which are unioning ~800 constant rows the SWO is doing around n*n/2 
operations trying to find 2 TS-es which could be merged

{code}
select constants
UNION ALL
...
UNION ALL
select constants
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25313) Upgrade commons-codec to 1.15

2021-07-07 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25313:
---

 Summary: Upgrade commons-codec to 1.15
 Key: HIVE-25313
 URL: https://issues.apache.org/jira/browse/HIVE-25313
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25312) Upgrade netty to 4.1.65.Final

2021-07-07 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25312:
---

 Summary: Upgrade netty to 4.1.65.Final
 Key: HIVE-25312
 URL: https://issues.apache.org/jira/browse/HIVE-25312
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25311) Slow compilation of union operators with >100 branches

2021-07-07 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25311:
---

 Summary: Slow compilation of union operators with >100 branches
 Key: HIVE-25311
 URL: https://issues.apache.org/jira/browse/HIVE-25311
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


during the processing of an N way union operator the full plan is cloned N 
times; which might hurt compilation time performance



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25290) Stabilize TestTxnHandler

2021-06-25 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25290:
---

 Summary: Stabilize TestTxnHandler
 Key: HIVE-25290
 URL: https://issues.apache.org/jira/browse/HIVE-25290
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


http://ci.hive.apache.org/job/hive-flaky-check/271/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25289) Fix external_jdbc_table3 and external_jdbc_table4

2021-06-25 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25289:
---

 Summary: Fix external_jdbc_table3 and external_jdbc_table4
 Key: HIVE-25289
 URL: https://issues.apache.org/jira/browse/HIVE-25289
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


http://ci.hive.apache.org/job/hive-flaky-check/265/
http://ci.hive.apache.org/job/hive-flaky-check/266/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25288) Fix TestMmCompactorOnTez

2021-06-25 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25288:
---

 Summary: Fix TestMmCompactorOnTez
 Key: HIVE-25288
 URL: https://issues.apache.org/jira/browse/HIVE-25288
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


http://ci.hive.apache.org/job/hive-flaky-check/240/





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25285) Retire HiveProjectJoinTransposeRule

2021-06-24 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25285:
---

 Summary: Retire HiveProjectJoinTransposeRule
 Key: HIVE-25285
 URL: https://issues.apache.org/jira/browse/HIVE-25285
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


we don't neccessary need our own rule anymore - a plain 
ProjectJoinTransposeRule  could probably work





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25278) HiveProjectJoinTransposeRule may do invalid transformations with windowing expressions

2021-06-23 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25278:
---

 Summary: HiveProjectJoinTransposeRule may do invalid 
transformations with windowing expressions 
 Key: HIVE-25278
 URL: https://issues.apache.org/jira/browse/HIVE-25278
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


running
{code}
create table table1 (acct_num string, interest_rate decimal(10,7)) stored as 
orc;
create table table2 (act_id string) stored as orc;
CREATE TABLE temp_output AS
SELECT act_nbr, row_num
FROM (SELECT t2.act_id as act_nbr,
row_number() over (PARTITION BY trim(acct_num) ORDER BY interest_rate DESC) AS 
row_num
FROM table1 t1
INNER JOIN table2 t2
ON trim(acct_num) = t2.act_id) t
WHERE t.row_num = 1;
{code}

may result in error like:

{code}
Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 
Invalid column reference 'interest_rate': (possible column names are: 
interest_rate, trim) (state=42000,code=4)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25267) Fix TestReplicationScenariosAcidTables

2021-06-18 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25267:
---

 Summary: Fix TestReplicationScenariosAcidTables
 Key: HIVE-25267
 URL: https://issues.apache.org/jira/browse/HIVE-25267
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


test is unstable
http://ci.hive.apache.org/job/hive-flaky-check/242/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25266) Fix TestWarehouseExternalDir

2021-06-18 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25266:
---

 Summary: Fix TestWarehouseExternalDir
 Key: HIVE-25266
 URL: https://issues.apache.org/jira/browse/HIVE-25266
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


test is unstable 
http://ci.hive.apache.org/job/hive-flaky-check/244/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25265) Fix TestHiveIcebergStorageHandlerWithEngine

2021-06-18 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25265:
---

 Summary: Fix TestHiveIcebergStorageHandlerWithEngine
 Key: HIVE-25265
 URL: https://issues.apache.org/jira/browse/HIVE-25265
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


test is unstable:
http://ci.hive.apache.org/job/hive-flaky-check/251/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25250) Fix TestHS2ImpersonationWithRemoteMS.testImpersonation

2021-06-15 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25250:
---

 Summary: Fix TestHS2ImpersonationWithRemoteMS.testImpersonation
 Key: HIVE-25250
 URL: https://issues.apache.org/jira/browse/HIVE-25250
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


http://ci.hive.apache.org/job/hive-flaky-check/235/testReport/org.apache.hive.service/TestHS2ImpersonationWithRemoteMS/testImpersonation/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25249) Fix TestWorker

2021-06-15 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25249:
---

 Summary: Fix TestWorker
 Key: HIVE-25249
 URL: https://issues.apache.org/jira/browse/HIVE-25249
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich



http://ci.hive.apache.org/job/hive-precommit/job/PR-2381/1/

http://ci.hive.apache.org/job/hive-flaky-check/236/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25248) Fix .TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1

2021-06-15 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25248:
---

 Summary: Fix 
.TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1
 Key: HIVE-25248
 URL: https://issues.apache.org/jira/browse/HIVE-25248
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


This test is failing randomly recently

http://ci.hive.apache.org/job/hive-flaky-check/233/testReport/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/testForcedLocalityMultiplePreemptionsSameHost1/





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25247) Fix TestWMMetricsWithTrigger

2021-06-15 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25247:
---

 Summary: Fix TestWMMetricsWithTrigger
 Key: HIVE-25247
 URL: https://issues.apache.org/jira/browse/HIVE-25247
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


this test seems to be unstable:

http://ci.hive.apache.org/job/hive-flaky-check/226/

it was introduced by HIVE-24803 a few months ago 

cc: [~gupta.nikhil0007]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25224) Multi insert statements involving tables with different bucketing_versions results in error

2021-06-09 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25224:
---

 Summary: Multi insert statements involving tables with different 
bucketing_versions results in error
 Key: HIVE-25224
 URL: https://issues.apache.org/jira/browse/HIVE-25224
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich



{code}
drop table if exists t;
drop table if exists t2;
drop table if exists t3;
create table t (a integer);
create table t2 (a integer);
create table t3 (a integer);
alter table t set tblproperties ('bucketing_version'='1');
explain from t3 insert into t select a insert into t2 select a;
{code}

results in
{code}
Error: Error while compiling statement: FAILED: RuntimeException Error setting 
bucketingVersion for group: [[op: FS[2], bucketingVersion=1], [op: FS[11], 
bucketingVersion=2]] (state=42000,code=4)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25180) Update netty to 4.1.60.Final

2021-05-31 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25180:
---

 Summary: Update netty to 4.1.60.Final
 Key: HIVE-25180
 URL: https://issues.apache.org/jira/browse/HIVE-25180
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25171) Use ACID_HOUSEKEEPER_SERVICE_START

2021-05-27 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25171:
---

 Summary: Use ACID_HOUSEKEEPER_SERVICE_START
 Key: HIVE-25171
 URL: https://issues.apache.org/jira/browse/HIVE-25171
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


seems to be unused right now



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25138) Auto disable scheduled queries after repeated failures

2021-05-19 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25138:
---

 Summary: Auto disable scheduled queries after repeated failures
 Key: HIVE-25138
 URL: https://issues.apache.org/jira/browse/HIVE-25138
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25044) Parallel edge fixer may not be able to process semijoin edges

2021-04-21 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25044:
---

 Summary: Parallel edge fixer may not be able to process semijoin 
edges
 Key: HIVE-25044
 URL: https://issues.apache.org/jira/browse/HIVE-25044
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


SJ filter edges are removed from the main operator graph - which could cause 
that a parallel edge remains after the remover was executed



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25036) Unstable testcase script_broken_pipe2

2021-04-20 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25036:
---

 Summary: Unstable testcase script_broken_pipe2
 Key: HIVE-25036
 URL: https://issues.apache.org/jira/browse/HIVE-25036
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


http://ci.hive.apache.org/job/hive-flaky-check/224/

{code}
Client Execution succeeded but contained differences (error code = 1) after 
executing script_broken_pipe2.q 
24c24
< Caused by: java.io.IOException: Broken pipe
---
> Caused by: java.io.IOException: Stream closed
46c46
< Caused by: java.io.IOException: Broken pipe
---
> Caused by: java.io.IOException: Stream closed
49,58d48
< FAILED: AssertionError java.lang.AssertionError: Client Execution succeeded 
but contained differences (error code = 1) after executing 
script_broken_pipe2.q 
< 24c24
< < Caused by: java.io.IOException: Broken pipe
< ---
< > Caused by: java.io.IOException: Stream closed
< 46c46
< < Caused by: java.io.IOException: Broken pipe
< ---
< > Caused by: java.io.IOException: Stream closed
< 
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25029) Remove travis builds

2021-04-19 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-25029:
---

 Summary: Remove travis builds
 Key: HIVE-25029
 URL: https://issues.apache.org/jira/browse/HIVE-25029
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


travis only compiles the project - we already do much more than that during 
precommit testing.
(and it it sometimes delays build because travis cant allocate executors/etc)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24986) Support aggregates on columns present in rollups

2021-04-07 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24986:
---

 Summary: Support aggregates on columns present in rollups
 Key: HIVE-24986
 URL: https://issues.apache.org/jira/browse/HIVE-24986
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


{code}
SELECT key, value, count(key) FROM src GROUP BY key, value with rollup;
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24979) Tests should not load confs from places like /etc/hive/hive-site.xml

2021-04-06 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24979:
---

 Summary: Tests should not load confs from places like 
/etc/hive/hive-site.xml
 Key: HIVE-24979
 URL: https://issues.apache.org/jira/browse/HIVE-24979
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


for example: 
TestEmbeddedHiveMetaStore

may load a value for the metastore.metadata.transformer.class key from 
/etc/hive/hive-site.xml





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24963) Windowing expression may loose its input in some cases

2021-03-31 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24963:
---

 Summary: Windowing expression may loose its input in some cases
 Key: HIVE-24963
 URL: https://issues.apache.org/jira/browse/HIVE-24963
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


{code}
drop table if exists sss;
 CREATE TABLE `sss`(
   `user_id` bigint,
   `user_mid` string
 )
 PARTITIONED BY (
   `dt` string)
STORED AS ORC
   ;

insert into sss partition(dt='part1') VALUES (12345,'user_mid 
v1'),(12345,'user_mid v1'),(12345,'user_mid v1'),(12345,'user_mid 
v1'),(12345,'user_mid v1');


set hive.auto.convert.join.noconditionaltask.size=1;
WITH
 unioned_user AS (
 SELECT
 *,
 row_number() OVER (PARTITION BY user_mid ORDER BY dt ASC) AS r_asc,
 row_number() OVER (PARTITION BY user_mid ORDER BY dt DESC) AS 
r_desc
 FROM (
 SELECT DISTINCT
 dt,
 user_mid
 FROM sss
 WHERE dt = '20210228'
 UNION ALL
 SELECT DISTINCT
dt,
 user_mid
 FROM sss
 ) AS uni
 ),
 merged_user AS (
 SELECT
 a.user_mid
 FROM (SELECT * FROM unioned_user WHERE r_asc = 1) AS a
 INNER JOIN (SELECT * FROM unioned_user WHERE r_desc = 1) AS d
 ON a.user_mid = d.user_mid
 )
 Select count(*) from merged_user;
{cdode}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24954) MetastoreTransformer is disabled during testing

2021-03-29 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24954:
---

 Summary: MetastoreTransformer is disabled during testing
 Key: HIVE-24954
 URL: https://issues.apache.org/jira/browse/HIVE-24954
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich



all calls are fortified with "isInTest" guards to avoid testing those calls 
(!@#$#)

https://github.com/apache/hive/blob/86fa9b30fe347c7fc78a2930f4d20ece2e124f03/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L1647

this causes some wierd behaviour:
out of the box hive installation creates TRANSLATED_TO_EXTERNAL external tables 
for plain CREATE TABLE commands
meanwhile during when most testing is executed CREATE table creates regular 
MANAGED tables...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24940) PartitionPruner may reject partitionfilter expressions evaluating to unknown if the filter is safe

2021-03-25 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24940:
---

 Summary: PartitionPruner may reject partitionfilter expressions 
evaluating to unknown if the filter is safe
 Key: HIVE-24940
 URL: https://issues.apache.org/jira/browse/HIVE-24940
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


{code}
CREATE TABLE t1pstring (col1 string) PARTITIONED BY (p1 string);
INSERT INTO t1pstring PARTITION (p1) VALUES 
("2020","2020"),("2021","2021"),("2021_backup","2021_backup"),('',''),('9','9'),('_a','_a'),('a','a');

explain extended SELECT count(*) FROM t1pstring WHERE p1=;
[...]
| Truncated Path -> Alias:   |
|   /t1pstring/p1=2021_backup [t1pstring] |
|   /t1pstring/p1= [t1pstring]   |
|   /t1pstring/p1=_a [t1pstring] |
|   /t1pstring/p1=a [t1pstring]  |
[...]
{code}

note:
* for all values which are interpretable as integers - the equals is evaluated 
and the result is false
* for {{NULL}} values and for non-integer values ({{a}}) the comparision 
results a {{NULL}} which is retained
* {{NULL}} -s are retained because there is a preprocessing step which removes 
functions which are dependant of non-partition columns as well - and replaces 
them with a {{NULL}}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24939) PartitionPruner incorrectly assumes that all builtin expressions are metastore side supported

2021-03-25 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24939:
---

 Summary: PartitionPruner incorrectly assumes that all builtin 
expressions are metastore side supported
 Key: HIVE-24939
 URL: https://issues.apache.org/jira/browse/HIVE-24939
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


in case ObjectStore is in use it results in logs like:
{code}
2021-03-20 12:37:36,568 INFO  
org.apache.hadoop.hive.metastore.PartFilterExprUtil: [pool-6-thread-170]: 
Unable to make the expression tree from expression string [(UDFToDouble(p1) = 
2021.0D)]Error parsing partition filter; lexer error: null; exception 
NoViableAltException(24@[])
{code}

may occur in the HMS log; while the metastore call returns all partitions...




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location

2021-03-22 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24920:
---

 Summary: TRANSLATED_TO_EXTERNAL tables may write to the same 
location
 Key: HIVE-24920
 URL: https://issues.apache.org/jira/browse/HIVE-24920
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


{code}
create table t (a integer);
insert into t values(1);
alter table t rename to t2;
create table t (a integer); -- I expected an exception from this command 
(location already exists) but because its an external table no exception
insert into t values(2);
select * from t;  -- shows 1 and 2
drop table t2;-- wipes out data location
select * from t;  -- empty resultset
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24841) Parallel edge fixer may run into NPE when RS is missing a duplicate column from the output schema

2021-03-02 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24841:
---

 Summary: Parallel edge fixer may run into NPE when RS is missing a 
duplicate column from the output schema
 Key: HIVE-24841
 URL: https://issues.apache.org/jira/browse/HIVE-24841
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


This may mean that the RS has an incorrect schema - but that will be 
investigated separately



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24830) Revise RowSchema mutability usage

2021-02-25 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24830:
---

 Summary: Revise RowSchema mutability usage
 Key: HIVE-24830
 URL: https://issues.apache.org/jira/browse/HIVE-24830
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


RowSchema is essentially a container class for a list of fields.

* it can be constructed from a "list"
* the list can be set
* the list can be accessed

none of the above methods try to protect the data inside; hence the following 
could easily  happen:
{code}
s=o1.getSchema();
col=s.getCol("favourite")
col.setInternalName("asd"); // will modify o1 schema
newSchema.add(col);
o2.setSchema(newSchema);

o2.getSchema().get("asd").setInternalName("xxx"); // will modify o1 and o2 
schema
[...]
{code}

not sure how much of this is actually cruical; exploratory testrun revealed 
some cases
https://github.com/apache/hive/pull/2019





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24829) CorrelationUtilities#replaceReduceSinkWithSelectOperator misses KEY mappings

2021-02-25 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24829:
---

 Summary: CorrelationUtilities#replaceReduceSinkWithSelectOperator 
misses KEY mappings
 Key: HIVE-24829
 URL: https://issues.apache.org/jira/browse/HIVE-24829
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


* it totally misses KEY column mappings - at the case I was looking at the KEY 
columns were at the and...in case they are not referenced then it will be okay
* the exprMap keys doesn't match the rowSchema



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24823) Fix ide error in BasePartitionEvaluator

2021-02-24 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24823:
---

 Summary: Fix ide error in BasePartitionEvaluator
 Key: HIVE-24823
 URL: https://issues.apache.org/jira/browse/HIVE-24823
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24821) Restrict parallel edge creation for invertable RS operators

2021-02-24 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24821:
---

 Summary: Restrict parallel edge creation for invertable RS 
operators
 Key: HIVE-24821
 URL: https://issues.apache.org/jira/browse/HIVE-24821
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


Apparently there are some cases in which the RS may do some other things as 
well - restricting is the first safest option



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24812) Disable sharedworkoptimizer remove semijoin by default

2021-02-23 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24812:
---

 Summary: Disable sharedworkoptimizer remove semijoin by default
 Key: HIVE-24812
 URL: https://issues.apache.org/jira/browse/HIVE-24812
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


SJ removal backfired a bit when I was testing stuff - because of the additional 
opportunities paralleledges may enable ; because it will increased the shuffled 
memory amount and/or even make MJ broadcast inputs larger

set hive.optimize.shared.work.semijoin=false by default for now

right now it's better to leave dppunion to pick up these cases instead of 
removing the SJ fully - after HIVE-24376 we might enable it back 




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24799) Rewise RS rowschema/colExprMap consistency in case of mapjoins

2021-02-19 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24799:
---

 Summary: Rewise RS rowschema/colExprMap consistency in case of 
mapjoins
 Key: HIVE-24799
 URL: https://issues.apache.org/jira/browse/HIVE-24799
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


I've seen some odd things around MJ - so I added an if to skip those errors 
while I was fixing others.

IIRC this this issue seemed more serious; the code did "know" that the RS had a 
bad schema and it went to the parent - I guess there was a reason to do that - 
and I guess fixing it won't be easy either.

https://github.com/apache/hive/pull/1929#discussion_r579200496



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24771) Fix hang of TransactionalKafkaWriterTest

2021-02-10 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24771:
---

 Summary: Fix hang of TransactionalKafkaWriterTest 
 Key: HIVE-24771
 URL: https://issues.apache.org/jira/browse/HIVE-24771
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


this test seems to hang randomly - I've launched 3 checks against it - all of 
which started to hang after some time
http://ci.hive.apache.org/job/hive-flaky-check/187/
http://ci.hive.apache.org/job/hive-flaky-check/188/
http://ci.hive.apache.org/job/hive-flaky-check/189/

{code}
"main" #1 prio=5 os_prio=0 tid=0x7f1d5400a800 nid=0x31e waiting on 
condition [0x7f1d59381000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x894b3ed8> (a 
java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:837)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1308)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at 
org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:56)
at 
org.apache.hadoop.hive.kafka.HiveKafkaProducer.flushNewPartitions(HiveKafkaProducer.java:187)
at 
org.apache.hadoop.hive.kafka.HiveKafkaProducer.flush(HiveKafkaProducer.java:123)
at 
org.apache.hadoop.hive.kafka.TransactionalKafkaWriter.close(TransactionalKafkaWriter.java:189)
at 
org.apache.hadoop.hive.kafka.TransactionalKafkaWriterTest.writeAndCommit(TransactionalKafkaWriterTest.java:182)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:377)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:138)
at 
org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:465)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:451)
{code}






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24767) Stabilize dyn_part3.q test

2021-02-10 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24767:
---

 Summary: Stabilize dyn_part3.q test
 Key: HIVE-24767
 URL: https://issues.apache.org/jira/browse/HIVE-24767
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich



http://ci.hive.apache.org/job/hive-flaky-check/186/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24766) Fix TestScheduledReplication

2021-02-10 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24766:
---

 Summary: Fix TestScheduledReplication
 Key: HIVE-24766
 URL: https://issues.apache.org/jira/browse/HIVE-24766
 Project: Hive
  Issue Type: Bug
 Environment: test seems to be unstable 

http://ci.hive.apache.org/job/hive-flaky-check/184/

Reporter: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24704) Ensure that all Operator column expressions refer to a column in the RowSchema

2021-01-29 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24704:
---

 Summary: Ensure that all Operator column expressions refer to a 
column in the RowSchema
 Key: HIVE-24704
 URL: https://issues.apache.org/jira/browse/HIVE-24704
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


Hive Operators should satisfy that all keys of the columnExprMap must be 
present in the schema



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24700) Run Mssql integration tests during precommit

2021-01-28 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24700:
---

 Summary: Run Mssql integration tests during precommit
 Key: HIVE-24700
 URL: https://issues.apache.org/jira/browse/HIVE-24700
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


add mssql to the jenkinsfile and run our metastore schema tests on it 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24699) Run Oracle integration tests during precommit

2021-01-28 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24699:
---

 Summary: Run Oracle integration tests during precommit
 Key: HIVE-24699
 URL: https://issues.apache.org/jira/browse/HIVE-24699
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich


this will need a working oracle docker image - and possibly some smaller 
changes to make sure that the tests could run reliable



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24678) Add feature toggle to control SWO parallel edge support

2021-01-21 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24678:
---

 Summary: Add feature toggle to control SWO parallel edge support
 Key: HIVE-24678
 URL: https://issues.apache.org/jira/browse/HIVE-24678
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


I can't foresee the future - but it might give better diagnosability 
opportunities to have a direct knob on this feature (I wanted to add it in the 
base patch ; but eventually forgot to do so)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24677) Fix typoed vectorization package declaration

2021-01-21 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24677:
---

 Summary: Fix typoed vectorization package declaration
 Key: HIVE-24677
 URL: https://issues.apache.org/jira/browse/HIVE-24677
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


HIVE-24510 have added 
ql/src/gen/vectorization/UDAFTemplates/VectorUDAFComputeBitVector.txt
but it's package declaration doesnt align with its folder name



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24671) Semijoinremoval should not run into an NPE in case the SJ filter contains an UDF

2021-01-20 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24671:
---

 Summary: Semijoinremoval should not run into an NPE in case the SJ 
filter contains an UDF
 Key: HIVE-24671
 URL: https://issues.apache.org/jira/browse/HIVE-24671
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


{code}
set hive.optimize.index.filter=true;
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set hive.vectorized.execution.enabled=true;



drop table if exists t1;
drop table if exists t2;

create table t1 (
v1 string
);

create table t2 (
v2 string
);

insert into t1 values ('e123456789'),('x123456789');
insert into t2 values
('123'),
 ('e123456789');


-- alter table t1 update statistics set 
('numRows'='9348843574','rawDataSize'='0');

alter table t1 update statistics set ('numRows'='934884357','rawDataSize'='0');
alter table t2 update statistics set ('numRows'='9348','rawDataSize'='0');

alter table t1 update statistics for column v1 set 
('numNulls'='0','numDVs'='15541355','avgColLen'='10.0','maxColLen'='10');
alter table t2 update statistics for column v2 set 
('numNulls'='0','numDVs'='155','avgColLen'='5.0','maxColLen'='10');
-- alter table t2 update statistics for column k set 
('numNulls'='0','numDVs'='13876472','avgColLen'='15.9836','maxColLen'='16');

explain
select v1,v2 from t1 join t2 on (substr(v1,1,3) = v2);
{code}

results in:
{code}
 java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.parse.TezCompiler.removeSemijoinOptimizationByBenefit(TezCompiler.java:1944)
at 
org.apache.hadoop.hive.ql.parse.TezCompiler.semijoinRemovalBasedTransformations(TezCompiler.java:544)
at 
org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:240)
at 
org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:161)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.compilePlan(SemanticAnalyzer.java:12467)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12672)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:455)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171)
[...]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24547) Fix acid_vectorization_original

2020-12-16 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24547:
---

 Summary: Fix acid_vectorization_original
 Key: HIVE-24547
 URL: https://issues.apache.org/jira/browse/HIVE-24547
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich


the failure was hidden by the failed-to-read issue

the test is most likely failed first after HIVE-24274 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24525) Invite reviewers automatically by file name patterns

2020-12-11 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24525:
---

 Summary: Invite reviewers automatically by file name patterns
 Key: HIVE-24525
 URL: https://issues.apache.org/jira/browse/HIVE-24525
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


I've wrote about this an 
[email|http://mail-archives.apache.org/mod_mbox/hive-dev/202006.mbox/%3c324a0a23-5841-09fe-a993-1a095035e...@rxd.hu%3e]
 a long time ago...

it could help in keeping an eye on some specific parts...eg: thrift and parser 
changes 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24488) Make docker host configurable for metastoredb/perf tests

2020-12-04 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24488:
---

 Summary: Make docker host configurable for metastoredb/perf tests
 Key: HIVE-24488
 URL: https://issues.apache.org/jira/browse/HIVE-24488
 Project: Hive
  Issue Type: Improvement
  Components: Test
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


I tend to develop patches inside containers (hive-dev-box) to be able to work 
on multiple patches in parallel

Running tests which do use docker were always a bit problematic for me - when I 
wanted to do it before: I manually exposed /var/lib/docker and added a rinetd 
forward by hand (which is not nice)

...with the current move to run Perf tests as well against a dockerized 
metastore exposes this problem a bit more for me.

I'm also considering to add the ability to use minikube with hive-dev-box ; but 
that's still needs exploring

it would be much easier to expose the address of the docker host I'm using...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24487) Use alternate ports for dockerized databases during testing

2020-12-04 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24487:
---

 Summary: Use alternate ports for dockerized databases during 
testing
 Key: HIVE-24487
 URL: https://issues.apache.org/jira/browse/HIVE-24487
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


like 5432 for postgres and 3306 for mysql

https://github.com/apache/hive/blob/52cf467836df71485e95b08c9e91e197e9898b79/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Postgres.java#L35



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24486) Enhance operator merge logic to also consider going thru RS operators

2020-12-04 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24486:
---

 Summary: Enhance operator merge logic to also consider going thru 
RS operators
 Key: HIVE-24486
 URL: https://issues.apache.org/jira/browse/HIVE-24486
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


the targeted situation looks like this:
{code}
OP1 -> RS1.1 -> JOIN1.1
OP1 -> RS1.2 -> JOIN1.2 

OP2 -> RS2.1 -> JOIN1.1 -> RS3.1 
OP2 -> RS2.2 -> JOIN1.2 -> RS3.2 
{code}






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart

2020-11-26 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24435:
---

 Summary: Vectorized unix_timestamp is inconsistent with 
non-vectorized counterpart
 Key: HIVE-24435
 URL: https://issues.apache.org/jira/browse/HIVE-24435
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


{code}
create table t (d string);
insert into t values('2020-11-16 22:18:40 UTC');

select
  '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
to_date(from_unixtime(unix_timestamp(d)))
from t
;

set hive.fetch.task.conversion=none;

select
  '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), 
to_date(from_unixtime(unix_timestamp(d)))
from t
;

{code}

results:
{code}
-- std udf:
>2020-11-16 22:18:40 UTC<   1605593920  2020-11-16 22:18:40 
>2020-11-16
-- vectorized udf
>2020-11-16 22:18:40 UTC<   NULLNULLNULL
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24428) Concurrent add_partitions requests may lead to data loss

2020-11-25 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24428:
---

 Summary: Concurrent add_partitions requests may lead to data loss
 Key: HIVE-24428
 URL: https://issues.apache.org/jira/browse/HIVE-24428
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


in case multiple clients are adding partitions to the same table - when the 
same partition is being added there is a chance that the data dir is removed 
after the other client have already written its data

https://github.com/apache/hive/blob/5e96b14a2357c66a0640254d5414bc706d8be852/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L3958





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24388) Enhance swo optimizations to merge EventOperators

2020-11-16 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24388:
---

 Summary: Enhance swo optimizations to merge EventOperators
 Key: HIVE-24388
 URL: https://issues.apache.org/jira/browse/HIVE-24388
 Project: Hive
  Issue Type: Sub-task
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


{code}
EVENT1->TS1
EVENT2->TS2
{code}

are not merged because a TS may only handles the first event properly; sending 
2 events would cause one of them to be ignored



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24384) SharedWorkOptimizer improvements

2020-11-13 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24384:
---

 Summary: SharedWorkOptimizer improvements
 Key: HIVE-24384
 URL: https://issues.apache.org/jira/browse/HIVE-24384
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


this started as a small feature addition but due to the sheer volume of the 
q.out changes - its better to do smaller changes at a time; which means more 
tickets...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24382) Organize replaceTabAlias methods

2020-11-12 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24382:
---

 Summary: Organize replaceTabAlias methods
 Key: HIVE-24382
 URL: https://issues.apache.org/jira/browse/HIVE-24382
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


* move to the OperatorDesc / etc

https://github.com/apache/hive/pull/1661#discussion_r522693729



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24376) SharedWorkOptimizer may retain the SJ filter condition during RemoveSemijoin mode

2020-11-12 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-24376:
---

 Summary: SharedWorkOptimizer may retain the SJ filter condition 
during RemoveSemijoin  mode
 Key: HIVE-24376
 URL: https://issues.apache.org/jira/browse/HIVE-24376
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich


the mode name is also a bit confusing..but here is what happens:

{code}
TS[A1] -> ...
TS[A2] -> JOIN
TS[B] -> JOIN
{code}

we have an SJ edge between TS[B] -> TS[A2] to communicate informations about 
the join keys; lets assume the reducation ratio was r.


RemoveSemijoin right now does the following:
* removes the semijoin edge (so TS[A2] will become a full scan)
* merges TS[A1] and TS[A2]

w.r.t to read data from disk: this is great - we accessed A twice; from which 1 
was a full scan - and now we only read it once.

but from row traffic perspective: TS[A2] emits more rows from now on because we 
dont have the r ratio semijoin reduction anymore.
 





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   5   6   7   >