[jira] [Created] (HIVE-26605) Remove reviewer pattern
Zoltan Haindrich created HIVE-26605: --- Summary: Remove reviewer pattern Key: HIVE-26605 URL: https://issues.apache.org/jira/browse/HIVE-26605 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HIVE-26138) Fix mapjoin_memcheck
Zoltan Haindrich created HIVE-26138: --- Summary: Fix mapjoin_memcheck Key: HIVE-26138 URL: https://issues.apache.org/jira/browse/HIVE-26138 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich this test fails very frequently http://ci.hive.apache.org/job/hive-precommit/job/master/1169/testReport/junit/org.apache.hadoop.hive.cli.split7/TestCliDriver/Testing___split_01___PostProcess___testCliDriver_mapjoin_memcheck_/ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26135) Invalid Anti join conversion may cause missing results
Zoltan Haindrich created HIVE-26135: --- Summary: Invalid Anti join conversion may cause missing results Key: HIVE-26135 URL: https://issues.apache.org/jira/browse/HIVE-26135 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich right now I think the following is needed to trigger the issue: * left outer join * only select left hand side columns * conditional which is using some udf * the nullness of the udf is checked repro sql; in case the conversion happens the row with 'a' will be missing {code} drop table if exists t; drop table if exists n; create table t(a string) stored as orc; create table n(a string) stored as orc; insert into t values ('a'),('1'),('2'),(null); insert into n values ('a'),('b'),('1'),('3'),(null); explain select n.* from n left outer join t on (n.a=t.a) where assert_true(t.a is null) is null; explain select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is null; select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is null; set hive.auto.convert.anti.join=false; select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is null; {code} workaround could be to disable the feature: {code} set hive.auto.convert.anti.join=false; {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25994) Analyze table runs into ClassNotFoundException-s in case binary distribution is used
Zoltan Haindrich created HIVE-25994: --- Summary: Analyze table runs into ClassNotFoundException-s in case binary distribution is used Key: HIVE-25994 URL: https://issues.apache.org/jira/browse/HIVE-25994 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich any nightly release can be used to reproduce this: {code} create table t (a integer); insert into t values (1) ; analyze table t compute statistics for columns; {code} results in {code} Caused by: java.lang.NoClassDefFoundError: org/antlr/runtime/tree/CommonTree at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:757) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at java.net.URLClassLoader.access$100(URLClassLoader.java:74) at java.net.URLClassLoader$1.run(URLClassLoader.java:369) at java.net.URLClassLoader$1.run(URLClassLoader.java:363) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:362) at java.lang.ClassLoader.loadClass(ClassLoader.java:419) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:352) at java.lang.Class.getDeclaredConstructors0(Native Method) at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671) at java.lang.Class.getConstructor0(Class.java:3075) at java.lang.Class.getDeclaredConstructor(Class.java:2178) at org.apache.hive.com.esotericsoftware.reflectasm.ConstructorAccess.get(ConstructorAccess.java:65) at org.apache.hive.com.esotericsoftware.kryo.util.DefaultInstantiatorStrategy.newInstantiatorOf(DefaultInstantiatorStrategy.java:60) at org.apache.hive.com.esotericsoftware.kryo.Kryo.newInstantiator(Kryo.java:1119) at org.apache.hive.com.esotericsoftware.kryo.Kryo.newInstance(Kryo.java:1128) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.create(FieldSerializer.java:153) at org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:118) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:729) at org.apache.hadoop.hive.ql.exec.SerializationUtilities$KryoWithHooks.readObject(SerializationUtilities.java:216) at org.apache.hive.com.esotericsoftware.kryo.serializers.ReflectField.read(ReflectField.java:125) ... 38 more Caused by: java.lang.ClassNotFoundException: org.antlr.runtime.tree.CommonTree at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:419) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:352) {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25977) Enhance Compaction Cleaner to skip when there is nothing to do #2
Zoltan Haindrich created HIVE-25977: --- Summary: Enhance Compaction Cleaner to skip when there is nothing to do #2 Key: HIVE-25977 URL: https://issues.apache.org/jira/browse/HIVE-25977 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich initially this was just an addendum to the original patch ; but got delayed and altered - so it should have its own ticket -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25976) Cleaner may remove files being accessed from a fetch-task-converted reader
Zoltan Haindrich created HIVE-25976: --- Summary: Cleaner may remove files being accessed from a fetch-task-converted reader Key: HIVE-25976 URL: https://issues.apache.org/jira/browse/HIVE-25976 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich in a nutshell the following happens: * query is compiled in fetch-task-converted mode * no real execution happensbut the locks are released * the HS2 is communicating with the client and uses the fetch-task to get the rows - which in this case will directly read files from the table's directory * client sleeps between reads - so there is ample time for other events... * cleaner wakes up and removes some files * in the next read the fetch-task encounters a read error... -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25944) Format pom.xml-s
Zoltan Haindrich created HIVE-25944: --- Summary: Format pom.xml-s Key: HIVE-25944 URL: https://issues.apache.org/jira/browse/HIVE-25944 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich at the moment I touch pom.xml-s with xmlstarlet it starts fixing indentation which makes seeing real diffs harder. fix and enforce that the pom.xmls are indented correctly -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25883) Enhance Compaction Cleaner to skip when there is nothing to do
Zoltan Haindrich created HIVE-25883: --- Summary: Enhance Compaction Cleaner to skip when there is nothing to do Key: HIVE-25883 URL: https://issues.apache.org/jira/browse/HIVE-25883 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich the cleaner works the following way: * it identifies obsolete directories (delta dirs ; which doesn't have open txns) * removes them and done if there are no obsolete directoris that is attributed to that there might be open txns so the request should be retried later. however if for some reason the directory was already cleaned - similarily it has no obsolete directories; and thus the request is retried for forever -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25874) Slow filter evaluation of nest struct fields in vectorized executions
Zoltan Haindrich created HIVE-25874: --- Summary: Slow filter evaluation of nest struct fields in vectorized executions Key: HIVE-25874 URL: https://issues.apache.org/jira/browse/HIVE-25874 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich {code:java} create table t as select named_struct('id',13,'str','string','nest',named_struct('id',12,'str','string','arr',array('value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value','value'))) s; -- go up to 1M rows insert into table t select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t; insert into table t select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t; insert into table t select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t; insert into table t select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t; insert into table t select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t; -- insert into table t select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t union all select * from t; set hive.fetch.task.conversion=none; select count(1) from t; --explain select s .id from t where s .nest .id > 0; {code} interestingly; the issue is not present: * for a query not looking into the nested struct * and in case the struct with the array is at the top level {code} select count(1) from t; --explain select s .id from t where s -- .nest .id > 0; {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25844) Exception deserialization error-s may cause beeline to terminate immediately
Zoltan Haindrich created HIVE-25844: --- Summary: Exception deserialization error-s may cause beeline to terminate immediately Key: HIVE-25844 URL: https://issues.apache.org/jira/browse/HIVE-25844 Project: Hive Issue Type: Bug Components: Beeline Affects Versions: 3.1.2 Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich the exception on the server side happens: * fetch task conversion is on * there is an exception during reading the table the error bubbles up * => transmits a message to beeline that error class name is: "org.apache.phoenix.schema.ColumnNotFoundException" + the message * it tries to reconstruct the exception around HiveSqlException * but during the constructor call org.apache.phoenix.exception.SQLExceptionCode is needed which fails to load org/apache/hadoop/hbase/shaded/com/google/protobuf/Service * a java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/shaded/com/google/protobuf/Service is thrown - which is not handled in that method - so it becomes a real error ; and shuts down the client {code:java} java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/shaded/com/google/protobuf/Service [...] at java.lang.Class.forName(Class.java:264) at org.apache.hive.service.cli.HiveSQLException.newInstance(HiveSQLException.java:245) at org.apache.hive.service.cli.HiveSQLException.toStackTrace(HiveSQLException.java:211) [...] Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.shaded.com.google.protobuf.Service [...] {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25823) Incorrect false positive results for outer join using non-satisfiable residual filters
Zoltan Haindrich created HIVE-25823: --- Summary: Incorrect false positive results for outer join using non-satisfiable residual filters Key: HIVE-25823 URL: https://issues.apache.org/jira/browse/HIVE-25823 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich similar to HIVE-25822 {code} create table t_y (id integer,s string); create table t_xy (id integer,s string); insert into t_y values(0,'a'),(1,'y'),(1,'x'); insert into t_xy values(1,'x'),(1,'y'); select * from t_xy l full outer join t_y r on (l.id=r.id and l.s='y' and l.id+2*r.id=1); {code} the rows with full of NULLs are incorrect {code} +---+---+---+---+ | l.id | l.s | r.id | r.s | +---+---+---+---+ | NULL | NULL | 0 | a | | NULL | NULL | NULL | NULL | | 1 | y | NULL | NULL | | NULL | NULL | NULL | NULL | | NULL | NULL | 1 | y | | NULL | NULL | 1 | x | +---+---+---+---+ {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25822) Unexpected result rows in case of outer join contains conditions only affecting one side
Zoltan Haindrich created HIVE-25822: --- Summary: Unexpected result rows in case of outer join contains conditions only affecting one side Key: HIVE-25822 URL: https://issues.apache.org/jira/browse/HIVE-25822 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich needed * outer join * on condition has at least one condition for one side of the join * in a single reducer: ** a right hand side only row outputted right before ** >=2 rows on LHS and 1 on RHS matching in the join keys but the first LHS doesn't satisfies the filter condition ** second LHS row with good filter condition {code} with t_y as (select col1 as id,col2 as s from (VALUES(0,'a'),(1,'y')) as c), t_xy as (select col1 as id,col2 as s from (VALUES(1,'x'),(1,'y')) as c) select * from t_xy l full outer join t_y r on (l.id=r.id and l.s='y'); {code} null,null,1,y is an unexpected result {code} +---+---+---+---+ | l.id | l.s | r.id | r.s | +---+---+---+---+ | NULL | NULL | 0 | a | | 1 | x | NULL | NULL | | NULL | NULL | 1 | y | | 1 | y | 1 | y | +---+---+---+---+ {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25820) Provide a way to disable join filters
Zoltan Haindrich created HIVE-25820: --- Summary: Provide a way to disable join filters Key: HIVE-25820 URL: https://issues.apache.org/jira/browse/HIVE-25820 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25792) Multi Insert query fails on CBO path
Zoltan Haindrich created HIVE-25792: --- Summary: Multi Insert query fails on CBO path Key: HIVE-25792 URL: https://issues.apache.org/jira/browse/HIVE-25792 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich {code} set hive.cbo.enable=true; drop table if exists aa1; drop table if exists bb1; drop table if exists cc1; drop table if exists dd1; drop table if exists ee1; drop table if exists ff1; create table aa1 ( stf_id string); create table bb1 ( stf_id string); create table cc1 ( stf_id string); create table ff1 ( x string); explain from ff1 as a join cc1 as b insert overwrite table aa1 select stf_id GROUP BY b.stf_id insert overwrite table bb1 select b.stf_id GROUP BY b.stf_id ; {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25791) Improve SFS exception messages
Zoltan Haindrich created HIVE-25791: --- Summary: Improve SFS exception messages Key: HIVE-25791 URL: https://issues.apache.org/jira/browse/HIVE-25791 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich Especially for cases when the path is already known to be invalid; like: `sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#` -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25780) DistinctExpansion creates more than 64 grouping sets II
Zoltan Haindrich created HIVE-25780: --- Summary: DistinctExpansion creates more than 64 grouping sets II Key: HIVE-25780 URL: https://issues.apache.org/jira/browse/HIVE-25780 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich HIVE-25498 have fixed this when there are only count(distinct x) queries. however after the rewrite happens grouping sets are used to handle group by columns as well -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25770) AST is corrupted after CBO fallback for CTAS queries
Zoltan Haindrich created HIVE-25770: --- Summary: AST is corrupted after CBO fallback for CTAS queries Key: HIVE-25770 URL: https://issues.apache.org/jira/browse/HIVE-25770 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Attachments: repro.q reproduce: * revert ec44c6081c88b81245185fa6a552d8c3631e47fa to force cbo fallbacks for >64 grouping sets * use repro.q test * the query would run with cbo turned off * but with cbo enabled it would fail in conservative mode as well -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25752) Fix incremental compilation of parser module
Zoltan Haindrich created HIVE-25752: --- Summary: Fix incremental compilation of parser module Key: HIVE-25752 URL: https://issues.apache.org/jira/browse/HIVE-25752 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich this issue doesn't happen all the time - but when it does its really annoying the problem is that the antlr files are not regenerated; however the "HiveParser.java Fix" is run regardless...which corrupts the java files after a second run and causes compilation errors {code} [INFO] --- antlr3-maven-plugin:3.5.2:antlr (default) @ hive-parser --- [INFO] ANTLR: Processing source directory /home/dev/hive/parser/src/java ANTLR Parser Generator Version 3.5.2 Grammar /home/dev/hive/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g is up to date - build skipped Grammar /home/dev/hive/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g is up to date - build skipped Grammar /home/dev/hive/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexerStandard.g is up to date - build skipped Grammar /home/dev/hive/parser/src/java/org/apache/hadoop/hive/ql/parse/HintParser.g is up to date - build skipped [INFO] [INFO] --- exec-maven-plugin:3.0.0:exec (HiveParser.java fix) @ hive-parser --- [INFO] {code} erros like: {code} [ERROR] /home/dev/hive/parser/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HiveParser.java:[50,16] class, interface, or enum expected {code} but I've also seen {code} [ERROR] /home/dev/hive/parser/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HiveParser.java:[49,32] cannot find symbol [ERROR] symbol: class statement_return [ERROR] location: class org.apache.hadoop.hive.ql.parse.HiveParser [ERROR] /home/dev/hive/parser/target/generated-sources/antlr3/org/apache/hadoop/hive/ql/parse/HiveParserTokens.java:[13,19] cannot find symbol {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25748) Investigate Union comparision
Zoltan Haindrich created HIVE-25748: --- Summary: Investigate Union comparision Key: HIVE-25748 URL: https://issues.apache.org/jira/browse/HIVE-25748 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich both of the following cases change the "non-used" part of the union (note: `create_union(idx,o0,o1)` creates a union which uses the `idx`-th object) {code} SELECT (NULLIF(create_union(0,1,2),create_union(0,1,3)) is not null); false SELECT (NULLIF(create_union(0,1,2),create_union(1,2,1)) is not null); true {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25738) NullIf doesn't support complex types
Zoltan Haindrich created HIVE-25738: --- Summary: NullIf doesn't support complex types Key: HIVE-25738 URL: https://issues.apache.org/jira/browse/HIVE-25738 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich {code} SELECT NULLIF(array(1,2,3),array(1,2,3)) {code} results in: {code} java.lang.ClassCastException: org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector cannot be cast to org.apache.hadoop.hive.serde2.objectinspector.PrimitiveObjectInspector at org.apache.hadoop.hive.ql.udf.generic.GenericUDFNullif.evaluate(GenericUDFNullif.java:96) at org.apache.hadoop.hive.ql.udf.generic.GenericUDF.initializeAndFoldConstants(GenericUDF.java:177) at org.apache.hadoop.hive.ql.parse.type.HiveFunctionHelper.getReturnType(HiveFunctionHelper.java:135) at org.apache.hadoop.hive.ql.parse.type.RexNodeExprFactory.createFuncCallExpr(RexNodeExprFactory.java:647) [...] {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25735) Improve statestimator in UDFWhen/UDFCase
Zoltan Haindrich created HIVE-25735: --- Summary: Improve statestimator in UDFWhen/UDFCase Key: HIVE-25735 URL: https://issues.apache.org/jira/browse/HIVE-25735 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25732) Improve HLL insert performance
Zoltan Haindrich created HIVE-25732: --- Summary: Improve HLL insert performance Key: HIVE-25732 URL: https://issues.apache.org/jira/browse/HIVE-25732 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich HIVE-23095 have fixed a correctness issue and removed a temporary list which supposed to speed up the algorithm and thus it suffered some performance degradation. There are ways to put back some of that stuff; or consider other options to gain back the lost performance - now that the bug is fixed it should be a performance only improvement ticket. It would be interesting to know how much time we spend on updating this DS during a large insert to know the weight of such an improvement. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25725) Upgrade used docker-in-docker container version
Zoltan Haindrich created HIVE-25725: --- Summary: Upgrade used docker-in-docker container version Key: HIVE-25725 URL: https://issues.apache.org/jira/browse/HIVE-25725 Project: Hive Issue Type: Improvement Components: Testing Infrastructure Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich in HIVE-25714 I came to the conclusion that there might be something wrong with dind - upgrading it would be the first step.. and while doing so the storage driver should be checked if its appropriate/etc -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25720) Fix flaky test TestScheduledReplicationScenarios
Zoltan Haindrich created HIVE-25720: --- Summary: Fix flaky test TestScheduledReplicationScenarios Key: HIVE-25720 URL: https://issues.apache.org/jira/browse/HIVE-25720 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich failed at the first attempt; the issue happened during {code} drop scheduled query repl_load_p2 {code} which is in a finally block ; so this exception may be shadowing another exception http://ci.hive.apache.org/job/hive-flaky-check/463/ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25719) Fix flaky test TestMiniLlapLocalCliDriver#testCliDriver[replication_metrics_ingest]
Zoltan Haindrich created HIVE-25719: --- Summary: Fix flaky test TestMiniLlapLocalCliDriver#testCliDriver[replication_metrics_ingest] Key: HIVE-25719 URL: https://issues.apache.org/jira/browse/HIVE-25719 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich flaky checker failed after 3 attempts with a q.out difference there seems to be some ID difference - maybe 2 events happened in a different order? http://ci.hive.apache.org/job/hive-flaky-check/465/testReport/junit/org.apache.hadoop.hive.cli/TestMiniLlapLocalCliDriver/testCliDriver_replication_metrics_ingest_/ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25715) Provide nightly builds
Zoltan Haindrich created HIVE-25715: --- Summary: Provide nightly builds Key: HIVE-25715 URL: https://issues.apache.org/jira/browse/HIVE-25715 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich provide nightly builds for the master branch -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25714) Some tests are flaky because docker is not able to start in 5 seconds
Zoltan Haindrich created HIVE-25714: --- Summary: Some tests are flaky because docker is not able to start in 5 seconds Key: HIVE-25714 URL: https://issues.apache.org/jira/browse/HIVE-25714 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich there are some testruns failing with; and on the test site multiple pods are running in parallel - its not an ideal environment for tight deadlines {code} Unexpected exception java.lang.RuntimeException: Process docker failed to run in 5 seconds at org.apache.hadoop.hive.ql.externalDB.AbstractExternalDB.runCmd(AbstractExternalDB.java:92) at org.apache.hadoop.hive.ql.externalDB.AbstractExternalDB.launchDockerContainer(AbstractExternalDB.java:123) at org.apache.hadoop.hive.ql.qoption.QTestDatabaseHandler.beforeTest(QTestDatabaseHandler.java:111) at org.apache.hadoop.hive.ql.qoption.QTestOptionDispatcher.beforeTest(QTestOptionDispatcher.java:79) {code} http://ci.hive.apache.org/job/hive-precommit/job/PR-1674/4/testReport/junit/org.apache.hadoop.hive.cli.split19/TestMiniLlapLocalCliDriver/Testing___split_14___PostProcess___testCliDriver_qt_database_all_/ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25713) Fix test TestLlapTaskSchedulerService#testPreemption
Zoltan Haindrich created HIVE-25713: --- Summary: Fix test TestLlapTaskSchedulerService#testPreemption Key: HIVE-25713 URL: https://issues.apache.org/jira/browse/HIVE-25713 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich when this test passes it passes under 100ms - but when it fails it keeps waiting or more than 10 seconds - the test seem to be using singal/await http://ci.hive.apache.org/job/hive-flaky-check/462/ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25712) Fix test TestContribCliDriver#testCliDriver[url_hook]
Zoltan Haindrich created HIVE-25712: --- Summary: Fix test TestContribCliDriver#testCliDriver[url_hook] Key: HIVE-25712 URL: https://issues.apache.org/jira/browse/HIVE-25712 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich The test makes use of SampleURLHook - which could change the JDO url http://ci.hive.apache.org/job/hive-flaky-check/460/ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25711) Make Table#isEmpty more efficient
Zoltan Haindrich created HIVE-25711: --- Summary: Make Table#isEmpty more efficient Key: HIVE-25711 URL: https://issues.apache.org/jira/browse/HIVE-25711 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich [~stevel] suggested in another ticket that we could make our isEmpty method faster: https://issues.apache.org/jira/browse/HIVE-24849?focusedCommentId=17372145&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17372145 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25707) SchemaTool may leave the metastore in-between upgrade steps
Zoltan Haindrich created HIVE-25707: --- Summary: SchemaTool may leave the metastore in-between upgrade steps Key: HIVE-25707 URL: https://issues.apache.org/jira/browse/HIVE-25707 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich it seems like: * schematool runs the sql files via beeline * autocommit is turned on * pressing ctrl+c or killing the process will result in an invalid schema https://github.com/apache/hive/blob/6e02f6164385a370ee8014c795bee1fa423d7937/beeline/src/java/org/apache/hive/beeline/schematool/HiveSchemaTool.java#L79 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25703) Postgres metastore test failures
Zoltan Haindrich created HIVE-25703: --- Summary: Postgres metastore test failures Key: HIVE-25703 URL: https://issues.apache.org/jira/browse/HIVE-25703 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich all recent builds are failing because postgres metastore don't start underlying issue is that the docker container can't start because of: ``` ls: cannot access '/docker-entrypoint-initdb.d/': Operation not permitted ``` -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25692) ExceptionHandler may mask checked exceptions
Zoltan Haindrich created HIVE-25692: --- Summary: ExceptionHandler may mask checked exceptions Key: HIVE-25692 URL: https://issues.apache.org/jira/browse/HIVE-25692 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich HIVE-25055 have changed the way exceptions as rethrowed - but one of the methods may let checked exception out without them being declared on the method (and avoid the compile time error for it) testcase for: org.apache.hadoop.hive.metastore.TestExceptionHandler {code} @Test public void testInvalid() throws MetaException { try { throw new IOException("IOException test"); } catch (Exception e) { throw handleException(e).throwIfInstance(AccessControlException.class, IOException.class).defaultMetaException(); } } {code} this testcase should not compile - as it may throw IOException or AccessControlException as well -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-25634) Eclipse compiler bumps into AIOBE during ObjectStore compilation
Zoltan Haindrich created HIVE-25634: --- Summary: Eclipse compiler bumps into AIOBE during ObjectStore compilation Key: HIVE-25634 URL: https://issues.apache.org/jira/browse/HIVE-25634 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich this issue seem to have started appearing after HIVE-23633 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25633) Prevent shutdown of MetaStore scheduled worker ThreadPool
Zoltan Haindrich created HIVE-25633: --- Summary: Prevent shutdown of MetaStore scheduled worker ThreadPool Key: HIVE-25633 URL: https://issues.apache.org/jira/browse/HIVE-25633 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich [~lpinter] have noticed that this patch has some sideffect: in HIVE-23164 the patch have added a {{ThreadPool#shutdown}} to {{HMSHandler#shutdown}} - which could cause trouble in case a {{HMSHandler}} is shutdown and a new one is created I was looking for cases in which a HMSHandler is created inside the metastore (beyond the one HiveMetaStore is using) - and I think tasks like Msck use it to access the metastore - and they close the client - which closes the hmshandler client ; which will shut down the threadpool -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25630) Translator fixes
Zoltan Haindrich created HIVE-25630: --- Summary: Translator fixes Key: HIVE-25630 URL: https://issues.apache.org/jira/browse/HIVE-25630 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich there are some issues: * AlreadyExistsException might be suppressed by the translator * uppercase letter usage may cause problems for some clients * add a way to suppress location checks for legacy clients -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25569) Enable table definition over a single file
Zoltan Haindrich created HIVE-25569: --- Summary: Enable table definition over a single file Key: HIVE-25569 URL: https://issues.apache.org/jira/browse/HIVE-25569 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich Suppose there is a directory where multiple files are present - and by a 3rd party database system this is perfectly normal - because its treating a single file as the contents of the table. Tables defined in the metastore follow a different principle - tables are considered to be under a directory - and all files under that directory are the contents of that directory. To enable seamless migration/evaluation of Hive and other databases using HMS as a metadatabackend the ability to define a table over a single file would be usefull. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25531) Remove the core classified hive-exec artifact
Zoltan Haindrich created HIVE-25531: --- Summary: Remove the core classified hive-exec artifact Key: HIVE-25531 URL: https://issues.apache.org/jira/browse/HIVE-25531 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich * this artifact was introduced in HIVE-7423 * loading this artifact and the shaded hive-exec (along with the jdbc driver) could create interesting classpath problems * if other projects have issues with the shaded hive-exec artifact we must start fix those problems -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25508) Partitioned tables created with CTAS queries doesnt have lineage informations
Zoltan Haindrich created HIVE-25508: --- Summary: Partitioned tables created with CTAS queries doesnt have lineage informations Key: HIVE-25508 URL: https://issues.apache.org/jira/browse/HIVE-25508 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25485) Transform selects of literals under a UNION ALL to inline table scan
Zoltan Haindrich created HIVE-25485: --- Summary: Transform selects of literals under a UNION ALL to inline table scan Key: HIVE-25485 URL: https://issues.apache.org/jira/browse/HIVE-25485 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich {code} select 1 union all select 1 union all [...] union all select 1 {code} results in a very big plan; which will have vertexes proportional to the number of union all branch - hence it could be slow to execute it -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25404) Inserts inside merge statements are rewritten incorrectly for partitioned tables
Zoltan Haindrich created HIVE-25404: --- Summary: Inserts inside merge statements are rewritten incorrectly for partitioned tables Key: HIVE-25404 URL: https://issues.apache.org/jira/browse/HIVE-25404 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich {code} drop table u;drop table t; create table t(value string default 'def') partitioned by (id integer); create table u(id integer); {code} #1 id&value specified rewritten {code} FROM `default`.`t` RIGHT OUTER JOIN `default`.`u` ON `t`.`id`=`u`.`id` INSERT INTO `default`.`t` (`id`,`value`) partition (`id`)-- insert clause SELECT `u`.`id`,'x' WHERE `t`.`id` IS NULL {code} it should be {code} [...] INSERT INTO `default`.`t` partition (`id`) (`value`)-- insert clause [...] {code} #2 when values is not specified {code} merge into t using u on t.id=u.id when not matched then insert (id) values (u.id); {code} rewritten query: {code} FROM `default`.`t` RIGHT OUTER JOIN `default`.`u` ON `t`.`id`=`u`.`id` INSERT INTO `default`.`t` (`id`) partition (`id`)-- insert clause SELECT `u`.`id` WHERE `t`.`id` IS NULL {code} it should be {code} [...] INSERT INTO `default`.`t` partition (`id`) ()-- insert clause [...] {code} however we don't accept empty column lists -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25395) Update hadoop to a more recent version
Zoltan Haindrich created HIVE-25395: --- Summary: Update hadoop to a more recent version Key: HIVE-25395 URL: https://issues.apache.org/jira/browse/HIVE-25395 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich we are still depending on hadoop 3.1.0 which doesn't have source attachments - and makes development harder -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25378) Enable removal of old builds on hive ci
Zoltan Haindrich created HIVE-25378: --- Summary: Enable removal of old builds on hive ci Key: HIVE-25378 URL: https://issues.apache.org/jira/browse/HIVE-25378 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich We are using the github plugin to run builds on PRs However to remove old builds that plugin needs to have periodic branch scanning enabled - however since we also use the plugins merge mechanism; this will cause to rediscover all open PRs after there is a new commit on the target branch. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25370) Improve SharedWorkOptimizer performance
Zoltan Haindrich created HIVE-25370: --- Summary: Improve SharedWorkOptimizer performance Key: HIVE-25370 URL: https://issues.apache.org/jira/browse/HIVE-25370 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich for queries which are unioning ~800 constant rows the SWO is doing around n*n/2 operations trying to find 2 TS-es which could be merged {code} select constants UNION ALL ... UNION ALL select constants {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25313) Upgrade commons-codec to 1.15
Zoltan Haindrich created HIVE-25313: --- Summary: Upgrade commons-codec to 1.15 Key: HIVE-25313 URL: https://issues.apache.org/jira/browse/HIVE-25313 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25312) Upgrade netty to 4.1.65.Final
Zoltan Haindrich created HIVE-25312: --- Summary: Upgrade netty to 4.1.65.Final Key: HIVE-25312 URL: https://issues.apache.org/jira/browse/HIVE-25312 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25311) Slow compilation of union operators with >100 branches
Zoltan Haindrich created HIVE-25311: --- Summary: Slow compilation of union operators with >100 branches Key: HIVE-25311 URL: https://issues.apache.org/jira/browse/HIVE-25311 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich during the processing of an N way union operator the full plan is cloned N times; which might hurt compilation time performance -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25290) Stabilize TestTxnHandler
Zoltan Haindrich created HIVE-25290: --- Summary: Stabilize TestTxnHandler Key: HIVE-25290 URL: https://issues.apache.org/jira/browse/HIVE-25290 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich http://ci.hive.apache.org/job/hive-flaky-check/271/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25289) Fix external_jdbc_table3 and external_jdbc_table4
Zoltan Haindrich created HIVE-25289: --- Summary: Fix external_jdbc_table3 and external_jdbc_table4 Key: HIVE-25289 URL: https://issues.apache.org/jira/browse/HIVE-25289 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich http://ci.hive.apache.org/job/hive-flaky-check/265/ http://ci.hive.apache.org/job/hive-flaky-check/266/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25288) Fix TestMmCompactorOnTez
Zoltan Haindrich created HIVE-25288: --- Summary: Fix TestMmCompactorOnTez Key: HIVE-25288 URL: https://issues.apache.org/jira/browse/HIVE-25288 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich http://ci.hive.apache.org/job/hive-flaky-check/240/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25285) Retire HiveProjectJoinTransposeRule
Zoltan Haindrich created HIVE-25285: --- Summary: Retire HiveProjectJoinTransposeRule Key: HIVE-25285 URL: https://issues.apache.org/jira/browse/HIVE-25285 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich we don't neccessary need our own rule anymore - a plain ProjectJoinTransposeRule could probably work -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25278) HiveProjectJoinTransposeRule may do invalid transformations with windowing expressions
Zoltan Haindrich created HIVE-25278: --- Summary: HiveProjectJoinTransposeRule may do invalid transformations with windowing expressions Key: HIVE-25278 URL: https://issues.apache.org/jira/browse/HIVE-25278 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich running {code} create table table1 (acct_num string, interest_rate decimal(10,7)) stored as orc; create table table2 (act_id string) stored as orc; CREATE TABLE temp_output AS SELECT act_nbr, row_num FROM (SELECT t2.act_id as act_nbr, row_number() over (PARTITION BY trim(acct_num) ORDER BY interest_rate DESC) AS row_num FROM table1 t1 INNER JOIN table2 t2 ON trim(acct_num) = t2.act_id) t WHERE t.row_num = 1; {code} may result in error like: {code} Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Invalid column reference 'interest_rate': (possible column names are: interest_rate, trim) (state=42000,code=4) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25267) Fix TestReplicationScenariosAcidTables
Zoltan Haindrich created HIVE-25267: --- Summary: Fix TestReplicationScenariosAcidTables Key: HIVE-25267 URL: https://issues.apache.org/jira/browse/HIVE-25267 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich test is unstable http://ci.hive.apache.org/job/hive-flaky-check/242/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25266) Fix TestWarehouseExternalDir
Zoltan Haindrich created HIVE-25266: --- Summary: Fix TestWarehouseExternalDir Key: HIVE-25266 URL: https://issues.apache.org/jira/browse/HIVE-25266 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich test is unstable http://ci.hive.apache.org/job/hive-flaky-check/244/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25265) Fix TestHiveIcebergStorageHandlerWithEngine
Zoltan Haindrich created HIVE-25265: --- Summary: Fix TestHiveIcebergStorageHandlerWithEngine Key: HIVE-25265 URL: https://issues.apache.org/jira/browse/HIVE-25265 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich test is unstable: http://ci.hive.apache.org/job/hive-flaky-check/251/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25250) Fix TestHS2ImpersonationWithRemoteMS.testImpersonation
Zoltan Haindrich created HIVE-25250: --- Summary: Fix TestHS2ImpersonationWithRemoteMS.testImpersonation Key: HIVE-25250 URL: https://issues.apache.org/jira/browse/HIVE-25250 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich http://ci.hive.apache.org/job/hive-flaky-check/235/testReport/org.apache.hive.service/TestHS2ImpersonationWithRemoteMS/testImpersonation/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25249) Fix TestWorker
Zoltan Haindrich created HIVE-25249: --- Summary: Fix TestWorker Key: HIVE-25249 URL: https://issues.apache.org/jira/browse/HIVE-25249 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich http://ci.hive.apache.org/job/hive-precommit/job/PR-2381/1/ http://ci.hive.apache.org/job/hive-flaky-check/236/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25248) Fix .TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1
Zoltan Haindrich created HIVE-25248: --- Summary: Fix .TestLlapTaskSchedulerService#testForcedLocalityMultiplePreemptionsSameHost1 Key: HIVE-25248 URL: https://issues.apache.org/jira/browse/HIVE-25248 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich This test is failing randomly recently http://ci.hive.apache.org/job/hive-flaky-check/233/testReport/org.apache.hadoop.hive.llap.tezplugins/TestLlapTaskSchedulerService/testForcedLocalityMultiplePreemptionsSameHost1/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25247) Fix TestWMMetricsWithTrigger
Zoltan Haindrich created HIVE-25247: --- Summary: Fix TestWMMetricsWithTrigger Key: HIVE-25247 URL: https://issues.apache.org/jira/browse/HIVE-25247 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich this test seems to be unstable: http://ci.hive.apache.org/job/hive-flaky-check/226/ it was introduced by HIVE-24803 a few months ago cc: [~gupta.nikhil0007] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25224) Multi insert statements involving tables with different bucketing_versions results in error
Zoltan Haindrich created HIVE-25224: --- Summary: Multi insert statements involving tables with different bucketing_versions results in error Key: HIVE-25224 URL: https://issues.apache.org/jira/browse/HIVE-25224 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich {code} drop table if exists t; drop table if exists t2; drop table if exists t3; create table t (a integer); create table t2 (a integer); create table t3 (a integer); alter table t set tblproperties ('bucketing_version'='1'); explain from t3 insert into t select a insert into t2 select a; {code} results in {code} Error: Error while compiling statement: FAILED: RuntimeException Error setting bucketingVersion for group: [[op: FS[2], bucketingVersion=1], [op: FS[11], bucketingVersion=2]] (state=42000,code=4) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25180) Update netty to 4.1.60.Final
Zoltan Haindrich created HIVE-25180: --- Summary: Update netty to 4.1.60.Final Key: HIVE-25180 URL: https://issues.apache.org/jira/browse/HIVE-25180 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25171) Use ACID_HOUSEKEEPER_SERVICE_START
Zoltan Haindrich created HIVE-25171: --- Summary: Use ACID_HOUSEKEEPER_SERVICE_START Key: HIVE-25171 URL: https://issues.apache.org/jira/browse/HIVE-25171 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich seems to be unused right now -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25138) Auto disable scheduled queries after repeated failures
Zoltan Haindrich created HIVE-25138: --- Summary: Auto disable scheduled queries after repeated failures Key: HIVE-25138 URL: https://issues.apache.org/jira/browse/HIVE-25138 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25044) Parallel edge fixer may not be able to process semijoin edges
Zoltan Haindrich created HIVE-25044: --- Summary: Parallel edge fixer may not be able to process semijoin edges Key: HIVE-25044 URL: https://issues.apache.org/jira/browse/HIVE-25044 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich SJ filter edges are removed from the main operator graph - which could cause that a parallel edge remains after the remover was executed -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25036) Unstable testcase script_broken_pipe2
Zoltan Haindrich created HIVE-25036: --- Summary: Unstable testcase script_broken_pipe2 Key: HIVE-25036 URL: https://issues.apache.org/jira/browse/HIVE-25036 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich http://ci.hive.apache.org/job/hive-flaky-check/224/ {code} Client Execution succeeded but contained differences (error code = 1) after executing script_broken_pipe2.q 24c24 < Caused by: java.io.IOException: Broken pipe --- > Caused by: java.io.IOException: Stream closed 46c46 < Caused by: java.io.IOException: Broken pipe --- > Caused by: java.io.IOException: Stream closed 49,58d48 < FAILED: AssertionError java.lang.AssertionError: Client Execution succeeded but contained differences (error code = 1) after executing script_broken_pipe2.q < 24c24 < < Caused by: java.io.IOException: Broken pipe < --- < > Caused by: java.io.IOException: Stream closed < 46c46 < < Caused by: java.io.IOException: Broken pipe < --- < > Caused by: java.io.IOException: Stream closed < {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-25029) Remove travis builds
Zoltan Haindrich created HIVE-25029: --- Summary: Remove travis builds Key: HIVE-25029 URL: https://issues.apache.org/jira/browse/HIVE-25029 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich travis only compiles the project - we already do much more than that during precommit testing. (and it it sometimes delays build because travis cant allocate executors/etc) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24986) Support aggregates on columns present in rollups
Zoltan Haindrich created HIVE-24986: --- Summary: Support aggregates on columns present in rollups Key: HIVE-24986 URL: https://issues.apache.org/jira/browse/HIVE-24986 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich {code} SELECT key, value, count(key) FROM src GROUP BY key, value with rollup; {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24979) Tests should not load confs from places like /etc/hive/hive-site.xml
Zoltan Haindrich created HIVE-24979: --- Summary: Tests should not load confs from places like /etc/hive/hive-site.xml Key: HIVE-24979 URL: https://issues.apache.org/jira/browse/HIVE-24979 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich for example: TestEmbeddedHiveMetaStore may load a value for the metastore.metadata.transformer.class key from /etc/hive/hive-site.xml -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24963) Windowing expression may loose its input in some cases
Zoltan Haindrich created HIVE-24963: --- Summary: Windowing expression may loose its input in some cases Key: HIVE-24963 URL: https://issues.apache.org/jira/browse/HIVE-24963 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich {code} drop table if exists sss; CREATE TABLE `sss`( `user_id` bigint, `user_mid` string ) PARTITIONED BY ( `dt` string) STORED AS ORC ; insert into sss partition(dt='part1') VALUES (12345,'user_mid v1'),(12345,'user_mid v1'),(12345,'user_mid v1'),(12345,'user_mid v1'),(12345,'user_mid v1'); set hive.auto.convert.join.noconditionaltask.size=1; WITH unioned_user AS ( SELECT *, row_number() OVER (PARTITION BY user_mid ORDER BY dt ASC) AS r_asc, row_number() OVER (PARTITION BY user_mid ORDER BY dt DESC) AS r_desc FROM ( SELECT DISTINCT dt, user_mid FROM sss WHERE dt = '20210228' UNION ALL SELECT DISTINCT dt, user_mid FROM sss ) AS uni ), merged_user AS ( SELECT a.user_mid FROM (SELECT * FROM unioned_user WHERE r_asc = 1) AS a INNER JOIN (SELECT * FROM unioned_user WHERE r_desc = 1) AS d ON a.user_mid = d.user_mid ) Select count(*) from merged_user; {cdode} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24954) MetastoreTransformer is disabled during testing
Zoltan Haindrich created HIVE-24954: --- Summary: MetastoreTransformer is disabled during testing Key: HIVE-24954 URL: https://issues.apache.org/jira/browse/HIVE-24954 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich all calls are fortified with "isInTest" guards to avoid testing those calls (!@#$#) https://github.com/apache/hive/blob/86fa9b30fe347c7fc78a2930f4d20ece2e124f03/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java#L1647 this causes some wierd behaviour: out of the box hive installation creates TRANSLATED_TO_EXTERNAL external tables for plain CREATE TABLE commands meanwhile during when most testing is executed CREATE table creates regular MANAGED tables... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24940) PartitionPruner may reject partitionfilter expressions evaluating to unknown if the filter is safe
Zoltan Haindrich created HIVE-24940: --- Summary: PartitionPruner may reject partitionfilter expressions evaluating to unknown if the filter is safe Key: HIVE-24940 URL: https://issues.apache.org/jira/browse/HIVE-24940 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich {code} CREATE TABLE t1pstring (col1 string) PARTITIONED BY (p1 string); INSERT INTO t1pstring PARTITION (p1) VALUES ("2020","2020"),("2021","2021"),("2021_backup","2021_backup"),('',''),('9','9'),('_a','_a'),('a','a'); explain extended SELECT count(*) FROM t1pstring WHERE p1=; [...] | Truncated Path -> Alias: | | /t1pstring/p1=2021_backup [t1pstring] | | /t1pstring/p1= [t1pstring] | | /t1pstring/p1=_a [t1pstring] | | /t1pstring/p1=a [t1pstring] | [...] {code} note: * for all values which are interpretable as integers - the equals is evaluated and the result is false * for {{NULL}} values and for non-integer values ({{a}}) the comparision results a {{NULL}} which is retained * {{NULL}} -s are retained because there is a preprocessing step which removes functions which are dependant of non-partition columns as well - and replaces them with a {{NULL}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24939) PartitionPruner incorrectly assumes that all builtin expressions are metastore side supported
Zoltan Haindrich created HIVE-24939: --- Summary: PartitionPruner incorrectly assumes that all builtin expressions are metastore side supported Key: HIVE-24939 URL: https://issues.apache.org/jira/browse/HIVE-24939 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich in case ObjectStore is in use it results in logs like: {code} 2021-03-20 12:37:36,568 INFO org.apache.hadoop.hive.metastore.PartFilterExprUtil: [pool-6-thread-170]: Unable to make the expression tree from expression string [(UDFToDouble(p1) = 2021.0D)]Error parsing partition filter; lexer error: null; exception NoViableAltException(24@[]) {code} may occur in the HMS log; while the metastore call returns all partitions... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24920) TRANSLATED_TO_EXTERNAL tables may write to the same location
Zoltan Haindrich created HIVE-24920: --- Summary: TRANSLATED_TO_EXTERNAL tables may write to the same location Key: HIVE-24920 URL: https://issues.apache.org/jira/browse/HIVE-24920 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich {code} create table t (a integer); insert into t values(1); alter table t rename to t2; create table t (a integer); -- I expected an exception from this command (location already exists) but because its an external table no exception insert into t values(2); select * from t; -- shows 1 and 2 drop table t2;-- wipes out data location select * from t; -- empty resultset {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24841) Parallel edge fixer may run into NPE when RS is missing a duplicate column from the output schema
Zoltan Haindrich created HIVE-24841: --- Summary: Parallel edge fixer may run into NPE when RS is missing a duplicate column from the output schema Key: HIVE-24841 URL: https://issues.apache.org/jira/browse/HIVE-24841 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich This may mean that the RS has an incorrect schema - but that will be investigated separately -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24830) Revise RowSchema mutability usage
Zoltan Haindrich created HIVE-24830: --- Summary: Revise RowSchema mutability usage Key: HIVE-24830 URL: https://issues.apache.org/jira/browse/HIVE-24830 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich RowSchema is essentially a container class for a list of fields. * it can be constructed from a "list" * the list can be set * the list can be accessed none of the above methods try to protect the data inside; hence the following could easily happen: {code} s=o1.getSchema(); col=s.getCol("favourite") col.setInternalName("asd"); // will modify o1 schema newSchema.add(col); o2.setSchema(newSchema); o2.getSchema().get("asd").setInternalName("xxx"); // will modify o1 and o2 schema [...] {code} not sure how much of this is actually cruical; exploratory testrun revealed some cases https://github.com/apache/hive/pull/2019 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24829) CorrelationUtilities#replaceReduceSinkWithSelectOperator misses KEY mappings
Zoltan Haindrich created HIVE-24829: --- Summary: CorrelationUtilities#replaceReduceSinkWithSelectOperator misses KEY mappings Key: HIVE-24829 URL: https://issues.apache.org/jira/browse/HIVE-24829 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich * it totally misses KEY column mappings - at the case I was looking at the KEY columns were at the and...in case they are not referenced then it will be okay * the exprMap keys doesn't match the rowSchema -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24823) Fix ide error in BasePartitionEvaluator
Zoltan Haindrich created HIVE-24823: --- Summary: Fix ide error in BasePartitionEvaluator Key: HIVE-24823 URL: https://issues.apache.org/jira/browse/HIVE-24823 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24821) Restrict parallel edge creation for invertable RS operators
Zoltan Haindrich created HIVE-24821: --- Summary: Restrict parallel edge creation for invertable RS operators Key: HIVE-24821 URL: https://issues.apache.org/jira/browse/HIVE-24821 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich Apparently there are some cases in which the RS may do some other things as well - restricting is the first safest option -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24812) Disable sharedworkoptimizer remove semijoin by default
Zoltan Haindrich created HIVE-24812: --- Summary: Disable sharedworkoptimizer remove semijoin by default Key: HIVE-24812 URL: https://issues.apache.org/jira/browse/HIVE-24812 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich SJ removal backfired a bit when I was testing stuff - because of the additional opportunities paralleledges may enable ; because it will increased the shuffled memory amount and/or even make MJ broadcast inputs larger set hive.optimize.shared.work.semijoin=false by default for now right now it's better to leave dppunion to pick up these cases instead of removing the SJ fully - after HIVE-24376 we might enable it back -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24799) Rewise RS rowschema/colExprMap consistency in case of mapjoins
Zoltan Haindrich created HIVE-24799: --- Summary: Rewise RS rowschema/colExprMap consistency in case of mapjoins Key: HIVE-24799 URL: https://issues.apache.org/jira/browse/HIVE-24799 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich I've seen some odd things around MJ - so I added an if to skip those errors while I was fixing others. IIRC this this issue seemed more serious; the code did "know" that the RS had a bad schema and it went to the parent - I guess there was a reason to do that - and I guess fixing it won't be easy either. https://github.com/apache/hive/pull/1929#discussion_r579200496 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24771) Fix hang of TransactionalKafkaWriterTest
Zoltan Haindrich created HIVE-24771: --- Summary: Fix hang of TransactionalKafkaWriterTest Key: HIVE-24771 URL: https://issues.apache.org/jira/browse/HIVE-24771 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich this test seems to hang randomly - I've launched 3 checks against it - all of which started to hang after some time http://ci.hive.apache.org/job/hive-flaky-check/187/ http://ci.hive.apache.org/job/hive-flaky-check/188/ http://ci.hive.apache.org/job/hive-flaky-check/189/ {code} "main" #1 prio=5 os_prio=0 tid=0x7f1d5400a800 nid=0x31e waiting on condition [0x7f1d59381000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x894b3ed8> (a java.util.concurrent.CountDownLatch$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:837) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1308) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at org.apache.kafka.clients.producer.internals.TransactionalRequestResult.await(TransactionalRequestResult.java:56) at org.apache.hadoop.hive.kafka.HiveKafkaProducer.flushNewPartitions(HiveKafkaProducer.java:187) at org.apache.hadoop.hive.kafka.HiveKafkaProducer.flush(HiveKafkaProducer.java:123) at org.apache.hadoop.hive.kafka.TransactionalKafkaWriter.close(TransactionalKafkaWriter.java:189) at org.apache.hadoop.hive.kafka.TransactionalKafkaWriterTest.writeAndCommit(TransactionalKafkaWriterTest.java:182) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at org.junit.runners.ParentRunner.run(ParentRunner.java:413) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:377) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:138) at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:465) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:451) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24767) Stabilize dyn_part3.q test
Zoltan Haindrich created HIVE-24767: --- Summary: Stabilize dyn_part3.q test Key: HIVE-24767 URL: https://issues.apache.org/jira/browse/HIVE-24767 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich http://ci.hive.apache.org/job/hive-flaky-check/186/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24766) Fix TestScheduledReplication
Zoltan Haindrich created HIVE-24766: --- Summary: Fix TestScheduledReplication Key: HIVE-24766 URL: https://issues.apache.org/jira/browse/HIVE-24766 Project: Hive Issue Type: Bug Environment: test seems to be unstable http://ci.hive.apache.org/job/hive-flaky-check/184/ Reporter: Zoltan Haindrich -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24704) Ensure that all Operator column expressions refer to a column in the RowSchema
Zoltan Haindrich created HIVE-24704: --- Summary: Ensure that all Operator column expressions refer to a column in the RowSchema Key: HIVE-24704 URL: https://issues.apache.org/jira/browse/HIVE-24704 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich Hive Operators should satisfy that all keys of the columnExprMap must be present in the schema -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24700) Run Mssql integration tests during precommit
Zoltan Haindrich created HIVE-24700: --- Summary: Run Mssql integration tests during precommit Key: HIVE-24700 URL: https://issues.apache.org/jira/browse/HIVE-24700 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich add mssql to the jenkinsfile and run our metastore schema tests on it -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24699) Run Oracle integration tests during precommit
Zoltan Haindrich created HIVE-24699: --- Summary: Run Oracle integration tests during precommit Key: HIVE-24699 URL: https://issues.apache.org/jira/browse/HIVE-24699 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich this will need a working oracle docker image - and possibly some smaller changes to make sure that the tests could run reliable -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24678) Add feature toggle to control SWO parallel edge support
Zoltan Haindrich created HIVE-24678: --- Summary: Add feature toggle to control SWO parallel edge support Key: HIVE-24678 URL: https://issues.apache.org/jira/browse/HIVE-24678 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich I can't foresee the future - but it might give better diagnosability opportunities to have a direct knob on this feature (I wanted to add it in the base patch ; but eventually forgot to do so) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24677) Fix typoed vectorization package declaration
Zoltan Haindrich created HIVE-24677: --- Summary: Fix typoed vectorization package declaration Key: HIVE-24677 URL: https://issues.apache.org/jira/browse/HIVE-24677 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich HIVE-24510 have added ql/src/gen/vectorization/UDAFTemplates/VectorUDAFComputeBitVector.txt but it's package declaration doesnt align with its folder name -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24671) Semijoinremoval should not run into an NPE in case the SJ filter contains an UDF
Zoltan Haindrich created HIVE-24671: --- Summary: Semijoinremoval should not run into an NPE in case the SJ filter contains an UDF Key: HIVE-24671 URL: https://issues.apache.org/jira/browse/HIVE-24671 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich {code} set hive.optimize.index.filter=true; set hive.support.concurrency=true; set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set hive.exec.dynamic.partition.mode=nonstrict; set hive.exec.dynamic.partition=true; set hive.vectorized.execution.enabled=true; drop table if exists t1; drop table if exists t2; create table t1 ( v1 string ); create table t2 ( v2 string ); insert into t1 values ('e123456789'),('x123456789'); insert into t2 values ('123'), ('e123456789'); -- alter table t1 update statistics set ('numRows'='9348843574','rawDataSize'='0'); alter table t1 update statistics set ('numRows'='934884357','rawDataSize'='0'); alter table t2 update statistics set ('numRows'='9348','rawDataSize'='0'); alter table t1 update statistics for column v1 set ('numNulls'='0','numDVs'='15541355','avgColLen'='10.0','maxColLen'='10'); alter table t2 update statistics for column v2 set ('numNulls'='0','numDVs'='155','avgColLen'='5.0','maxColLen'='10'); -- alter table t2 update statistics for column k set ('numNulls'='0','numDVs'='13876472','avgColLen'='15.9836','maxColLen'='16'); explain select v1,v2 from t1 join t2 on (substr(v1,1,3) = v2); {code} results in: {code} java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.TezCompiler.removeSemijoinOptimizationByBenefit(TezCompiler.java:1944) at org.apache.hadoop.hive.ql.parse.TezCompiler.semijoinRemovalBasedTransformations(TezCompiler.java:544) at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:240) at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:161) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.compilePlan(SemanticAnalyzer.java:12467) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12672) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:455) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:301) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:171) [...] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24547) Fix acid_vectorization_original
Zoltan Haindrich created HIVE-24547: --- Summary: Fix acid_vectorization_original Key: HIVE-24547 URL: https://issues.apache.org/jira/browse/HIVE-24547 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich the failure was hidden by the failed-to-read issue the test is most likely failed first after HIVE-24274 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24525) Invite reviewers automatically by file name patterns
Zoltan Haindrich created HIVE-24525: --- Summary: Invite reviewers automatically by file name patterns Key: HIVE-24525 URL: https://issues.apache.org/jira/browse/HIVE-24525 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich I've wrote about this an [email|http://mail-archives.apache.org/mod_mbox/hive-dev/202006.mbox/%3c324a0a23-5841-09fe-a993-1a095035e...@rxd.hu%3e] a long time ago... it could help in keeping an eye on some specific parts...eg: thrift and parser changes -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24488) Make docker host configurable for metastoredb/perf tests
Zoltan Haindrich created HIVE-24488: --- Summary: Make docker host configurable for metastoredb/perf tests Key: HIVE-24488 URL: https://issues.apache.org/jira/browse/HIVE-24488 Project: Hive Issue Type: Improvement Components: Test Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich I tend to develop patches inside containers (hive-dev-box) to be able to work on multiple patches in parallel Running tests which do use docker were always a bit problematic for me - when I wanted to do it before: I manually exposed /var/lib/docker and added a rinetd forward by hand (which is not nice) ...with the current move to run Perf tests as well against a dockerized metastore exposes this problem a bit more for me. I'm also considering to add the ability to use minikube with hive-dev-box ; but that's still needs exploring it would be much easier to expose the address of the docker host I'm using... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24487) Use alternate ports for dockerized databases during testing
Zoltan Haindrich created HIVE-24487: --- Summary: Use alternate ports for dockerized databases during testing Key: HIVE-24487 URL: https://issues.apache.org/jira/browse/HIVE-24487 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich like 5432 for postgres and 3306 for mysql https://github.com/apache/hive/blob/52cf467836df71485e95b08c9e91e197e9898b79/standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Postgres.java#L35 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24486) Enhance operator merge logic to also consider going thru RS operators
Zoltan Haindrich created HIVE-24486: --- Summary: Enhance operator merge logic to also consider going thru RS operators Key: HIVE-24486 URL: https://issues.apache.org/jira/browse/HIVE-24486 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich the targeted situation looks like this: {code} OP1 -> RS1.1 -> JOIN1.1 OP1 -> RS1.2 -> JOIN1.2 OP2 -> RS2.1 -> JOIN1.1 -> RS3.1 OP2 -> RS2.2 -> JOIN1.2 -> RS3.2 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24435) Vectorized unix_timestamp is inconsistent with non-vectorized counterpart
Zoltan Haindrich created HIVE-24435: --- Summary: Vectorized unix_timestamp is inconsistent with non-vectorized counterpart Key: HIVE-24435 URL: https://issues.apache.org/jira/browse/HIVE-24435 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich {code} create table t (d string); insert into t values('2020-11-16 22:18:40 UTC'); select '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), to_date(from_unixtime(unix_timestamp(d))) from t ; set hive.fetch.task.conversion=none; select '>' || d || '<' , unix_timestamp(d), from_unixtime(unix_timestamp(d)), to_date(from_unixtime(unix_timestamp(d))) from t ; {code} results: {code} -- std udf: >2020-11-16 22:18:40 UTC< 1605593920 2020-11-16 22:18:40 >2020-11-16 -- vectorized udf >2020-11-16 22:18:40 UTC< NULLNULLNULL {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24428) Concurrent add_partitions requests may lead to data loss
Zoltan Haindrich created HIVE-24428: --- Summary: Concurrent add_partitions requests may lead to data loss Key: HIVE-24428 URL: https://issues.apache.org/jira/browse/HIVE-24428 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich in case multiple clients are adding partitions to the same table - when the same partition is being added there is a chance that the data dir is removed after the other client have already written its data https://github.com/apache/hive/blob/5e96b14a2357c66a0640254d5414bc706d8be852/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L3958 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24388) Enhance swo optimizations to merge EventOperators
Zoltan Haindrich created HIVE-24388: --- Summary: Enhance swo optimizations to merge EventOperators Key: HIVE-24388 URL: https://issues.apache.org/jira/browse/HIVE-24388 Project: Hive Issue Type: Sub-task Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich {code} EVENT1->TS1 EVENT2->TS2 {code} are not merged because a TS may only handles the first event properly; sending 2 events would cause one of them to be ignored -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24384) SharedWorkOptimizer improvements
Zoltan Haindrich created HIVE-24384: --- Summary: SharedWorkOptimizer improvements Key: HIVE-24384 URL: https://issues.apache.org/jira/browse/HIVE-24384 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich this started as a small feature addition but due to the sheer volume of the q.out changes - its better to do smaller changes at a time; which means more tickets... -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24382) Organize replaceTabAlias methods
Zoltan Haindrich created HIVE-24382: --- Summary: Organize replaceTabAlias methods Key: HIVE-24382 URL: https://issues.apache.org/jira/browse/HIVE-24382 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich * move to the OperatorDesc / etc https://github.com/apache/hive/pull/1661#discussion_r522693729 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-24376) SharedWorkOptimizer may retain the SJ filter condition during RemoveSemijoin mode
Zoltan Haindrich created HIVE-24376: --- Summary: SharedWorkOptimizer may retain the SJ filter condition during RemoveSemijoin mode Key: HIVE-24376 URL: https://issues.apache.org/jira/browse/HIVE-24376 Project: Hive Issue Type: Improvement Reporter: Zoltan Haindrich the mode name is also a bit confusing..but here is what happens: {code} TS[A1] -> ... TS[A2] -> JOIN TS[B] -> JOIN {code} we have an SJ edge between TS[B] -> TS[A2] to communicate informations about the join keys; lets assume the reducation ratio was r. RemoveSemijoin right now does the following: * removes the semijoin edge (so TS[A2] will become a full scan) * merges TS[A1] and TS[A2] w.r.t to read data from disk: this is great - we accessed A twice; from which 1 was a full scan - and now we only read it once. but from row traffic perspective: TS[A2] emits more rows from now on because we dont have the r ratio semijoin reduction anymore. -- This message was sent by Atlassian Jira (v8.3.4#803005)