[jira] [Created] (HIVE-24734) Sanity check in HiveSplitGenerator available slot calculation

2021-02-04 Thread Zoltan Matyus (Jira)
Zoltan Matyus created HIVE-24734:


 Summary: Sanity check in HiveSplitGenerator available slot 
calculation
 Key: HIVE-24734
 URL: https://issues.apache.org/jira/browse/HIVE-24734
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: 4.0.0
Reporter: Zoltan Matyus


HiveSplitGenerator calculates the number of available slots from available 
memory like this:

{code:java}
if (getContext() != null) {
  totalResource = getContext().getTotalAvailableResource().getMemory();
  taskResource = getContext().getVertexTaskResource().getMemory();
  availableSlots = totalResource / taskResource;
}
{code}

I had a scenario where the total memory was calculated correctly, but the task 
memory returned -1. This led to error like these:

{noformat}
tez.HiveSplitGenerator: Number of input splits: 1. -3641 available slots, 1.7 
waves. Input format is: org.apache.hadoop.hive.ql.io.HiveInputFormat

Estimated number of tasks: -6189 for bucket 1

java.lang.IllegalArgumentException: Illegal Capacity: -6189
{noformat}

Admittedly, this happened during development, and hopefully will not occur on a 
properly configured cluster. (Although I'm not sure what the issue was on my 
setup, possibly XMX set higher than physical memory.)

In any case, it feels like setting availableSlots < 1 will never lead to 
desired behavior, so in such cases we could emit a warning and correct the 
value to 1.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24691) Ban commons-logging (again)

2021-01-27 Thread Zoltan Matyus (Jira)
Zoltan Matyus created HIVE-24691:


 Summary: Ban commons-logging (again)
 Key: HIVE-24691
 URL: https://issues.apache.org/jira/browse/HIVE-24691
 Project: Hive
  Issue Type: Bug
  Components: Logging
Affects Versions: 4.0.0
Reporter: Zoltan Matyus
Assignee: Zoltan Matyus


The usage of commons-logging has been completely removed once from Hive in 
HIVE-20019. However, new usage has been added since, despite attempts to ban 
this (bannedDependencies). I'm removing all usage again, and add another way to 
ban using it (restrictImports).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-24213) Incorrect exception in the Merge MapJoinTask into its child MapRedTask optimizer

2020-09-30 Thread Zoltan Matyus (Jira)
Zoltan Matyus created HIVE-24213:


 Summary: Incorrect exception in the Merge MapJoinTask into its 
child MapRedTask optimizer
 Key: HIVE-24213
 URL: https://issues.apache.org/jira/browse/HIVE-24213
 Project: Hive
  Issue Type: Bug
  Components: Physical Optimizer
Affects Versions: 4.0.0
Reporter: Zoltan Matyus
Assignee: Zoltan Matyus


The {{CommonJoinTaskDispatcher#mergeMapJoinTaskIntoItsChildMapRedTask}} method 
throws a {{SemanticException}} if the number of {{FileSinkOperator}}s it finds 
is not exactly 1. The exception is valid if zero operators are found, but there 
can be valid use cases where multiple FileSinkOperators exist.

Example: the MapJoin and it child are used in a common table expression, which 
is used for multiple inserts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22767) beeline doesn't parse semicolons in comments properly

2020-01-23 Thread Zoltan Matyus (Jira)
Zoltan Matyus created HIVE-22767:


 Summary: beeline doesn't parse semicolons in comments properly
 Key: HIVE-22767
 URL: https://issues.apache.org/jira/browse/HIVE-22767
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Zoltan Matyus


HIVE-12646 fixed the handling of semicolons in quoted strings, but leaves the 
problem of semicolons in comments. E.g. with beeline connected to any 
database...

this works: {code:sql}select 1; select /*   */ 2; select /*   */ 3;{code}
this doesn't work: {code:sql}select 1; select /* ; */ 2; select /* ; */ 3;{code}

This has been fixed and reintroduced before (possibly multiple times). Ideally, 
there should be a single utility method somewhere to separate comments, strings 
and commands -- with the proper testing in place (q files).
However, I'm trying to make this fix back-portable, so a light touch is needed. 
I'm focusing on beeline for now, and only writing (very thorough) unit tests, 
as I cannot exclude any new q files from TestCliDriver (which would break, 
since it's using a different parsing method).



P.S. excerpt of the error message:

{noformat}
0: jdbc:hive2://...> select 1; select /* ; */ 2; select /* ; */ 3;
INFO  : Compiling command(queryId=...): select 1
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, 
type:int, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=...); Time taken: 0.38 seconds
INFO  : Executing command(queryId=...): select 1
INFO  : Completed executing command(queryId=...); Time taken: 0.004 seconds
INFO  : OK
+--+
| _c0  |
+--+
| 1|
+--+
1 row selected (2.007 seconds)
INFO  : Compiling command(queryId=...): select /*
ERROR : FAILED: ParseException line 1:9 cannot recognize input near '' 
'' '' in select clause
org.apache.hadoop.hive.ql.parse.ParseException: line 1:9 cannot recognize input 
near '' '' '' in select clause
at 
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:233)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:79)
at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:72)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:598)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1505)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1452)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1447)
at 
org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126)
at ...
{noformat}



Similarly, the following query also fails:
{code:sql}select /* ' */ 1; select /* ' */ 2;{code}
I suspect line comments are also not handled properly but I cannot reproduce 
this in interactive beeline...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22236) Fail to create View selecting View containing NOT IN subquery

2019-09-24 Thread Zoltan Matyus (Jira)
Zoltan Matyus created HIVE-22236:


 Summary: Fail to create View selecting View containing NOT IN 
subquery
 Key: HIVE-22236
 URL: https://issues.apache.org/jira/browse/HIVE-22236
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Matyus
Assignee: Zoltan Matyus


* Given a complicated view with a select statement that has subquery containing 
"{{NOT IN}}"
* Hive fails to create a simple view as {{SELECT * FROM complicated_view}} 
* (with CBO disabled).

The unparse replacements of the complicated view will be applied to the text of 
the simple view, resulting in {{IllegalArgumentException: replace: range 
invalid}} exceptions from {{org.antlr.runtime.TokenRewriteStream.replace}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-21584) Java 11 preparation: system class loader is not URLClassLoader

2019-04-05 Thread Zoltan Matyus (JIRA)
Zoltan Matyus created HIVE-21584:


 Summary: Java 11 preparation: system class loader is not 
URLClassLoader
 Key: HIVE-21584
 URL: https://issues.apache.org/jira/browse/HIVE-21584
 Project: Hive
  Issue Type: Task
  Components: Hive
Affects Versions: 4.0.0
Reporter: Zoltan Matyus
Assignee: Zoltan Matyus


Currently, Hive assumes that the system class loader is instance of 
{{URLClassLoader}}. In Java 11 this is not the case. There are a few 
(unresolved) JIRAs about specific occurrences of {{URLClassLoader}} (e.g. 
[HIVE-21237|https://issues.apache.org/jira/browse/HIVE-21237], 
[HIVE-17909|https://issues.apache.org/jira/browse/HIVE-17909]), but no _"remove 
all occurrences"_. Also I couldn't find umbrella "Java 11 upgrade" JIRA.

This ticket is to remove all unconditional casts of any random class loader to 
{{URLClassLoader}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)