[jira] [Created] (HIVE-22235) CommandProcessorResponse should not be an exception

2019-09-24 Thread Miklos Gergely (Jira)
Miklos Gergely created HIVE-22235:
-

 Summary: CommandProcessorResponse should not be an exception
 Key: HIVE-22235
 URL: https://issues.apache.org/jira/browse/HIVE-22235
 Project: Hive
  Issue Type: Sub-task
  Components: Hive
Reporter: Miklos Gergely
Assignee: Miklos Gergely
 Fix For: 4.0.0


The CommandProcessorResponse class extends Exception. This may be convenient, 
but it is wrong, as a response is not an exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22236) Fail to create View selecting View containing NOT IN subquery

2019-09-24 Thread Zoltan Matyus (Jira)
Zoltan Matyus created HIVE-22236:


 Summary: Fail to create View selecting View containing NOT IN 
subquery
 Key: HIVE-22236
 URL: https://issues.apache.org/jira/browse/HIVE-22236
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Matyus
Assignee: Zoltan Matyus


* Given a complicated view with a select statement that has subquery containing 
"{{NOT IN}}"
* Hive fails to create a simple view as {{SELECT * FROM complicated_view}} 
* (with CBO disabled).

The unparse replacements of the complicated view will be applied to the text of 
the simple view, resulting in {{IllegalArgumentException: replace: range 
invalid}} exceptions from {{org.antlr.runtime.TokenRewriteStream.replace}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22237) Show operator id in non-user explain

2019-09-24 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-22237:
---

 Summary: Show operator id in non-user explain
 Key: HIVE-22237
 URL: https://issues.apache.org/jira/browse/HIVE-22237
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22238) PK/FK selectivity estimation underscales estimations

2019-09-24 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created HIVE-22238:
---

 Summary: PK/FK selectivity estimation underscales estimations
 Key: HIVE-22238
 URL: https://issues.apache.org/jira/browse/HIVE-22238
 Project: Hive
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


at [this 
point|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2182]
 the parent operators rownum is scaled according to pkfkselectivity

however [pkfkselectivity is 
computed|https://github.com/apache/hive/blob/5098d155a1e6a164253f5fa98755273bc34085df/ql/src/java/org/apache/hadoop/hive/ql/optimizer/stats/annotation/StatsRulesProcFactory.java#L2157]
 on a whole subtree.

Scaling it by that amount will count in estimation already used when 
parentstats was calculated...so depending on the number of upstream joins - 
this may lead to severe underestimations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22239) Scale data size using column value ranges

2019-09-24 Thread Jesus Camacho Rodriguez (Jira)
Jesus Camacho Rodriguez created HIVE-22239:
--

 Summary: Scale data size using column value ranges
 Key: HIVE-22239
 URL: https://issues.apache.org/jira/browse/HIVE-22239
 Project: Hive
  Issue Type: Improvement
  Components: Physical Optimizer
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Currently, min/max values for columns are only used to determine whether a 
certain range filter falls out of range and thus filters all rows or none at 
all. If it does not, we just use a heuristic that the condition will filter 1/3 
of the input rows. Instead of using that heuristic, we can use another one that 
assumes that data will be uniformly distributed across that range, and 
calculate the selectivity for the condition accordingly.

This patch also includes the propagation of min/max column values from 
statistics to the optimizer for timestamp type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22240) Function percentile_cont fails when array parameter passed

2019-09-24 Thread Krisztian Kasa (Jira)
Krisztian Kasa created HIVE-22240:
-

 Summary: Function percentile_cont fails when array parameter passed
 Key: HIVE-22240
 URL: https://issues.apache.org/jira/browse/HIVE-22240
 Project: Hive
  Issue Type: Bug
Reporter: Krisztian Kasa
Assignee: Krisztian Kasa
 Fix For: 4.0.0


{code}
SELECT
percentile_cont(array(0.2, 0.5, 0.9)) WITHIN GROUP (ORDER BY value)
FROM t_test;
{code}

hive.log:
{code}
2019-09-24T21:00:43,203 ERROR [LocalJobRunner Map Task Executor #0] 
mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
Error while processing row
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:148)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassCastException: java.util.ArrayList cannot be cast to 
org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:793)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552)
... 11 more
Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to 
org.apache.hadoop.hive.serde2.io.HiveDecimalWritable
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFPercentileCont$PercentileContEvaluator.iterate(GenericUDAFPercentileCont.java:259)
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:214)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.updateAggregations(GroupByOperator.java:639)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processHashAggr(GroupByOperator.java:814)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.processKey(GroupByOperator.java:720)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:788)
... 17 more

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-22241) Implement UDF to convert a date/timestamp from Gregorian-Julian hybrid calendar to proleptic Gregorian calendar

2019-09-24 Thread Jesus Camacho Rodriguez (Jira)
Jesus Camacho Rodriguez created HIVE-22241:
--

 Summary: Implement UDF to convert a date/timestamp from 
Gregorian-Julian hybrid calendar to proleptic Gregorian calendar
 Key: HIVE-22241
 URL: https://issues.apache.org/jira/browse/HIVE-22241
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


UDF that converts a date/timestamp from *Gregorian-Julian hybrid* calendar, 
i.e., calendar that supports both the Julian and Gregorian calendar systems 
with the support of a single discontinuity, which corresponds by default to the 
Gregorian date when the Gregorian calendar was instituted, to *proleptic 
Gregorian calendar* (ISO 8601 standard), which is produced by extending the 
Gregorian calendar backward to dates preceding its official introduction in 
1582.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)