[jira] [Created] (HIVE-17175) Improve desc formatted for bitvectors
Pengcheng Xiong created HIVE-17175: -- Summary: Improve desc formatted for bitvectors Key: HIVE-17175 URL: https://issues.apache.org/jira/browse/HIVE-17175 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17137) Fix javolution conflict
Pengcheng Xiong created HIVE-17137: -- Summary: Fix javolution conflict Key: HIVE-17137 URL: https://issues.apache.org/jira/browse/HIVE-17137 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong as reported by [~jcamachorodriguez] {code} [WARNING] Some problems were encountered while building the effective model for org.apache.hive:hive-exec:jar:3.0.0-SNAPSHOT [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must be unique: javolution:javolution:jar -> duplicate declaration of version ${javolution.version} @ org.apache.hive:hive-exec:[unknown-version], /grid/5/dev/jcamachorodriguez/dist/tez-autobuild/hive/ql/pom.xml, line 366, column 17 {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17096) Fix test failures in 2.3 branch
Pengcheng Xiong created HIVE-17096: -- Summary: Fix test failures in 2.3 branch Key: HIVE-17096 URL: https://issues.apache.org/jira/browse/HIVE-17096 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17071) Make hive 2.3 depend on storage-api-2.3
Pengcheng Xiong created HIVE-17071: -- Summary: Make hive 2.3 depend on storage-api-2.3 Key: HIVE-17071 URL: https://issues.apache.org/jira/browse/HIVE-17071 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17062) make hive.optimize.bucketingsorting work for smb_mapjoin_20.q
Pengcheng Xiong created HIVE-17062: -- Summary: make hive.optimize.bucketingsorting work for smb_mapjoin_20.q Key: HIVE-17062 URL: https://issues.apache.org/jira/browse/HIVE-17062 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong follow-up of HIVE-16981 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17045) Add HyperLogLog as an UDAF
Pengcheng Xiong created HIVE-17045: -- Summary: Add HyperLogLog as an UDAF Key: HIVE-17045 URL: https://issues.apache.org/jira/browse/HIVE-17045 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16997) Extend object store to store bit vectors
Pengcheng Xiong created HIVE-16997: -- Summary: Extend object store to store bit vectors Key: HIVE-16997 URL: https://issues.apache.org/jira/browse/HIVE-16997 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16995) Merge NDV across partitions using bit vectors
Pengcheng Xiong created HIVE-16995: -- Summary: Merge NDV across partitions using bit vectors Key: HIVE-16995 URL: https://issues.apache.org/jira/browse/HIVE-16995 Project: Hive Issue Type: Improvement Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16996) Add HLL as an alternative to FM sketch to compute stats
Pengcheng Xiong created HIVE-16996: -- Summary: Add HLL as an alternative to FM sketch to compute stats Key: HIVE-16996 URL: https://issues.apache.org/jira/browse/HIVE-16996 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16986) Support vectorization for UDAF compute_stats
Pengcheng Xiong created HIVE-16986: -- Summary: Support vectorization for UDAF compute_stats Key: HIVE-16986 URL: https://issues.apache.org/jira/browse/HIVE-16986 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16981) hive.optimize.bucketingsorting should compare the schema before removing RS
Pengcheng Xiong created HIVE-16981: -- Summary: hive.optimize.bucketingsorting should compare the schema before removing RS Key: HIVE-16981 URL: https://issues.apache.org/jira/browse/HIVE-16981 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong on master, smb_mapjoin_20.q, run {code} select * from test_table3; {code} you will get {code} val_0 0 NULL1 ... {code} The correct result is {code} val_0 0 val_01 ... {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16971) improve explain when invalidate stats
Pengcheng Xiong created HIVE-16971: -- Summary: improve explain when invalidate stats Key: HIVE-16971 URL: https://issues.apache.org/jira/browse/HIVE-16971 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong for example, in a load statement, we use statsTask to invalidate stats. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16957) Support CTAS for auto gather column stats
Pengcheng Xiong created HIVE-16957: -- Summary: Support CTAS for auto gather column stats Key: HIVE-16957 URL: https://issues.apache.org/jira/browse/HIVE-16957 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16956) Support date type for merging column stats
Pengcheng Xiong created HIVE-16956: -- Summary: Support date type for merging column stats Key: HIVE-16956 URL: https://issues.apache.org/jira/browse/HIVE-16956 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16916) UpdateColumnStatsTask should set column stats as inaccurate
Pengcheng Xiong created HIVE-16916: -- Summary: UpdateColumnStatsTask should set column stats as inaccurate Key: HIVE-16916 URL: https://issues.apache.org/jira/browse/HIVE-16916 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong It seems that it is now set default as accurate. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16836) improve query28 for count distinct rewrite
Pengcheng Xiong created HIVE-16836: -- Summary: improve query28 for count distinct rewrite Key: HIVE-16836 URL: https://issues.apache.org/jira/browse/HIVE-16836 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16837) improve query28 for count distinct rewrite
Pengcheng Xiong created HIVE-16837: -- Summary: improve query28 for count distinct rewrite Key: HIVE-16837 URL: https://issues.apache.org/jira/browse/HIVE-16837 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16827) Merge stats task and column stats task into a single task
Pengcheng Xiong created HIVE-16827: -- Summary: Merge stats task and column stats task into a single task Key: HIVE-16827 URL: https://issues.apache.org/jira/browse/HIVE-16827 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Within the task, we can specify whether to compute basic stats only or column stats only or both. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16798) Flaky test query14.q
Pengcheng Xiong created HIVE-16798: -- Summary: Flaky test query14.q Key: HIVE-16798 URL: https://issues.apache.org/jira/browse/HIVE-16798 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16797) Support a new rule RemoveUnionBranchRule
Pengcheng Xiong created HIVE-16797: -- Summary: Support a new rule RemoveUnionBranchRule Key: HIVE-16797 URL: https://issues.apache.org/jira/browse/HIVE-16797 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong in query4.q, we can see that it creates a CTE with union all of 3 branches. Then it is going to do a 3 way self-join of the CTE with predicates. The predicates actually specifies only one of the branch in CTE to participate in the join. Thus, in some cases, e.g., {code} /- filter(false) -TS0 union all - filter(false) -TS1 \-TS2 {code} we can cut the branches of TS0 and TS1. The union becomes only TS2. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16775) Augment ASTConverter for TPCDS queries
Pengcheng Xiong created HIVE-16775: -- Summary: Augment ASTConverter for TPCDS queries Key: HIVE-16775 URL: https://issues.apache.org/jira/browse/HIVE-16775 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong query4.q,query74.q {code} [7e490527-156a-48c7-aa87-8c80093cdfa8 main] ql.Driver: FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter$QBVisitor.visit(ASTConverter.java:457) at org.apache.calcite.rel.RelVisitor.go(RelVisitor.java:61) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:110) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convertSource(ASTConverter.java:393) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:115) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16774) Support position in ORDER BY when using SELECT *
Pengcheng Xiong created HIVE-16774: -- Summary: Support position in ORDER BY when using SELECT * Key: HIVE-16774 URL: https://issues.apache.org/jira/browse/HIVE-16774 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong query47.q query57.q -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16773) Support non-equi join predicate in scalar subqueries with aggregate
Pengcheng Xiong created HIVE-16773: -- Summary: Support non-equi join predicate in scalar subqueries with aggregate Key: HIVE-16773 URL: https://issues.apache.org/jira/browse/HIVE-16773 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong query41.q {code} [5e84b202-205a-4fea-a457-94f28e63f0b4 main] ql.Driver: FAILED: SemanticException [Error 10250]: org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSubquerySemanticException: Line 8:13 Invalid SubQuery expression ''medium'': Scalar subqueries with aggregate cannot have non-equi join predicate org.apache.hadoop.hive.ql.parse.SemanticException: org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSubquerySemanticException: Line 8:13 Invalid SubQuery expression ''medium'': Scalar subqueries with aggregate cannot have non-equi join predicate at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:466) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16772) Support TPCDS query11.q in PerfCliDriver
Pengcheng Xiong created HIVE-16772: -- Summary: Support TPCDS query11.q in PerfCliDriver Key: HIVE-16772 URL: https://issues.apache.org/jira/browse/HIVE-16772 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong {code} org.apache.hadoop.hive.ql.parse.SemanticException: Line 54:22 Invalid column reference 'customer_preferred_cust_flag' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:11744) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:11692) {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16764) Support numeric as same as decimal
Pengcheng Xiong created HIVE-16764: -- Summary: Support numeric as same as decimal Key: HIVE-16764 URL: https://issues.apache.org/jira/browse/HIVE-16764 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong for example numeric(12,2) -> decimal(12,2) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16763) Support space in quoted column alias
Pengcheng Xiong created HIVE-16763: -- Summary: Support space in quoted column alias Key: HIVE-16763 URL: https://issues.apache.org/jira/browse/HIVE-16763 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong {code} select key as 'k y' from src; {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16762) Support unmodified TPCDS queries in Hive
Pengcheng Xiong created HIVE-16762: -- Summary: Support unmodified TPCDS queries in Hive Key: HIVE-16762 URL: https://issues.apache.org/jira/browse/HIVE-16762 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16734) Support original tpcds queries in perfclidriver after order by unselect column feature is done
Pengcheng Xiong created HIVE-16734: -- Summary: Support original tpcds queries in perfclidriver after order by unselect column feature is done Key: HIVE-16734 URL: https://issues.apache.org/jira/browse/HIVE-16734 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16733) Support conflict column name in order by
Pengcheng Xiong created HIVE-16733: -- Summary: Support conflict column name in order by Key: HIVE-16733 URL: https://issues.apache.org/jira/browse/HIVE-16733 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong There is a bug in RR which is exposed in HIVE-15160. After resolving the bug, we can support both: select key as value from src order by src.value select key as value from src order by value -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16654) Optimize a combination of avg(), sum(), count(distinct) etc
Pengcheng Xiong created HIVE-16654: -- Summary: Optimize a combination of avg(), sum(), count(distinct) etc Key: HIVE-16654 URL: https://issues.apache.org/jira/browse/HIVE-16654 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong an example rewrite for q28 of tpcds is {code} (select LP as B1_LP ,CNT as B1_CNT,CNTD as B1_CNTD from (select sum(xc0) / sum(xc1) as LP, sum(xc1) as CNT, count(1) as CNTD from (select sum(ss_list_price) as xc0, count(ss_list_price) as xc1 from store_sales where ss_list_price is not null and ss_quantity between 0 and 5 and (ss_list_price between 11 and 11+10 or ss_coupon_amt between 460 and 460+1000 or ss_wholesale_cost between 14 and 14+20) group by ss_list_price) ss0) ss1) B1 {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16653) Mergejoin should give itself a correct tag
Pengcheng Xiong created HIVE-16653: -- Summary: Mergejoin should give itself a correct tag Key: HIVE-16653 URL: https://issues.apache.org/jira/browse/HIVE-16653 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16629) Print thread name when thread pool is used in Hive.java
Pengcheng Xiong created HIVE-16629: -- Summary: Print thread name when thread pool is used in Hive.java Key: HIVE-16629 URL: https://issues.apache.org/jira/browse/HIVE-16629 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16628) Fix query27 when it uses a mix of MergeJoin and MapJoin
Pengcheng Xiong created HIVE-16628: -- Summary: Fix query27 when it uses a mix of MergeJoin and MapJoin Key: HIVE-16628 URL: https://issues.apache.org/jira/browse/HIVE-16628 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16627) Improve user level explain
Pengcheng Xiong created HIVE-16627: -- Summary: Improve user level explain Key: HIVE-16627 URL: https://issues.apache.org/jira/browse/HIVE-16627 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong We are trying to support user level explain in both Tez and Spark. We can use this JIRA as an umbrella to track. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16566) Set column stats default as true when creating new tables/partitions
Pengcheng Xiong created HIVE-16566: -- Summary: Set column stats default as true when creating new tables/partitions Key: HIVE-16566 URL: https://issues.apache.org/jira/browse/HIVE-16566 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16543) Deprecate HIVESTATSAUTOGATHER
Pengcheng Xiong created HIVE-16543: -- Summary: Deprecate HIVESTATSAUTOGATHER Key: HIVE-16543 URL: https://issues.apache.org/jira/browse/HIVE-16543 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong The overhead for auto collecting table stats is quite low. We can deprecate this configuration and treat it always as true. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16540) dynamic_semijoin_user_level is failing on MiniLlap
Pengcheng Xiong created HIVE-16540: -- Summary: dynamic_semijoin_user_level is failing on MiniLlap Key: HIVE-16540 URL: https://issues.apache.org/jira/browse/HIVE-16540 Project: Hive Issue Type: Test Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Priority: Trivial -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16537) Add missing AL files
Pengcheng Xiong created HIVE-16537: -- Summary: Add missing AL files Key: HIVE-16537 URL: https://issues.apache.org/jira/browse/HIVE-16537 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16495) ColumnStats merge should consider the accuracy of the current stats
Pengcheng Xiong created HIVE-16495: -- Summary: ColumnStats merge should consider the accuracy of the current stats Key: HIVE-16495 URL: https://issues.apache.org/jira/browse/HIVE-16495 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16493) Skip column stats when colStats is empty
Pengcheng Xiong created HIVE-16493: -- Summary: Skip column stats when colStats is empty Key: HIVE-16493 URL: https://issues.apache.org/jira/browse/HIVE-16493 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16485) Enable outputName for RS operator in explain formatted
Pengcheng Xiong created HIVE-16485: -- Summary: Enable outputName for RS operator in explain formatted Key: HIVE-16485 URL: https://issues.apache.org/jira/browse/HIVE-16485 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16440) Fix failing test columnstats_partlvl_invalid_values when autogather column stats is on
Pengcheng Xiong created HIVE-16440: -- Summary: Fix failing test columnstats_partlvl_invalid_values when autogather column stats is on Key: HIVE-16440 URL: https://issues.apache.org/jira/browse/HIVE-16440 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16421) Runtime filtering breaks user-level explain
Pengcheng Xiong created HIVE-16421: -- Summary: Runtime filtering breaks user-level explain Key: HIVE-16421 URL: https://issues.apache.org/jira/browse/HIVE-16421 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16387) Fix test failing org.apache.hive.jdbc.TestJdbcDriver2.testResultSetMetaData
Pengcheng Xiong created HIVE-16387: -- Summary: Fix test failing org.apache.hive.jdbc.TestJdbcDriver2.testResultSetMetaData Key: HIVE-16387 URL: https://issues.apache.org/jira/browse/HIVE-16387 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16379) Can not compute column stats when partition column is decimal
Pengcheng Xiong created HIVE-16379: -- Summary: Can not compute column stats when partition column is decimal Key: HIVE-16379 URL: https://issues.apache.org/jira/browse/HIVE-16379 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong to repo, run {code} et hive.compute.query.using.stats=false; set hive.stats.column.autogather=false; drop table if exists partcoltypeothers; create table partcoltypeothers (key int, value string) partitioned by (decpart decimal(6,2), datepart date); set hive.typecheck.on.insert=false; insert into partcoltypeothers partition (decpart = 1000.01BD, datepart = date '2015-4-13') select key, value from src limit 10; show partitions partcoltypeothers; analyze table partcoltypeothers partition (decpart = 1000.01BD, datepart = date '2015-4-13') compute statistics for columns; {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16378) Derby throws java.lang.StackOverflowError when it tries to get column stats from a table with thousands columns
Pengcheng Xiong created HIVE-16378: -- Summary: Derby throws java.lang.StackOverflowError when it tries to get column stats from a table with thousands columns Key: HIVE-16378 URL: https://issues.apache.org/jira/browse/HIVE-16378 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong to repo, set hive.stats.column.autogather=true, and run orc_wide_table.q -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16373) Enable DDL statement for non-native tables (rename table)
Pengcheng Xiong created HIVE-16373: -- Summary: Enable DDL statement for non-native tables (rename table) Key: HIVE-16373 URL: https://issues.apache.org/jira/browse/HIVE-16373 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16372) Enable DDL statement for non-native tables (add/remove table properties)
Pengcheng Xiong created HIVE-16372: -- Summary: Enable DDL statement for non-native tables (add/remove table properties) Key: HIVE-16372 URL: https://issues.apache.org/jira/browse/HIVE-16372 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16366) Hive 2.3 release planning
Pengcheng Xiong created HIVE-16366: -- Summary: Hive 2.3 release planning Key: HIVE-16366 URL: https://issues.apache.org/jira/browse/HIVE-16366 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16349) Enable DDL statement for non-native tables
Pengcheng Xiong created HIVE-16349: -- Summary: Enable DDL statement for non-native tables Key: HIVE-16349 URL: https://issues.apache.org/jira/browse/HIVE-16349 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16310) Get the output operators of Reducesink when vectorization is on
Pengcheng Xiong created HIVE-16310: -- Summary: Get the output operators of Reducesink when vectorization is on Key: HIVE-16310 URL: https://issues.apache.org/jira/browse/HIVE-16310 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16293) Column pruner should continue to work when SEL has more than 1 child
Pengcheng Xiong created HIVE-16293: -- Summary: Column pruner should continue to work when SEL has more than 1 child Key: HIVE-16293 URL: https://issues.apache.org/jira/browse/HIVE-16293 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16274) Improve column stats computation using density function
Pengcheng Xiong created HIVE-16274: -- Summary: Improve column stats computation using density function Key: HIVE-16274 URL: https://issues.apache.org/jira/browse/HIVE-16274 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong to take into consideration of row counts. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16262) Inconsistent result when casting integer to timestamp
Pengcheng Xiong created HIVE-16262: -- Summary: Inconsistent result when casting integer to timestamp Key: HIVE-16262 URL: https://issues.apache.org/jira/browse/HIVE-16262 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong As reported by [~jcamachorodriguez]: {code} To give a concrete example, consider the following query: select cast(0 as timestamp) from src limit 1; The result if Hive is running in Santa Clara is: 1969-12-31 16:00:00 While the result if Hive is running in London is: 1970-01-01 00:00:00 This is not the behavior defined by the standard for TIMESTAMP type. The result should be consistent, in this case the correct result is: 1970-01-01 00:00:00 {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16249) With column stats, mergejoin.q throws NPE
Pengcheng Xiong created HIVE-16249: -- Summary: With column stats, mergejoin.q throws NPE Key: HIVE-16249 URL: https://issues.apache.org/jira/browse/HIVE-16249 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong stack trace: {code} 2017-03-17T16:00:26,356 ERROR [3d512d4d-72b5-48fc-92cb-0c72f7c876e5 main] parse.CalcitePlanner: CBO failed, skipping CBO. java.lang.NullPointerException at org.apache.calcite.rel.metadata.RelMdUtil.estimateFilteredRows(RelMdUtil.java:719) ~[calcite-core-1.10.0.jar:1.10.0] at org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:123) ~[calcite-core-1.10.0.jar:1.10.0] at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) ~[?:?] at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?] at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) ~[?:?] at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?] at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201) ~[calcite-core-1.10.0.jar:1.10.0] at org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(RelMdRowCount.java:132) ~[calcite-core-1.10.0.jar:1.10.0] at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) ~[?:?] at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?] at GeneratedMetadataHandler_RowCount.getRowCount_$(Unknown Source) ~[?:?] at GeneratedMetadataHandler_RowCount.getRowCount(Unknown Source) ~[?:?] at org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:201) ~[calcite-core-1.10.0.jar:1.10.0] at org.apache.calcite.rel.rules.LoptOptimizeJoinRule.swapInputs(LoptOptimizeJoinRule.java:1866) ~[calcite-core-1.10.0.jar:1.10.0] at org.apache.calcite.rel.rules.LoptOptimizeJoinRule.createJoinSubtree(LoptOptimizeJoinRule.java:1739) ~[calcite-core-1.10.0.jar:1.10.0] at org.apache.calcite.rel.rules.LoptOptimizeJoinRule.addToTop(LoptOptimizeJoinRule.java:1216) ~[calcite-core-1.10.0.jar:1.10.0] {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16246) Support auto gather column stats for columns with trailing white spaces
Pengcheng Xiong created HIVE-16246: -- Summary: Support auto gather column stats for columns with trailing white spaces Key: HIVE-16246 URL: https://issues.apache.org/jira/browse/HIVE-16246 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16232) Support stats computation for column in QuotedIdentifier
Pengcheng Xiong created HIVE-16232: -- Summary: Support stats computation for column in QuotedIdentifier Key: HIVE-16232 URL: https://issues.apache.org/jira/browse/HIVE-16232 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong right now if a column contains double quotes ``, we can not compute its stats. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16227) GenMRFileSink1.java should refer to its nearest MR task
Pengcheng Xiong created HIVE-16227: -- Summary: GenMRFileSink1.java should refer to its nearest MR task Key: HIVE-16227 URL: https://issues.apache.org/jira/browse/HIVE-16227 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16190) Support expression in merge statement
Pengcheng Xiong created HIVE-16190: -- Summary: Support expression in merge statement Key: HIVE-16190 URL: https://issues.apache.org/jira/browse/HIVE-16190 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Right now, we only support atomExpression, rather than expression in values in MergeStatement. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16169) Improve StatsOptimizer to deal with groupby partition columns
Pengcheng Xiong created HIVE-16169: -- Summary: Improve StatsOptimizer to deal with groupby partition columns Key: HIVE-16169 URL: https://issues.apache.org/jira/browse/HIVE-16169 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong As reported by [~ashutoshc] 1) select sum(c), count(c),... from T group by b; 2) select max(c), min(c), ... from T group by b; If b happens to be a partition column, we can also answer these from metadata. Currently, StatsOptimizer don't handle these queries, but we can extend it to handle those as well. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16163) Remove unnecessary parentheses in HiveParser
Pengcheng Xiong created HIVE-16163: -- Summary: Remove unnecessary parentheses in HiveParser Key: HIVE-16163 URL: https://issues.apache.org/jira/browse/HIVE-16163 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong in HiveParser.g L2145: {code} columnParenthesesList @init { pushMsg("column parentheses list", state); } @after { popMsg(state); } : LPAREN columnNameList RPAREN ; {code} should be changed to {code} columnParenthesesList @init { pushMsg("column parentheses list", state); } @after { popMsg(state); } : LPAREN! columnNameList RPAREN! ; {code} However, we also need to refactor our code. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16142) ATSHook NPE via LLAP
Pengcheng Xiong created HIVE-16142: -- Summary: ATSHook NPE via LLAP Key: HIVE-16142 URL: https://issues.apache.org/jira/browse/HIVE-16142 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Exceptions in the log of the form: 2017-03-06T15:42:30,046 WARN [ATS Logger 0]: hooks.ATSHook (ATSHook.java:run(318)) - Failed to submit to ATS for hive_20170306154227_f41bc7cb-1a2f-40f1-a85b-b2bc260a451a java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.ExplainTask.outputPlan(ExplainTask.java:608) ~[hive-exec-2.1.0.2.6.0.0-585.jar:2.1.0.2.6.0.0-585] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16018) Add more information for DynamicPartitionPruningOptimization
Pengcheng Xiong created HIVE-16018: -- Summary: Add more information for DynamicPartitionPruningOptimization Key: HIVE-16018 URL: https://issues.apache.org/jira/browse/HIVE-16018 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15955) make explain formatted to dump JSONObject when user level explain is on
Pengcheng Xiong created HIVE-15955: -- Summary: make explain formatted to dump JSONObject when user level explain is on Key: HIVE-15955 URL: https://issues.apache.org/jira/browse/HIVE-15955 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15903) Compute table stats when user computes column stats
Pengcheng Xiong created HIVE-15903: -- Summary: Compute table stats when user computes column stats Key: HIVE-15903 URL: https://issues.apache.org/jira/browse/HIVE-15903 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15884) Optimize not between for vectorization
Pengcheng Xiong created HIVE-15884: -- Summary: Optimize not between for vectorization Key: HIVE-15884 URL: https://issues.apache.org/jira/browse/HIVE-15884 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15845) create view on subquery_exists_having may not work well with unparseTranslator
Pengcheng Xiong created HIVE-15845: -- Summary: create view on subquery_exists_having may not work well with unparseTranslator Key: HIVE-15845 URL: https://issues.apache.org/jira/browse/HIVE-15845 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong On master, {code} hive> create view v as select b.key, count(*) > from src b > group by b.key > having exists > (select a.key > from src a > where a.key = b.key and a.value > 'val_9' > ); {code} {code} View Expanded Text: select `b`.`key`, count(*) from `default`.`src` `b` group by `b`.`key` having exists (select `a`.`key` from `default`.`src` `a` where `a`.`key` = b.key and `a`.`value` > 'val_9' ) {code} You can see that b.key is not quoted. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] (HIVE-15769) Support view creation in CBO
Pengcheng Xiong created HIVE-15769: -- Summary: Support view creation in CBO Key: HIVE-15769 URL: https://issues.apache.org/jira/browse/HIVE-15769 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Right now, set operator needs to run in CBO. If a view contains a set op, it will throw exception. We need to support view creation in CBO. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15719) Hive cast decimal to int is not consistent with postgres or oracle.
Pengcheng Xiong created HIVE-15719: -- Summary: Hive cast decimal to int is not consistent with postgres or oracle. Key: HIVE-15719 URL: https://issues.apache.org/jira/browse/HIVE-15719 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15716) Add TPCDS query14.q to HivePerfCliDriver
Pengcheng Xiong created HIVE-15716: -- Summary: Add TPCDS query14.q to HivePerfCliDriver Key: HIVE-15716 URL: https://issues.apache.org/jira/browse/HIVE-15716 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15685) count(distinct) generates different result than expected
Pengcheng Xiong created HIVE-15685: -- Summary: count(distinct) generates different result than expected Key: HIVE-15685 URL: https://issues.apache.org/jira/browse/HIVE-15685 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Following query with count(distinct) generates different result than expected on hive master: {noformat} select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from store_sales; {noformat} Expected output generated using postgres: {noformat} select count(distinct ss_ticket_number), count(distinct ss_sold_date_sk) from store_sales; count | count +--- 24 | 1823 (1 row) {noformat} Actual output {noformat} 24 1824 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15674) Add more setOp tests to HivePerfCliDriver
Pengcheng Xiong created HIVE-15674: -- Summary: Add more setOp tests to HivePerfCliDriver Key: HIVE-15674 URL: https://issues.apache.org/jira/browse/HIVE-15674 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15663) Add more interval tests to HivePerfCliDriver
Pengcheng Xiong created HIVE-15663: -- Summary: Add more interval tests to HivePerfCliDriver Key: HIVE-15663 URL: https://issues.apache.org/jira/browse/HIVE-15663 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong following HIVE-13557 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15646) Column level lineage is not available for table Views
Pengcheng Xiong created HIVE-15646: -- Summary: Column level lineage is not available for table Views Key: HIVE-15646 URL: https://issues.apache.org/jira/browse/HIVE-15646 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15591) Hive can not use "," in quoted column name
Pengcheng Xiong created HIVE-15591: -- Summary: Hive can not use "," in quoted column name Key: HIVE-15591 URL: https://issues.apache.org/jira/browse/HIVE-15591 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15578) Simplify IdentifiersParser
Pengcheng Xiong created HIVE-15578: -- Summary: Simplify IdentifiersParser Key: HIVE-15578 URL: https://issues.apache.org/jira/browse/HIVE-15578 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15577) Simplify current parser
Pengcheng Xiong created HIVE-15577: -- Summary: Simplify current parser Key: HIVE-15577 URL: https://issues.apache.org/jira/browse/HIVE-15577 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong We encountered "code too large" problem frequently. We need to reduce the code size. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15467) escape1.q hangs in TestMiniLlapLocalCliDriver
Pengcheng Xiong created HIVE-15467: -- Summary: escape1.q hangs in TestMiniLlapLocalCliDriver Key: HIVE-15467 URL: https://issues.apache.org/jira/browse/HIVE-15467 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Prasanth Jayachandran here is part of the log before it hangs {code} 2016-12-19T15:21:05,779 INFO [LlapScheduler] tezplugins.LlapTaskSchedulerService: ScheduleResult for Task: TaskInfo{task=attempt_1482189645956_0001_33_00_00_1, priority=1, startTime=0, containerId=null, assignedNode=, uniqueId=54, localityDelayTimeout=0} = DELAYED_RESOURCES 2016-12-19T15:21:05,779 DEBUG [LlapScheduler] tezplugins.LlapTaskSchedulerService: Attempting to preempt on any host for task=attempt_1482189645956_0001_33_00_00_1, pendingPreemptions=0 2016-12-19T15:21:05,779 INFO [LlapScheduler] tezplugins.LlapTaskSchedulerService: Preempting for task=attempt_1482189645956_0001_33_00_00_1 on any available host 2016-12-19T15:21:05,779 DEBUG [LlapScheduler] tezplugins.LlapTaskSchedulerService: Unable to schedule all requests at priority=1. Skipping subsequent priority levels 2016-12-19T15:21:07,953 DEBUG [AMReporterQueueDrainer] impl.AMReporter: Removing am localhost:61788 with last associated dag QueryIdentifier{appIdentifier='application_1482189645956_0001', dagIdentifier=33} from heartbeat with taskCount=0, amFailed=false 2016-12-19T15:21:08,634 INFO [86edca30-bf12-42f8-90cd-a9fbdfbcb546 main] SessionState: Map 1: 0(+1,-1)/1 2016-12-19T15:21:11,700 INFO [86edca30-bf12-42f8-90cd-a9fbdfbcb546 main] SessionState: Map 1: 0(+1,-1)/1 2016-12-19T15:21:14,755 INFO [86edca30-bf12-42f8-90cd-a9fbdfbcb546 main] SessionState: Map 1: 0(+1,-1)/1 2016-12-19T15:21:17,814 INFO [86edca30-bf12-42f8-90cd-a9fbdfbcb546 main] SessionState: Map 1: 0(+1,-1)/1 2016-12-19T15:21:20,871 INFO [86edca30-bf12-42f8-90cd-a9fbdfbcb546 main] SessionState: Map 1: 0(+1,-1)/1 2016-12-19T15:21:23,931 INFO [86edca30-bf12-42f8-90cd-a9fbdfbcb546 main] SessionState: Map 1: 0(+1,-1)/1 2016-12-19T15:21:26,977 INFO [86edca30-bf12-42f8-90cd-a9fbdfbcb546 main] SessionState: Map 1: 0(+1,-1)/1 2016-12-19T15:21:30,027 INFO [86edca30-bf12-42f8-90cd-a9fbdfbcb546 main] SessionState: Map 1: 0(+1,-1)/1 2016-12-19T15:21:33,078 INFO [86edca30-bf12-42f8-90cd-a9fbdfbcb546 main] SessionState: Map 1: 0(+1,-1)/1 2016-12-19T15:21:36,133 INFO [86edca30-bf12-42f8-90cd-a9fbdfbcb546 main] SessionState: Map 1: 0(+1,-1)/1 2016-12-19T15:21:39,179 INFO [86edca30-bf12-42f8-90cd-a9fbdfbcb546 main] SessionState: Map 1: 0(+1,-1)/1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15399) Parser change for UniqueJoin
Pengcheng Xiong created HIVE-15399: -- Summary: Parser change for UniqueJoin Key: HIVE-15399 URL: https://issues.apache.org/jira/browse/HIVE-15399 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong UniqueJoin was introduced in HIVE-591. Add Unique Join. (Emil Ibrishimov via namit). It sounds like that there is only one q test for unique join, i.e., uniquejoin.q. In the q test, unique join source can only come from a table. However, in parser, its source can come from not only tableSource, but also {code} partitionedTableFunction | tableSource | subQuerySource | virtualTableSource {code} I think it would be better to change the parser and limit it to meet the user's real requirement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15311) Analyze column stats should skip non-primitive column types
Pengcheng Xiong created HIVE-15311: -- Summary: Analyze column stats should skip non-primitive column types Key: HIVE-15311 URL: https://issues.apache.org/jira/browse/HIVE-15311 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15297) Hive should not split semicolon within quoted string literals
Pengcheng Xiong created HIVE-15297: -- Summary: Hive should not split semicolon within quoted string literals Key: HIVE-15297 URL: https://issues.apache.org/jira/browse/HIVE-15297 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong String literals in query cannot have reserved symbols. The same set of query works fine in mysql and postgresql. {code} hive> CREATE TABLE ts(s varchar(550)); OK Time taken: 0.075 seconds hive> INSERT INTO ts VALUES ('Mozilla/5.0 (iPhone; CPU iPhone OS 5_0'); MismatchedTokenException(14!=326) at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617) at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valueRowConstructor(HiveParser_FromClauseParser.java:7271) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesTableConstructor(HiveParser_FromClauseParser.java:7370) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.valuesClause(HiveParser_FromClauseParser.java:7510) at org.apache.hadoop.hive.ql.parse.HiveParser.valuesClause(HiveParser.java:51854) at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:45432) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:44578) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:8) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1694) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1176) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:204) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:402) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:326) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1288) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1095) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1083) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) FAILED: ParseException line 1:31 mismatched input '/' expecting ) near 'Mozilla' in value row constructor hive> {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15200) Support setOp in subQuery with parentheses
Pengcheng Xiong created HIVE-15200: -- Summary: Support setOp in subQuery with parentheses Key: HIVE-15200 URL: https://issues.apache.org/jira/browse/HIVE-15200 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong {code} explain select key from ((select key from src) union (select key from src))subq; {code} will throw {code} FAILED: ParseException line 1:47 cannot recognize input near 'union' '(' 'select' in subquery source {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15160) Can't group by an unselected column
Pengcheng Xiong created HIVE-15160: -- Summary: Can't group by an unselected column Key: HIVE-15160 URL: https://issues.apache.org/jira/browse/HIVE-15160 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong If a grouping key hasn't been selected, Hive complains. For comparison, Postgres does not. Example. Notice i_item_id is not selected: {code} select i_item_desc ,i_category ,i_class ,i_current_price ,sum(cs_ext_sales_price) as itemrevenue ,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over (partition by i_class) as revenueratio from catalog_sales ,item ,date_dim where cs_item_sk = i_item_sk and i_category in ('Jewelry', 'Sports', 'Books') and cs_sold_date_sk = d_date_sk and d_date between cast('2001-01-12' as date) and (cast('2001-01-12' as date) + 30 days) group by i_item_id ,i_item_desc ,i_category ,i_class ,i_current_price order by i_category ,i_class ,i_item_id ,i_item_desc ,revenueratio limit 100; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15150) Flaky test: explainanalyze_4, explainanalyze_5
Pengcheng Xiong created HIVE-15150: -- Summary: Flaky test: explainanalyze_4, explainanalyze_5 Key: HIVE-15150 URL: https://issues.apache.org/jira/browse/HIVE-15150 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15042) Support intersect/except without distinct keyword
Pengcheng Xiong created HIVE-15042: -- Summary: Support intersect/except without distinct keyword Key: HIVE-15042 URL: https://issues.apache.org/jira/browse/HIVE-15042 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-15023) SimpleFetchOptimizer needs to optimize limit=0
Pengcheng Xiong created HIVE-15023: -- Summary: SimpleFetchOptimizer needs to optimize limit=0 Key: HIVE-15023 URL: https://issues.apache.org/jira/browse/HIVE-15023 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong on current master {code} hive> explain select key from src limit 0; OK STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: 0 Processor Tree: TableScan alias: src Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: key (type: string) outputColumnNames: _col0 Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 0 Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column stats: NONE ListSink Time taken: 7.534 seconds, Fetched: 20 row(s) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14982) Remove some reserved keywords in 2.2
Pengcheng Xiong created HIVE-14982: -- Summary: Remove some reserved keywords in 2.2 Key: HIVE-14982 URL: https://issues.apache.org/jira/browse/HIVE-14982 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong It seems that CACHE, DAYOFWEEK, VIEWS are reserved keywords in master. This conflicts with SQL2011 standard. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14957) HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union
Pengcheng Xiong created HIVE-14957: -- Summary: HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union Key: HIVE-14957 URL: https://issues.apache.org/jira/browse/HIVE-14957 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-14957.01.patch {code} call.transformTo(parent.copy(parent.getTraitSet(), ImmutableList.of(relBuilder.build(; {code} When parent is an union operator which has 2 inputs, the parent.copy will only copy the one that has SortLimit and ignore the other branches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14917) explainanalyze_2.q fails after HIVE-14861
Pengcheng Xiong created HIVE-14917: -- Summary: explainanalyze_2.q fails after HIVE-14861 Key: HIVE-14917 URL: https://issues.apache.org/jira/browse/HIVE-14917 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14908) Upgrade ANTLR to 3.5.2
Pengcheng Xiong created HIVE-14908: -- Summary: Upgrade ANTLR to 3.5.2 Key: HIVE-14908 URL: https://issues.apache.org/jira/browse/HIVE-14908 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Antlr v4 is also available but it does not support "->" which is widely used in our grammar. Antlr 3.5.2 is the latest v3 version. It will reduce the code size: {code} Here is summary of current parser code size 422345 HiveLexer.java 2436601 HiveParser.java 814184 HiveParser_FromClauseParser.java 2705920 HiveParser_IdentifiersParser.java 777665 HiveParser_SelectClauseParser.java After change, it will become 319589 HiveLexer.java 1853104 HiveParser.java 574156 HiveParser_FromClauseParser.java 1799195 HiveParser_IdentifiersParser.java 587305 HiveParser_SelectClauseParser.java {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14872) Deprecate the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS
Pengcheng Xiong created HIVE-14872: -- Summary: Deprecate the configuration HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS Key: HIVE-14872 URL: https://issues.apache.org/jira/browse/HIVE-14872 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong The main purpose for the configuration of HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS is for backward compatibility because a lot of reserved key words has been used as identifiers in the previous releases. We already have had several releases with this configuration. Now when I tried to add new set operators to the parser, ANTLR is always complaining "code too large". I think it is time to remove this configuration. (1) It will simplify the parser logic and largely reduce the size of generated parser code; (2) it leave space for new features, especially those which require parser changes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14861) Support precedence for set operator using parentheses
Pengcheng Xiong created HIVE-14861: -- Summary: Support precedence for set operator using parentheses Key: HIVE-14861 URL: https://issues.apache.org/jira/browse/HIVE-14861 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong We should support precedence for set operator by using parentheses. For example {code} select * from src union all (select * from src union select * from src); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14806) Support UDTF in CBO (AST return path)
Pengcheng Xiong created HIVE-14806: -- Summary: Support UDTF in CBO (AST return path) Key: HIVE-14806 URL: https://issues.apache.org/jira/browse/HIVE-14806 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14768) Add a new UDTF ExplodeByNumber
Pengcheng Xiong created HIVE-14768: -- Summary: Add a new UDTF ExplodeByNumber Key: HIVE-14768 URL: https://issues.apache.org/jira/browse/HIVE-14768 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong For intersect all and except all implementation purpose. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14563) StatsOptimizer treats NULL in a wrong way
Pengcheng Xiong created HIVE-14563: -- Summary: StatsOptimizer treats NULL in a wrong way Key: HIVE-14563 URL: https://issues.apache.org/jira/browse/HIVE-14563 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong {code} OSTHOOK: query: explain select count(key) from (select null as key from src)src POSTHOOK: type: QUERY STAGE DEPENDENCIES: Stage-0 is a root stage STAGE PLANS: Stage: Stage-0 Fetch Operator limit: 1 Processor Tree: ListSink PREHOOK: query: select count(key) from (select null as key from src)src PREHOOK: type: QUERY PREHOOK: Input: default@src A masked pattern was here POSTHOOK: query: select count(key) from (select null as key from src)src POSTHOOK: type: QUERY POSTHOOK: Input: default@src A masked pattern was here 500 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14511) Improve MSCK for partitioned table to deal with special cases
Pengcheng Xiong created HIVE-14511: -- Summary: Improve MSCK for partitioned table to deal with special cases Key: HIVE-14511 URL: https://issues.apache.org/jira/browse/HIVE-14511 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Some users will have a folder rather than a file under the last partition folder. However, msck is going to search for the leaf folder rather than the last partition folder. We need to improve that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14362) Support explain analyze in Hive
Pengcheng Xiong created HIVE-14362: -- Summary: Support explain analyze in Hive Key: HIVE-14362 URL: https://issues.apache.org/jira/browse/HIVE-14362 Project: Hive Issue Type: New Feature Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Right now all the explain levels only support stats before query runs. We would like to have an explain analyze similar to Postgres for real stats after query runs. This will help to identify the major gap between estimated/real stats and make not only query optimization better but also query performance debugging easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14338) Delete/Alter table calls failing with HiveAccessControlException
Pengcheng Xiong created HIVE-14338: -- Summary: Delete/Alter table calls failing with HiveAccessControlException Key: HIVE-14338 URL: https://issues.apache.org/jira/browse/HIVE-14338 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Many Hcatalog/Webhcat tests are failing with below error, when tests try to alter/delete/describe tables. Error is thrown when the same user or a different user (same group) who created the table is trying to run the delete/alter table call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14317) Make the print of COLUMN_STATS_ACCURATE more stable.
Pengcheng Xiong created HIVE-14317: -- Summary: Make the print of COLUMN_STATS_ACCURATE more stable. Key: HIVE-14317 URL: https://issues.apache.org/jira/browse/HIVE-14317 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong based on different versions, we may have COLUMN_STATS_ACCURATE {"BASIC_STATS":"true","COLUMN_STATS":{"key":"true","value":"true"}} or COLUMN_STATS_ACCURATE {"COLUMN_STATS":{"key":"true","value":"true"},"BASIC_STATS":"true"} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-14291) count(*) on a table written by hcatstorer returns incorrect result
Pengcheng Xiong created HIVE-14291: -- Summary: count(*) on a table written by hcatstorer returns incorrect result Key: HIVE-14291 URL: https://issues.apache.org/jira/browse/HIVE-14291 Project: Hive Issue Type: Sub-task Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong {code} count(*) on a table written by hcatstorer returns wrong result. {code} steps to repro the issue: 1) create hive table {noformat} create table ${DEST_TABLE}(name string, age int, gpa float) row format delimited fields terminated by '\t' stored as textfile; {noformat} 2) load data into table using hcatstorer {noformat} A = LOAD '$DATA_1' USING PigStorage() AS (name:chararray, age:int, gpa:float); B = LOAD '$DATA_2' USING PigStorage() AS (name:chararray, age:int, gpa:float); C = UNION A, B; STORE C INTO '$HIVE_TABLE' USING org.apache.hive.hcatalog.pig.HCatStorer(); {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)