[jira] [Updated] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-7231: -- Attachment: HIVE-7231.8.patch Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, HIVE-7231.8.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053242#comment-14053242 ] Hive QA commented on HIVE-7231: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12654227/HIVE-7231.8.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5677 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/683/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/683/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-683/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12654227 Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, HIVE-7231.8.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053253#comment-14053253 ] Gopal V commented on HIVE-7231: --- Test failures unrelated. Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, HIVE-7231.8.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7231) Improve ORC padding
[ https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-7231: -- Resolution: Fixed Fix Version/s: 0.14.0 Release Note: HIVE-7231 : Improve ORC padding (Prasanth J, reviewed by Gopal V) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk, thanks [~prasanth_j]! Improve ORC padding --- Key: HIVE-7231 URL: https://issues.apache.org/jira/browse/HIVE-7231 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Fix For: 0.14.0 Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, HIVE-7231.8.patch Current ORC padding is not optimal because of fixed stripe sizes within block. The padding overhead will be significant in some cases. Also padding percentage relative to stripe size is not configurable. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6122) Implement show grant on resource
[ https://issues.apache.org/jira/browse/HIVE-6122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053264#comment-14053264 ] Navis commented on HIVE-6122: - With this patch, principal_specification is now optional. For example, SHOW GRANT ON TABLE x returns all principals allowed any of privileges on the table X. Furthermore, 'ON' can take 'ALL' keyword representing any resources of any type. The above spec should be changed to something like, {code} SHOW GRANT [ principal_specification | ON object_type priv_level [(column_list)] | ON ALL ] {code} Could you translate this as a proper documentation? I tried, but you know --;; Implement show grant on resource -- Key: HIVE-6122 URL: https://issues.apache.org/jira/browse/HIVE-6122 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Labels: TODOC13 Fix For: 0.13.0 Attachments: HIVE-6122.1.patch.txt, HIVE-6122.2.patch.txt, HIVE-6122.3.patch.txt, HIVE-6122.4.patch, HIVE-6122.4.patch, HIVE-6122.5.patch, HIVE-6122.6.patch Currently, hive shows privileges owned by a principal. Reverse API is also needed, which shows all principals for a resource. {noformat} show grant user hive_test_user on database default; show grant user hive_test_user on table dummy; show grant user hive_test_user on all; {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7350) Changes related to TEZ-692, TEZ-1169, TEZ-1234
[ https://issues.apache.org/jira/browse/HIVE-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-7350: --- Attachment: HIVE-7350.3.patch Added waitTillReady() in TezSessionState. Changes related to TEZ-692, TEZ-1169, TEZ-1234 -- Key: HIVE-7350 URL: https://issues.apache.org/jira/browse/HIVE-7350 Project: Hive Issue Type: Sub-task Reporter: Rajesh Balamohan Assignee: Rajesh Balamohan Attachments: HIVE-7350.1.patch, HIVE-7350.2.patch, HIVE-7350.3.patch Address changes related to TEZ-692 - Unify job submission in either TezClient or TezSession TEZ-1169 - Allow numPhysicalInputs to be specified for RootInputs TEZ-1234 - Replace Interfaces with Abstract classes for VertexManagerPlugin and EdgeManager -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-2925) Support non-MR fetching for simple queries with select/limit/filter operations only
[ https://issues.apache.org/jira/browse/HIVE-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053271#comment-14053271 ] Navis commented on HIVE-2925: - bq. What does RS mean? Yes, it's ReduceSinkOperator, which needs a MR task to be evaluated, negating fetch task conversion. bq. LIMIT only and LIMIT only (TABLESAMPLE, virtual columns) mean? In 'minimal', it didn't allowed any virtual columns and sampling directives. But with 'more', the optimizer can convert those, too. bq. Does FILTER just mean the WHERE and HAVING clauses? Yes, and yes. It's any form of predicates in the query. And uppercase would be better. Support non-MR fetching for simple queries with select/limit/filter operations only --- Key: HIVE-2925 URL: https://issues.apache.org/jira/browse/HIVE-2925 Project: Hive Issue Type: Improvement Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Priority: Trivial Fix For: 0.10.0 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2925.D2607.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2925.D2607.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2925.D2607.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2925.D2607.4.patch, HIVE-2925.1.patch.txt, HIVE-2925.2.patch.txt, HIVE-2925.3.patch.txt It's trivial but frequently asked by end-users. Currently, select queries with simple conditions or limit should run MR job which takes some time especially for big tables, making the people irritated. For that kind of simple queries, using fetch task would make them happy. -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 23298: Wrong results in multi-table insert aggregating without group by clause
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23298/ --- Review request for hive. Bugs: HIVE-7045 https://issues.apache.org/jira/browse/HIVE-7045 Repository: hive-git Description --- This happens whenever there are more than 1 reducers. The scenario : CREATE TABLE t1 (a int, b int); CREATE TABLE t2 (cnt int) PARTITIONED BY (var_name string); insert into table t1 select 1,1 from asd limit 1; insert into table t1 select 2,2 from asd limit 1; t1 contains : 1 1 2 2 from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt ; select * from t2; returns : 2 a 2 b as expected. Setting the number of reducers higher than 1 : set mapred.reduce.tasks=2; from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt insert overwrite table t2 partition(var_name='b') select count(b) cnt; select * from t2; 1 a 1 a 1 b 1 b Wrong results. This happens when ever t1 is big enough to automatically generate more than 1 reducers and without specifying it directly. adding group by 1 in the end of each insert solves the problem : from t1 insert overwrite table t2 partition(var_name='a') select count(a) cnt group by 1 insert overwrite table t2 partition(var_name='b') select count(b) cnt group by 1; generates : 2 a 2 b This should work without the group by... The number of rows for each partition will be the amount of reducers. Each reducer calculated a sub total of the count. Diffs - ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 399f92a Diff: https://reviews.apache.org/r/23298/diff/ Testing --- Thanks, Navis Ryu
[jira] [Updated] (HIVE-7326) Hive complains invalid column reference with 'having' aggregate predicates
[ https://issues.apache.org/jira/browse/HIVE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7326: Attachment: HIVE-7326.2.patch.txt Hive complains invalid column reference with 'having' aggregate predicates -- Key: HIVE-7326 URL: https://issues.apache.org/jira/browse/HIVE-7326 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-7326.1.patch.txt, HIVE-7326.2.patch.txt CREATE TABLE TestV1_Staples ( Item_Count INT, Ship_Priority STRING, Order_Priority STRING, Order_Status STRING, Order_Quantity DOUBLE, Sales_Total DOUBLE, Discount DOUBLE, Tax_Rate DOUBLE, Ship_Mode STRING, Fill_Time DOUBLE, Gross_Profit DOUBLE, Price DOUBLE, Ship_Handle_Cost DOUBLE, Employee_Name STRING, Employee_Dept STRING, Manager_Name STRING, Employee_Yrs_Exp DOUBLE, Employee_Salary DOUBLE, Customer_Name STRING, Customer_State STRING, Call_Center_Region STRING, Customer_Balance DOUBLE, Customer_Segment STRING, Prod_Type1 STRING, Prod_Type2 STRING, Prod_Type3 STRING, Prod_Type4 STRING, Product_Name STRING, Product_Container STRING, Ship_Promo STRING, Supplier_Name STRING, Supplier_Balance DOUBLE, Supplier_Region STRING, Supplier_State STRING, Order_ID STRING, Order_Year INT, Order_Month INT, Order_Day INT, Order_Date_ STRING, Order_Quarter STRING, Product_Base_Margin DOUBLE, Product_ID STRING, Receive_Time DOUBLE, Received_Date_ STRING, Ship_Date_ STRING, Ship_Charge DOUBLE, Total_Cycle_Time DOUBLE, Product_In_Stock STRING, PID INT, Market_Segment STRING ); Query that works: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (COUNT(s1.discount) = 822) AND (SUM(customer_balance) = 4074689.00041) ); Query that fails: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (SUM(customer_balance) = 4074689.00041) AND (COUNT(s1.discount) = 822) ); -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5690) Support subquery for single sourced multi query
[ https://issues.apache.org/jira/browse/HIVE-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5690: Attachment: HIVE-5690.6.patch.txt Support subquery for single sourced multi query --- Key: HIVE-5690 URL: https://issues.apache.org/jira/browse/HIVE-5690 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: D13791.1.patch, HIVE-5690.2.patch.txt, HIVE-5690.3.patch.txt, HIVE-5690.4.patch.txt, HIVE-5690.5.patch.txt, HIVE-5690.6.patch.txt Single sourced multi (insert) query is very useful for various ETL processes but it does not allow subqueries included. For example, {noformat} explain from src insert overwrite table x1 select * from (select distinct key,value) b order by key insert overwrite table x2 select * from (select distinct key,value) c order by value; {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5343) Add equals method to ObjectInspectorUtils
[ https://issues.apache.org/jira/browse/HIVE-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5343: Attachment: HIVE-5343.2.patch.txt Add equals method to ObjectInspectorUtils - Key: HIVE-5343 URL: https://issues.apache.org/jira/browse/HIVE-5343 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D13053.1.patch, HIVE-5343.2.patch.txt Might provide shortcut for some use cases. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7079) Hive logs errors about missing tables when parsing CTE expressions
[ https://issues.apache.org/jira/browse/HIVE-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7079: Attachment: HIVE-7079.2.patch.txt Hive logs errors about missing tables when parsing CTE expressions -- Key: HIVE-7079 URL: https://issues.apache.org/jira/browse/HIVE-7079 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.0 Reporter: Craig Condit Assignee: Navis Priority: Minor Attachments: HIVE-7079.1.patch.txt, HIVE-7079.2.patch.txt Given a query containing common table expressions (CTE) such as: WITH a AS (SELECT ...), b AS (SELECT ...) SELECT * FROM a JOIN b on a.col = b.col ...; Hive CLI executes the query, but logs stack traces at ERROR level during query parsing: {noformat} ERROR metadata.Hive: NoSuchObjectException(message:ccondit.a table not found) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:29338) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:29306) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result.read(ThriftHiveMetastore.java:29237) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1036) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1022) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy7.getTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:967) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:909) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1223) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1192) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9209) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:391) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:291) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:944) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1009) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} It looks like Hive is attempting to resolve the CTE aliases as physical tables. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7111) Extend join transitivity PPD to non-column expressions
[ https://issues.apache.org/jira/browse/HIVE-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7111: Attachment: HIVE-7111.3.patch.txt Extend join transitivity PPD to non-column expressions -- Key: HIVE-7111 URL: https://issues.apache.org/jira/browse/HIVE-7111 Project: Hive Issue Type: Task Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-7111.1.patch.txt, HIVE-7111.2.patch.txt, HIVE-7111.3.patch.txt Join transitive in PPD only supports column expressions, but it's possible to extend this to generic expressions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7111) Extend join transitivity PPD to non-column expressions
[ https://issues.apache.org/jira/browse/HIVE-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7111: Attachment: (was: HIVE-7111.3.patch.txt) Extend join transitivity PPD to non-column expressions -- Key: HIVE-7111 URL: https://issues.apache.org/jira/browse/HIVE-7111 Project: Hive Issue Type: Task Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-7111.1.patch.txt, HIVE-7111.2.patch.txt, HIVE-7111.3.patch.txt Join transitive in PPD only supports column expressions, but it's possible to extend this to generic expressions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7111) Extend join transitivity PPD to non-column expressions
[ https://issues.apache.org/jira/browse/HIVE-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-7111: Attachment: HIVE-7111.3.patch.txt Extend join transitivity PPD to non-column expressions -- Key: HIVE-7111 URL: https://issues.apache.org/jira/browse/HIVE-7111 Project: Hive Issue Type: Task Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-7111.1.patch.txt, HIVE-7111.2.patch.txt, HIVE-7111.3.patch.txt Join transitive in PPD only supports column expressions, but it's possible to extend this to generic expressions. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6259) Support truncate for non-native tables
[ https://issues.apache.org/jira/browse/HIVE-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-6259: Attachment: HIVE-6259.5.patch.txt Support truncate for non-native tables -- Key: HIVE-6259 URL: https://issues.apache.org/jira/browse/HIVE-6259 Project: Hive Issue Type: Bug Components: StorageHandler Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-6259.1.patch.txt, HIVE-6259.2.patch.txt, HIVE-6259.3.patch.txt, HIVE-6259.4.patch.txt, HIVE-6259.5.patch.txt Tables on HBase might be truncated by similar method in HBaseShell. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5718) Support direct fetch for lateral views, sub queries, etc.
[ https://issues.apache.org/jira/browse/HIVE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5718: Attachment: HIVE-5718.4.patch.txt Support direct fetch for lateral views, sub queries, etc. - Key: HIVE-5718 URL: https://issues.apache.org/jira/browse/HIVE-5718 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D13857.1.patch, D13857.2.patch, D13857.3.patch, HIVE-5718.4.patch.txt Extend HIVE-2925 with LV and SubQ. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7243) Print padding information in ORC file dump
[ https://issues.apache.org/jira/browse/HIVE-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-7243: -- Attachment: HIVE-7243.3.patch Rebase patch unit-tests after HIVE-7231 Print padding information in ORC file dump -- Key: HIVE-7243 URL: https://issues.apache.org/jira/browse/HIVE-7243 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Minor Labels: orcfile Attachments: HIVE-7243.1.patch, HIVE-7243.2.patch, HIVE-7243.3.patch It will be useful to print the padding information in orc file dump utility. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7243) Print padding information in ORC file dump
[ https://issues.apache.org/jira/browse/HIVE-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-7243: -- Status: Patch Available (was: Open) Print padding information in ORC file dump -- Key: HIVE-7243 URL: https://issues.apache.org/jira/browse/HIVE-7243 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.14.0 Reporter: Prasanth J Assignee: Prasanth J Priority: Minor Labels: orcfile Attachments: HIVE-7243.1.patch, HIVE-7243.2.patch, HIVE-7243.3.patch It will be useful to print the padding information in orc file dump utility. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-5025) Column aliases for input argument of GenericUDFs
[ https://issues.apache.org/jira/browse/HIVE-5025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-5025: Attachment: HIVE-5025.4.patch.txt Column aliases for input argument of GenericUDFs - Key: HIVE-5025 URL: https://issues.apache.org/jira/browse/HIVE-5025 Project: Hive Issue Type: Improvement Components: UDF Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D12093.2.patch, D12093.3.patch, HIVE-5025.4.patch.txt, HIVE-5025.D12093.1.patch In some cases, column aliases for input argument are very useful to know. But I cannot sure of this in the sense that UDFs should not be dependent to contextual information like column alias. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4790) MapredLocalTask task does not make virtual columns
[ https://issues.apache.org/jira/browse/HIVE-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4790: Attachment: HIVE-4790.10.patch.txt MapredLocalTask task does not make virtual columns -- Key: HIVE-4790 URL: https://issues.apache.org/jira/browse/HIVE-4790 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: D11511.3.patch, D11511.4.patch, HIVE-4790.10.patch.txt, HIVE-4790.5.patch.txt, HIVE-4790.6.patch.txt, HIVE-4790.7.patch.txt, HIVE-4790.8.patch.txt, HIVE-4790.9.patch.txt, HIVE-4790.D11511.1.patch, HIVE-4790.D11511.2.patch From mailing list, http://www.mail-archive.com/user@hive.apache.org/msg08264.html {noformat} SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = a.number; fails with this error: SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = a.number; Automatically selecting local only mode for query Total MapReduce jobs = 1 setting HADOOP_USER_NAMEpmarron 13/06/25 10:52:56 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore. Execution log at: /tmp/pmarron/.log 2013-06-25 10:52:56 Starting to launch local task to process map join; maximum memory = 932118528 java.lang.RuntimeException: cannot find field block__offset__inside__file from [0:rownumber, 1:offset] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366) at org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldRef(LazySimpleStructObjectInspector.java:168) at org.apache.hadoop.hive.serde2.objectinspector.DelegatedStructObjectInspector.getStructFieldRef(DelegatedStructObjectInspector.java:74) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57) at org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:68) at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:222) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:186) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) at org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:394) at org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277) at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Execution failed with exit status: 2 {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6406) Introduce immutable-table table property and if set, disallow insert-into
[ https://issues.apache.org/jira/browse/HIVE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053325#comment-14053325 ] Lefty Leverenz commented on HIVE-6406: -- This is documented in the wiki in two places: * [DML -- Inserting data into Hive tables from queries (see Synopsis after syntax) | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingdataintoHiveTablesfromqueries] * [DDL -- Create Table (see bullet list after syntax) | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable] Introduce immutable-table table property and if set, disallow insert-into - Key: HIVE-6406 URL: https://issues.apache.org/jira/browse/HIVE-6406 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.13.0 Attachments: HIVE-6406.2.patch, HIVE-6406.3.patch, HIVE-6406.patch As part of HIVE-6405's attempt to make HCatalog and Hive behave in similar ways with regards to immutable tables, this is a companion task to introduce the notion of an immutable table, wherein all tables are not immutable by default, and have this be a table property. If this property is set for a table, and we attempt to write to a table that already has data (or a partition), disallow INSERT INTO into it from hive(if destination directory is non-empty). This property being set will allow hive to mimic HCatalog's current immutable-table property. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6475) Implement support for appending to mutable tables in HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053326#comment-14053326 ] Lefty Leverenz commented on HIVE-6475: -- Does this need any user doc? Implement support for appending to mutable tables in HCatalog - Key: HIVE-6475 URL: https://issues.apache.org/jira/browse/HIVE-6475 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.13.0 Attachments: 6475.log, 6475.log.hadoop2, HIVE-6475.2.patch, HIVE-6475.patch Part of HIVE-6405, this is the implementation of the append feature on the HCatalog side. If a table is mutable, we must support being able to append to existing data instead of erroring out as a duplicate publish. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7326) Hive complains invalid column reference with 'having' aggregate predicates
[ https://issues.apache.org/jira/browse/HIVE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053327#comment-14053327 ] Hive QA commented on HIVE-7326: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12654241/HIVE-7326.2.patch.txt {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5678 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/685/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/685/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-685/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12654241 Hive complains invalid column reference with 'having' aggregate predicates -- Key: HIVE-7326 URL: https://issues.apache.org/jira/browse/HIVE-7326 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-7326.1.patch.txt, HIVE-7326.2.patch.txt CREATE TABLE TestV1_Staples ( Item_Count INT, Ship_Priority STRING, Order_Priority STRING, Order_Status STRING, Order_Quantity DOUBLE, Sales_Total DOUBLE, Discount DOUBLE, Tax_Rate DOUBLE, Ship_Mode STRING, Fill_Time DOUBLE, Gross_Profit DOUBLE, Price DOUBLE, Ship_Handle_Cost DOUBLE, Employee_Name STRING, Employee_Dept STRING, Manager_Name STRING, Employee_Yrs_Exp DOUBLE, Employee_Salary DOUBLE, Customer_Name STRING, Customer_State STRING, Call_Center_Region STRING, Customer_Balance DOUBLE, Customer_Segment STRING, Prod_Type1 STRING, Prod_Type2 STRING, Prod_Type3 STRING, Prod_Type4 STRING, Product_Name STRING, Product_Container STRING, Ship_Promo STRING, Supplier_Name STRING, Supplier_Balance DOUBLE, Supplier_Region STRING, Supplier_State STRING, Order_ID STRING, Order_Year INT, Order_Month INT, Order_Day INT, Order_Date_ STRING, Order_Quarter STRING, Product_Base_Margin DOUBLE, Product_ID STRING, Receive_Time DOUBLE, Received_Date_ STRING, Ship_Date_ STRING, Ship_Charge DOUBLE, Total_Cycle_Time DOUBLE, Product_In_Stock STRING, PID INT, Market_Segment STRING ); Query that works: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (COUNT(s1.discount) = 822) AND (SUM(customer_balance) = 4074689.00041) ); Query that fails: SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM default.testv1_staples s1 GROUP BY customer_name HAVING ( (SUM(customer_balance) = 4074689.00041) AND (COUNT(s1.discount) = 822) ); -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: User doc for table properties
Review request: Predefined TBLPROPERTIES are documented briefly in the Create Table section https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable of the wiki -- see the notes immediately following the syntax. I'd still like to know if there are any other predefined table properties. By the way, one of the properties mentioned in the previous message isn't a table property, it's a SerDe property (hbase.table.default.storage.type https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration#HBaseIntegration-ColumnMapping ). -- Lefty On Fri, Feb 14, 2014 at 6:45 PM, Lefty Leverenz leftylever...@gmail.com wrote: The user doc for TBLPROPERTIES needs work. Currently the DDL wikidoc only says this: The TBLPROPERTIES clause allows you to tag the table definition with your own metadata key/value pairs. But some table properties have predefined keys and values. HIVE-6406 https://issues.apache.org/jira/browse/HIVE-6406 will add immutable -- how many others already exist? Are they all listed in one file and distinguishable from internal parameters, or just scattered throughout the code? A quick search found orc.compress (example in HIVE-6083 https://issues.apache.org/jira/browse/HIVE-6083) and hbase.table.name hbase.table.default.storage.type (in TestPigHBaseStorageHandler.java). OrcFile.java has several more listed after orc.compress (some mentioned in HIVE-4221 https://issues.apache.org/jira/browse/HIVE-4221 comments). This might be a can of worms but the wiki should list all predefined keys and their possible values, with version information where needed. I suggest a new subsection in the Create Table section of DDL: - Language Manual – DDL – Create Table https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/TruncateTable Then particular table properties can be mentioned in their topic docs (like ORC) with links to the DDL doc. This message can be converted to a JIRA ticket later, but now I'm just looking for information. Hearts flowers chocolate to all on Valentine's Day. -- Lefty
[jira] [Commented] (HIVE-5343) Add equals method to ObjectInspectorUtils
[ https://issues.apache.org/jira/browse/HIVE-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053362#comment-14053362 ] Hive QA commented on HIVE-5343: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12654245/HIVE-5343.2.patch.txt {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5692 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_comparison org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_serde org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/686/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/686/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-686/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12654245 Add equals method to ObjectInspectorUtils - Key: HIVE-5343 URL: https://issues.apache.org/jira/browse/HIVE-5343 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Trivial Attachments: D13053.1.patch, HIVE-5343.2.patch.txt Might provide shortcut for some use cases. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-4616) Simple reconnection support for jdbc2
[ https://issues.apache.org/jira/browse/HIVE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4616: Attachment: HIVE-4616.4.patch.txt Simple reconnection support for jdbc2 - Key: HIVE-4616 URL: https://issues.apache.org/jira/browse/HIVE-4616 Project: Hive Issue Type: Improvement Components: JDBC Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4616.3.patch.txt, HIVE-4616.4.patch.txt, HIVE-4616.D10953.1.patch, HIVE-4616.D10953.2.patch jdbc:hive2://localhost:1/db2;autoReconnect=true simple reconnection on TransportException. If hiveserver2 has not been shutdown, session could be reused. -- This message was sent by Atlassian JIRA (v6.2#6252)