[jira] [Updated] (HIVE-7231) Improve ORC padding

2014-07-06 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-7231:
--

Attachment: HIVE-7231.8.patch

 Improve ORC padding
 ---

 Key: HIVE-7231
 URL: https://issues.apache.org/jira/browse/HIVE-7231
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, 
 HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, 
 HIVE-7231.8.patch


 Current ORC padding is not optimal because of fixed stripe sizes within 
 block. The padding overhead will be significant in some cases. Also padding 
 percentage relative to stripe size is not configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7231) Improve ORC padding

2014-07-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053242#comment-14053242
 ] 

Hive QA commented on HIVE-7231:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12654227/HIVE-7231.8.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5677 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/683/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/683/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-683/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12654227

 Improve ORC padding
 ---

 Key: HIVE-7231
 URL: https://issues.apache.org/jira/browse/HIVE-7231
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, 
 HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, 
 HIVE-7231.8.patch


 Current ORC padding is not optimal because of fixed stripe sizes within 
 block. The padding overhead will be significant in some cases. Also padding 
 percentage relative to stripe size is not configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7231) Improve ORC padding

2014-07-06 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053253#comment-14053253
 ] 

Gopal V commented on HIVE-7231:
---

Test failures unrelated.

 Improve ORC padding
 ---

 Key: HIVE-7231
 URL: https://issues.apache.org/jira/browse/HIVE-7231
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, 
 HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, 
 HIVE-7231.8.patch


 Current ORC padding is not optimal because of fixed stripe sizes within 
 block. The padding overhead will be significant in some cases. Also padding 
 percentage relative to stripe size is not configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7231) Improve ORC padding

2014-07-06 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-7231:
--

   Resolution: Fixed
Fix Version/s: 0.14.0
 Release Note: HIVE-7231 : Improve ORC padding (Prasanth J, reviewed by 
Gopal V)
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk, thanks [~prasanth_j]!

 Improve ORC padding
 ---

 Key: HIVE-7231
 URL: https://issues.apache.org/jira/browse/HIVE-7231
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Fix For: 0.14.0

 Attachments: HIVE-7231.1.patch, HIVE-7231.2.patch, HIVE-7231.3.patch, 
 HIVE-7231.4.patch, HIVE-7231.5.patch, HIVE-7231.6.patch, HIVE-7231.7.patch, 
 HIVE-7231.8.patch


 Current ORC padding is not optimal because of fixed stripe sizes within 
 block. The padding overhead will be significant in some cases. Also padding 
 percentage relative to stripe size is not configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6122) Implement show grant on resource

2014-07-06 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053264#comment-14053264
 ] 

Navis commented on HIVE-6122:
-

With this patch, principal_specification is now optional. For example, SHOW 
GRANT ON TABLE x returns all principals allowed any of privileges on the table 
X. Furthermore, 'ON' can take 'ALL' keyword representing any resources of any 
type. The above spec should be changed to something like,
{code}
SHOW GRANT [
  principal_specification |
  ON object_type priv_level [(column_list)] |
  ON ALL
]
{code}

Could you translate this as a proper documentation? I tried, but you know --;;

 Implement show grant on resource
 --

 Key: HIVE-6122
 URL: https://issues.apache.org/jira/browse/HIVE-6122
 Project: Hive
  Issue Type: Improvement
  Components: Authorization
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: TODOC13
 Fix For: 0.13.0

 Attachments: HIVE-6122.1.patch.txt, HIVE-6122.2.patch.txt, 
 HIVE-6122.3.patch.txt, HIVE-6122.4.patch, HIVE-6122.4.patch, 
 HIVE-6122.5.patch, HIVE-6122.6.patch


 Currently, hive shows privileges owned by a principal. Reverse API is also 
 needed, which shows all principals for a resource. 
 {noformat}
 show grant user hive_test_user on database default;
 show grant user hive_test_user on table dummy;
 show grant user hive_test_user on all;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7350) Changes related to TEZ-692, TEZ-1169, TEZ-1234

2014-07-06 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-7350:
---

Attachment: HIVE-7350.3.patch

Added waitTillReady() in TezSessionState.

 Changes related to TEZ-692, TEZ-1169, TEZ-1234
 --

 Key: HIVE-7350
 URL: https://issues.apache.org/jira/browse/HIVE-7350
 Project: Hive
  Issue Type: Sub-task
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: HIVE-7350.1.patch, HIVE-7350.2.patch, HIVE-7350.3.patch


 Address changes related to 
  TEZ-692 - Unify job submission in either TezClient or TezSession
  TEZ-1169 -  Allow numPhysicalInputs to be specified for RootInputs
  TEZ-1234 - Replace Interfaces with Abstract classes for VertexManagerPlugin 
 and EdgeManager



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-2925) Support non-MR fetching for simple queries with select/limit/filter operations only

2014-07-06 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053271#comment-14053271
 ] 

Navis commented on HIVE-2925:
-

bq. What does RS mean?
Yes, it's ReduceSinkOperator, which needs a MR task to be evaluated, negating 
fetch task conversion.

bq. LIMIT only and LIMIT only (TABLESAMPLE, virtual columns) mean?
In 'minimal', it didn't allowed any virtual columns and sampling directives. 
But with 'more', the optimizer can convert those, too.

bq. Does FILTER just mean the WHERE and HAVING clauses?
Yes, and yes. It's any form of predicates in the query. And uppercase would be 
better.

 Support non-MR fetching for simple queries with select/limit/filter 
 operations only
 ---

 Key: HIVE-2925
 URL: https://issues.apache.org/jira/browse/HIVE-2925
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.10.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2925.D2607.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2925.D2607.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2925.D2607.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2925.D2607.4.patch, HIVE-2925.1.patch.txt, 
 HIVE-2925.2.patch.txt, HIVE-2925.3.patch.txt


 It's trivial but frequently asked by end-users. Currently, select queries 
 with simple conditions or limit should run MR job which takes some time 
 especially for big tables, making the people irritated.
 For that kind of simple queries, using fetch task would make them happy.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Review Request 23298: Wrong results in multi-table insert aggregating without group by clause

2014-07-06 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23298/
---

Review request for hive.


Bugs: HIVE-7045
https://issues.apache.org/jira/browse/HIVE-7045


Repository: hive-git


Description
---

This happens whenever there are more than 1 reducers.

The scenario :

CREATE  TABLE t1 (a int, b int);
CREATE  TABLE t2 (cnt int) PARTITIONED BY (var_name string);

insert into table t1 select 1,1 from asd limit 1;
insert into table t1 select 2,2 from asd limit 1;

t1 contains :
1 1
2 2

from  t1
insert overwrite table t2 partition(var_name='a') select count(a) cnt 
insert overwrite table t2 partition(var_name='b') select count(b) cnt ;

select * from t2;
returns : 
2 a
2 b

as expected.

Setting the number of reducers higher than 1 :

set mapred.reduce.tasks=2;

from  t1
insert overwrite table t2 partition(var_name='a') select count(a) cnt
insert overwrite table t2 partition(var_name='b') select count(b) cnt;

select * from t2;
1   a
1   a
1   b
1   b

Wrong results.

This happens when ever t1 is big enough to automatically generate more than 1 
reducers and without specifying it directly.

adding group by 1 in the end of each insert solves the problem :

from  t1
insert overwrite table t2 partition(var_name='a') select count(a) cnt group by 1
insert overwrite table t2 partition(var_name='b') select count(b) cnt group by 
1;

generates : 
2 a
2 b

This should work without the group by...
The number of rows for each partition will be the amount of reducers.
Each reducer calculated a sub total of the count.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 399f92a 

Diff: https://reviews.apache.org/r/23298/diff/


Testing
---


Thanks,

Navis Ryu



[jira] [Updated] (HIVE-7326) Hive complains invalid column reference with 'having' aggregate predicates

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7326:


Attachment: HIVE-7326.2.patch.txt

 Hive complains invalid column reference with 'having' aggregate predicates
 --

 Key: HIVE-7326
 URL: https://issues.apache.org/jira/browse/HIVE-7326
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-7326.1.patch.txt, HIVE-7326.2.patch.txt


 CREATE TABLE TestV1_Staples (
   Item_Count INT,
   Ship_Priority STRING,
   Order_Priority STRING,
   Order_Status STRING,
   Order_Quantity DOUBLE,
   Sales_Total DOUBLE,
   Discount DOUBLE,
   Tax_Rate DOUBLE,
   Ship_Mode STRING,
   Fill_Time DOUBLE,
   Gross_Profit DOUBLE,
   Price DOUBLE,
   Ship_Handle_Cost DOUBLE,
   Employee_Name STRING,
   Employee_Dept STRING,
   Manager_Name STRING,
   Employee_Yrs_Exp DOUBLE,
   Employee_Salary DOUBLE,
   Customer_Name STRING,
   Customer_State STRING,
   Call_Center_Region STRING,
   Customer_Balance DOUBLE,
   Customer_Segment STRING,
   Prod_Type1 STRING,
   Prod_Type2 STRING,
   Prod_Type3 STRING,
   Prod_Type4 STRING,
   Product_Name STRING,
   Product_Container STRING,
   Ship_Promo STRING,
   Supplier_Name STRING,
   Supplier_Balance DOUBLE,
   Supplier_Region STRING,
   Supplier_State STRING,
   Order_ID STRING,
   Order_Year INT,
   Order_Month INT,
   Order_Day INT,
   Order_Date_ STRING,
   Order_Quarter STRING,
   Product_Base_Margin DOUBLE,
   Product_ID STRING,
   Receive_Time DOUBLE,
   Received_Date_ STRING,
   Ship_Date_ STRING,
   Ship_Charge DOUBLE,
   Total_Cycle_Time DOUBLE,
   Product_In_Stock STRING,
   PID INT,
   Market_Segment STRING
   );
 Query that works:
 SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM 
 default.testv1_staples s1 GROUP BY customer_name HAVING (
 (COUNT(s1.discount) = 822) AND
 (SUM(customer_balance) = 4074689.00041)
 );
 Query that fails:
 SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM 
 default.testv1_staples s1 GROUP BY customer_name HAVING (
 (SUM(customer_balance) = 4074689.00041)
 AND (COUNT(s1.discount) = 822)
 );



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5690) Support subquery for single sourced multi query

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5690:


Attachment: HIVE-5690.6.patch.txt

 Support subquery for single sourced multi query
 ---

 Key: HIVE-5690
 URL: https://issues.apache.org/jira/browse/HIVE-5690
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: D13791.1.patch, HIVE-5690.2.patch.txt, 
 HIVE-5690.3.patch.txt, HIVE-5690.4.patch.txt, HIVE-5690.5.patch.txt, 
 HIVE-5690.6.patch.txt


 Single sourced multi (insert) query is very useful for various ETL processes 
 but it does not allow subqueries included. For example, 
 {noformat}
 explain from src 
 insert overwrite table x1 select * from (select distinct key,value) b order 
 by key
 insert overwrite table x2 select * from (select distinct key,value) c order 
 by value;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5343) Add equals method to ObjectInspectorUtils

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5343:


Attachment: HIVE-5343.2.patch.txt

 Add equals method to ObjectInspectorUtils
 -

 Key: HIVE-5343
 URL: https://issues.apache.org/jira/browse/HIVE-5343
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: D13053.1.patch, HIVE-5343.2.patch.txt


 Might provide shortcut for some use cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7079) Hive logs errors about missing tables when parsing CTE expressions

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7079:


Attachment: HIVE-7079.2.patch.txt

 Hive logs errors about missing tables when parsing CTE expressions
 --

 Key: HIVE-7079
 URL: https://issues.apache.org/jira/browse/HIVE-7079
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Craig Condit
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7079.1.patch.txt, HIVE-7079.2.patch.txt


 Given a query containing common table expressions (CTE) such as:
 WITH a AS (SELECT ...), b AS (SELECT ...)
 SELECT * FROM a JOIN b on a.col = b.col ...;
 Hive CLI executes the query, but logs stack traces at ERROR level during 
 query parsing:
 {noformat}
 ERROR metadata.Hive: NoSuchObjectException(message:ccondit.a table not found)
   at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:29338)
   at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result$get_table_resultStandardScheme.read(ThriftHiveMetastore.java:29306)
   at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_result.read(ThriftHiveMetastore.java:29237)
   at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
   at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table(ThriftHiveMetastore.java:1036)
   at 
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table(ThriftHiveMetastore.java:1022)
   at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:997)
   at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
   at com.sun.proxy.$Proxy7.getTable(Unknown Source)
   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:967)
   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:909)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1223)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1192)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9209)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:391)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:291)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:944)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1009)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:880)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:870)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
 {noformat}
 It looks like Hive is attempting to resolve the CTE aliases as physical 
 tables.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7111) Extend join transitivity PPD to non-column expressions

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7111:


Attachment: HIVE-7111.3.patch.txt

 Extend join transitivity PPD to non-column expressions
 --

 Key: HIVE-7111
 URL: https://issues.apache.org/jira/browse/HIVE-7111
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7111.1.patch.txt, HIVE-7111.2.patch.txt, 
 HIVE-7111.3.patch.txt


 Join transitive in PPD only supports column expressions, but it's possible to 
 extend this to generic expressions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7111) Extend join transitivity PPD to non-column expressions

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7111:


Attachment: (was: HIVE-7111.3.patch.txt)

 Extend join transitivity PPD to non-column expressions
 --

 Key: HIVE-7111
 URL: https://issues.apache.org/jira/browse/HIVE-7111
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7111.1.patch.txt, HIVE-7111.2.patch.txt, 
 HIVE-7111.3.patch.txt


 Join transitive in PPD only supports column expressions, but it's possible to 
 extend this to generic expressions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7111) Extend join transitivity PPD to non-column expressions

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7111:


Attachment: HIVE-7111.3.patch.txt

 Extend join transitivity PPD to non-column expressions
 --

 Key: HIVE-7111
 URL: https://issues.apache.org/jira/browse/HIVE-7111
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-7111.1.patch.txt, HIVE-7111.2.patch.txt, 
 HIVE-7111.3.patch.txt


 Join transitive in PPD only supports column expressions, but it's possible to 
 extend this to generic expressions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-6259) Support truncate for non-native tables

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-6259:


Attachment: HIVE-6259.5.patch.txt

 Support truncate for non-native tables
 --

 Key: HIVE-6259
 URL: https://issues.apache.org/jira/browse/HIVE-6259
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: HIVE-6259.1.patch.txt, HIVE-6259.2.patch.txt, 
 HIVE-6259.3.patch.txt, HIVE-6259.4.patch.txt, HIVE-6259.5.patch.txt


 Tables on HBase might be truncated by similar method in HBaseShell.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5718) Support direct fetch for lateral views, sub queries, etc.

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5718:


Attachment: HIVE-5718.4.patch.txt

 Support direct fetch for lateral views, sub queries, etc.
 -

 Key: HIVE-5718
 URL: https://issues.apache.org/jira/browse/HIVE-5718
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: D13857.1.patch, D13857.2.patch, D13857.3.patch, 
 HIVE-5718.4.patch.txt


 Extend HIVE-2925 with LV and SubQ.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7243) Print padding information in ORC file dump

2014-07-06 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-7243:
--

Attachment: HIVE-7243.3.patch

Rebase patch  unit-tests after HIVE-7231

 Print padding information in ORC file dump
 --

 Key: HIVE-7243
 URL: https://issues.apache.org/jira/browse/HIVE-7243
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
Priority: Minor
  Labels: orcfile
 Attachments: HIVE-7243.1.patch, HIVE-7243.2.patch, HIVE-7243.3.patch


 It will be useful to print the padding information in orc file dump utility.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7243) Print padding information in ORC file dump

2014-07-06 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-7243:
--

Status: Patch Available  (was: Open)

 Print padding information in ORC file dump
 --

 Key: HIVE-7243
 URL: https://issues.apache.org/jira/browse/HIVE-7243
 Project: Hive
  Issue Type: Improvement
  Components: File Formats
Affects Versions: 0.14.0
Reporter: Prasanth J
Assignee: Prasanth J
Priority: Minor
  Labels: orcfile
 Attachments: HIVE-7243.1.patch, HIVE-7243.2.patch, HIVE-7243.3.patch


 It will be useful to print the padding information in orc file dump utility.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5025) Column aliases for input argument of GenericUDFs

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-5025:


Attachment: HIVE-5025.4.patch.txt

 Column aliases for input argument of GenericUDFs 
 -

 Key: HIVE-5025
 URL: https://issues.apache.org/jira/browse/HIVE-5025
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: D12093.2.patch, D12093.3.patch, HIVE-5025.4.patch.txt, 
 HIVE-5025.D12093.1.patch


 In some cases, column aliases for input argument are very useful to know. But 
 I cannot sure of this in the sense that UDFs should not be dependent to 
 contextual information like column alias.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4790) MapredLocalTask task does not make virtual columns

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4790:


Attachment: HIVE-4790.10.patch.txt

 MapredLocalTask task does not make virtual columns
 --

 Key: HIVE-4790
 URL: https://issues.apache.org/jira/browse/HIVE-4790
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: D11511.3.patch, D11511.4.patch, HIVE-4790.10.patch.txt, 
 HIVE-4790.5.patch.txt, HIVE-4790.6.patch.txt, HIVE-4790.7.patch.txt, 
 HIVE-4790.8.patch.txt, HIVE-4790.9.patch.txt, HIVE-4790.D11511.1.patch, 
 HIVE-4790.D11511.2.patch


 From mailing list, 
 http://www.mail-archive.com/user@hive.apache.org/msg08264.html
 {noformat}
 SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON 
 b.rownumber = a.number;
 fails with this error:
  
  SELECT *,b.BLOCK__OFFSET__INSIDE__FILE FROM a JOIN b ON b.rownumber = 
 a.number;
 Automatically selecting local only mode for query
 Total MapReduce jobs = 1
 setting HADOOP_USER_NAMEpmarron
 13/06/25 10:52:56 WARN conf.HiveConf: DEPRECATED: Configuration property 
 hive.metastore.local no longer has any effect. Make sure to provide a valid 
 value for hive.metastore.uris if you are connecting to a remote metastore.
 Execution log at: /tmp/pmarron/.log
 2013-06-25 10:52:56 Starting to launch local task to process map join;
   maximum memory = 932118528
 java.lang.RuntimeException: cannot find field block__offset__inside__file 
 from [0:rownumber, 1:offset]
 at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:366)
 at 
 org.apache.hadoop.hive.serde2.lazy.objectinspector.LazySimpleStructObjectInspector.getStructFieldRef(LazySimpleStructObjectInspector.java:168)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.DelegatedStructObjectInspector.getStructFieldRef(DelegatedStructObjectInspector.java:74)
 at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
 at 
 org.apache.hadoop.hive.ql.exec.JoinUtil.getObjectInspectorsFromEvaluators(JoinUtil.java:68)
 at 
 org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.initializeOp(HashTableSinkOperator.java:222)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:451)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:407)
 at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:186)
 at 
 org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375)
 at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:394)
 at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:277)
 at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:676)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Execution failed with exit status: 2
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6406) Introduce immutable-table table property and if set, disallow insert-into

2014-07-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053325#comment-14053325
 ] 

Lefty Leverenz commented on HIVE-6406:
--

This is documented in the wiki in two places:

* [DML -- Inserting data into Hive tables from queries (see Synopsis after 
syntax) | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingdataintoHiveTablesfromqueries]
* [DDL -- Create Table (see bullet list after syntax) | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable]

 Introduce immutable-table table property and if set, disallow insert-into
 -

 Key: HIVE-6406
 URL: https://issues.apache.org/jira/browse/HIVE-6406
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 0.13.0

 Attachments: HIVE-6406.2.patch, HIVE-6406.3.patch, HIVE-6406.patch


 As part of HIVE-6405's attempt to make HCatalog and Hive behave in similar 
 ways with regards to immutable tables, this is a companion task to introduce 
 the notion of an immutable table, wherein all tables are not immutable by 
 default, and have this be a table property. If this property is set for a 
 table, and we attempt to write to a table that already has data (or a 
 partition), disallow INSERT INTO into it from hive(if destination directory 
 is non-empty). This property being set will allow hive to mimic HCatalog's 
 current immutable-table property.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6475) Implement support for appending to mutable tables in HCatalog

2014-07-06 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053326#comment-14053326
 ] 

Lefty Leverenz commented on HIVE-6475:
--

Does this need any user doc?

 Implement support for appending to mutable tables in HCatalog
 -

 Key: HIVE-6475
 URL: https://issues.apache.org/jira/browse/HIVE-6475
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 0.13.0

 Attachments: 6475.log, 6475.log.hadoop2, HIVE-6475.2.patch, 
 HIVE-6475.patch


 Part of HIVE-6405, this is the implementation of the append feature on the 
 HCatalog side. If a table is mutable, we must support being able to append to 
 existing data instead of erroring out as  a duplicate publish.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7326) Hive complains invalid column reference with 'having' aggregate predicates

2014-07-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053327#comment-14053327
 ] 

Hive QA commented on HIVE-7326:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12654241/HIVE-7326.2.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5678 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/685/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/685/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-685/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12654241

 Hive complains invalid column reference with 'having' aggregate predicates
 --

 Key: HIVE-7326
 URL: https://issues.apache.org/jira/browse/HIVE-7326
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-7326.1.patch.txt, HIVE-7326.2.patch.txt


 CREATE TABLE TestV1_Staples (
   Item_Count INT,
   Ship_Priority STRING,
   Order_Priority STRING,
   Order_Status STRING,
   Order_Quantity DOUBLE,
   Sales_Total DOUBLE,
   Discount DOUBLE,
   Tax_Rate DOUBLE,
   Ship_Mode STRING,
   Fill_Time DOUBLE,
   Gross_Profit DOUBLE,
   Price DOUBLE,
   Ship_Handle_Cost DOUBLE,
   Employee_Name STRING,
   Employee_Dept STRING,
   Manager_Name STRING,
   Employee_Yrs_Exp DOUBLE,
   Employee_Salary DOUBLE,
   Customer_Name STRING,
   Customer_State STRING,
   Call_Center_Region STRING,
   Customer_Balance DOUBLE,
   Customer_Segment STRING,
   Prod_Type1 STRING,
   Prod_Type2 STRING,
   Prod_Type3 STRING,
   Prod_Type4 STRING,
   Product_Name STRING,
   Product_Container STRING,
   Ship_Promo STRING,
   Supplier_Name STRING,
   Supplier_Balance DOUBLE,
   Supplier_Region STRING,
   Supplier_State STRING,
   Order_ID STRING,
   Order_Year INT,
   Order_Month INT,
   Order_Day INT,
   Order_Date_ STRING,
   Order_Quarter STRING,
   Product_Base_Margin DOUBLE,
   Product_ID STRING,
   Receive_Time DOUBLE,
   Received_Date_ STRING,
   Ship_Date_ STRING,
   Ship_Charge DOUBLE,
   Total_Cycle_Time DOUBLE,
   Product_In_Stock STRING,
   PID INT,
   Market_Segment STRING
   );
 Query that works:
 SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM 
 default.testv1_staples s1 GROUP BY customer_name HAVING (
 (COUNT(s1.discount) = 822) AND
 (SUM(customer_balance) = 4074689.00041)
 );
 Query that fails:
 SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM 
 default.testv1_staples s1 GROUP BY customer_name HAVING (
 (SUM(customer_balance) = 4074689.00041)
 AND (COUNT(s1.discount) = 822)
 );



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: User doc for table properties

2014-07-06 Thread Lefty Leverenz
Review request:  Predefined TBLPROPERTIES are documented briefly in the Create
Table section
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable
of the wiki -- see the notes immediately following the syntax.  I'd still
like to know if there are any other predefined table properties.

By the way, one of the properties mentioned in the previous message isn't a
table property, it's a SerDe property (hbase.table.default.storage.type
https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration#HBaseIntegration-ColumnMapping
).

-- Lefty


On Fri, Feb 14, 2014 at 6:45 PM, Lefty Leverenz leftylever...@gmail.com
wrote:

 The user doc for TBLPROPERTIES needs work.  Currently the DDL wikidoc only
 says this:

 The TBLPROPERTIES clause allows you to tag the table definition with your
 own metadata key/value pairs.

 But some table properties have predefined keys and values.  HIVE-6406
 https://issues.apache.org/jira/browse/HIVE-6406 will add immutable --
 how many others already exist?  Are they all listed in one file and
 distinguishable from internal parameters, or just scattered throughout the
 code?

 A quick search found orc.compress (example in HIVE-6083
 https://issues.apache.org/jira/browse/HIVE-6083) and hbase.table.name 
 hbase.table.default.storage.type (in TestPigHBaseStorageHandler.java).
  OrcFile.java has several more listed after orc.compress (some mentioned in
 HIVE-4221 https://issues.apache.org/jira/browse/HIVE-4221 comments).

 This might be a can of worms but the wiki should list all predefined keys
 and their possible values, with version information where needed.  I
 suggest a new subsection in the Create Table section of DDL:

- Language Manual – DDL – Create Table

 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/TruncateTable

 Then particular table properties can be mentioned in their topic docs
 (like ORC) with links to the DDL doc.

 This message can be converted to a JIRA ticket later, but now I'm just
 looking for information.

 Hearts  flowers  chocolate to all on Valentine's Day. -- Lefty



[jira] [Commented] (HIVE-5343) Add equals method to ObjectInspectorUtils

2014-07-06 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053362#comment-14053362
 ] 

Hive QA commented on HIVE-5343:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12654245/HIVE-5343.2.patch.txt

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 5692 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_comparison
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_serde
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/686/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/686/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-686/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12654245

 Add equals method to ObjectInspectorUtils
 -

 Key: HIVE-5343
 URL: https://issues.apache.org/jira/browse/HIVE-5343
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: D13053.1.patch, HIVE-5343.2.patch.txt


 Might provide shortcut for some use cases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-4616) Simple reconnection support for jdbc2

2014-07-06 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-4616:


Attachment: HIVE-4616.4.patch.txt

 Simple reconnection support for jdbc2
 -

 Key: HIVE-4616
 URL: https://issues.apache.org/jira/browse/HIVE-4616
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: HIVE-4616.3.patch.txt, HIVE-4616.4.patch.txt, 
 HIVE-4616.D10953.1.patch, HIVE-4616.D10953.2.patch


 jdbc:hive2://localhost:1/db2;autoReconnect=true
 simple reconnection on TransportException. If hiveserver2 has not been 
 shutdown, session could be reused.



--
This message was sent by Atlassian JIRA
(v6.2#6252)