[jira] [Commented] (HIVE-5510) [WebHCat] GET job/queue return wrong job information

2014-07-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052813#comment-14052813
 ] 

Lefty Leverenz commented on HIVE-5510:
--

This bug fix is documented in the wiki here:

* [GET Jobs -- Curl Command (fields) | 
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Jobs#WebHCatReferenceJobs-CurlCommand(fields)]
* [GET Jobs -- JSON Output (fields, Hive 0.12.0 and later) | 
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Jobs#WebHCatReferenceJobs-JSONOutput(fields,Hive0.12.0andlater)]

> [WebHCat] GET job/queue return wrong job information
> 
>
> Key: HIVE-5510
> URL: https://issues.apache.org/jira/browse/HIVE-5510
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 0.12.0
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Fix For: 0.13.0
>
> Attachments: HIVE-5510-1.patch, HIVE-5510-2.patch, HIVE-5510-3.patch, 
> HIVE-5510-4.patch, test_harnesss_1381798977
>
>
> GET job/queue of a TempletonController job return weird information. It is a 
> mix of child job and itself. It should only pull the information of the 
> controller job itself.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6172) Whitespaces and comments on Tez

2014-07-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052775#comment-14052775
 ] 

Lefty Leverenz commented on HIVE-6172:
--

The wiki documents *hive.jar.directory* and *hive.user.install.directory* here:

* [Configuration Properties -- hive.jar.directory | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.jar.directory]
* [Configuration Properties -- hive.user.install.directory | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.user.install.directory]

> Whitespaces and comments on Tez
> ---
>
> Key: HIVE-6172
> URL: https://issues.apache.org/jira/browse/HIVE-6172
> Project: Hive
>  Issue Type: Bug
>Affects Versions: tez-branch
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
> Fix For: tez-branch
>
> Attachments: HIVE-6172.1.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HIVE-7349) Consuming published Hive HCatalog artificats in a Hadoop 2 build environment fails

2014-07-04 Thread Venkat Ranganathan (JIRA)
Venkat Ranganathan created HIVE-7349:


 Summary: Consuming published Hive HCatalog artificats in a Hadoop 
2 build environment fails
 Key: HIVE-7349
 URL: https://issues.apache.org/jira/browse/HIVE-7349
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.13.0
Reporter: Venkat Ranganathan


The published Hive artifacts are built with Hadoop 1 profile.   Even though 
Hive has Hadoop 1 and Hadoop 2 shims, some of the HCatalog Mapreduce classes 
are still dependent on the compiled environment.

For example, using Hive artifacts published in a Sqoop Hcatalog Hadoop 2 build 
environment results in the following failure

Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected
java.lang.IncompatibleClassChangeError: Found interface 
org.apache.hadoop.mapreduce.JobContext, but class was expected
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:104)
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getOutputFormat(HCatBaseOutputFormat.java:84)
at 
org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.checkOutputSpecs(HCatBaseOutputFormat.java:73)
at 
org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:418)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:333)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215)






--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)

2014-07-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052481#comment-14052481
 ] 

Hive QA commented on HIVE-7220:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12654060/HIVE-7220.5.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5691 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.cli.TestPermsGrp.testCustomPerms
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/681/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/681/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-681/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12654060

> Empty dir in external table causes issue (root_dir_external_table.q failure)
> 
>
> Key: HIVE-7220
> URL: https://issues.apache.org/jira/browse/HIVE-7220
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, 
> HIVE-7220.5.patch, HIVE-7220.5.patch, HIVE-7220.patch
>
>
> While looking at root_dir_external_table.q failure, which is doing a query on 
> an external table located at root ('/'), I noticed that latest Hadoop2 
> CombineFileInputFormat returns split representing empty directories (like 
> '/Users'), which leads to failure in Hive's CombineFileRecordReader as it 
> tries to open the directory for processing.
> Tried with an external table in a normal HDFS directory, and it also returns 
> the same error.  Looks like a real bug.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-3227) Implement data loading from user provided string directly for test

2014-07-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052444#comment-14052444
 ] 

Hive QA commented on HIVE-3227:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12654059/HIVE-3227.1.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5677 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/680/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/680/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-680/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12654059

> Implement data loading from user provided string directly for test
> --
>
> Key: HIVE-3227
> URL: https://issues.apache.org/jira/browse/HIVE-3227
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor, Testing Infrastructure
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3227.1.patch.txt
>
>
> {code}
> load data instream 'key value\nkey2 value2' into table test;
> {code}
> This will make test easier and also can reduce test time. For example,
> {code}
> -- ppr_pushdown.q
> create table ppr_test (key string) partitioned by (ds string);
> alter table ppr_test add partition (ds = '1234');
> insert overwrite table ppr_test partition(ds = '1234') select * from (select 
> '1234' from src limit 1 union all select 'abcd' from src limit 1) s;
> {code}
> last query is 4MR job. But can be replaced by
> {code}
> create table ppr_test (key string) partitioned by (ds string) ROW FORMAT 
> delimited fields terminated by ' ';
> alter table ppr_test add partition (ds = '1234');
> load data local instream '1234\nabcd' overwrite into table ppr_test 
> partition(ds = '1234');
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7045) Wrong results in multi-table insert aggregating without group by clause

2014-07-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052353#comment-14052353
 ] 

Hive QA commented on HIVE-7045:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12654053/HIVE-7045.1.patch.txt

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5676 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority
org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/679/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/679/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-679/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12654053

> Wrong results in multi-table insert aggregating without group by clause
> ---
>
> Key: HIVE-7045
> URL: https://issues.apache.org/jira/browse/HIVE-7045
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0, 0.12.0
>Reporter: dima machlin
>Assignee: Navis
>Priority: Blocker
> Attachments: HIVE-7045.1.patch.txt
>
>
> This happens whenever there are more than 1 reducers.
> The scenario :
> CREATE  TABLE t1 (a int, b int);
> CREATE  TABLE t2 (cnt int) PARTITIONED BY (var_name string);
> insert into table t1 select 1,1 from asd limit 1;
> insert into table t1 select 2,2 from asd limit 1;
> t1 contains :
> 1 1
> 2 2
> from  t1
> insert overwrite table t2 partition(var_name='a') select count(a) cnt 
> insert overwrite table t2 partition(var_name='b') select count(b) cnt ;
> select * from t2;
> returns : 
> 2 a
> 2 b
> as expected.
> Setting the number of reducers higher than 1 :
> set mapred.reduce.tasks=2;
> from  t1
> insert overwrite table t2 partition(var_name='a') select count(a) cnt
> insert overwrite table t2 partition(var_name='b') select count(b) cnt;
> select * from t2;
> 1 a
> 1 a
> 1 b
> 1 b
> Wrong results.
> This happens when ever t1 is big enough to automatically generate more than 1 
> reducers and without specifying it directly.
> adding "group by 1" in the end of each insert solves the problem :
> from  t1
> insert overwrite table t2 partition(var_name='a') select count(a) cnt group 
> by 1
> insert overwrite table t2 partition(var_name='b') select count(b) cnt group 
> by 1;
> generates : 
> 2 a
> 2 b
> This should work without the group by...
> The number of rows for each partition will be the amount of reducers.
> Each reducer calculated a sub total of the count.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-707) add group_concat

2014-07-04 Thread Jian Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052301#comment-14052301
 ] 

Jian Wang commented on HIVE-707:


[~ph4t]
I use this concat_ws(' ', map_keys(UNION_MAP(MAP(your_column, 'dummy' 
method instead of group_concat,but I got a error like this
{code}
FAILED: SemanticException [Error 10011]: Line 172:30 Invalid function 
'UNION_MAP'
{/code}
should I add some jars ?


> add group_concat
> 
>
> Key: HIVE-707
> URL: https://issues.apache.org/jira/browse/HIVE-707
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Min Zhou
>
> Moving the discussion to a new jira:
> I've implemented group_cat() in a rush, and found something difficult to 
> slove:
> 1. function group_cat() has a internal order by clause, currently, we can't 
> implement such an aggregation in hive.
> 2. when the strings will be group concated are too large, in another words, 
> if data skew appears, there is often not enough memory to store such a big 
> result.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7326) Hive complains invalid column reference with 'having' aggregate predicates

2014-07-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052253#comment-14052253
 ] 

Hive QA commented on HIVE-7326:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12654033/HIVE-7326.1.patch.txt

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5692 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/678/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/678/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-678/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12654033

> Hive complains invalid column reference with 'having' aggregate predicates
> --
>
> Key: HIVE-7326
> URL: https://issues.apache.org/jira/browse/HIVE-7326
> Project: Hive
>  Issue Type: Bug
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-7326.1.patch.txt
>
>
> CREATE TABLE TestV1_Staples (
>   Item_Count INT,
>   Ship_Priority STRING,
>   Order_Priority STRING,
>   Order_Status STRING,
>   Order_Quantity DOUBLE,
>   Sales_Total DOUBLE,
>   Discount DOUBLE,
>   Tax_Rate DOUBLE,
>   Ship_Mode STRING,
>   Fill_Time DOUBLE,
>   Gross_Profit DOUBLE,
>   Price DOUBLE,
>   Ship_Handle_Cost DOUBLE,
>   Employee_Name STRING,
>   Employee_Dept STRING,
>   Manager_Name STRING,
>   Employee_Yrs_Exp DOUBLE,
>   Employee_Salary DOUBLE,
>   Customer_Name STRING,
>   Customer_State STRING,
>   Call_Center_Region STRING,
>   Customer_Balance DOUBLE,
>   Customer_Segment STRING,
>   Prod_Type1 STRING,
>   Prod_Type2 STRING,
>   Prod_Type3 STRING,
>   Prod_Type4 STRING,
>   Product_Name STRING,
>   Product_Container STRING,
>   Ship_Promo STRING,
>   Supplier_Name STRING,
>   Supplier_Balance DOUBLE,
>   Supplier_Region STRING,
>   Supplier_State STRING,
>   Order_ID STRING,
>   Order_Year INT,
>   Order_Month INT,
>   Order_Day INT,
>   Order_Date_ STRING,
>   Order_Quarter STRING,
>   Product_Base_Margin DOUBLE,
>   Product_ID STRING,
>   Receive_Time DOUBLE,
>   Received_Date_ STRING,
>   Ship_Date_ STRING,
>   Ship_Charge DOUBLE,
>   Total_Cycle_Time DOUBLE,
>   Product_In_Stock STRING,
>   PID INT,
>   Market_Segment STRING
>   );
> Query that works:
> SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM 
> default.testv1_staples s1 GROUP BY customer_name HAVING (
> (COUNT(s1.discount) <= 822) AND
> (SUM(customer_balance) <= 4074689.00041)
> );
> Query that fails:
> SELECT customer_name, SUM(customer_balance), SUM(order_quantity) FROM 
> default.testv1_staples s1 GROUP BY customer_name HAVING (
> (SUM(customer_balance) <= 4074689.00041)
> AND (COUNT(s1.discount) <= 822)
> );



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5692) Make VectorGroupByOperator parameters configurable

2014-07-04 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-5692:
-

Labels: TODOC13  (was: )

> Make VectorGroupByOperator parameters configurable
> --
>
> Key: HIVE-5692
> URL: https://issues.apache.org/jira/browse/HIVE-5692
> Project: Hive
>  Issue Type: Bug
>Reporter: Remus Rusanu
>Assignee: Remus Rusanu
>Priority: Minor
>  Labels: TODOC13
> Fix For: 0.13.0
>
> Attachments: HIVE-5692.1.patch, HIVE-5692.2.patch, HIVE-5692.3.patch, 
> HIVE-5692.4.patch, HIVE-5692.5.patch, HIVE-5692.6.patch
>
>
> The FLUSH_CHECK_THRESHOLD and PERCENT_ENTRIES_TO_FLUSH should be configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-3990) Provide input threshold for direct-fetcher (HIVE-2925)

2014-07-04 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-3990:
-

Labels: TODOC13  (was: )

> Provide input threshold for direct-fetcher (HIVE-2925)
> --
>
> Key: HIVE-3990
> URL: https://issues.apache.org/jira/browse/HIVE-3990
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
>  Labels: TODOC13
> Fix For: 0.13.0
>
> Attachments: D8415.2.patch, D8415.3.patch, HIVE-3990.D8415.1.patch
>
>
> As a followup of HIVE-2925, add input threshold for fetch task conversion.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7220) Empty dir in external table causes issue (root_dir_external_table.q failure)

2014-07-04 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-7220:


Attachment: HIVE-7220.5.patch

I couldn't reproduce the dynpart_sort_optimization failure with this patch.  As 
the logs are gone, attaching again to see if I can get some idea.

> Empty dir in external table causes issue (root_dir_external_table.q failure)
> 
>
> Key: HIVE-7220
> URL: https://issues.apache.org/jira/browse/HIVE-7220
> Project: Hive
>  Issue Type: Bug
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-7220.2.patch, HIVE-7220.3.patch, HIVE-7220.4.patch, 
> HIVE-7220.5.patch, HIVE-7220.5.patch, HIVE-7220.patch
>
>
> While looking at root_dir_external_table.q failure, which is doing a query on 
> an external table located at root ('/'), I noticed that latest Hadoop2 
> CombineFileInputFormat returns split representing empty directories (like 
> '/Users'), which leads to failure in Hive's CombineFileRecordReader as it 
> tries to open the directory for processing.
> Tried with an external table in a normal HDFS directory, and it also returns 
> the same error.  Looks like a real bug.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-1511) Hive plan serialization is slow

2014-07-04 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-1511:
-

Labels: TODOC13  (was: )

> Hive plan serialization is slow
> ---
>
> Key: HIVE-1511
> URL: https://issues.apache.org/jira/browse/HIVE-1511
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.7.0, 0.11.0
>Reporter: Ning Zhang
>Assignee: Mohammad Kamrul Islam
>  Labels: TODOC13
> Fix For: 0.13.0
>
> Attachments: HIVE-1511-wip.patch, HIVE-1511-wip2.patch, 
> HIVE-1511-wip3.patch, HIVE-1511-wip4.patch, HIVE-1511.10.patch, 
> HIVE-1511.11.patch, HIVE-1511.12.patch, HIVE-1511.13.patch, 
> HIVE-1511.14.patch, HIVE-1511.16.patch, HIVE-1511.17.patch, 
> HIVE-1511.4.patch, HIVE-1511.5.patch, HIVE-1511.6.patch, HIVE-1511.7.patch, 
> HIVE-1511.8.patch, HIVE-1511.9.patch, HIVE-1511.patch, HIVE-1511.wip.9.patch, 
> KryoHiveTest.java, failedPlan.xml, generated_plan.xml, run.sh
>
>
> As reported by Edward Capriolo:
> For reference I did this as a test case
> SELECT * FROM src where
> key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> key=0 OR key=0 OR key=0 OR  key=0 OR key=0 OR key=0 OR key=0 OR key=0
> OR key=0 OR key=0 OR key=0 OR
> ...(100 more of these)
> No OOM but I gave up after the test case did not go anywhere for about
> 2 minutes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-5843) Transaction manager for Hive

2014-07-04 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-5843:
-

Labels: TODOC13  (was: )

> Transaction manager for Hive
> 
>
> Key: HIVE-5843
> URL: https://issues.apache.org/jira/browse/HIVE-5843
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 0.12.0
>Reporter: Alan Gates
>Assignee: Alan Gates
>  Labels: TODOC13
> Fix For: 0.13.0
>
> Attachments: 5843.5-wip.patch, HIVE-5843-src-only.6.patch, 
> HIVE-5843-src-only.patch, HIVE-5843.10.patch, HIVE-5843.2.patch, 
> HIVE-5843.3-src.path, HIVE-5843.3.patch, HIVE-5843.4-src.patch, 
> HIVE-5843.4.patch, HIVE-5843.6.patch, HIVE-5843.7.patch, HIVE-5843.8.patch, 
> HIVE-5843.8.src-only.patch, HIVE-5843.9.patch, HIVE-5843.patch, 
> HiveTransactionManagerDetailedDesign (1).pdf
>
>
> As part of the ACID work proposed in HIVE-5317 a transaction manager is 
> required.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-6981) Remove old website from SVN

2014-07-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052230#comment-14052230
 ] 

Lefty Leverenz commented on HIVE-6981:
--

Good catch, [~szehon].  The whole section needs to be rewritten because we're 
not using versioned xml docs anymore.  Wikidocs are now the official Hive 
documentation and they don't need to be committed.

Following the link to How to Release, I see we also need to delete steps 6, 7, 
and 8 in the Publishing section 
(https://cwiki.apache.org/confluence/display/Hive/HowToRelease#HowToRelease-Publishing).
  Or do those steps cover docs besides the old xml documentation, such as 
release notes?  Maybe we should ask the release managers for recent releases 
about that.

> Remove old website from SVN
> ---
>
> Key: HIVE-6981
> URL: https://issues.apache.org/jira/browse/HIVE-6981
> Project: Hive
>  Issue Type: Task
>Reporter: Brock Noland
>Assignee: Brock Noland
>
> Command to do removal:
> {noformat}
> svn delete https://svn.apache.org/repos/asf/hive/site/ --message "HIVE-6981 - 
> Remove old website from SVN"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7325) Support non-constant expressions for MAP type indices.

2014-07-04 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052226#comment-14052226
 ] 

Hive QA commented on HIVE-7325:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12654029/HIVE-7325.1.patch.txt

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5676 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table
org.apache.hadoop.hive.ql.parse.TestParseNegative.testParseNegative_invalid_list_index
org.apache.hadoop.hive.ql.parse.TestParseNegative.testParseNegative_invalid_list_index2
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/676/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/676/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-676/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12654029

> Support non-constant expressions for MAP type indices.
> --
>
> Key: HIVE-7325
> URL: https://issues.apache.org/jira/browse/HIVE-7325
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.1
>Reporter: Mala Chikka Kempanna
>Assignee: Navis
> Fix For: 0.14.0
>
> Attachments: HIVE-7325.1.patch.txt
>
>
> Here is my sample:
> {code}
> CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,D:BatchDate,D:Country") 
> TBLPROPERTIES ("hbase.table.name" = "RECORD"); 
> CREATE TABLE KEY_RECORD(KeyValue String, RecordId map) 
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, K:") 
> TBLPROPERTIES ("hbase.table.name" = "KEY_RECORD"); 
> {code}
> The following join statement doesn't work. 
> {code}
> SELECT a.*, b.* from KEY_RECORD a join RECORD b 
> WHERE a.RecordId[b.RecordID] is not null;
> {code}
> FAILED: SemanticException 2:16 Non-constant expression for map indexes not 
> supported. Error encountered near token 'RecordID' 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-3227) Implement data loading from user provided string directly for test

2014-07-04 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052224#comment-14052224
 ] 

Navis commented on HIVE-3227:
-

[~appodictic] Restricted this to be used only in "hive.in.test=true". How about 
that?

> Implement data loading from user provided string directly for test
> --
>
> Key: HIVE-3227
> URL: https://issues.apache.org/jira/browse/HIVE-3227
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor, Testing Infrastructure
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3227.1.patch.txt
>
>
> {code}
> load data instream 'key value\nkey2 value2' into table test;
> {code}
> This will make test easier and also can reduce test time. For example,
> {code}
> -- ppr_pushdown.q
> create table ppr_test (key string) partitioned by (ds string);
> alter table ppr_test add partition (ds = '1234');
> insert overwrite table ppr_test partition(ds = '1234') select * from (select 
> '1234' from src limit 1 union all select 'abcd' from src limit 1) s;
> {code}
> last query is 4MR job. But can be replaced by
> {code}
> create table ppr_test (key string) partitioned by (ds string) ROW FORMAT 
> delimited fields terminated by ' ';
> alter table ppr_test add partition (ds = '1234');
> load data local instream '1234\nabcd' overwrite into table ppr_test 
> partition(ds = '1234');
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-3227) Implement data loading from user provided string directly for test

2014-07-04 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3227:


Attachment: HIVE-3227.1.patch.txt

> Implement data loading from user provided string directly for test
> --
>
> Key: HIVE-3227
> URL: https://issues.apache.org/jira/browse/HIVE-3227
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor, Testing Infrastructure
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3227.1.patch.txt
>
>
> {code}
> load data instream 'key value\nkey2 value2' into table test;
> {code}
> This will make test easier and also can reduce test time. For example,
> {code}
> -- ppr_pushdown.q
> create table ppr_test (key string) partitioned by (ds string);
> alter table ppr_test add partition (ds = '1234');
> insert overwrite table ppr_test partition(ds = '1234') select * from (select 
> '1234' from src limit 1 union all select 'abcd' from src limit 1) s;
> {code}
> last query is 4MR job. But can be replaced by
> {code}
> create table ppr_test (key string) partitioned by (ds string) ROW FORMAT 
> delimited fields terminated by ' ';
> alter table ppr_test add partition (ds = '1234');
> load data local instream '1234\nabcd' overwrite into table ppr_test 
> partition(ds = '1234');
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7343) Update committer list

2014-07-04 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052217#comment-14052217
 ] 

Szehon Ho commented on HIVE-7343:
-

Ah thanks for catching that, +1.  I was going to verify with the hive-staging 
site after commit and before pushing as mentioned in the wiki, but looks good 
to me.

> Update committer list
> -
>
> Key: HIVE-7343
> URL: https://issues.apache.org/jira/browse/HIVE-7343
> Project: Hive
>  Issue Type: Test
>  Components: Documentation
>Reporter: Szehon Ho
>Assignee: Szehon Ho
> Attachments: HIVE-7343.2.patch, HIVE-7343.patch
>
>
> NO PRECOMMIT TESTS



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HIVE-7257) UDF format_number() does not work on FLOAT types

2014-07-04 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052216#comment-14052216
 ] 

Lefty Leverenz commented on HIVE-7257:
--

I mentioned this bug fix in the wiki here:

* [UDFs -- String Functions | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-StringFunctions]

> UDF format_number() does not work on FLOAT types
> 
>
> Key: HIVE-7257
> URL: https://issues.apache.org/jira/browse/HIVE-7257
> Project: Hive
>  Issue Type: Bug
>Reporter: Wilbur Yang
>Assignee: Wilbur Yang
> Fix For: 0.14.0
>
> Attachments: HIVE-7257.1.patch
>
>
> #1 Show the table:
> hive> describe ssga3; 
> OK
> sourcestring  
> test  float   
> dttimestamp   
> Time taken: 0.243 seconds
> #2 Run format_number on double and it works:
> hive> select format_number(cast(test as double),2) from ssga3;
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_201403131616_0009, Tracking URL = 
> http://cdh5-1:50030/jobdetails.jsp?jobid=job_201403131616_0009
> Kill Command = 
> /opt/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/lib/hadoop/bin/hadoop job 
> -kill job_201403131616_0009
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2014-03-13 17:14:53,992 Stage-1 map = 0%, reduce = 0%
> 2014-03-13 17:14:59,032 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.47 
> sec
> 2014-03-13 17:15:00,046 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.47 
> sec
> 2014-03-13 17:15:01,056 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.47 
> sec
> 2014-03-13 17:15:02,067 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 
> 1.47 sec
> MapReduce Total cumulative CPU time: 1 seconds 470 msec
> Ended Job = job_201403131616_0009
> MapReduce Jobs Launched: 
> Job 0: Map: 1 Cumulative CPU: 1.47 sec HDFS Read: 299 HDFS Write: 10 SUCCESS
> Total MapReduce CPU Time Spent: 1 seconds 470 msec
> OK
> 1.00
> 2.00
> Time taken: 16.563 seconds
> #3 Run format_number on float and it does not work
> hive> select format_number(test,2) from ssga3; 
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_201403131616_0010, Tracking URL = 
> http://cdh5-1:50030/jobdetails.jsp?jobid=job_201403131616_0010
> Kill Command = 
> /opt/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/lib/hadoop/bin/hadoop job 
> -kill job_201403131616_0010
> Hadoop job information for Stage-1: number of mappers: 1; number of reducers: > 0
> 2014-03-13 17:20:21,158 Stage-1 map = 0%, reduce = 0%
> 2014-03-13 17:21:00,453 Stage-1 map = 100%, reduce = 100%
> Ended Job = job_201403131616_0010 with errors
> Error during job, obtaining debugging information...
> Job Tracking URL: 
> http://cdh5-1:50030/jobdetails.jsp?jobid=job_201403131616_0010
> Examining task ID: task_201403131616_0010_m_02 (and more) from job 
> job_201403131616_0010
> Unable to retrieve URL for Hadoop Task logs. Does not contain a valid 
> host:port authority: logicaljt
> Task with the most failures(4):
> Task ID:
> task_201403131616_0010_m_00
> Diagnostic Messages for this Task:
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
> Hive Runtime Error while processing row {"source":null,"test":1.0,"dt":null}
> at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:159)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
> at org.apache.hadoop.mapred.Child.main(Child.java:262)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
> Error while processing row {"source":null,"test":1.0,"dt":null}
> at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:675)
> at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:141)
> ..
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask
> MapReduce Jobs Launched: 
> Job 0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec



--
This message was sent by Atlassian JIRA
(v6.2#6252)