[jira] [Created] (HIVE-3652) Join optimization for star schema

2012-11-01 Thread Amareshwari Sriramadasu (JIRA)
Amareshwari Sriramadasu created HIVE-3652:
-

 Summary: Join optimization for star schema
 Key: HIVE-3652
 URL: https://issues.apache.org/jira/browse/HIVE-3652
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu


Currently, if we join one fact table with multiple dimension tables, it results 
in multiple mapreduce jobs for each join with dimension table, because join 
would be on different keys for each dimension. 
Usually all the dimension tables will be small and can hit into memory and so 
map-side join can used to join with fact table.

In this issue I want to look at optimizing such query to generate single 
mapreduce job sothat mapper loads dimension tables into memory and joins with 
fact table on different keys as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Issues building hive

2012-11-01 Thread amareshwari sriramdasu
On which machine, are you building hive? When I built in mac and tried to
execute, i got the same problem. It works fine in Linux.


On Wed, Oct 31, 2012 at 11:29 PM, Arnab Guin  wrote:

> Hi,
>
> I executed the following command on the root directory of the installation
> (checked out from SVN).
>
> ant clean package
>
> Hive built successfully but when I invoke Hive and execute a command, it
> gives the following error:
>
> bin/hive
>
> hive> CREATE TABLE X(S STRING);
> FAILED: ParseException line 1:13 cannot recognize input near 'X' '(' ')'
>
> Can anybody help out? I see the libraries have been built successfully in
> build/dist.
>
> Also, when I run the following command on the root directory:
>
> ant test -Dtestcase=TestCliDriver, I keep getting the following error:
>   [for] Cause: the class
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTask was not found.
>   [for] This looks like one of Ant's optional components.
>   [for] Action: Check that the appropriate optional JAR exists in
>   [for] -/usr/share/ant/lib
>   [for] -/root/.ant/lib
>   [for] -a directory added on the command line with the -lib
> argument
>
> I did download the junit.jar and ant-junit.jar files and put them in the
> /usr/share/ant/lib directory.
>
> Any help appreciated.
>
> Thanks.
>


[jira] [Commented] (HIVE-1362) column level statistics

2012-11-01 Thread Shreepadma Venugopalan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489276#comment-13489276
 ] 

Shreepadma Venugopalan commented on HIVE-1362:
--

@Namit: Thanks for your comments! I responded to most of your comments on 
phabricator. Thanks!

> column level statistics
> ---
>
> Key: HIVE-1362
> URL: https://issues.apache.org/jira/browse/HIVE-1362
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Ning Zhang
>Assignee: Shreepadma Venugopalan
> Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, 
> HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, 
> HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, 
> HIVE-1362.D6339.1.patch, HIVE-1362-gen_thrift.1.patch.txt, 
> HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, 
> HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, 
> HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, 
> HIVE-1362_gen-thrift.8.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1362) column level statistics

2012-11-01 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489275#comment-13489275
 ] 

Phabricator commented on HIVE-1362:
---

shreepadma has commented on the revision "HIVE-1362 [jira] column level 
statistics".

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsWork.java:33 Can you 
please let me know what you would like to see dumped in the explain extended?
  ql/src/test/queries/clientnegative/columnstats_partlvl.q:13 Will add  
negative test for wrong column name and move each one of the query to a 
separate q file.
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsDesc.java:27 Can you 
please let me know what you would like to include in the explain output?
  ql/src/java/org/apache/hadoop/hive/ql/parse/StatsSemanticAnalyzer.java:216 
Can you please explain the purpose of creating an error message entry in 
addition to raising an exception? Throwing an exception should present all the 
error information that is needed.
  ql/src/test/queries/clientpositive/columnstats_partlvl.q:10 Will add tests 
wil explain extended. However, I'm not sure what needs to be printed as part of 
explain extended output for columnstatswork. Please refer to my earlier comment 
on it.

REVISION DETAIL
  https://reviews.facebook.net/D6339

To: JIRA, njain, cwsteinbach
Cc: shreepadma


> column level statistics
> ---
>
> Key: HIVE-1362
> URL: https://issues.apache.org/jira/browse/HIVE-1362
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Ning Zhang
>Assignee: Shreepadma Venugopalan
> Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, 
> HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, 
> HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, 
> HIVE-1362.D6339.1.patch, HIVE-1362-gen_thrift.1.patch.txt, 
> HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, 
> HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, 
> HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, 
> HIVE-1362_gen-thrift.8.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3554) Hive List Bucketing - Query logic

2012-11-01 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3554:
---

Attachment: HIVE-3554.patch.11

> Hive List Bucketing - Query logic
> -
>
> Key: HIVE-3554
> URL: https://issues.apache.org/jira/browse/HIVE-3554
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.10.0
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3554.patch.1, HIVE-3554.patch.10, 
> HIVE-3554.patch.11, HIVE-3554.patch.2, HIVE-3554.patch.3, HIVE-3554.patch.4, 
> HIVE-3554.patch.5, HIVE-3554.patch.7, HIVE-3554.patch.8, HIVE-3554.patch.9
>
>
> This is part of efforts for list bucketing feature: 
> https://cwiki.apache.org/Hive/listbucketing.html
> This patch includes:
> 1. Query logic: hive chooses right sub-directory instead of partition 
> directory.
> 2. alter table grammar which is required to support query logic
> This patch doesn't include list bucketing DML. Main reasons:
> 1. risk. w/o DML, this patch won't impact any existing hive regression 
> features since no touch on any data manipulation so that very low risk.
> 2. manageability. w/ DML, patch is getting bigger and hard to review. 
> Removing DML, it's easy to review.
> We still disable hive feature by default since DML is not in yet.
> DML will be in follow-up patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3554) Hive List Bucketing - Query logic

2012-11-01 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489261#comment-13489261
 ] 

Namit Jain commented on HIVE-3554:
--

+1

will commit if tests pass

> Hive List Bucketing - Query logic
> -
>
> Key: HIVE-3554
> URL: https://issues.apache.org/jira/browse/HIVE-3554
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.10.0
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3554.patch.1, HIVE-3554.patch.10, 
> HIVE-3554.patch.2, HIVE-3554.patch.3, HIVE-3554.patch.4, HIVE-3554.patch.5, 
> HIVE-3554.patch.7, HIVE-3554.patch.8, HIVE-3554.patch.9
>
>
> This is part of efforts for list bucketing feature: 
> https://cwiki.apache.org/Hive/listbucketing.html
> This patch includes:
> 1. Query logic: hive chooses right sub-directory instead of partition 
> directory.
> 2. alter table grammar which is required to support query logic
> This patch doesn't include list bucketing DML. Main reasons:
> 1. risk. w/o DML, this patch won't impact any existing hive regression 
> features since no touch on any data manipulation so that very low risk.
> 2. manageability. w/ DML, patch is getting bigger and hard to review. 
> Removing DML, it's easy to review.
> We still disable hive feature by default since DML is not in yet.
> DML will be in follow-up patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3392) Hive unnecessarily validates table SerDes when dropping a table

2012-11-01 Thread Ajesh Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajesh Kumar updated HIVE-3392:
--

Attachment: HIVE-3392.Test Case - with_trunk_version.txt

Hi Edward Capriolo,

I checked the latest version of Driver.java in 
trunk.(org/apache/hadoop/hive/ql/Driver.java)
A lot of patches has been already applied  on this file as part of other 
issue's fixes. 
The change we have discussed to display a clean error message will be taken 
care by those changes.So we do not need any more code change for this issue.

Attaching a test case for the same."HIVE-3392.Test Case - 
with_trunk_version.txt".

> Hive unnecessarily validates table SerDes when dropping a table
> ---
>
> Key: HIVE-3392
> URL: https://issues.apache.org/jira/browse/HIVE-3392
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Jonathan Natkins
>Assignee: Ajesh Kumar
>  Labels: patch
> Attachments: HIVE-3392.2.patch.txt, HIVE-3392.Test Case - 
> with_trunk_version.txt
>
>
> natty@hadoop1:~$ hive
> hive> add jar 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar;
> Added 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
>  to class path
> Added resource: 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
> hive> create table test (a int) row format serde 'hive.serde.JSONSerDe';  
>   
> OK
> Time taken: 2.399 seconds
> natty@hadoop1:~$ hive
> hive> drop table test;
>
> FAILED: Hive Internal Error: 
> java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException
>  SerDe hive.serde.JSONSerDe does not exist))
> java.lang.RuntimeException: 
> MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe 
> hive.serde.JSONSerDe does not exist)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253)
>   at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943)
>   at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700)
>   at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:210)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
> Caused by: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
> SerDe com.cloudera.hive.serde.JSONSerDe does not exist)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:211)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260)
>   ... 20 more
> hive> add jar 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar;
> Added 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
>  to class path
> Added resource: 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
> hive> drop table test;
> OK
> Time taken: 0.658 seconds
> hive> 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3392) Hive unnecessarily validates table SerDes when dropping a table

2012-11-01 Thread Ajesh Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajesh Kumar updated HIVE-3392:
--

Attachment: (was: HIVE-3392.Test Case - After Patch.txt)

> Hive unnecessarily validates table SerDes when dropping a table
> ---
>
> Key: HIVE-3392
> URL: https://issues.apache.org/jira/browse/HIVE-3392
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Jonathan Natkins
>Assignee: Ajesh Kumar
>  Labels: patch
> Attachments: HIVE-3392.2.patch.txt
>
>
> natty@hadoop1:~$ hive
> hive> add jar 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar;
> Added 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
>  to class path
> Added resource: 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
> hive> create table test (a int) row format serde 'hive.serde.JSONSerDe';  
>   
> OK
> Time taken: 2.399 seconds
> natty@hadoop1:~$ hive
> hive> drop table test;
>
> FAILED: Hive Internal Error: 
> java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException
>  SerDe hive.serde.JSONSerDe does not exist))
> java.lang.RuntimeException: 
> MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe 
> hive.serde.JSONSerDe does not exist)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253)
>   at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943)
>   at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700)
>   at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:210)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
> Caused by: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
> SerDe com.cloudera.hive.serde.JSONSerDe does not exist)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:211)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260)
>   ... 20 more
> hive> add jar 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar;
> Added 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
>  to class path
> Added resource: 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
> hive> drop table test;
> OK
> Time taken: 0.658 seconds
> hive> 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3392) Hive unnecessarily validates table SerDes when dropping a table

2012-11-01 Thread Ajesh Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajesh Kumar updated HIVE-3392:
--

Attachment: (was: HIVE-3392.Test Case - Before Patch.txt)

> Hive unnecessarily validates table SerDes when dropping a table
> ---
>
> Key: HIVE-3392
> URL: https://issues.apache.org/jira/browse/HIVE-3392
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Jonathan Natkins
>Assignee: Ajesh Kumar
>  Labels: patch
> Attachments: HIVE-3392.2.patch.txt
>
>
> natty@hadoop1:~$ hive
> hive> add jar 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar;
> Added 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
>  to class path
> Added resource: 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
> hive> create table test (a int) row format serde 'hive.serde.JSONSerDe';  
>   
> OK
> Time taken: 2.399 seconds
> natty@hadoop1:~$ hive
> hive> drop table test;
>
> FAILED: Hive Internal Error: 
> java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException
>  SerDe hive.serde.JSONSerDe does not exist))
> java.lang.RuntimeException: 
> MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe 
> hive.serde.JSONSerDe does not exist)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253)
>   at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162)
>   at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943)
>   at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700)
>   at 
> org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:210)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>   at java.lang.reflect.Method.invoke(Method.java:597)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
> Caused by: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException 
> SerDe com.cloudera.hive.serde.JSONSerDe does not exist)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:211)
>   at 
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260)
>   ... 20 more
> hive> add jar 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar;
> Added 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
>  to class path
> Added resource: 
> /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar
> hive> drop table test;
> OK
> Time taken: 0.658 seconds
> hive> 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3646) Add 'IGNORE PROTECTION' predicate for dropping partitions

2012-11-01 Thread Andrew Chalfant (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Chalfant updated HIVE-3646:
--

Attachment: HIVE-3646.1.patch.txt

Patch file

> Add 'IGNORE PROTECTION' predicate for dropping partitions
> -
>
> Key: HIVE-3646
> URL: https://issues.apache.org/jira/browse/HIVE-3646
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.9.0
>Reporter: Andrew Chalfant
>Assignee: Andrew Chalfant
>Priority: Minor
> Attachments: HIVE-3646.1.patch.txt
>
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> There are cases where it is desirable to move partitions between clusters. 
> Having to undo protection and then re-protect tables in order to delete 
> partitions from a source are multi-step and can leave us in a failed open 
> state where partition and table metadata is dirty. By implementing an 'rm 
> -rf'-like functionality, we can perform these operations atomically.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3646) Add 'IGNORE PROTECTION' predicate for dropping partitions

2012-11-01 Thread Andrew Chalfant (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Chalfant updated HIVE-3646:
--

Release Note: This is a feature which enables us to ignore protection for 
partitions when dropping them, if specified. The use case is moving partitions 
between clusters. Unprotecting and reprotecting tables is non-atomic and thus 
prone to leaving table metadata in a dirty state.
  Status: Patch Available  (was: Open)

https://reviews.facebook.net/D6405

> Add 'IGNORE PROTECTION' predicate for dropping partitions
> -
>
> Key: HIVE-3646
> URL: https://issues.apache.org/jira/browse/HIVE-3646
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Affects Versions: 0.9.0
>Reporter: Andrew Chalfant
>Assignee: Andrew Chalfant
>Priority: Minor
>   Original Estimate: 1m
>  Remaining Estimate: 1m
>
> There are cases where it is desirable to move partitions between clusters. 
> Having to undo protection and then re-protect tables in order to delete 
> partitions from a source are multi-step and can leave us in a failed open 
> state where partition and table metadata is dirty. By implementing an 'rm 
> -rf'-like functionality, we can perform these operations atomically.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3384) HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC specification

2012-11-01 Thread Chris Drome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-3384:
--

Assignee: (was: Chris Drome)

> HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC 
> specification
> --
>
> Key: HIVE-3384
> URL: https://issues.apache.org/jira/browse/HIVE-3384
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Weidong Bian
>Priority: Minor
> Attachments: HIVE-3384.patch
>
>
> jdbc module couldn't be compiled with jdk7 as it adds some abstract method in 
> the JDBC specification 
> some error info:
>  error: HiveCallableStatement is not abstract and does not override abstract
> method getObject(String,Class) in CallableStatement
> .
> .
> .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3554) Hive List Bucketing - Query logic

2012-11-01 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3554:
---

Status: Patch Available  (was: Open)

patch available in attachment and D5955

> Hive List Bucketing - Query logic
> -
>
> Key: HIVE-3554
> URL: https://issues.apache.org/jira/browse/HIVE-3554
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.10.0
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3554.patch.1, HIVE-3554.patch.10, 
> HIVE-3554.patch.2, HIVE-3554.patch.3, HIVE-3554.patch.4, HIVE-3554.patch.5, 
> HIVE-3554.patch.7, HIVE-3554.patch.8, HIVE-3554.patch.9
>
>
> This is part of efforts for list bucketing feature: 
> https://cwiki.apache.org/Hive/listbucketing.html
> This patch includes:
> 1. Query logic: hive chooses right sub-directory instead of partition 
> directory.
> 2. alter table grammar which is required to support query logic
> This patch doesn't include list bucketing DML. Main reasons:
> 1. risk. w/o DML, this patch won't impact any existing hive regression 
> features since no touch on any data manipulation so that very low risk.
> 2. manageability. w/ DML, patch is getting bigger and hard to review. 
> Removing DML, it's easy to review.
> We still disable hive feature by default since DML is not in yet.
> DML will be in follow-up patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3554) Hive List Bucketing - Query logic

2012-11-01 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3554:
---

Attachment: HIVE-3554.patch.10

> Hive List Bucketing - Query logic
> -
>
> Key: HIVE-3554
> URL: https://issues.apache.org/jira/browse/HIVE-3554
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.10.0
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3554.patch.1, HIVE-3554.patch.10, 
> HIVE-3554.patch.2, HIVE-3554.patch.3, HIVE-3554.patch.4, HIVE-3554.patch.5, 
> HIVE-3554.patch.7, HIVE-3554.patch.8, HIVE-3554.patch.9
>
>
> This is part of efforts for list bucketing feature: 
> https://cwiki.apache.org/Hive/listbucketing.html
> This patch includes:
> 1. Query logic: hive chooses right sub-directory instead of partition 
> directory.
> 2. alter table grammar which is required to support query logic
> This patch doesn't include list bucketing DML. Main reasons:
> 1. risk. w/o DML, this patch won't impact any existing hive regression 
> features since no touch on any data manipulation so that very low risk.
> 2. manageability. w/ DML, patch is getting bigger and hard to review. 
> Removing DML, it's easy to review.
> We still disable hive feature by default since DML is not in yet.
> DML will be in follow-up patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3651) bucketmapjoin?.q tests fail with hadoop 0.23

2012-11-01 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-3651:
--

Attachment: HIVE-3651-1.patch

> bucketmapjoin?.q  tests fail with hadoop 0.23
> -
>
> Key: HIVE-3651
> URL: https://issues.apache.org/jira/browse/HIVE-3651
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
> Attachments: HIVE-3651-1.patch
>
>
> The hive.log show error in MR job -
> Task failed!
> Task ID:
>   Stage-1
> The job log has following error -
> 2012-11-01 15:51:20,253 WARN  mapred.LocalJobRunner 
> (LocalJobRunner.java:run(479)) - job_local_0001
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> /home/prasadm/repos/apache/hive-patches/build/ql/scratchdir/local/hive_2012-11-01_15-51-06_176_6704298995984162430/-local-10003/HashTable-Stage-1/MapJoin-b-11-srcbucket21.txt.hashtable
>  (No such file or directory)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:400)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> /home/prasadm/repos/apache/hive-patches/build/ql/scratchdir/local/hive_2012-11-01_15-51-06_176_6704298995984162430/-local-10003/HashTable-Stage-1/MapJoin-b-11-srcbucket21.txt.hashtable
>  (No such file or directory)
> at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:399)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:232)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:679)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3651) bucketmapjoin?.q tests fail with hadoop 0.23

2012-11-01 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-3651:
--

Status: Patch Available  (was: Open)

Review request on https://reviews.apache.org/r/7829/

> bucketmapjoin?.q  tests fail with hadoop 0.23
> -
>
> Key: HIVE-3651
> URL: https://issues.apache.org/jira/browse/HIVE-3651
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
> Attachments: HIVE-3651-1.patch
>
>
> The hive.log show error in MR job -
> Task failed!
> Task ID:
>   Stage-1
> The job log has following error -
> 2012-11-01 15:51:20,253 WARN  mapred.LocalJobRunner 
> (LocalJobRunner.java:run(479)) - job_local_0001
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> /home/prasadm/repos/apache/hive-patches/build/ql/scratchdir/local/hive_2012-11-01_15-51-06_176_6704298995984162430/-local-10003/HashTable-Stage-1/MapJoin-b-11-srcbucket21.txt.hashtable
>  (No such file or directory)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:400)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> /home/prasadm/repos/apache/hive-patches/build/ql/scratchdir/local/hive_2012-11-01_15-51-06_176_6704298995984162430/-local-10003/HashTable-Stage-1/MapJoin-b-11-srcbucket21.txt.hashtable
>  (No such file or directory)
> at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:399)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:232)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:679)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3651) bucketmapjoin? tests fail with hadoop 0.23

2012-11-01 Thread Prasad Mujumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489146#comment-13489146
 ] 

Prasad Mujumdar commented on HIVE-3651:
---

The problem is similar to HIVE-3257. The hashtable file prefix is causing the 
file name comparison to fail

> bucketmapjoin? tests fail with hadoop 0.23
> --
>
> Key: HIVE-3651
> URL: https://issues.apache.org/jira/browse/HIVE-3651
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
>
> The hive.log show error in MR job -
> Task failed!
> Task ID:
>   Stage-1
> The job log has following error -
> 2012-11-01 15:51:20,253 WARN  mapred.LocalJobRunner 
> (LocalJobRunner.java:run(479)) - job_local_0001
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> /home/prasadm/repos/apache/hive-patches/build/ql/scratchdir/local/hive_2012-11-01_15-51-06_176_6704298995984162430/-local-10003/HashTable-Stage-1/MapJoin-b-11-srcbucket21.txt.hashtable
>  (No such file or directory)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:400)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> /home/prasadm/repos/apache/hive-patches/build/ql/scratchdir/local/hive_2012-11-01_15-51-06_176_6704298995984162430/-local-10003/HashTable-Stage-1/MapJoin-b-11-srcbucket21.txt.hashtable
>  (No such file or directory)
> at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:399)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:232)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:679)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3651) bucketmapjoin?.q tests fail with hadoop 0.23

2012-11-01 Thread Prasad Mujumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasad Mujumdar updated HIVE-3651:
--

Summary: bucketmapjoin?.q  tests fail with hadoop 0.23  (was: 
bucketmapjoin? tests fail with hadoop 0.23)

> bucketmapjoin?.q  tests fail with hadoop 0.23
> -
>
> Key: HIVE-3651
> URL: https://issues.apache.org/jira/browse/HIVE-3651
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Prasad Mujumdar
>Assignee: Prasad Mujumdar
>
> The hive.log show error in MR job -
> Task failed!
> Task ID:
>   Stage-1
> The job log has following error -
> 2012-11-01 15:51:20,253 WARN  mapred.LocalJobRunner 
> (LocalJobRunner.java:run(479)) - job_local_0001
> java.lang.Exception: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> /home/prasadm/repos/apache/hive-patches/build/ql/scratchdir/local/hive_2012-11-01_15-51-06_176_6704298995984162430/-local-10003/HashTable-Stage-1/MapJoin-b-11-srcbucket21.txt.hashtable
>  (No such file or directory)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:400)
> Caused by: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> /home/prasadm/repos/apache/hive-patches/build/ql/scratchdir/local/hive_2012-11-01_15-51-06_176_6704298995984162430/-local-10003/HashTable-Stage-1/MapJoin-b-11-srcbucket21.txt.hashtable
>  (No such file or directory)
> at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:399)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:232)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> at java.lang.Thread.run(Thread.java:679)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3651) bucketmapjoin? tests fail with hadoop 0.23

2012-11-01 Thread Prasad Mujumdar (JIRA)
Prasad Mujumdar created HIVE-3651:
-

 Summary: bucketmapjoin? tests fail with hadoop 0.23
 Key: HIVE-3651
 URL: https://issues.apache.org/jira/browse/HIVE-3651
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Prasad Mujumdar
Assignee: Prasad Mujumdar


The hive.log show error in MR job -
Task failed!
Task ID:
  Stage-1

The job log has following error -
2012-11-01 15:51:20,253 WARN  mapred.LocalJobRunner 
(LocalJobRunner.java:run(479)) - job_local_0001
java.lang.Exception: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
/home/prasadm/repos/apache/hive-patches/build/ql/scratchdir/local/hive_2012-11-01_15-51-06_176_6704298995984162430/-local-10003/HashTable-Stage-1/MapJoin-b-11-srcbucket21.txt.hashtable
 (No such file or directory)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:400)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: 
/home/prasadm/repos/apache/hive-patches/build/ql/scratchdir/local/hive_2012-11-01_15-51-06_176_6704298995984162430/-local-10003/HashTable-Stage-1/MapJoin-b-11-srcbucket21.txt.hashtable
 (No such file or directory)
at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:399)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:232)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3649) Hive List Bucketing - enhance DDL to specify list bucketing table

2012-11-01 Thread Gang Tim Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3649:
---

Affects Version/s: 0.10.0
  Summary: Hive List Bucketing - enhance DDL to specify list 
bucketing table  (was: Hive List Bucketing - specify list bucketing table)

> Hive List Bucketing - enhance DDL to specify list bucketing table
> -
>
> Key: HIVE-3649
> URL: https://issues.apache.org/jira/browse/HIVE-3649
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.10.0
>Reporter: Gang Tim Liu
>
> We need to differ normal skewed table from list bucketing table. we use an 
> optional parameter "store as DIRECTORIES"
> create table  (schema) skewed by (keys) on ('c1', 'c2') [store as 
> DIRECTORIES];
> details in
> https://cwiki.apache.org/confluence/display/Hive/ListBucketing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 1769 - Still Failing

2012-11-01 Thread Apache Jenkins Server
Changes for Build #1764
[kevinwilfong] HIVE-3610. Add a command "Explain dependency ..." (Sambavi 
Muthukrishnan via kevinwilfong)


Changes for Build #1765

Changes for Build #1766
[hashutosh] HIVE-3441 : testcases escape1,escape2 fail on windows (Thejas Nair 
via Ashutosh Chauhan)

[kevinwilfong] HIVE-3499. add tests to use bucketing metadata for partitions. 
(njain via kevinwilfong)


Changes for Build #1767
[kevinwilfong] HIVE-3276. optimize union sub-queries. (njain via kevinwilfong)


Changes for Build #1768

Changes for Build #1769



1 tests failed.
FAILED:  
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1

Error Message:
Unexpected exception See build/ql/tmp/hive.log, or try "ant test ... 
-Dtest.silent=false" to get more logs.

Stack Trace:
junit.framework.AssertionFailedError: Unexpected exception
See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" to get 
more logs.
at junit.framework.Assert.fail(Assert.java:47)
at 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_script_broken_pipe1(TestNegativeCliDriver.java:11553)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:422)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:931)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:785)




The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1769)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1769/ to 
view the results.

[jira] [Created] (HIVE-3650) Hive List Bucketing - validation

2012-11-01 Thread Gang Tim Liu (JIRA)
Gang Tim Liu created HIVE-3650:
--

 Summary: Hive List Bucketing - validation
 Key: HIVE-3650
 URL: https://issues.apache.org/jira/browse/HIVE-3650
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Gang Tim Liu
Priority: Minor


Many validations are done in each patch. This issue tracks left-over from 
complete list

https://cwiki.apache.org/confluence/display/Hive/ListBucketing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3649) Hive List Bucketing - specify list bucketing table

2012-11-01 Thread Gang Tim Liu (JIRA)
Gang Tim Liu created HIVE-3649:
--

 Summary: Hive List Bucketing - specify list bucketing table
 Key: HIVE-3649
 URL: https://issues.apache.org/jira/browse/HIVE-3649
 Project: Hive
  Issue Type: New Feature
Reporter: Gang Tim Liu


We need to differ normal skewed table from list bucketing table. we use an 
optional parameter "store as DIRECTORIES"

create table  (schema) skewed by (keys) on ('c1', 'c2') [store as 
DIRECTORIES];

details in

https://cwiki.apache.org/confluence/display/Hive/ListBucketing



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3384) HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC specification

2012-11-01 Thread Chris Drome (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Drome updated HIVE-3384:
--

Assignee: Chris Drome

> HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC 
> specification
> --
>
> Key: HIVE-3384
> URL: https://issues.apache.org/jira/browse/HIVE-3384
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Reporter: Weidong Bian
>Assignee: Chris Drome
>Priority: Minor
> Attachments: HIVE-3384.patch
>
>
> jdbc module couldn't be compiled with jdk7 as it adds some abstract method in 
> the JDBC specification 
> some error info:
>  error: HiveCallableStatement is not abstract and does not override abstract
> method getObject(String,Class) in CallableStatement
> .
> .
> .

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3640) Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks are both set

2012-11-01 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3640:


Affects Version/s: 0.10.0

> Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks 
> are both set
> -
>
> Key: HIVE-3640
> URL: https://issues.apache.org/jira/browse/HIVE-3640
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Vighnesh Avadhani
>Assignee: Vighnesh Avadhani
>Priority: Minor
> Fix For: 0.10.0
>
> Attachments: HIVE-3640.1.patch.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When I enforce bucketing and fix the number of reducers via 
> mapred.reduce.tasks Hive ignores my input and instead takes the largest value 
> <= hive.exec.reducers.max that is also an even divisor of num_buckets. In 
> other words, if I set 1024 buckets and set mapred.reduce.tasks=1024 I'll get. 
> . . 256 reducers. If I set 1997 buckets and set mapred.reduce.tasks=1997 I'll 
> get. . . 1 reducer. 
> This is totally crazy, and it's far, far crazier when the data inputs get 
> large. In the latter case the bucketing job will almost certainly fail 
> because we'll most likely try to stuff several TB of input through a single 
> reducer. We'll also drastically reduce the effectiveness of bucketing, since 
> the buckets themselves will be larger.
> If the user sets mapred.reduce.tasks in a query that inserts into a bucketed 
> table we should either accept that value or raise an exception if it's 
> invalid relative to the number of buckets. We should absolutely NOT override 
> the user's direction and fall back on automatically allocating reducers based 
> on some obscure logic dictated by completely different setting. 
> I have yet to encounter a single person who expected this the first time, so 
> it's clearly a bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3640) Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks are both set

2012-11-01 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3640:


Component/s: Query Processor

> Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks 
> are both set
> -
>
> Key: HIVE-3640
> URL: https://issues.apache.org/jira/browse/HIVE-3640
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Vighnesh Avadhani
>Assignee: Vighnesh Avadhani
>Priority: Minor
> Fix For: 0.10.0
>
> Attachments: HIVE-3640.1.patch.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When I enforce bucketing and fix the number of reducers via 
> mapred.reduce.tasks Hive ignores my input and instead takes the largest value 
> <= hive.exec.reducers.max that is also an even divisor of num_buckets. In 
> other words, if I set 1024 buckets and set mapred.reduce.tasks=1024 I'll get. 
> . . 256 reducers. If I set 1997 buckets and set mapred.reduce.tasks=1997 I'll 
> get. . . 1 reducer. 
> This is totally crazy, and it's far, far crazier when the data inputs get 
> large. In the latter case the bucketing job will almost certainly fail 
> because we'll most likely try to stuff several TB of input through a single 
> reducer. We'll also drastically reduce the effectiveness of bucketing, since 
> the buckets themselves will be larger.
> If the user sets mapred.reduce.tasks in a query that inserts into a bucketed 
> table we should either accept that value or raise an exception if it's 
> invalid relative to the number of buckets. We should absolutely NOT override 
> the user's direction and fall back on automatically allocating reducers based 
> on some obscure logic dictated by completely different setting. 
> I have yet to encounter a single person who expected this the first time, so 
> it's clearly a bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3640) Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks are both set

2012-11-01 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3640:


Fix Version/s: 0.10.0

> Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks 
> are both set
> -
>
> Key: HIVE-3640
> URL: https://issues.apache.org/jira/browse/HIVE-3640
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Vighnesh Avadhani
>Assignee: Vighnesh Avadhani
>Priority: Minor
> Fix For: 0.10.0
>
> Attachments: HIVE-3640.1.patch.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When I enforce bucketing and fix the number of reducers via 
> mapred.reduce.tasks Hive ignores my input and instead takes the largest value 
> <= hive.exec.reducers.max that is also an even divisor of num_buckets. In 
> other words, if I set 1024 buckets and set mapred.reduce.tasks=1024 I'll get. 
> . . 256 reducers. If I set 1997 buckets and set mapred.reduce.tasks=1997 I'll 
> get. . . 1 reducer. 
> This is totally crazy, and it's far, far crazier when the data inputs get 
> large. In the latter case the bucketing job will almost certainly fail 
> because we'll most likely try to stuff several TB of input through a single 
> reducer. We'll also drastically reduce the effectiveness of bucketing, since 
> the buckets themselves will be larger.
> If the user sets mapred.reduce.tasks in a query that inserts into a bucketed 
> table we should either accept that value or raise an exception if it's 
> invalid relative to the number of buckets. We should absolutely NOT override 
> the user's direction and fall back on automatically allocating reducers based 
> on some obscure logic dictated by completely different setting. 
> I have yet to encounter a single person who expected this the first time, so 
> it's clearly a bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3640) Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks are both set

2012-11-01 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488962#comment-13488962
 ] 

Kevin Wilfong commented on HIVE-3640:
-

+1

> Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks 
> are both set
> -
>
> Key: HIVE-3640
> URL: https://issues.apache.org/jira/browse/HIVE-3640
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Vighnesh Avadhani
>Assignee: Vighnesh Avadhani
>Priority: Minor
> Fix For: 0.10.0
>
> Attachments: HIVE-3640.1.patch.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When I enforce bucketing and fix the number of reducers via 
> mapred.reduce.tasks Hive ignores my input and instead takes the largest value 
> <= hive.exec.reducers.max that is also an even divisor of num_buckets. In 
> other words, if I set 1024 buckets and set mapred.reduce.tasks=1024 I'll get. 
> . . 256 reducers. If I set 1997 buckets and set mapred.reduce.tasks=1997 I'll 
> get. . . 1 reducer. 
> This is totally crazy, and it's far, far crazier when the data inputs get 
> large. In the latter case the bucketing job will almost certainly fail 
> because we'll most likely try to stuff several TB of input through a single 
> reducer. We'll also drastically reduce the effectiveness of bucketing, since 
> the buckets themselves will be larger.
> If the user sets mapred.reduce.tasks in a query that inserts into a bucketed 
> table we should either accept that value or raise an exception if it's 
> invalid relative to the number of buckets. We should absolutely NOT override 
> the user's direction and fall back on automatically allocating reducers based 
> on some obscure logic dictated by completely different setting. 
> I have yet to encounter a single person who expected this the first time, so 
> it's clearly a bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3554) Hive List Bucketing - Query logic

2012-11-01 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488954#comment-13488954
 ] 

Gang Tim Liu commented on HIVE-3554:


@Mark,

It's also not allowed to change skewed column type. Code has supported it.

I am going to add a test case to cover it.

thanks

Tim

> Hive List Bucketing - Query logic
> -
>
> Key: HIVE-3554
> URL: https://issues.apache.org/jira/browse/HIVE-3554
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.10.0
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3554.patch.1, HIVE-3554.patch.2, HIVE-3554.patch.3, 
> HIVE-3554.patch.4, HIVE-3554.patch.5, HIVE-3554.patch.7, HIVE-3554.patch.8, 
> HIVE-3554.patch.9
>
>
> This is part of efforts for list bucketing feature: 
> https://cwiki.apache.org/Hive/listbucketing.html
> This patch includes:
> 1. Query logic: hive chooses right sub-directory instead of partition 
> directory.
> 2. alter table grammar which is required to support query logic
> This patch doesn't include list bucketing DML. Main reasons:
> 1. risk. w/o DML, this patch won't impact any existing hive regression 
> features since no touch on any data manipulation so that very low risk.
> 2. manageability. w/ DML, patch is getting bigger and hard to review. 
> Removing DML, it's easy to review.
> We still disable hive feature by default since DML is not in yet.
> DML will be in follow-up patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2693) Add DECIMAL data type

2012-11-01 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488946#comment-13488946
 ] 

Vikram Dixit K commented on HIVE-2693:
--

Ok. I will look into this error.

> Add DECIMAL data type
> -
>
> Key: HIVE-2693
> URL: https://issues.apache.org/jira/browse/HIVE-2693
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor, Types
>Reporter: Carl Steinbach
>Assignee: Prasad Mujumdar
> Attachments: HIVE-2693-all.patch, HIVE-2693-fix.patch, 
> HIVE-2693.patch, HIVE-2693-take3.patch, HIVE-2693-take4.patch
>
>
> Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
> template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3621) Make prompt in Hive CLI configurable

2012-11-01 Thread Jingwei Lu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingwei Lu updated HIVE-3621:
-

Attachment: HIVE-3621.patch.1.txt

> Make prompt in Hive CLI configurable
> 
>
> Key: HIVE-3621
> URL: https://issues.apache.org/jira/browse/HIVE-3621
> Project: Hive
>  Issue Type: Improvement
>  Components: CLI
>Affects Versions: 0.9.0
>Reporter: Jingwei Lu
>Assignee: Jingwei Lu
>Priority: Minor
> Fix For: 0.10.0
>
> Attachments: HIVE-3621.patch.1.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Right now the Hive CLI prompt just says "hive>", for users (primarily power 
> users) who run in different clusters it can be easy to forget which cluster 
> your Hive CLI is pointing to.  If we change the Hive CLI prompt to be 
> something like "hive(silver)>" it would be much clearer.  We could 
> potentially extend this to namespaces as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3554) Hive List Bucketing - Query logic

2012-11-01 Thread Gang Tim Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488920#comment-13488920
 ] 

Gang Tim Liu commented on HIVE-3554:


@Mark,

It's not allowed to rename skewed column. It's been covered by column_rename5.q

thanks

Tim

> Hive List Bucketing - Query logic
> -
>
> Key: HIVE-3554
> URL: https://issues.apache.org/jira/browse/HIVE-3554
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.10.0
>Reporter: Gang Tim Liu
>Assignee: Gang Tim Liu
> Attachments: HIVE-3554.patch.1, HIVE-3554.patch.2, HIVE-3554.patch.3, 
> HIVE-3554.patch.4, HIVE-3554.patch.5, HIVE-3554.patch.7, HIVE-3554.patch.8, 
> HIVE-3554.patch.9
>
>
> This is part of efforts for list bucketing feature: 
> https://cwiki.apache.org/Hive/listbucketing.html
> This patch includes:
> 1. Query logic: hive chooses right sub-directory instead of partition 
> directory.
> 2. alter table grammar which is required to support query logic
> This patch doesn't include list bucketing DML. Main reasons:
> 1. risk. w/o DML, this patch won't impact any existing hive regression 
> features since no touch on any data manipulation so that very low risk.
> 2. manageability. w/ DML, patch is getting bigger and hard to review. 
> Removing DML, it's easy to review.
> We still disable hive feature by default since DML is not in yet.
> DML will be in follow-up patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3640) Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks are both set

2012-11-01 Thread Vighnesh Avadhani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vighnesh Avadhani updated HIVE-3640:


Status: Patch Available  (was: Open)

> Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks 
> are both set
> -
>
> Key: HIVE-3640
> URL: https://issues.apache.org/jira/browse/HIVE-3640
> Project: Hive
>  Issue Type: Bug
>Reporter: Vighnesh Avadhani
>Assignee: Vighnesh Avadhani
>Priority: Minor
> Attachments: HIVE-3640.1.patch.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When I enforce bucketing and fix the number of reducers via 
> mapred.reduce.tasks Hive ignores my input and instead takes the largest value 
> <= hive.exec.reducers.max that is also an even divisor of num_buckets. In 
> other words, if I set 1024 buckets and set mapred.reduce.tasks=1024 I'll get. 
> . . 256 reducers. If I set 1997 buckets and set mapred.reduce.tasks=1997 I'll 
> get. . . 1 reducer. 
> This is totally crazy, and it's far, far crazier when the data inputs get 
> large. In the latter case the bucketing job will almost certainly fail 
> because we'll most likely try to stuff several TB of input through a single 
> reducer. We'll also drastically reduce the effectiveness of bucketing, since 
> the buckets themselves will be larger.
> If the user sets mapred.reduce.tasks in a query that inserts into a bucketed 
> table we should either accept that value or raise an exception if it's 
> invalid relative to the number of buckets. We should absolutely NOT override 
> the user's direction and fall back on automatically allocating reducers based 
> on some obscure logic dictated by completely different setting. 
> I have yet to encounter a single person who expected this the first time, so 
> it's clearly a bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3640) Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks are both set

2012-11-01 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488862#comment-13488862
 ] 

Kevin Wilfong commented on HIVE-3640:
-

Couple minor comments on the diff.

Could you also hit the "Submit Patch" button to mark the JIRA "Patch Available" 
so reviewers know to review the diff.

> Reducer allocation is incorrect if enforce bucketing and mapred.reduce.tasks 
> are both set
> -
>
> Key: HIVE-3640
> URL: https://issues.apache.org/jira/browse/HIVE-3640
> Project: Hive
>  Issue Type: Bug
>Reporter: Vighnesh Avadhani
>Assignee: Vighnesh Avadhani
>Priority: Minor
> Attachments: HIVE-3640.1.patch.txt
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When I enforce bucketing and fix the number of reducers via 
> mapred.reduce.tasks Hive ignores my input and instead takes the largest value 
> <= hive.exec.reducers.max that is also an even divisor of num_buckets. In 
> other words, if I set 1024 buckets and set mapred.reduce.tasks=1024 I'll get. 
> . . 256 reducers. If I set 1997 buckets and set mapred.reduce.tasks=1997 I'll 
> get. . . 1 reducer. 
> This is totally crazy, and it's far, far crazier when the data inputs get 
> large. In the latter case the bucketing job will almost certainly fail 
> because we'll most likely try to stuff several TB of input through a single 
> reducer. We'll also drastically reduce the effectiveness of bucketing, since 
> the buckets themselves will be larger.
> If the user sets mapred.reduce.tasks in a query that inserts into a bucketed 
> table we should either accept that value or raise an exception if it's 
> invalid relative to the number of buckets. We should absolutely NOT override 
> the user's direction and fall back on automatically allocating reducers based 
> on some obscure logic dictated by completely different setting. 
> I have yet to encounter a single person who expected this the first time, so 
> it's clearly a bug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3613) Implement grouping_id function

2012-11-01 Thread Ivan Gorbachev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Gorbachev updated HIVE-3613:
-

Status: Patch Available  (was: Open)

> Implement grouping_id function
> --
>
> Key: HIVE-3613
> URL: https://issues.apache.org/jira/browse/HIVE-3613
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Ivan Gorbachev
>Assignee: Ivan Gorbachev
> Attachments: jira-3613.0.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3613) Implement grouping_id function

2012-11-01 Thread Ivan Gorbachev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Gorbachev updated HIVE-3613:
-

Attachment: jira-3613.0.patch

> Implement grouping_id function
> --
>
> Key: HIVE-3613
> URL: https://issues.apache.org/jira/browse/HIVE-3613
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Ivan Gorbachev
>Assignee: Ivan Gorbachev
> Attachments: jira-3613.0.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3613) Implement grouping_id function

2012-11-01 Thread Ivan Gorbachev (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488850#comment-13488850
 ] 

Ivan Gorbachev commented on HIVE-3613:
--

https://reviews.facebook.net/D6375

> Implement grouping_id function
> --
>
> Key: HIVE-3613
> URL: https://issues.apache.org/jira/browse/HIVE-3613
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Ivan Gorbachev
>Assignee: Ivan Gorbachev
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #185

2012-11-01 Thread Apache Jenkins Server
See 


--
[...truncated 10125 lines...]
 [echo] Project: odbc
 [copy] Warning: 

 does not exist.

ivy-resolve-test:
 [echo] Project: odbc

ivy-retrieve-test:
 [echo] Project: odbc

compile-test:
 [echo] Project: odbc

create-dirs:
 [echo] Project: serde
 [copy] Warning: 

 does not exist.

init:
 [echo] Project: serde

ivy-init-settings:
 [echo] Project: serde

ivy-resolve:
 [echo] Project: serde
[ivy:resolve] :: loading settings :: file = 

[ivy:report] Processing 

 to 


ivy-retrieve:
 [echo] Project: serde

dynamic-serde:

compile:
 [echo] Project: serde

ivy-resolve-test:
 [echo] Project: serde

ivy-retrieve-test:
 [echo] Project: serde

compile-test:
 [echo] Project: serde
[javac] Compiling 26 source files to 

[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

create-dirs:
 [echo] Project: service
 [copy] Warning: 

 does not exist.

init:
 [echo] Project: service

ivy-init-settings:
 [echo] Project: service

ivy-resolve:
 [echo] Project: service
[ivy:resolve] :: loading settings :: file = 

[ivy:report] Processing 

 to 


ivy-retrieve:
 [echo] Project: service

compile:
 [echo] Project: service

ivy-resolve-test:
 [echo] Project: service

ivy-retrieve-test:
 [echo] Project: service

compile-test:
 [echo] Project: service
[javac] Compiling 2 source files to 


test:
 [echo] Project: hive

test-shims:
 [echo] Project: hive

test-conditions:
 [echo] Project: shims

gen-test:
 [echo] Project: shims

create-dirs:
 [echo] Project: shims
 [copy] Warning: 

 does not exist.

init:
 [echo] Project: shims

ivy-init-settings:
 [echo] Project: shims

ivy-resolve:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 

[ivy:report] Processing 

 to 


ivy-retrieve:
 [echo] Project: shims

compile:
 [echo] Project: shims
 [echo] Building shims 0.20

build_shims:
 [echo] Project: shims
 [echo] Compiling 

 against hadoop 0.20.2 
(

ivy-init-settings:
 [echo] Project: shims

ivy-resolve-hadoop-shim:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 


ivy-retrieve-hadoop-shim:
 [echo] Project: shims
 [echo] Building shims 0.20S

build_shims:
 [echo] Project: shims
 [echo] Compiling 


Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #185

2012-11-01 Thread Apache Jenkins Server
See 

--
[...truncated 10092 lines...]

compile-test:
 [echo] Project: metastore
[javac] Compiling 18 source files to 


create-dirs:
 [echo] Project: odbc
[mkdir] Created dir: 

[mkdir] Created dir: 

[mkdir] Created dir: 

[mkdir] Created dir: 

[mkdir] Created dir: 

[mkdir] Created dir: 

 [copy] Warning: 

 does not exist.

init:
 [echo] Project: odbc

setup:
 [echo] Project: odbc

ivy-init-settings:
 [echo] Project: odbc

ivy-resolve:
 [echo] Project: odbc
[ivy:resolve] :: loading settings :: file = 

[ivy:report] Processing 

 to 


ivy-retrieve:
 [echo] Project: odbc

compile:
 [echo] Project: odbc
 [copy] Warning: 
 
does not exist.

ivy-resolve-test:
 [echo] Project: odbc

ivy-retrieve-test:
 [echo] Project: odbc

compile-test:
 [echo] Project: odbc

create-dirs:
 [echo] Project: serde
 [copy] Warning: 

 does not exist.

init:
 [echo] Project: serde

ivy-init-settings:
 [echo] Project: serde

ivy-resolve:
 [echo] Project: serde
[ivy:resolve] :: loading settings :: file = 

[ivy:report] Processing 

 to 


ivy-retrieve:
 [echo] Project: serde

dynamic-serde:

compile:
 [echo] Project: serde

ivy-resolve-test:
 [echo] Project: serde

ivy-retrieve-test:
 [echo] Project: serde

compile-test:
 [echo] Project: serde
[javac] Compiling 26 source files to 

[javac] Note: Some input files use or override a deprecated API.
[javac] Note: Recompile with -Xlint:deprecation for details.
[javac] Note: Some input files use unchecked or unsafe operations.
[javac] Note: Recompile with -Xlint:unchecked for details.

create-dirs:
 [echo] Project: service
 [copy] Warning: 

 does not exist.

init:
 [echo] Project: service

ivy-init-settings:
 [echo] Project: service

ivy-resolve:
 [echo] Project: service
[ivy:resolve] :: loading settings :: file = 

[ivy:report] Processing 

 to 


ivy-retrieve:
 [echo] Project: service

compile:
 [echo] Project: service

ivy-resolve-test:
 [echo] Project: service

ivy-retrieve-test:
 [echo] Project: service

compile-test:
 [echo] Project: service
[javac] Compiling 2 source files to 


test:
 [echo] Project: hive

test-shims:
 [echo] Project: hive

test-conditions:
 [echo] Project: shims

gen-test:
 [echo] Project: shims

create-dirs:
 [echo] Project: shims
 [copy] Warning: 

 does not exist.

init:
 [echo] Project: shims

ivy-init-se

[jira] [Updated] (HIVE-33) [Hive]: Add ability to compute statistics on hive tables

2012-11-01 Thread Tianxu Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianxu Wang updated HIVE-33:


Summary: [Hive]: Add ability to compute statistics on hive tables  (was: 
[Hive]: Add ability to compute statistics on hive table)

> [Hive]: Add ability to compute statistics on hive tables
> 
>
> Key: HIVE-33
> URL: https://issues.apache.org/jira/browse/HIVE-33
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor, Statistics
>Reporter: Ashish Thusoo
>  Labels: statistics
>
> Add commands to collect partition and column level statistics in hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-33) [Hive]: Add ability to compute statistics on hive table

2012-11-01 Thread Tianxu Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tianxu Wang updated HIVE-33:


Summary: [Hive]: Add ability to compute statistics on hive table  (was: 
[Hive]: Add ability to compute statistics on hive tables)

> [Hive]: Add ability to compute statistics on hive table
> ---
>
> Key: HIVE-33
> URL: https://issues.apache.org/jira/browse/HIVE-33
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor, Statistics
>Reporter: Ashish Thusoo
>  Labels: statistics
>
> Add commands to collect partition and column level statistics in hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3648) HiveMetaStoreFsImpl is not compatible with hadoop viewfs

2012-11-01 Thread Kihwal Lee (JIRA)
Kihwal Lee created HIVE-3648:


 Summary: HiveMetaStoreFsImpl is not compatible with hadoop viewfs
 Key: HIVE-3648
 URL: https://issues.apache.org/jira/browse/HIVE-3648
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.9.0, 0.10.0
Reporter: Kihwal Lee


HiveMetaStoreFsImpl#deleteDir() method calls Trash#moveToTrash(). This may not 
work when viewfs is used. It needs to call Trash#moveToAppropriateTrash() 
instead.  Please note that this method is not available in hadoop versions 
earlier than 0.23.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: number of buckets in bucket map join

2012-11-01 Thread Mark Grover
Hi Mahsa,
Just to elaborate a little further.
Bucket joins are quicker than regular joins because they lessen the number
of logical multiplications that need to happen between the records from two
tables being joined.

A regular bucket join would logically multiply each row from one table with
each row of another table, leading to O(n^2) logical multiplications. A
bucket map join reduces the number of multiplications by only joining the
corresponding buckets. The answer to your question lies in the explanation
of how Hive figures out which buckets from the tables correspond to each
other.

The default method by which Hive distributes rows across buckets is by
hash_function(bucketing_column) mod num_buckets (reference:
http://hive.apache.org/docs/r0.8.1/language_manual/working_with_bucketed_tables.html).
Given that the hash function is deterministic, it gives the same result for
same argument every single time it's called.

Let's say table t1 has n1 buckets and t2 has n2 buckets. Therefore a record
with key some_key would go in bucket number hash_function(some_key)/mod(n1)
in t1 and bucket number hash_function(some_key)/mod(n2) in t2.

If n1 and n2 are equal, the bucket number would be the same in both tables.
Consequently, Hive only needs to join same bucket numbers with each other,
because if a record with some join key is present in bucket i of table t1,
all records in table t2 with the same join key must be present in bucket i
of table t2.

Now, let's come to a case where the number of buckets of the 2 tables is
not equal. Without loss of generality, we can assume that t1 has n1 buckets
and t2 has n2 buckets where  n1 < n2. For a bucket join to work, Hive has
to be able to deterministically figure out which bucket from table2 should
bucket i from table1 be joined with.

If n2 is a multiple of n1, then it can be proved that
hash_function(some_key)/mod (n2) is one of the following n2/n1 values:
hash_function(some_key)/mod(n1), hash_function(some_key)/mod(n1) + n1,
hash_function(some_key)/mod(n1) + 2 * n1, ,
hash_function(some_key)/mod(n1) + (n2/n1) - 1. In other words, if a record
with a given join key exists in bucket i of table1, a record with the same
join key can only exist in bucket i, i + n1, i + 2*n1, , or i + (n2/n1)
- 1. Consequently, Hive only needs to logically multiply records from
bucket i of table1 with the above n2/n1 buckets from table2 to perform a
join.

If n2 is not a multiple of n1, no nice relationship between the mods of n1
and n2 exist so it can't be determined which bucket from table1 should be
joined with which bucket from table2. So we are back to joining the entire
table instead of individual buckets.

Therefore, the number of buckets in one table has to be a multiple of the
other in order for bucketed join to work.

Hope that helps. BTW, the above is just my understanding of bucketed joins,
I haven't verified that this in fact what the present code does.

Mark

On Wed, Oct 31, 2012 at 4:50 PM, Mahsa Mofidpoor wrote:

> Hi,
>
> Hive summit 2011 says for performing a bucket join the number of buckets of
> all tables has to be a multiple of each other.
> Does anybody know the reason?
>
> Thanks you.
> Mahsa
>


[jira] [Updated] (HIVE-3647) map-side groupby wrongly due to HIVE-3432

2012-11-01 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3647:
-

Status: Patch Available  (was: Open)

> map-side groupby wrongly due to HIVE-3432
> -
>
> Key: HIVE-3647
> URL: https://issues.apache.org/jira/browse/HIVE-3647
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.3647.1.patch
>
>
> There seems to be a bug due to HIVE-3432.
> We are converting the group by to a map side group by after only looking at
> sorting columns. This can give wrong results if the data is sorted and
> bucketed by different columns.
> Add some tests for that scenario, verify and fix any issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr

2012-11-01 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3570:
-

Status: Open  (was: Patch Available)

minor comments

> Add/fix facility to collect operator specific statisticsin hive + add 
> hash-in/hash-out counter for GroupBy Optr
> ---
>
> Key: HIVE-3570
> URL: https://issues.apache.org/jira/browse/HIVE-3570
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 0.10.0
>Reporter: Satadru Pan
>Assignee: Satadru Pan
>Priority: Minor
> Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch, 
> HIVE-3570.D5985.2.patch, HIVE-3570.D5985.3.patch, HIVE-3570.D5985.4.patch, 
> HIVE-3570.D5985.5.patch, HIVE-3570.D5985.6.patch, HIVE-3570.D5985.7.patch
>
>
> Requirement: Collect Operator specific stats for hive queries. Use the 
> counter framework available in Hive Operator.java to accomplish that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr

2012-11-01 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488552#comment-13488552
 ] 

Phabricator commented on HIVE-3570:
---

njain has requested changes to the revision "HIVE-3570 [jira] Hive changes for 
Optr level stats".

  ignore my accept, can you try putting a comment

INLINE COMMENTS
  ql/src/test/queries/clientpositive/OptrStatGroupBy.q:1 There is a bug where 
comments cannot come before a SET statement.

  Can you try putting a comment before line 4 (select count(1) from src) ?

REVISION DETAIL
  https://reviews.facebook.net/D5985

BRANCH
  svn

To: njain, sambavim, kevinwilfong, satadru
Cc: JIRA, adobriyal


> Add/fix facility to collect operator specific statisticsin hive + add 
> hash-in/hash-out counter for GroupBy Optr
> ---
>
> Key: HIVE-3570
> URL: https://issues.apache.org/jira/browse/HIVE-3570
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 0.10.0
>Reporter: Satadru Pan
>Assignee: Satadru Pan
>Priority: Minor
> Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch, 
> HIVE-3570.D5985.2.patch, HIVE-3570.D5985.3.patch, HIVE-3570.D5985.4.patch, 
> HIVE-3570.D5985.5.patch, HIVE-3570.D5985.6.patch, HIVE-3570.D5985.7.patch
>
>
> Requirement: Collect Operator specific stats for hive queries. Use the 
> counter framework available in Hive Operator.java to accomplish that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr

2012-11-01 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488550#comment-13488550
 ] 

Phabricator commented on HIVE-3570:
---

njain has accepted the revision "HIVE-3570 [jira] Hive changes for Optr level 
stats".

INLINE COMMENTS
  ql/src/test/results/clientpositive/optrstat_groupby.q.out:5 Outside the scope 
of this patch, but there seems to be a bug in num output rows.
  It should be 1, not 0.
  Can you file a new jira with the details ?

REVISION DETAIL
  https://reviews.facebook.net/D5985

BRANCH
  svn

To: njain, sambavim, kevinwilfong, satadru
Cc: JIRA, adobriyal


> Add/fix facility to collect operator specific statisticsin hive + add 
> hash-in/hash-out counter for GroupBy Optr
> ---
>
> Key: HIVE-3570
> URL: https://issues.apache.org/jira/browse/HIVE-3570
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 0.10.0
>Reporter: Satadru Pan
>Assignee: Satadru Pan
>Priority: Minor
> Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch, 
> HIVE-3570.D5985.2.patch, HIVE-3570.D5985.3.patch, HIVE-3570.D5985.4.patch, 
> HIVE-3570.D5985.5.patch, HIVE-3570.D5985.6.patch, HIVE-3570.D5985.7.patch
>
>
> Requirement: Collect Operator specific stats for hive queries. Use the 
> counter framework available in Hive Operator.java to accomplish that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1362) column level statistics

2012-11-01 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1362:
-

Status: Open  (was: Patch Available)

minor comments on phabricator

> column level statistics
> ---
>
> Key: HIVE-1362
> URL: https://issues.apache.org/jira/browse/HIVE-1362
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Ning Zhang
>Assignee: Shreepadma Venugopalan
> Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, 
> HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, 
> HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, 
> HIVE-1362.D6339.1.patch, HIVE-1362-gen_thrift.1.patch.txt, 
> HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, 
> HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, 
> HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, 
> HIVE-1362_gen-thrift.8.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1362) column level statistics

2012-11-01 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488546#comment-13488546
 ] 

Phabricator commented on HIVE-1362:
---

njain has commented on the revision "HIVE-1362 [jira] column level statistics".

  if u haven't done already - file a jira for upgrade scripts for other dbs.

INLINE COMMENTS
  ql/src/test/queries/clientpositive/columnstats_partlvl.q:10 add some tests 
with explain extended.
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsWork.java:33 More stuff 
should be dumped in explain(extended)
  ql/src/java/org/apache/hadoop/hive/ql/plan/ColumnStatsDesc.java:27 explain 
should output this

REVISION DETAIL
  https://reviews.facebook.net/D6339

To: JIRA, njain, cwsteinbach


> column level statistics
> ---
>
> Key: HIVE-1362
> URL: https://issues.apache.org/jira/browse/HIVE-1362
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Ning Zhang
>Assignee: Shreepadma Venugopalan
> Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, 
> HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, 
> HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, 
> HIVE-1362.D6339.1.patch, HIVE-1362-gen_thrift.1.patch.txt, 
> HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, 
> HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, 
> HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, 
> HIVE-1362_gen-thrift.8.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1362) column level statistics

2012-11-01 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488543#comment-13488543
 ] 

Phabricator commented on HIVE-1362:
---

njain has commented on the revision "HIVE-1362 [jira] column level statistics".

  you have a columnstatsdesc/columnstatswork etc.
  Rename the file as columnstatssemanticanalyzer

  I did not look into too much detail in the UDAF - I am assuming Carl has 
already reviewed that part

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/parse/StatsSemanticAnalyzer.java:216 
create a entry in ErrorMsg - similarly to all other exceptions

REVISION DETAIL
  https://reviews.facebook.net/D6339

To: JIRA, njain, cwsteinbach


> column level statistics
> ---
>
> Key: HIVE-1362
> URL: https://issues.apache.org/jira/browse/HIVE-1362
> Project: Hive
>  Issue Type: Sub-task
>  Components: Statistics
>Reporter: Ning Zhang
>Assignee: Shreepadma Venugopalan
> Attachments: HIVE-1362.1.patch.txt, HIVE-1362.2.patch.txt, 
> HIVE-1362.3.patch.txt, HIVE-1362.4.patch.txt, HIVE-1362.5.patch.txt, 
> HIVE-1362.6.patch.txt, HIVE-1362.7.patch.txt, HIVE-1362.8.patch.txt, 
> HIVE-1362.D6339.1.patch, HIVE-1362-gen_thrift.1.patch.txt, 
> HIVE-1362-gen_thrift.2.patch.txt, HIVE-1362-gen_thrift.3.patch.txt, 
> HIVE-1362-gen_thrift.4.patch.txt, HIVE-1362-gen_thrift.5.patch.txt, 
> HIVE-1362-gen_thrift.6.patch.txt, HIVE-1362_gen-thrift.7.patch.txt, 
> HIVE-1362_gen-thrift.8.patch.txt
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr

2012-11-01 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488530#comment-13488530
 ] 

Phabricator commented on HIVE-3570:
---

satadru has commented on the revision "HIVE-3570 [jira] Hive changes for Optr 
level stats".

INLINE COMMENTS
  ql/src/test/queries/clientpositive/OptrStatGroupBy.q:1 Getting ant test 
failure whenever I am adding a comment line (-- comment) in the .q file. This 
seems to be some problem with the minimr test case.

REVISION DETAIL
  https://reviews.facebook.net/D5985

To: njain, sambavim, kevinwilfong, satadru
Cc: JIRA, adobriyal


> Add/fix facility to collect operator specific statisticsin hive + add 
> hash-in/hash-out counter for GroupBy Optr
> ---
>
> Key: HIVE-3570
> URL: https://issues.apache.org/jira/browse/HIVE-3570
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 0.10.0
>Reporter: Satadru Pan
>Assignee: Satadru Pan
>Priority: Minor
> Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch, 
> HIVE-3570.D5985.2.patch, HIVE-3570.D5985.3.patch, HIVE-3570.D5985.4.patch, 
> HIVE-3570.D5985.5.patch, HIVE-3570.D5985.6.patch, HIVE-3570.D5985.7.patch
>
>
> Requirement: Collect Operator specific stats for hive queries. Use the 
> counter framework available in Hive Operator.java to accomplish that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr

2012-11-01 Thread Satadru Pan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satadru Pan updated HIVE-3570:
--

Status: Patch Available  (was: Open)

> Add/fix facility to collect operator specific statisticsin hive + add 
> hash-in/hash-out counter for GroupBy Optr
> ---
>
> Key: HIVE-3570
> URL: https://issues.apache.org/jira/browse/HIVE-3570
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 0.10.0
>Reporter: Satadru Pan
>Assignee: Satadru Pan
>Priority: Minor
> Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch, 
> HIVE-3570.D5985.2.patch, HIVE-3570.D5985.3.patch, HIVE-3570.D5985.4.patch, 
> HIVE-3570.D5985.5.patch, HIVE-3570.D5985.6.patch, HIVE-3570.D5985.7.patch
>
>
> Requirement: Collect Operator specific stats for hive queries. Use the 
> counter framework available in Hive Operator.java to accomplish that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr

2012-11-01 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3570:
--

Attachment: HIVE-3570.D5985.7.patch

satadru updated the revision "HIVE-3570 [jira] Hive changes for Optr level 
stats".
Reviewers: njain, sambavim, kevinwilfong

  Removing one unintended change build-common.


REVISION DETAIL
  https://reviews.facebook.net/D5985

AFFECTED FILES
  build-common.xml
  ql/src/test/results/clientpositive/optrstat_groupby.q.out
  ql/src/test/org/apache/hadoop/hive/ql/hooks/OptrStatGroupByHook.java
  ql/src/test/queries/clientpositive/optrstat_groupby.q
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java

To: njain, sambavim, kevinwilfong, satadru
Cc: JIRA, adobriyal


> Add/fix facility to collect operator specific statisticsin hive + add 
> hash-in/hash-out counter for GroupBy Optr
> ---
>
> Key: HIVE-3570
> URL: https://issues.apache.org/jira/browse/HIVE-3570
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 0.10.0
>Reporter: Satadru Pan
>Assignee: Satadru Pan
>Priority: Minor
> Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch, 
> HIVE-3570.D5985.2.patch, HIVE-3570.D5985.3.patch, HIVE-3570.D5985.4.patch, 
> HIVE-3570.D5985.5.patch, HIVE-3570.D5985.6.patch, HIVE-3570.D5985.7.patch
>
>
> Requirement: Collect Operator specific stats for hive queries. Use the 
> counter framework available in Hive Operator.java to accomplish that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3570) Add/fix facility to collect operator specific statisticsin hive + add hash-in/hash-out counter for GroupBy Optr

2012-11-01 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3570:
--

Attachment: HIVE-3570.D5985.6.patch

satadru updated the revision "HIVE-3570 [jira] Hive changes for Optr level 
stats".
Reviewers: njain, sambavim, kevinwilfong

  Updated the code according to the review comments.


REVISION DETAIL
  https://reviews.facebook.net/D5985

AFFECTED FILES
  build-common.xml
  ql/src/test/results/clientpositive/optrstat_groupby.q.out
  ql/src/test/org/apache/hadoop/hive/ql/hooks/OptrStatGroupByHook.java
  ql/src/test/queries/clientpositive/optrstat_groupby.q
  ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java

To: njain, sambavim, kevinwilfong, satadru
Cc: JIRA, adobriyal


> Add/fix facility to collect operator specific statisticsin hive + add 
> hash-in/hash-out counter for GroupBy Optr
> ---
>
> Key: HIVE-3570
> URL: https://issues.apache.org/jira/browse/HIVE-3570
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Affects Versions: 0.10.0
>Reporter: Satadru Pan
>Assignee: Satadru Pan
>Priority: Minor
> Attachments: HIVE-3570.1.patch.txt, HIVE-3570.D5985.1.patch, 
> HIVE-3570.D5985.2.patch, HIVE-3570.D5985.3.patch, HIVE-3570.D5985.4.patch, 
> HIVE-3570.D5985.5.patch, HIVE-3570.D5985.6.patch
>
>
> Requirement: Collect Operator specific stats for hive queries. Use the 
> counter framework available in Hive Operator.java to accomplish that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira