[jira] [Updated] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly

2013-02-04 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2839:
--

Attachment: HIVE-2839.D2079.4.patch

navis updated the revision HIVE-2839 [jira] Filters on outer join with mapjoin 
hint is not applied correctly.

  Addressed comments

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D2079

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D2079?vs=27147id=27171#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
  ql/src/test/queries/clientpositive/mapjoin1.q
  ql/src/test/results/clientpositive/mapjoin1.q.out

To: JIRA, navis
Cc: njain


 Filters on outer join with mapjoin hint is not applied correctly
 

 Key: HIVE-2839
 URL: https://issues.apache.org/jira/browse/HIVE-2839
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, 
 HIVE-2839.D2079.4.patch


 Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions.
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND true limit 10;
 FAILED: Hive Internal Error: 
 java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc
  cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
 java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689)
   at 
 org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 and 
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND b.key * 10  '1000' limit 10;
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 

[jira] [Updated] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly

2013-02-04 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2839:
--

Attachment: HIVE-2839.D2079.5.patch

navis updated the revision HIVE-2839 [jira] Filters on outer join with mapjoin 
hint is not applied correctly.

  Typos, sorry.

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D2079

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D2079?vs=27171id=27177#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
  ql/src/test/queries/clientpositive/mapjoin1.q
  ql/src/test/results/clientpositive/mapjoin1.q.out

To: JIRA, navis
Cc: njain


 Filters on outer join with mapjoin hint is not applied correctly
 

 Key: HIVE-2839
 URL: https://issues.apache.org/jira/browse/HIVE-2839
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, 
 HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch


 Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions.
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND true limit 10;
 FAILED: Hive Internal Error: 
 java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc
  cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
 java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689)
   at 
 org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 and 
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND b.key * 10  '1000' limit 10;
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 

[jira] [Updated] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly

2013-02-04 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2839:


Affects Version/s: (was: 0.9.0)
   Status: Patch Available  (was: Open)

 Filters on outer join with mapjoin hint is not applied correctly
 

 Key: HIVE-2839
 URL: https://issues.apache.org/jira/browse/HIVE-2839
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, 
 HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch


 Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions.
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND true limit 10;
 FAILED: Hive Internal Error: 
 java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc
  cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
 java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689)
   at 
 org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 and 
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND b.key * 10  '1000' limit 10;
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:495)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
   ... 8 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: branch for ptf and windowing fuction

2013-02-04 Thread Arvind Prabhakar
Hi Ashutosh,

My +1 for the proposal for creating a separate branch for feature
development.

I do have one question in this regard: how do you plan on keeping this
branch in sync with the trunk? If the branch is allowed to diverge
indefinitely, it is likely that the build from it will lag in features and
fixes that are otherwise available on the trunk. It will be great if you
could get the branch to first synchronize with the trunk and then follow a
policy where there are periodic merges from the trunk into the development
branch.

Regards,
Arvind Prabhakar

On Fri, Feb 1, 2013 at 10:11 AM, Ashutosh Chauhan hashut...@apache.orgwrote:

 Hi all,

 Harish and Prajkta are doing some cool work over at
 https://issues.apache.org/jira/browse/HIVE-896 IMO its a very useful
 feature for the community and our user base. Harish and Prajkta are making
 steady progress on this for much last year in their github repo
 https://github.com/hbutani/hive and much of the feature is now functional.
 However, its quite a bit of work and new code which will take some time
 before being ready for trunk. I propose that we create a new branch so that
 further development  of this happens in apache repo instead of github repo.
 This gets us few benefits:
 a) It will avoid the situation we ended up with HiveServer2 where a useful
 new functionality came but in one big patch which made its review and thus
 inclusion in mainline harder than it should have been.
 b) Obvious advantages of development getting done in apache as oppose to
 github which are:
  i) It will make it easier for apache hive community members interested
 in this work (like me) to follow progress.
 ii) It will make it easier for apache hive community members interested
 in this work to contribute.
 iii) It will make it easier for apache community members to review the
 work and provide feedback.

 I further propose that we follow Commit-than-review policy for this feature
 branch which will enable contributors to make rapid progress without
 waiting for lengthy review cycles. Hive committers interested in work can
 either review branch any time they want to provide feedback or can wait
 till contributors declare work is complete and make a proposal to merge in
 trunk and than review it than. This anyway is a throwaway branch not
 intended to make releases out of it.

 Unless I hear any objections, I will create a branch over the weekend.

 Thanks,
 Ashutosh



[jira] [Updated] (HIVE-2379) Hive/HBase integration could be improved

2013-02-04 Thread Guido Serra aka Zeph (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guido Serra aka Zeph updated HIVE-2379:
---

  Priority: Critical  (was: Minor)
Issue Type: Bug  (was: Improvement)

ahemm [~namitjain], sorry if I dare to raise this to Critical and switch it to 
a Bug, but it is preventing one of the major use cases to work at all

from hive's console I can sort of circumvent it, but from Hue or from a JDBC 
connector I have no easy way (even totally impossible from an end user's 
perspective)

is anyone reviewing the patch? I tried to compile and apply it, but the unit 
tests against 974918c Hive 0.9.0-rc0 release are failing, and I can't judge 
if I get to a consistent state applying it

 Hive/HBase integration could be improved
 

 Key: HIVE-2379
 URL: https://issues.apache.org/jira/browse/HIVE-2379
 Project: Hive
  Issue Type: Bug
  Components: CLI, Clients, HBase Handler
Affects Versions: 0.7.1, 0.8.0, 0.9.0
Reporter: Roman Shaposhnik
Assignee: Navis
Priority: Critical
 Attachments: HIVE-2379.D7347.1.patch


 For now any Hive/HBase queries would require the following jars to be 
 explicitly added via hive's add jar command:
 add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar;
 add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar;
 add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar;
 add jar /usr/lib/hive/lib/guava-r06.jar;
 the longer term solution, perhaps, should be to have the code at submit time 
 call hbase's 
 TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship 
 it in distributedcache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2379) Hive/HBase integration could be improved

2013-02-04 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570077#comment-13570077
 ] 

Guido Serra aka Zeph commented on HIVE-2379:


(uhmm... looks like I stepped into HIVE-2937)
{code}
[junit] Running org.apache.hadoop.hive.service.TestHiveServerSessions
[junit] Running org.apache.hadoop.hive.service.TestHiveServerSessions
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
{code}

 Hive/HBase integration could be improved
 

 Key: HIVE-2379
 URL: https://issues.apache.org/jira/browse/HIVE-2379
 Project: Hive
  Issue Type: Bug
  Components: CLI, Clients, HBase Handler
Affects Versions: 0.7.1, 0.8.0, 0.9.0
Reporter: Roman Shaposhnik
Assignee: Navis
Priority: Critical
 Attachments: HIVE-2379.D7347.1.patch


 For now any Hive/HBase queries would require the following jars to be 
 explicitly added via hive's add jar command:
 add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar;
 add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar;
 add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar;
 add jar /usr/lib/hive/lib/guava-r06.jar;
 the longer term solution, perhaps, should be to have the code at submit time 
 call hbase's 
 TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship 
 it in distributedcache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2379) Hive/HBase integration could be improved

2013-02-04 Thread Guido Serra aka Zeph (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570078#comment-13570078
 ] 

Guido Serra aka Zeph commented on HIVE-2379:


and...
{code}
[junit] Running org.apache.hadoop.hive.cli.TestCliDriver
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
{code}
but no much output though


 Hive/HBase integration could be improved
 

 Key: HIVE-2379
 URL: https://issues.apache.org/jira/browse/HIVE-2379
 Project: Hive
  Issue Type: Bug
  Components: CLI, Clients, HBase Handler
Affects Versions: 0.7.1, 0.8.0, 0.9.0
Reporter: Roman Shaposhnik
Assignee: Navis
Priority: Critical
 Attachments: HIVE-2379.D7347.1.patch


 For now any Hive/HBase queries would require the following jars to be 
 explicitly added via hive's add jar command:
 add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar;
 add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar;
 add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar;
 add jar /usr/lib/hive/lib/guava-r06.jar;
 the longer term solution, perhaps, should be to have the code at submit time 
 call hbase's 
 TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship 
 it in distributedcache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3405) UDF to obtain a string with the first letter of each word in uppercase

2013-02-04 Thread Padma Ravindran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Padma Ravindran updated HIVE-3405:
--

   Labels: patch  (was: )
Affects Version/s: 0.11.0
   0.9.1
   0.8.1
   0.9.0
   0.10.0
 Release Note: Initcap method tested.please verify
   Status: Patch Available  (was: Open)

Initcap method tested.please verify

 UDF to obtain a string with the first letter of each word in uppercase
 --

 Key: HIVE-3405
 URL: https://issues.apache.org/jira/browse/HIVE-3405
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.10.0, 0.9.0, 0.8.1, 0.9.1, 0.11.0
Reporter: Archana Nair
  Labels: patch

 Hive current releases lacks a INITCAP function  which returns String with 
 first letter of the word in uppercase.INITCAP returns String, with the first 
 letter of each word in uppercase, all other letters in same case. Words are 
 delimited by white space.This will be useful report generation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3405) UDF to obtain a string with the first letter of each word in uppercase

2013-02-04 Thread Padma Ravindran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Padma Ravindran updated HIVE-3405:
--

Attachment: HIVE-3405.1.patch.txt

Patch

 UDF to obtain a string with the first letter of each word in uppercase
 --

 Key: HIVE-3405
 URL: https://issues.apache.org/jira/browse/HIVE-3405
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.8.1, 0.9.0, 0.10.0, 0.9.1, 0.11.0
Reporter: Archana Nair
  Labels: patch
 Attachments: HIVE-3405.1.patch.txt


 Hive current releases lacks a INITCAP function  which returns String with 
 first letter of the word in uppercase.INITCAP returns String, with the first 
 letter of each word in uppercase, all other letters in same case. Words are 
 delimited by white space.This will be useful report generation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3937) Hive Profiler

2013-02-04 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3937:
-

Status: Patch Available  (was: Open)

 Hive Profiler
 -

 Key: HIVE-3937
 URL: https://issues.apache.org/jira/browse/HIVE-3937
 Project: Hive
  Issue Type: New Feature
Reporter: Pamela Vagata
Assignee: Pamela Vagata
Priority: Minor
 Attachments: HIVE-3937.1.patch.txt, HIVE-3937.patch.2.txt, 
 HIVE-3937.patch.3.txt, HIVE-3937.patch.4.txt, HIVE-3937.patch.5.txt


 Adding a Hive Profiler implementation which tracks inclusive wall times and 
 call counts of the operators

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3937) Hive Profiler

2013-02-04 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3937:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Pamela

 Hive Profiler
 -

 Key: HIVE-3937
 URL: https://issues.apache.org/jira/browse/HIVE-3937
 Project: Hive
  Issue Type: New Feature
Reporter: Pamela Vagata
Assignee: Pamela Vagata
Priority: Minor
 Fix For: 0.11.0

 Attachments: HIVE-3937.1.patch.txt, HIVE-3937.patch.2.txt, 
 HIVE-3937.patch.3.txt, HIVE-3937.patch.4.txt, HIVE-3937.patch.5.txt


 Adding a Hive Profiler implementation which tracks inclusive wall times and 
 call counts of the operators

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3937) Hive Profiler

2013-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570126#comment-13570126
 ] 

Hudson commented on HIVE-3937:
--

Integrated in hive-trunk-hadoop1 #67 (See 
[https://builds.apache.org/job/hive-trunk-hadoop1/67/])
HIVE-3937 Hive Profiler
(Pamela Vagata via namit) (Revision 1442062)

 Result = ABORTED
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442062
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
* /hive/trunk/conf/hive-default.xml.template
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisher.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisherInfo.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfiler.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerAggregateStat.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerConnectionInfo.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStats.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStatsAggregator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerUtils.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/HiveProfilerResultsHook.java
* /hive/trunk/ql/src/test/queries/clientpositive/hiveprofiler0.q
* /hive/trunk/ql/src/test/results/clientpositive/hiveprofiler0.q.out


 Hive Profiler
 -

 Key: HIVE-3937
 URL: https://issues.apache.org/jira/browse/HIVE-3937
 Project: Hive
  Issue Type: New Feature
Reporter: Pamela Vagata
Assignee: Pamela Vagata
Priority: Minor
 Fix For: 0.11.0

 Attachments: HIVE-3937.1.patch.txt, HIVE-3937.patch.2.txt, 
 HIVE-3937.patch.3.txt, HIVE-3937.patch.4.txt, HIVE-3937.patch.5.txt


 Adding a Hive Profiler implementation which tracks inclusive wall times and 
 call counts of the operators

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3571) add a way to run a small unit quickly

2013-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570127#comment-13570127
 ] 

Hudson commented on HIVE-3571:
--

Integrated in hive-trunk-hadoop1 #67 (See 
[https://builds.apache.org/job/hive-trunk-hadoop1/67/])
HIVE-3571 : add a way to run a small unit quickly (Navis via Ashutosh 
Chauhan) (Revision 1442043)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442043
Files : 
* /hive/trunk/build.properties
* /hive/trunk/build.xml


 add a way to run a small unit quickly
 -

 Key: HIVE-3571
 URL: https://issues.apache.org/jira/browse/HIVE-3571
 Project: Hive
  Issue Type: Test
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Navis
 Fix For: 0.11.0

 Attachments: HIVE-3571.1.patch.txt, HIVE-3571.D7695.1.patch, 
 HIVE-3571.D7695.2.patch, HIVE-3571.D7695.3.patch


 A simple unit test:
 ant test -Dtestcase=TestCliDriver -Dqfile=groupby2.q
 takes a long time.
 There should be a quick way to achieve that for debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3956) TestMetaStoreAuthorization always uses the same port

2013-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570128#comment-13570128
 ] 

Hudson commented on HIVE-3956:
--

Integrated in hive-trunk-hadoop1 #67 (See 
[https://builds.apache.org/job/hive-trunk-hadoop1/67/])
HIVE-3956 : TestMetaStoreAuthorization always uses the same port (Navis via 
Ashutosh Chauhan) (Revision 1442038)

 Result = ABORTED
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442038
Files : 
* 
/hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreAuthorization.java


 TestMetaStoreAuthorization always uses the same port
 

 Key: HIVE-3956
 URL: https://issues.apache.org/jira/browse/HIVE-3956
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.11.0

 Attachments: HIVE-3956.D8253.1.patch


 Similar issue with HIVE-2959 and HIVE-3052. Using fixed port(1) for test.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly

2013-02-04 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570130#comment-13570130
 ] 

Phabricator commented on HIVE-2839:
---

njain has commented on the revision HIVE-2839 [jira] Filters on outer join 
with mapjoin hint is not applied correctly.

INLINE COMMENTS
  ql/src/test/queries/clientpositive/mapjoin1.q:15 Can you add tests with 
hive.outerjoin.supports.filter also ?

REVISION DETAIL
  https://reviews.facebook.net/D2079

To: JIRA, navis
Cc: njain


 Filters on outer join with mapjoin hint is not applied correctly
 

 Key: HIVE-2839
 URL: https://issues.apache.org/jira/browse/HIVE-2839
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, 
 HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch


 Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions.
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND true limit 10;
 FAILED: Hive Internal Error: 
 java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc
  cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
 java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689)
   at 
 org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 and 
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND b.key * 10  '1000' limit 10;
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:495)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
   ... 8 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent 

[jira] [Commented] (HIVE-3559) UDF RIGHT(string,position) to HIVE

2013-02-04 Thread Arun A K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570282#comment-13570282
 ] 

Arun A K commented on HIVE-3559:


[~meenu]
Please review the patch submitted. I have seen ample number of bugs in the 
current version of the patch.

eg:
 
String null, length 3
String teststring, length 200
String test, length -1
...

Do take time and fix these bugs and add new patch file. 

 UDF  RIGHT(string,position) to HIVE
 ---

 Key: HIVE-3559
 URL: https://issues.apache.org/jira/browse/HIVE-3559
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.9.0
Reporter: Vinaya Varghese
Assignee: Meenu K Chandran
Priority: Minor
 Attachments: HIVE-3559.1.patch.txt, udf_right.q, udf_right.q.out


 Introduction
   UDF (User Defined Function) to obtain the rightmost 'n' characters from 
 a string in  HIVE. 
 Relevance
   Current releases of Hive lacks a function which would returns the 
 rightmost len characters from the string str, or NULL if any argument is 
 NULL. The function LEFT(string,length)  would return the rightmost 'length' 
 characters from the string 'string' , or NULL if any argument is NULL which 
 would be useful while using HiveQL. This would find its use  in all the 
 technical aspects where the concept of strings are used.
 Functionality :-
 Function Name: RIGHT(string,length) 

 Returns the rightmost 'length' characters from the string  or NULL if any 
 argument is NULL.  
 Example: hiveSELECT LEFT('https://www.irctc.com',3);
   - 'com'
 Usage :-
 Case 1: To query a table to find details based on an https request
 Table :-Transaction
 Request_id|date|period_id|url_name
 0001|01/07/2012|110001|https://www.irctc.com
 0002|02/07/2012|110001|https://nextstep.tcs.com
 0003|03/07/2012|110001|https://www.hdfcbank.com
 0005|01/07/2012|110001|http://www.lmnm.org
 0006|08/07/2012|110001|http://nextstart.gov
 0007|10/07/2012|110001|https://netbanking.icicibank.com
 0012|21/07/2012|110001|http://www.people.nic
 0026|08/07/2012|110001|http://nextprobs.gov
 00023|25/07/2012|110001|https://netbanking.canarabank.com
 Query : select * from transaction where RIGHT(url_name,3)='com';
 Result :-
 0001|01/07/2012|110001|https://www.irctc.com
 0002|02/07/2012|110001|https://nextstep.tcs.com  
 0003|03/07/2012|110001|https://www.hdfcbank.com
 0007|10/07/2012|110001|https://netbanking.icicibank.com
 00023|25/07/2012|110001|https://netbanking.canarabank.com

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-896) Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.

2013-02-04 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-896:
---

Attachment: hive-896.3.patch.txt

 Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.
 ---

 Key: HIVE-896
 URL: https://issues.apache.org/jira/browse/HIVE-896
 Project: Hive
  Issue Type: New Feature
  Components: OLAP, UDF
Reporter: Amr Awadallah
Priority: Minor
 Attachments: DataStructs.pdf, HIVE-896.1.patch.txt, 
 Hive-896.2.patch.txt, hive-896.3.patch.txt


 Windowing functions are very useful for click stream processing and similar 
 time-series/sliding-window analytics.
 More details at:
 http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1006709
 http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007059
 http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007032
 -- amr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-896) Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.

2013-02-04 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570306#comment-13570306
 ] 

Harish Butani commented on HIVE-896:


Attached patch to be used as starting point for hive branch. 
Has minor changes since the last patch.

 Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.
 ---

 Key: HIVE-896
 URL: https://issues.apache.org/jira/browse/HIVE-896
 Project: Hive
  Issue Type: New Feature
  Components: OLAP, UDF
Reporter: Amr Awadallah
Priority: Minor
 Attachments: DataStructs.pdf, HIVE-896.1.patch.txt, 
 Hive-896.2.patch.txt, hive-896.3.patch.txt


 Windowing functions are very useful for click stream processing and similar 
 time-series/sliding-window analytics.
 More details at:
 http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1006709
 http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007059
 http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007032
 -- amr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Requests

2013-02-04 Thread kulkarni.swar...@gmail.com
Hello,

I opened up two reviews for small issues, HIVE-3553[1] and HIVE-3725[2]. If
you guys get a chance to review and provide feedback on it, I will really
appreciate.

Thanks,

[1] https://reviews.apache.org/r/9275/
[2] https://reviews.apache.org/r/9276/

-- 
Swarnim


Re: branch for ptf and windowing fuction

2013-02-04 Thread Ashutosh Chauhan
Hi Arvind,

Yeah thats the idea to do periodic merges to keep the branch in sync with
trunk, otherwise merging it with trunk later on will get unnecessarily
complicated.

Thanks,
Ashutosh


On Mon, Feb 4, 2013 at 12:56 AM, Arvind Prabhakar arv...@apache.org wrote:

 Hi Ashutosh,

 My +1 for the proposal for creating a separate branch for feature
 development.

 I do have one question in this regard: how do you plan on keeping this
 branch in sync with the trunk? If the branch is allowed to diverge
 indefinitely, it is likely that the build from it will lag in features and
 fixes that are otherwise available on the trunk. It will be great if you
 could get the branch to first synchronize with the trunk and then follow a
 policy where there are periodic merges from the trunk into the development
 branch.

 Regards,
 Arvind Prabhakar

 On Fri, Feb 1, 2013 at 10:11 AM, Ashutosh Chauhan hashut...@apache.org
 wrote:

  Hi all,
 
  Harish and Prajkta are doing some cool work over at
  https://issues.apache.org/jira/browse/HIVE-896 IMO its a very useful
  feature for the community and our user base. Harish and Prajkta are
 making
  steady progress on this for much last year in their github repo
  https://github.com/hbutani/hive and much of the feature is now
 functional.
  However, its quite a bit of work and new code which will take some time
  before being ready for trunk. I propose that we create a new branch so
 that
  further development  of this happens in apache repo instead of github
 repo.
  This gets us few benefits:
  a) It will avoid the situation we ended up with HiveServer2 where a
 useful
  new functionality came but in one big patch which made its review and
 thus
  inclusion in mainline harder than it should have been.
  b) Obvious advantages of development getting done in apache as oppose to
  github which are:
   i) It will make it easier for apache hive community members
 interested
  in this work (like me) to follow progress.
  ii) It will make it easier for apache hive community members
 interested
  in this work to contribute.
  iii) It will make it easier for apache community members to review
 the
  work and provide feedback.
 
  I further propose that we follow Commit-than-review policy for this
 feature
  branch which will enable contributors to make rapid progress without
  waiting for lengthy review cycles. Hive committers interested in work can
  either review branch any time they want to provide feedback or can wait
  till contributors declare work is complete and make a proposal to merge
 in
  trunk and than review it than. This anyway is a throwaway branch not
  intended to make releases out of it.
 
  Unless I hear any objections, I will create a branch over the weekend.
 
  Thanks,
  Ashutosh
 



Re: branch for ptf and windowing fuction

2013-02-04 Thread Ashutosh Chauhan
Hi all,

Cool. Seems like everyone is on board. I have created a new branch [1]
based of current trunk and have committed latest patch attached on HIVE-896
to it. Check it out. Feel free to open jiras for this work and put up
patches. I have added a new component called ptf-windowing on jira which
you could use for issues related to this work.

https://svn.apache.org/repos/asf/hive/branches/ptf-windowing/

Thanks,
Ashutosh


On Mon, Feb 4, 2013 at 8:54 AM, Ashutosh Chauhan hashut...@apache.orgwrote:

 Hi Arvind,

 Yeah thats the idea to do periodic merges to keep the branch in sync with
 trunk, otherwise merging it with trunk later on will get unnecessarily
 complicated.

 Thanks,
 Ashutosh


 On Mon, Feb 4, 2013 at 12:56 AM, Arvind Prabhakar arv...@apache.orgwrote:

 Hi Ashutosh,

 My +1 for the proposal for creating a separate branch for feature
 development.

 I do have one question in this regard: how do you plan on keeping this
 branch in sync with the trunk? If the branch is allowed to diverge
 indefinitely, it is likely that the build from it will lag in features and
 fixes that are otherwise available on the trunk. It will be great if you
 could get the branch to first synchronize with the trunk and then follow a
 policy where there are periodic merges from the trunk into the development
 branch.

 Regards,
 Arvind Prabhakar

 On Fri, Feb 1, 2013 at 10:11 AM, Ashutosh Chauhan hashut...@apache.org
 wrote:

  Hi all,
 
  Harish and Prajkta are doing some cool work over at
  https://issues.apache.org/jira/browse/HIVE-896 IMO its a very useful
  feature for the community and our user base. Harish and Prajkta are
 making
  steady progress on this for much last year in their github repo
  https://github.com/hbutani/hive and much of the feature is now
 functional.
  However, its quite a bit of work and new code which will take some time
  before being ready for trunk. I propose that we create a new branch so
 that
  further development  of this happens in apache repo instead of github
 repo.
  This gets us few benefits:
  a) It will avoid the situation we ended up with HiveServer2 where a
 useful
  new functionality came but in one big patch which made its review and
 thus
  inclusion in mainline harder than it should have been.
  b) Obvious advantages of development getting done in apache as oppose to
  github which are:
   i) It will make it easier for apache hive community members
 interested
  in this work (like me) to follow progress.
  ii) It will make it easier for apache hive community members
 interested
  in this work to contribute.
  iii) It will make it easier for apache community members to review
 the
  work and provide feedback.
 
  I further propose that we follow Commit-than-review policy for this
 feature
  branch which will enable contributors to make rapid progress without
  waiting for lengthy review cycles. Hive committers interested in work
 can
  either review branch any time they want to provide feedback or can wait
  till contributors declare work is complete and make a proposal to merge
 in
  trunk and than review it than. This anyway is a throwaway branch not
  intended to make releases out of it.
 
  Unless I hear any objections, I will create a branch over the weekend.
 
  Thanks,
  Ashutosh
 





[jira] [Created] (HIVE-3981) Split up tests in ptf_general_queries.q

2013-02-04 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-3981:
--

 Summary: Split up tests in ptf_general_queries.q
 Key: HIVE-3981
 URL: https://issues.apache.org/jira/browse/HIVE-3981
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes 
on my laptop to finish. I think we should break it down in smaller .q files 
otherwise adding a new query and debugging it will be a pain. I have split out 
rcfile and seqfile tests from it to begin. Also, this test currently fails 
because original patch didn't had .rc and .seq files (they were binary). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3981) Split up tests in ptf_general_queries.q

2013-02-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570400#comment-13570400
 ] 

Ashutosh Chauhan commented on HIVE-3981:


[~rhbutani] Can you attach .rcfile and .seqfile here on jira, I can than 
generate the full patch.

 Split up tests in ptf_general_queries.q
 ---

 Key: HIVE-3981
 URL: https://issues.apache.org/jira/browse/HIVE-3981
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3981.patch


 ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes 
 on my laptop to finish. I think we should break it down in smaller .q files 
 otherwise adding a new query and debugging it will be a pain. I have split 
 out rcfile and seqfile tests from it to begin. Also, this test currently 
 fails because original patch didn't had .rc and .seq files (they were 
 binary). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3937) Hive Profiler

2013-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570402#comment-13570402
 ] 

Hudson commented on HIVE-3937:
--

Integrated in Hive-trunk-hadoop2 #106 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/106/])
HIVE-3937 Hive Profiler
(Pamela Vagata via namit) (Revision 1442062)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442062
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
* /hive/trunk/conf/hive-default.xml.template
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisher.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisherInfo.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfiler.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerAggregateStat.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerConnectionInfo.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStats.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStatsAggregator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerUtils.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/HiveProfilerResultsHook.java
* /hive/trunk/ql/src/test/queries/clientpositive/hiveprofiler0.q
* /hive/trunk/ql/src/test/results/clientpositive/hiveprofiler0.q.out


 Hive Profiler
 -

 Key: HIVE-3937
 URL: https://issues.apache.org/jira/browse/HIVE-3937
 Project: Hive
  Issue Type: New Feature
Reporter: Pamela Vagata
Assignee: Pamela Vagata
Priority: Minor
 Fix For: 0.11.0

 Attachments: HIVE-3937.1.patch.txt, HIVE-3937.patch.2.txt, 
 HIVE-3937.patch.3.txt, HIVE-3937.patch.4.txt, HIVE-3937.patch.5.txt


 Adding a Hive Profiler implementation which tracks inclusive wall times and 
 call counts of the operators

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3571) add a way to run a small unit quickly

2013-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570403#comment-13570403
 ] 

Hudson commented on HIVE-3571:
--

Integrated in Hive-trunk-hadoop2 #106 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/106/])
HIVE-3571 : add a way to run a small unit quickly (Navis via Ashutosh 
Chauhan) (Revision 1442043)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442043
Files : 
* /hive/trunk/build.properties
* /hive/trunk/build.xml


 add a way to run a small unit quickly
 -

 Key: HIVE-3571
 URL: https://issues.apache.org/jira/browse/HIVE-3571
 Project: Hive
  Issue Type: Test
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Navis
 Fix For: 0.11.0

 Attachments: HIVE-3571.1.patch.txt, HIVE-3571.D7695.1.patch, 
 HIVE-3571.D7695.2.patch, HIVE-3571.D7695.3.patch


 A simple unit test:
 ant test -Dtestcase=TestCliDriver -Dqfile=groupby2.q
 takes a long time.
 There should be a quick way to achieve that for debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3956) TestMetaStoreAuthorization always uses the same port

2013-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570404#comment-13570404
 ] 

Hudson commented on HIVE-3956:
--

Integrated in Hive-trunk-hadoop2 #106 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/106/])
HIVE-3956 : TestMetaStoreAuthorization always uses the same port (Navis via 
Ashutosh Chauhan) (Revision 1442038)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442038
Files : 
* 
/hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreAuthorization.java


 TestMetaStoreAuthorization always uses the same port
 

 Key: HIVE-3956
 URL: https://issues.apache.org/jira/browse/HIVE-3956
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.11.0

 Attachments: HIVE-3956.D8253.1.patch


 Similar issue with HIVE-2959 and HIVE-3052. Using fixed port(1) for test.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3981) Split up tests in ptf_general_queries.q

2013-02-04 Thread Harish Butani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harish Butani updated HIVE-3981:


Attachment: part.seq
part.rc

 Split up tests in ptf_general_queries.q
 ---

 Key: HIVE-3981
 URL: https://issues.apache.org/jira/browse/HIVE-3981
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3981.patch, part.rc, part.seq


 ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes 
 on my laptop to finish. I think we should break it down in smaller .q files 
 otherwise adding a new query and debugging it will be a pain. I have split 
 out rcfile and seqfile tests from it to begin. Also, this test currently 
 fails because original patch didn't had .rc and .seq files (they were 
 binary). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3978) HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH

2013-02-04 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3978:
---

Attachment: HIVE-3978_branch_0.10_0.patch
HIVE-3978_trunk_0.patch

 HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets 
 appended to HADOOP_CLASSPATH
 -

 Key: HIVE-3978
 URL: https://issues.apache.org/jira/browse/HIVE-3978
 Project: Hive
  Issue Type: Bug
 Environment: hive-0.10
 hcatalog-0.5
 hadoop 0.23
 hbase 0.94
Reporter: Arup Malakar
Assignee: Arup Malakar
 Attachments: HIVE-3978_branch_0.10_0.patch, HIVE-3978_trunk_0.patch


 The following code gets executed only in case of cygwin.
 HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/,/:/g'`
 But since HIVE_AUX_JARS_PATH gets added to HADOOP_CLASSPATH, the comma should 
 get replaced by : for all cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3978) HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets appended to HADOOP_CLASSPATH

2013-02-04 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3978:
---

Fix Version/s: 0.10.0
   0.11.0
 Release Note: Use ':' in HIVE_AUX_JARS_PATH instead of ','
   Status: Patch Available  (was: Open)

Review: https://reviews.facebook.net/D8373

 HIVE_AUX_JARS_PATH should have : instead of , as separator since it gets 
 appended to HADOOP_CLASSPATH
 -

 Key: HIVE-3978
 URL: https://issues.apache.org/jira/browse/HIVE-3978
 Project: Hive
  Issue Type: Bug
 Environment: hive-0.10
 hcatalog-0.5
 hadoop 0.23
 hbase 0.94
Reporter: Arup Malakar
Assignee: Arup Malakar
 Fix For: 0.11.0, 0.10.0

 Attachments: HIVE-3978_branch_0.10_0.patch, HIVE-3978_trunk_0.patch


 The following code gets executed only in case of cygwin.
 HIVE_AUX_JARS_PATH=`echo $HIVE_AUX_JARS_PATH | sed 's/,/:/g'`
 But since HIVE_AUX_JARS_PATH gets added to HADOOP_CLASSPATH, the comma should 
 get replaced by : for all cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3981) Split up tests in ptf_general_queries.q

2013-02-04 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570485#comment-13570485
 ] 

Harish Butani commented on HIVE-3981:
-

I attached the files. Several things:
- the rc  seq file were in the original patch. So why do you need them here
- We have struggled with this issue. The reason to keep all tests in 1 file is 
we make sure all the ptf tests run before every commit. Can we have a junit 
suite for all these tests. I would request that developers continue to make 
sure all these tests run before a commit.
- A hack we use to speed up running all the tests is to explicitly set 
'runningViaChild = false' line 133 MapRedTask.java. The 70 tests run in under 5 
minutes with this setting.

 Split up tests in ptf_general_queries.q
 ---

 Key: HIVE-3981
 URL: https://issues.apache.org/jira/browse/HIVE-3981
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3981.patch, part.rc, part.seq


 ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes 
 on my laptop to finish. I think we should break it down in smaller .q files 
 otherwise adding a new query and debugging it will be a pain. I have split 
 out rcfile and seqfile tests from it to begin. Also, this test currently 
 fails because original patch didn't had .rc and .seq files (they were 
 binary). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3252) Add environment context to metastore Thrift calls

2013-02-04 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570516#comment-13570516
 ] 

Kevin Wilfong commented on HIVE-3252:
-

+1

 Add environment context to metastore Thrift calls
 -

 Key: HIVE-3252
 URL: https://issues.apache.org/jira/browse/HIVE-3252
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: John Reese
Assignee: John Reese
Priority: Minor
 Attachments: HIVE-3252.1.patch.txt, HIVE-3252.2.patch.txt


 Currently in the Hive Thrift metastore API create_table, add_partition, 
 alter_table, alter_partition have with_environment_context analogs.  It would 
 be really useful to add similar methods from drop_partition, drop_table, and 
 append_partition.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3982) Merge PTFDesc and PTFDef classes

2013-02-04 Thread Ashutosh Chauhan (JIRA)
Ashutosh Chauhan created HIVE-3982:
--

 Summary: Merge PTFDesc and PTFDef classes
 Key: HIVE-3982
 URL: https://issues.apache.org/jira/browse/HIVE-3982
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan


As discussed on 
https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3982) Merge PTFDesc and PTFDef classes

2013-02-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3982:
---

Attachment: hive-3982.patch

This is first stab at refactoring. There is more work to do to get rid of antlr 
datastructures. 
This patch just gets rid of PTFDef class and uses PTFDesc everywhere.

 Merge PTFDesc and PTFDef classes
 

 Key: HIVE-3982
 URL: https://issues.apache.org/jira/browse/HIVE-3982
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3982.patch


 As discussed on 
 https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3983) Select on table with hbase storage handler fails with an SASL error

2013-02-04 Thread Arup Malakar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arup Malakar updated HIVE-3983:
---

Summary: Select on table with hbase storage handler fails with an SASL 
error  (was: Select on table with hbase storage handler fails with an SASL)

 Select on table with hbase storage handler fails with an SASL error
 ---

 Key: HIVE-3983
 URL: https://issues.apache.org/jira/browse/HIVE-3983
 Project: Hive
  Issue Type: Bug
 Environment: hive-0.10
 hbase-0.94.5.5
 hadoop-0.23.3.1
 hcatalog-0.5
Reporter: Arup Malakar

 The table is created using the following query:
 {code}
 CREATE TABLE hbase_table_1(key int, value string) 
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
 WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf1:val)
 TBLPROPERTIES (hbase.table.name = xyz); 
 {code}
 Doing a select on the table launches a map-reduce job. But the job fails with 
 the following error:
 {code}
 2013-02-02 01:31:07,500 FATAL [IPC Server handler 3 on 40118] 
 org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
 attempt_1348093718159_1501_m_00_0 - exited : java.io.IOException: 
 java.lang.RuntimeException: SASL authentication failed. The most likely cause 
 is missing or invalid credentials. Consider 'kinit'.
   at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
   at 
 org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
   at 
 org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243)
   at 
 org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:522)
   at 
 org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:160)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:381)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
 Caused by: java.lang.RuntimeException: SASL authentication failed. The most 
 likely cause is missing or invalid credentials. Consider 'kinit'.
   at 
 org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:242)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37)
   at org.apache.hadoop.hbase.security.User.call(User.java:590)
   at org.apache.hadoop.hbase.security.User.access$700(User.java:51)
   at 
 org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:444)
   at 
 org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.handleSaslConnectionFailure(SecureClient.java:203)
   at 
 org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.setupIOstreams(SecureClient.java:291)
   at 
 org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
   at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
   at 
 org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:104)
   at $Proxy12.getProtocolVersion(Unknown Source)
   at 
 org.apache.hadoop.hbase.ipc.SecureRpcEngine.getProxy(SecureRpcEngine.java:146)
   at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1291)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1278)
   at 
 org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:987)
   at 
 

[jira] [Created] (HIVE-3983) Select on table with hbase storage handler fails with an SASL

2013-02-04 Thread Arup Malakar (JIRA)
Arup Malakar created HIVE-3983:
--

 Summary: Select on table with hbase storage handler fails with an 
SASL
 Key: HIVE-3983
 URL: https://issues.apache.org/jira/browse/HIVE-3983
 Project: Hive
  Issue Type: Bug
 Environment: hive-0.10
hbase-0.94.5.5
hadoop-0.23.3.1
hcatalog-0.5
Reporter: Arup Malakar


The table is created using the following query:

{code}
CREATE TABLE hbase_table_1(key int, value string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf1:val)
TBLPROPERTIES (hbase.table.name = xyz); 
{code}

Doing a select on the table launches a map-reduce job. But the job fails with 
the following error:

{code}
2013-02-02 01:31:07,500 FATAL [IPC Server handler 3 on 40118] 
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: 
attempt_1348093718159_1501_m_00_0 - exited : java.io.IOException: 
java.lang.RuntimeException: SASL authentication failed. The most likely cause 
is missing or invalid credentials. Consider 'kinit'.
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:243)
at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:522)
at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.init(MapTask.java:160)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:381)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)
Caused by: java.lang.RuntimeException: SASL authentication failed. The most 
likely cause is missing or invalid credentials. Consider 'kinit'.
at 
org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection$1.run(SecureClient.java:242)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.hbase.util.Methods.call(Methods.java:37)
at org.apache.hadoop.hbase.security.User.call(User.java:590)
at org.apache.hadoop.hbase.security.User.access$700(User.java:51)
at 
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:444)
at 
org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.handleSaslConnectionFailure(SecureClient.java:203)
at 
org.apache.hadoop.hbase.ipc.SecureClient$SecureConnection.setupIOstreams(SecureClient.java:291)
at 
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1124)
at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:974)
at 
org.apache.hadoop.hbase.ipc.SecureRpcEngine$Invoker.invoke(SecureRpcEngine.java:104)
at $Proxy12.getProtocolVersion(Unknown Source)
at 
org.apache.hadoop.hbase.ipc.SecureRpcEngine.getProxy(SecureRpcEngine.java:146)
at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:208)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1335)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1291)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:1278)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:987)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:882)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:984)
at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:886)
at 

[jira] [Assigned] (HIVE-3952) merge map-job followed by map-reduce job

2013-02-04 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reassigned HIVE-3952:
-

Assignee: Vinod Kumar Vavilapalli

I'd like to take a stab at it..

 merge map-job followed by map-reduce job
 

 Key: HIVE-3952
 URL: https://issues.apache.org/jira/browse/HIVE-3952
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Vinod Kumar Vavilapalli

 Consider the query like:
 select count(*) FROM
 ( select idOne, idTwo, value FROM
   bigTable   
   JOIN
 
   smallTableOne on (bigTable.idOne = smallTableOne.idOne) 
   
   ) firstjoin 
 
 JOIN  
 
 smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo);
 where smallTableOne and smallTableTwo are smaller than 
 hive.auto.convert.join.noconditionaltask.size and
 hive.auto.convert.join.noconditionaltask is set to true.
 The joins are collapsed into mapjoins, and it leads to a map-only job
 (for the map-joins) followed by a map-reduce job (for the group by).
 Ideally, the map-only job should be merged with the following map-reduce job.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby

2013-02-04 Thread Gunther Hagleitner (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570645#comment-13570645
 ] 

Gunther Hagleitner commented on HIVE-2340:
--

[~navis]: I think in general the logic should be to copy numReducers from 
parent to child not the other way around. If hive makes a decent estimate of 
reducers for the parent, that's probably the number you want to carry into the 
combined reduce stage, because that means each reducer is doing the desired 
amount of work. Buckets and order by are the only special cases I can think of, 
where the number needs to be fixed.

For those special cases without knowing the cardinalities of join/group 
by/tables, it's indeed difficult to guess if the optimization should be on or 
off. However, what do you think of using a max ratio of parent reducers/child 
reducers instead of a fixed minimum number of reducers for the child? With a 
default of 4 maybe. I.e.: If there are less than 4 times as many reducers in 
the parent than in the child collapse (assuming another job will be more 
expensive than the lower number of reducers), else leave it alone. The 
optimization is only good if the input sizes of the child and parent reducers 
are similar and expressing this as a ratio of number of reducers is probably 
the closest we can get right now.

This would enable the optimization for a larger body of queries (small tables, 
single input split, empty group by expr, etc).

 optimize orderby followed by a groupby
 --

 Key: HIVE-2340
 URL: https://issues.apache.org/jira/browse/HIVE-2340
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: perfomance
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt, 
 HIVE-2340.D1209.10.patch, HIVE-2340.D1209.6.patch, HIVE-2340.D1209.7.patch, 
 HIVE-2340.D1209.8.patch, HIVE-2340.D1209.9.patch, testclidriver.txt


 Before implementing optimizer for JOIN-GBY, try to implement RS-GBY 
 optimizer(cluster-by following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [VOTE] Amend Hive Bylaws + Add HCatalog Submodule

2013-02-04 Thread Carl Steinbach
The following active Hive PMC members have cast votes:

Carl Steinbach: +1, +1
Ashutosh Chauhan: +1, +1
Edward Capriolo: +1, +1
Ashish Thusoo: +1, +1
Yongqiang He: +1, +1
Namit Jain: +1, +1

Three active PMC members have abstained from voting.

Over the last week the following four Hive PMC members requested
that their status be changed from active to emeritus member:
jvs, prasadc, zhao, pauly.

Voting on these measures is now closed. Both measures have been approved
with the required 2/3 majority of active Hive PMC members.

Thanks.

Carl

On Thu, Jan 31, 2013 at 2:04 PM, Vinod Kumar Vavilapalli 
vino...@hortonworks.com wrote:


 +1 and +1 non-binding.

 Great to see this happen!

 Thanks,
 +Vinod


 On Thu, Jan 31, 2013 at 12:14 AM, Namit Jain nj...@fb.com wrote:

 +1 and +1

 On 1/30/13 6:53 AM, Gunther Hagleitner ghagleit...@hortonworks.com
 wrote:

 +1 and +1
 
 Thanks,
 Gunther.
 
 
 On Tue, Jan 29, 2013 at 5:18 PM, Edward Capriolo
 edlinuxg...@gmail.comwrote:
 
  Measure 1: +1
  Measure 2: +1
 
  On Mon, Jan 28, 2013 at 2:47 PM, Carl Steinbach c...@apache.org
 wrote:
 
   I am calling a vote on the following two measures.
  
   Measure 1: Amend Hive Bylaws to Define Submodules and Submodule
  Committers
  
   If this measure passes the Apache Hive Project Bylaws will be
   amended with the following changes:
  
  
  
 
 
 https://cwiki.apache.org/confluence/display/Hive/Proposed+Changes+to+Hive
 +Bylaws+for+Submodule+Committers
  
   The motivation for these changes is discussed in the following
   email thread which appeared on the hive-dev and hcatalog-dev
   mailing lists:
  
   http://markmail.org/thread/u5nap7ghvyo7euqa
  
  
   Measure 2: Create HCatalog Submodule and Adopt HCatalog Codebase
  
   This measure provides for 1) the establishment of an HCatalog
   submodule in the Apache Hive Project, 2) the adoption of the
   Apache HCatalog codebase into the Hive HCatalog submodule, and
   3) adding all currently active HCatalog committers as submodule
   committers on the Hive HCatalog submodule.
  
   Passage of this measure depends on the passage of Measure 1.
  
  
   Voting:
  
   Both measures require +1 votes from 2/3 of active Hive PMC
   members in order to pass. All participants in the Hive project
   are encouraged to vote on these measures, but only votes from
   active Hive PMC members are binding. The voting period
   commences immediately and shall last a minimum of six days.
  
   Voting is carried out by replying to this email thread. You must
   indicate which measure you are voting on in order for your vote
   to be counted.
  
   More details about the voting process can be found in the Apache
   Hive Project Bylaws:
  
   https://cwiki.apache.org/confluence/display/Hive/Bylaws
  
  
 




 --
 +Vinod
 Hortonworks Inc.
 http://hortonworks.com/



Jenkins build is back to normal : Hive-0.9.1-SNAPSHOT-h0.21 #282

2013-02-04 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/282/



[jira] [Updated] (HIVE-3981) Split up tests in ptf_general_queries.q

2013-02-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3981:
---

Attachment: hive-3981_2.patch

Patch with ptf_seqfile.q.out and ptf_rcfile.q.out 

Harish,
* patch files are txt files, they don't usually contain binary data, so part.rc 
and part.seq data are not contained in your patch file.
* -Dqfile is capable of handling multiple q files in one run, so contributors 
can specify -Dqfile=ptf_general_queries.q,ptf_rcfile.q,ptf_seqfile.q so it 
shouldn't be too hard for contributors to ensure they have run all tests.
* Running tests in-process is a good idea but has implication for whole of hive 
project so probably needs to be discussed in seperate jira 

 Split up tests in ptf_general_queries.q
 ---

 Key: HIVE-3981
 URL: https://issues.apache.org/jira/browse/HIVE-3981
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3981_2.patch, hive-3981.patch, part.rc, part.seq


 ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes 
 on my laptop to finish. I think we should break it down in smaller .q files 
 otherwise adding a new query and debugging it will be a pain. I have split 
 out rcfile and seqfile tests from it to begin. Also, this test currently 
 fails because original patch didn't had .rc and .seq files (they were 
 binary). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3937) Hive Profiler

2013-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570778#comment-13570778
 ] 

Hudson commented on HIVE-3937:
--

Integrated in Hive-trunk-h0.21 #1955 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1955/])
HIVE-3937 Hive Profiler
(Pamela Vagata via namit) (Revision 1442062)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442062
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
* /hive/trunk/conf/hive-default.xml.template
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisher.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilePublisherInfo.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfiler.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerAggregateStat.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerConnectionInfo.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStats.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerStatsAggregator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/profiler/HiveProfilerUtils.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/HiveProfilerResultsHook.java
* /hive/trunk/ql/src/test/queries/clientpositive/hiveprofiler0.q
* /hive/trunk/ql/src/test/results/clientpositive/hiveprofiler0.q.out


 Hive Profiler
 -

 Key: HIVE-3937
 URL: https://issues.apache.org/jira/browse/HIVE-3937
 Project: Hive
  Issue Type: New Feature
Reporter: Pamela Vagata
Assignee: Pamela Vagata
Priority: Minor
 Fix For: 0.11.0

 Attachments: HIVE-3937.1.patch.txt, HIVE-3937.patch.2.txt, 
 HIVE-3937.patch.3.txt, HIVE-3937.patch.4.txt, HIVE-3937.patch.5.txt


 Adding a Hive Profiler implementation which tracks inclusive wall times and 
 call counts of the operators

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3571) add a way to run a small unit quickly

2013-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570779#comment-13570779
 ] 

Hudson commented on HIVE-3571:
--

Integrated in Hive-trunk-h0.21 #1955 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1955/])
HIVE-3571 : add a way to run a small unit quickly (Navis via Ashutosh 
Chauhan) (Revision 1442043)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442043
Files : 
* /hive/trunk/build.properties
* /hive/trunk/build.xml


 add a way to run a small unit quickly
 -

 Key: HIVE-3571
 URL: https://issues.apache.org/jira/browse/HIVE-3571
 Project: Hive
  Issue Type: Test
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Navis
 Fix For: 0.11.0

 Attachments: HIVE-3571.1.patch.txt, HIVE-3571.D7695.1.patch, 
 HIVE-3571.D7695.2.patch, HIVE-3571.D7695.3.patch


 A simple unit test:
 ant test -Dtestcase=TestCliDriver -Dqfile=groupby2.q
 takes a long time.
 There should be a quick way to achieve that for debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3956) TestMetaStoreAuthorization always uses the same port

2013-02-04 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570780#comment-13570780
 ] 

Hudson commented on HIVE-3956:
--

Integrated in Hive-trunk-h0.21 #1955 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1955/])
HIVE-3956 : TestMetaStoreAuthorization always uses the same port (Navis via 
Ashutosh Chauhan) (Revision 1442038)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1442038
Files : 
* 
/hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreAuthorization.java


 TestMetaStoreAuthorization always uses the same port
 

 Key: HIVE-3956
 URL: https://issues.apache.org/jira/browse/HIVE-3956
 Project: Hive
  Issue Type: Test
  Components: Tests
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.11.0

 Attachments: HIVE-3956.D8253.1.patch


 Similar issue with HIVE-2959 and HIVE-3052. Using fixed port(1) for test.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 1955 - Failure

2013-02-04 Thread Apache Jenkins Server
Changes for Build #1955
[namit] HIVE-3937 Hive Profiler
(Pamela Vagata via namit)

[hashutosh] HIVE-3571 : add a way to run a small unit quickly (Navis via 
Ashutosh Chauhan)

[hashutosh] HIVE-3956 : TestMetaStoreAuthorization always uses the same port 
(Navis via Ashutosh Chauhan)




1 tests failed.
REGRESSION:  
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_aggregator_error_1

Error Message:
Forked Java VM exited abnormally. Please note the time in the report does not 
reflect the time until the VM exit.

Stack Trace:
junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please 
note the time in the report does not reflect the time until the VM exit.
at 
net.sf.antcontrib.logic.ForTask.doSequentialIteration(ForTask.java:259)
at net.sf.antcontrib.logic.ForTask.doToken(ForTask.java:268)
at net.sf.antcontrib.logic.ForTask.doTheTasks(ForTask.java:299)
at net.sf.antcontrib.logic.ForTask.execute(ForTask.java:244)




The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1955)

Status: Failure

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1955/ to 
view the results.

[jira] [Updated] (HIVE-3982) Merge PTFDesc and PTFDef classes

2013-02-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3982:
---

Attachment: hive-3982_2.patch

Previous wasn't applying correctly. Updated patch.

 Merge PTFDesc and PTFDef classes
 

 Key: HIVE-3982
 URL: https://issues.apache.org/jira/browse/HIVE-3982
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3982_2.patch, hive-3982.patch


 As discussed on 
 https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3981) Split up tests in ptf_general_queries.q

2013-02-04 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570809#comment-13570809
 ] 

Harish Butani commented on HIVE-3981:
-

Oh yes, I forgot.
+1

 Split up tests in ptf_general_queries.q
 ---

 Key: HIVE-3981
 URL: https://issues.apache.org/jira/browse/HIVE-3981
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3981_2.patch, hive-3981.patch, part.rc, part.seq


 ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes 
 on my laptop to finish. I think we should break it down in smaller .q files 
 otherwise adding a new query and debugging it will be a pain. I have split 
 out rcfile and seqfile tests from it to begin. Also, this test currently 
 fails because original patch didn't had .rc and .seq files (they were 
 binary). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby

2013-02-04 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570813#comment-13570813
 ] 

Navis commented on HIVE-2340:
-

@Gunther Hagleitner: I also considered ratio thing, but number of reducers is 
calculated based on input size just before submitted to hadoop and cannot be 
known in optimizer layer. 
Except those special cases with order by and bucketing, number of reducers for 
both RS is -1. So generally speaking, it's safe.

 optimize orderby followed by a groupby
 --

 Key: HIVE-2340
 URL: https://issues.apache.org/jira/browse/HIVE-2340
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: perfomance
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt, 
 HIVE-2340.D1209.10.patch, HIVE-2340.D1209.6.patch, HIVE-2340.D1209.7.patch, 
 HIVE-2340.D1209.8.patch, HIVE-2340.D1209.9.patch, testclidriver.txt


 Before implementing optimizer for JOIN-GBY, try to implement RS-GBY 
 optimizer(cluster-by following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-3981) Split up tests in ptf_general_queries.q

2013-02-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-3981.


Resolution: Fixed

Committed to branch.

 Split up tests in ptf_general_queries.q
 ---

 Key: HIVE-3981
 URL: https://issues.apache.org/jira/browse/HIVE-3981
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3981_2.patch, hive-3981.patch, part.rc, part.seq


 ptf_general_queries.q has 62 queries currently and it takes nearly 20 minutes 
 on my laptop to finish. I think we should break it down in smaller .q files 
 otherwise adding a new query and debugging it will be a pain. I have split 
 out rcfile and seqfile tests from it to begin. Also, this test currently 
 fails because original patch didn't had .rc and .seq files (they were 
 binary). 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly

2013-02-04 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-2839:
--

Attachment: HIVE-2839.D2079.6.patch

navis updated the revision HIVE-2839 [jira] Filters on outer join with mapjoin 
hint is not applied correctly.

  Added tests

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D2079

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D2079?vs=27177id=27231#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java
  ql/src/test/queries/clientpositive/mapjoin1.q
  ql/src/test/results/clientpositive/mapjoin1.q.out

To: JIRA, navis
Cc: njain


 Filters on outer join with mapjoin hint is not applied correctly
 

 Key: HIVE-2839
 URL: https://issues.apache.org/jira/browse/HIVE-2839
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, 
 HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch, HIVE-2839.D2079.6.patch


 Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions.
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND true limit 10;
 FAILED: Hive Internal Error: 
 java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc
  cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
 java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689)
   at 
 org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 and 
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND b.key * 10  '1000' limit 10;
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 

[jira] [Commented] (HIVE-3982) Merge PTFDesc and PTFDef classes

2013-02-04 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570832#comment-13570832
 ] 

Harish Butani commented on HIVE-3982:
-

Hi Ashutosh,

Thanks for doing this. Looks like by using the PTFDesc as the conf object on 
the PTFOperator; the regular Hive serialization of the work object just worked.

I am fine with this patch; but we are planning to completely changing the Spec 
and Def(Desc) classes to handle these things:
- make the Windowing  PTF information explicit after Phase 1: so that it can 
be used by a Windowing Operator in the future.
- remove references from Desc classes to Spec classes and cleanup the Desc 
classes; It is good time to simplify the translation from Spec to Desc and how 
the Desc is reinitialized at runtime. 

I hope you are ok with this.

 Merge PTFDesc and PTFDef classes
 

 Key: HIVE-3982
 URL: https://issues.apache.org/jira/browse/HIVE-3982
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3982_2.patch, hive-3982.patch


 As discussed on 
 https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3984) Maintain a clear separation between Windowing PTF at the specification level.

2013-02-04 Thread Harish Butani (JIRA)
Harish Butani created HIVE-3984:
---

 Summary: Maintain a clear separation between Windowing  PTF at 
the specification level. 
 Key: HIVE-3984
 URL: https://issues.apache.org/jira/browse/HIVE-3984
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani


This has multiple pieces:
- redefine the PTF Spec classes, as described in the Data Structs doc in 
Hive-896
- refactor PTFDesc classes based on this design
- refactor translation: both Phase 1  GenPlan phases


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3985) Update new UDAFs introduced for Windowing to work with new Decimal Type

2013-02-04 Thread Harish Butani (JIRA)
Harish Butani created HIVE-3985:
---

 Summary: Update new UDAFs introduced for Windowing to work with 
new Decimal Type
 Key: HIVE-3985
 URL: https://issues.apache.org/jira/browse/HIVE-3985
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3986) Fix select expr processing in PTF Operator

2013-02-04 Thread Harish Butani (JIRA)
Harish Butani created HIVE-3986:
---

 Summary: Fix select expr processing in PTF Operator
 Key: HIVE-3986
 URL: https://issues.apache.org/jira/browse/HIVE-3986
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani


Select expressions that contain Lead/Lag functions are handled by the PTF 
Operator as a post processing step after output Partition is computed.
The Select Exprs Node Descs for these are incorrectly created using the Input 
RR. These should be created, just like the having expression using the Output 
RR of the PTF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3982) Merge PTFDesc and PTFDef classes

2013-02-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570870#comment-13570870
 ] 

Ashutosh Chauhan commented on HIVE-3982:


Yeah.. I agree we need to do both these things. Thats the goal. Thats why my 
comment said first stab.. more work to do.. : ) this is first step. If you 
are ok with this change, than I will go ahead and commit, since it simplifies 
code a bit by getting rid of unnecessary class. 

 Merge PTFDesc and PTFDef classes
 

 Key: HIVE-3982
 URL: https://issues.apache.org/jira/browse/HIVE-3982
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3982_2.patch, hive-3982.patch


 As discussed on 
 https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3982) Merge PTFDesc and PTFDef classes

2013-02-04 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570883#comment-13570883
 ] 

Harish Butani commented on HIVE-3982:
-

+1

 Merge PTFDesc and PTFDef classes
 

 Key: HIVE-3982
 URL: https://issues.apache.org/jira/browse/HIVE-3982
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3982_2.patch, hive-3982.patch


 As discussed on 
 https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3987) Update PTF invocation and windowing grammar

2013-02-04 Thread Harish Butani (JIRA)
Harish Butani created HIVE-3987:
---

 Summary: Update PTF invocation and windowing grammar
 Key: HIVE-3987
 URL: https://issues.apache.org/jira/browse/HIVE-3987
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Harish Butani


Changes to grammar to make it more Standards based:
- support Partition  Order style along with Hive specific Distribute/Cluster 
and Sort in windowing specification.
- PTF args should come after Input details like in Aster.
- tbd: do we need to support named parameters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3874) Create a new Optimized Row Columnar file format for Hive

2013-02-04 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570888#comment-13570888
 ] 

Kevin Wilfong commented on HIVE-3874:
-

@Owen: Here's a couple more issues I ran into, and again I can file JIRAs for 
these later once the code is checked in.

Incorrect deserialization of doubles (leads to a lot of NaNs)
https://reviews.facebook.net/D8379

Strings are written incorrectly when they span two chunks of a DynamicByteArray
E.g. say the original string is 'abcdefghi' the string written in the ORC file 
may be 'abcdefabc'
https://reviews.facebook.net/D8385

 Create a new Optimized Row Columnar file format for Hive
 

 Key: HIVE-3874
 URL: https://issues.apache.org/jira/browse/HIVE-3874
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: hive.3874.2.patch, OrcFileIntro.pptx, orc.tgz


 There are several limitations of the current RC File format that I'd like to 
 address by creating a new format:
 * each column value is stored as a binary blob, which means:
 ** the entire column value must be read, decompressed, and deserialized
 ** the file format can't use smarter type-specific compression
 ** push down filters can't be evaluated
 * the start of each row group needs to be found by scanning
 * user metadata can only be added to the file when the file is created
 * the file doesn't store the number of rows per a file or row group
 * there is no mechanism for seeking to a particular row number, which is 
 required for external indexes.
 * there is no mechanism for storing light weight indexes within the file to 
 enable push-down filters to skip entire row groups.
 * the type of the rows aren't stored in the file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [VOTE] Amend Hive Bylaws + Add HCatalog Submodule

2013-02-04 Thread Alan Gates
Most excellent.  I'll start the vote in the HCatalog PPMC to approve this, and 
assuming that passes I'll then start a vote in the IPMC per the guidelines at 
http://incubator.apache.org/guides/graduation.html#subproject

Alan.

On Feb 4, 2013, at 2:27 PM, Carl Steinbach wrote:

 The following active Hive PMC members have cast votes:
 
 Carl Steinbach: +1, +1
 Ashutosh Chauhan: +1, +1
 Edward Capriolo: +1, +1
 Ashish Thusoo: +1, +1
 Yongqiang He: +1, +1
 Namit Jain: +1, +1
 
 Three active PMC members have abstained from voting.
 
 Over the last week the following four Hive PMC members requested
 that their status be changed from active to emeritus member:
 jvs, prasadc, zhao, pauly.
 
 Voting on these measures is now closed. Both measures have been approved
 with the required 2/3 majority of active Hive PMC members.
 
 Thanks.
 
 Carl
 
 On Thu, Jan 31, 2013 at 2:04 PM, Vinod Kumar Vavilapalli 
 vino...@hortonworks.com wrote:
 
 
 +1 and +1 non-binding.
 
 Great to see this happen!
 
 Thanks,
 +Vinod
 
 
 On Thu, Jan 31, 2013 at 12:14 AM, Namit Jain nj...@fb.com wrote:
 
 +1 and +1
 
 On 1/30/13 6:53 AM, Gunther Hagleitner ghagleit...@hortonworks.com
 wrote:
 
 +1 and +1
 
 Thanks,
 Gunther.
 
 
 On Tue, Jan 29, 2013 at 5:18 PM, Edward Capriolo
 edlinuxg...@gmail.comwrote:
 
 Measure 1: +1
 Measure 2: +1
 
 On Mon, Jan 28, 2013 at 2:47 PM, Carl Steinbach c...@apache.org
 wrote:
 
 I am calling a vote on the following two measures.
 
 Measure 1: Amend Hive Bylaws to Define Submodules and Submodule
 Committers
 
 If this measure passes the Apache Hive Project Bylaws will be
 amended with the following changes:
 
 
 
 
 
 https://cwiki.apache.org/confluence/display/Hive/Proposed+Changes+to+Hive
 +Bylaws+for+Submodule+Committers
 
 The motivation for these changes is discussed in the following
 email thread which appeared on the hive-dev and hcatalog-dev
 mailing lists:
 
 http://markmail.org/thread/u5nap7ghvyo7euqa
 
 
 Measure 2: Create HCatalog Submodule and Adopt HCatalog Codebase
 
 This measure provides for 1) the establishment of an HCatalog
 submodule in the Apache Hive Project, 2) the adoption of the
 Apache HCatalog codebase into the Hive HCatalog submodule, and
 3) adding all currently active HCatalog committers as submodule
 committers on the Hive HCatalog submodule.
 
 Passage of this measure depends on the passage of Measure 1.
 
 
 Voting:
 
 Both measures require +1 votes from 2/3 of active Hive PMC
 members in order to pass. All participants in the Hive project
 are encouraged to vote on these measures, but only votes from
 active Hive PMC members are binding. The voting period
 commences immediately and shall last a minimum of six days.
 
 Voting is carried out by replying to this email thread. You must
 indicate which measure you are voting on in order for your vote
 to be counted.
 
 More details about the voting process can be found in the Apache
 Hive Project Bylaws:
 
 https://cwiki.apache.org/confluence/display/Hive/Bylaws
 
 
 
 
 
 
 
 --
 +Vinod
 Hortonworks Inc.
 http://hortonworks.com/
 



[jira] [Updated] (HIVE-3849) Aliased column in where clause for multi-groupby single reducer cannot be resolved

2013-02-04 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3849:
--

Attachment: HIVE-3849.D7713.7.patch

navis updated the revision HIVE-3849 [jira] Columns are not extracted for 
multi-groupby single reducer case somtimes.

  For addressing comment, I've generified some important entities and found 
it's affeting many classes.
  @namit; I think this is not what you expected, or is it?

Reviewers: JIRA

REVISION DETAIL
  https://reviews.facebook.net/D7713

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D7713?vs=26973id=27243#toc

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java
  ql/src/java/org/apache/hadoop/hive/ql/index/IndexPredicateAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/DefaultGraphWalker.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/DefaultRuleDispatcher.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/Dispatcher.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/GraphWalker.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/Node.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/NodeProcessor.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/PreOrderWalker.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/Rule.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/RuleExactMatch.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/RuleRegExp.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/TaskGraphWalker.java
  ql/src/java/org/apache/hadoop/hive/ql/lib/Utils.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractBucketJoinProc.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketMapJoinOptimizer.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPruner.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMROperator.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRRedSink1.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRRedSink2.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRRedSink3.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRTableScan1.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRUnion1.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GroupByOptimizer.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/PrunerExpressionOperatorFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/PrunerOperatorFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ReduceSinkDeDuplication.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SamplePruner.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/SkewJoinOptimizer.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedMergeBucketMapJoinOptimizer.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyCtx.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteCanApplyProcFactory.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndex.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/index/RewriteQueryUsingAggregateIndexCtx.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/ExprProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/lineage/OpProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrExprProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/pcr/PcrOpProcFactory.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/BucketingSortingOpProcFactory.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MetadataOnlyOptimizer.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SkewJoinResolver.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/unionproc/UnionProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/ASTNode.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/PrintOpTreeProcessor.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  

[jira] [Commented] (HIVE-3959) Update Partition Statistics in Metastore Layer

2013-02-04 Thread Bhushan Mandhani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570910#comment-13570910
 ] 

Bhushan Mandhani commented on HIVE-3959:


Diff out at https://reviews.facebook.net/D8271 Still need some minor updates 
before I can submit the patch.

 Update Partition Statistics in Metastore Layer
 --

 Key: HIVE-3959
 URL: https://issues.apache.org/jira/browse/HIVE-3959
 Project: Hive
  Issue Type: Improvement
  Components: Metastore, Statistics
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor

 When partitions are created using queries (insert overwrite and insert 
 into) then the StatsTask updates all stats. However, when partitions are 
 added directly through metadata-only partitions (either CLI or direct calls 
 to Thrift Metastore) no stats are populated even if hive.stats.reliable is 
 set to true. This puts us in a situation where we can't decide if stats are 
 truly reliable or not.
 We propose that the fast stats (numFiles and totalSize) which don't require 
 a scan of the data should always be populated and be completely reliable. For 
 now we are still excluding rowCount and rawDataSize because that will make 
 these operations very expensive. Currently they are quick metadata-only ops.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Add support for pulling HBase columns with prefixes

2013-02-04 Thread Mark Grover

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9276/#review16080
---



hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java
https://reviews.apache.org/r/9276/#comment34401

This seems like a limited case of pattern matching. Swarnim, any way we can 
support generic regex matching instead?


- Mark Grover


On Feb. 3, 2013, 1:04 a.m., Swarnim Kulkarni wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/9276/
 ---
 
 (Updated Feb. 3, 2013, 1:04 a.m.)
 
 
 Review request for hive.
 
 
 Description
 ---
 
 Added support for pulling hbase columns just by providing prefixes and a 
 wildcard. So a query now could look something like this:
 
 CREATE EXTERNAL TABLE hive_hbase_test
 ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' 
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
 WITH SERDEPROPERTIES (hbase.columns.mapping = :key,fam1:col*) 
 TBLPROPERTIES (hbase.table.name = TEST_HBASE_TABLE);
 
 This would pull in all columns under column family fam1 which start with 
 col. This gives a little more flexibility over pull all columns format.
 
 
 This addresses bug HIVE-3725.
 https://issues.apache.org/jira/browse/HIVE-3725
 
 
 Diffs
 -
 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 7f37ba5 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java 
 a8ba9d9 
   hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java 
 d35bb52 
   hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java 
 e821282 
 
 Diff: https://reviews.apache.org/r/9276/diff/
 
 
 Testing
 ---
 
 Added unit tests to demonstrate the new functionality. Also made sure that 
 all existing unit tests passed.
 
 
 Thanks,
 
 Swarnim Kulkarni
 




[jira] [Commented] (HIVE-3725) Add support for pulling HBase columns with prefixes

2013-02-04 Thread Mark Grover (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13570925#comment-13570925
 ] 

Mark Grover commented on HIVE-3725:
---

Comments on reviewboard

 Add support for pulling HBase columns with prefixes
 ---

 Key: HIVE-3725
 URL: https://issues.apache.org/jira/browse/HIVE-3725
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Affects Versions: 0.9.0
Reporter: Swarnim Kulkarni
Assignee: Swarnim Kulkarni
 Attachments: HIVE-3725.1.patch.txt


 Current HBase Hive integration supports reading many values from the same row 
 by specifying a column family. And specifying just the column family can pull 
 in all qualifiers within the family.
 We should add in support to be able to specify a prefix for the qualifier and 
 all columns that start with the prefix would automatically get pulled in. A 
 wildcard support would be ideal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Requests

2013-02-04 Thread Mark Grover
Swarnim,
I left some comments on  reviewboard.

On Mon, Feb 4, 2013 at 8:00 AM, kulkarni.swar...@gmail.com 
kulkarni.swar...@gmail.com wrote:

 Hello,

 I opened up two reviews for small issues, HIVE-3553[1] and HIVE-3725[2]. If
 you guys get a chance to review and provide feedback on it, I will really
 appreciate.

 Thanks,

 [1] https://reviews.apache.org/r/9275/
 [2] https://reviews.apache.org/r/9276/

 --
 Swarnim



[jira] [Comment Edited] (HIVE-3982) Merge PTFDesc and PTFDef classes

2013-02-04 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571030#comment-13571030
 ] 

Ashutosh Chauhan edited comment on HIVE-3982 at 2/5/13 4:26 AM:


Committed to branch.

  was (Author: ashutoshc):
Committed to trunk.
  
 Merge PTFDesc and PTFDef classes
 

 Key: HIVE-3982
 URL: https://issues.apache.org/jira/browse/HIVE-3982
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3982_2.patch, hive-3982.patch


 As discussed on 
 https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-3982) Merge PTFDesc and PTFDef classes

2013-02-04 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan resolved HIVE-3982.


Resolution: Fixed

Committed to trunk.

 Merge PTFDesc and PTFDef classes
 

 Key: HIVE-3982
 URL: https://issues.apache.org/jira/browse/HIVE-3982
 Project: Hive
  Issue Type: Task
  Components: PTF-Windowing
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: hive-3982_2.patch, hive-3982.patch


 As discussed on 
 https://issues.apache.org/jira/browse/HIVE-896?focusedCommentId=13567271page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13567271

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Fwd: [VOTE] Graduate HCatalog from the incubator and become part of Hive

2013-02-04 Thread Alan Gates
FYI.

Alan.

Begin forwarded message:

 From: Alan Gates ga...@hortonworks.com
 Date: February 4, 2013 10:18:09 PM PST
 To: hcatalog-...@incubator.apache.org
 Subject: [VOTE] Graduate HCatalog from the incubator and become part of Hive
 
 The Hive PMC has voted to accept HCatalog as a submodule of Hive.  You can 
 see the vote thread at 
 http://mail-archives.apache.org/mod_mbox/hive-dev/201301.mbox/%3cCACf6RrzktBYD0suZxn3Pfv8XkR=vgwszrzyb_2qvesuj2vh...@mail.gmail.com%3e
  .  We now need to vote to graduate from the incubator and become a submodule 
 of Hive.  This entails the following:
 
 1) the establishment of an HCatalog submodule in the Apache Hive Project;
 2) the adoption of the Apache HCatalog codebase into the Hive HCatalog 
 submodule; and
 3) adding all currently active HCatalog committers as submodule committers on 
 the Hive HCatalog submodule.
 
 Definitions for all these can be found in the (now adopted) Hive bylaws at 
 https://cwiki.apache.org/confluence/display/Hive/Proposed+Changes+to+Hive+Bylaws+for+Submodule+Committer.
 
 This vote will stay open for at least 72 hours (thus 23:00 PST on 2/7/13).  
 PPMC members votes are binding in this vote, though input from all is welcome.
 
 If this vote passes the next step will be to submit the graduation motion to 
 the Incubator PMC.
 
 Here's my +1.
 
 Alan.



[jira] [Updated] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly

2013-02-04 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2839:
-

Status: Open  (was: Patch Available)

comments

 Filters on outer join with mapjoin hint is not applied correctly
 

 Key: HIVE-2839
 URL: https://issues.apache.org/jira/browse/HIVE-2839
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, 
 HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch, HIVE-2839.D2079.6.patch


 Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions.
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND true limit 10;
 FAILED: Hive Internal Error: 
 java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc
  cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
 java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689)
   at 
 org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 and 
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND b.key * 10  '1000' limit 10;
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1321)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1325)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:495)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:143)
   ... 8 more
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2839) Filters on outer join with mapjoin hint is not applied correctly

2013-02-04 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13571117#comment-13571117
 ] 

Phabricator commented on HIVE-2839:
---

njain has commented on the revision HIVE-2839 [jira] Filters on outer join 
with mapjoin hint is not applied correctly.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java:169 This 
should not be possible.

  We don't support

  join - mapjoin
  union - mapjoin

  I am not sure about LateralView - mapjoin,
  but if it is allowed, I will add a jira to stop that, and fix that
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java:181 Remove 
this function.

  assert that parents.size() == 1
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java:128 return 
a List instead of ArrayList
  ql/src/java/org/apache/hadoop/hive/ql/plan/ExprNodeDescUtils.java:130 result 
can be a List instead of ArrayList

REVISION DETAIL
  https://reviews.facebook.net/D2079

To: JIRA, navis
Cc: njain


 Filters on outer join with mapjoin hint is not applied correctly
 

 Key: HIVE-2839
 URL: https://issues.apache.org/jira/browse/HIVE-2839
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2839.D2079.2.patch, HIVE-2839.D2079.3.patch, 
 HIVE-2839.D2079.4.patch, HIVE-2839.D2079.5.patch, HIVE-2839.D2079.6.patch


 Testing HIVE-2820, I've found some queries with mapjoin hint makes exceptions.
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND true limit 10;
 FAILED: Hive Internal Error: 
 java.lang.ClassCastException(org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc
  cannot be cast to org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc)
 java.lang.ClassCastException: 
 org.apache.hadoop.hive.ql.plan.ExprNodeConstantDesc cannot be cast to 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.convertMapJoin(MapJoinProcessor.java:363)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.generateMapJoinOperator(MapJoinProcessor.java:483)
   at 
 org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor.transform(MapJoinProcessor.java:689)
   at 
 org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7519)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:891)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 and 
 {code}
 SELECT /*+ MAPJOIN(a) */ * FROM src a RIGHT OUTER JOIN src b on a.key=b.key 
 AND b.key * 10  '1000' limit 10;
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:161)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:212)
   at 
 

[jira] [Updated] (HIVE-2991) Integrate Clover with Hive

2013-02-04 Thread Ilya Katsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated HIVE-2991:
--

Attachment: (was: hive.2991.2.trunk.patch)

 Integrate Clover with Hive
 --

 Key: HIVE-2991
 URL: https://issues.apache.org/jira/browse/HIVE-2991
 Project: Hive
  Issue Type: Test
  Components: Testing Infrastructure
Affects Versions: 0.9.0
Reporter: Ashutosh Chauhan
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2991.D2985.1.patch, 
 hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, 
 hive.2991.1.trunk.patch, hive.2991.2.branch-0.10.patch, 
 hive.2991.2.branch-0.9.patch, hive.2991.2.trunk.patch, 
 hive-trunk-clover-html-report.zip


 Atlassian has donated license of their code coverage tool Clover to ASF. Lets 
 make use of it to generate code coverage report to figure out which areas of 
 Hive are well tested and which ones are not. More information about license 
 can be found in Hadoop jira HADOOP-1718 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2991) Integrate Clover with Hive

2013-02-04 Thread Ilya Katsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated HIVE-2991:
--

Attachment: hive.2991.2.trunk.patch

 Integrate Clover with Hive
 --

 Key: HIVE-2991
 URL: https://issues.apache.org/jira/browse/HIVE-2991
 Project: Hive
  Issue Type: Test
  Components: Testing Infrastructure
Affects Versions: 0.9.0
Reporter: Ashutosh Chauhan
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2991.D2985.1.patch, 
 hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, 
 hive.2991.1.trunk.patch, hive.2991.2.branch-0.10.patch, 
 hive.2991.2.branch-0.9.patch, hive.2991.2.trunk.patch, 
 hive-trunk-clover-html-report.zip


 Atlassian has donated license of their code coverage tool Clover to ASF. Lets 
 make use of it to generate code coverage report to figure out which areas of 
 Hive are well tested and which ones are not. More information about license 
 can be found in Hadoop jira HADOOP-1718 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2991) Integrate Clover with Hive

2013-02-04 Thread Ilya Katsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ilya Katsov updated HIVE-2991:
--

Status: Patch Available  (was: Open)

 Integrate Clover with Hive
 --

 Key: HIVE-2991
 URL: https://issues.apache.org/jira/browse/HIVE-2991
 Project: Hive
  Issue Type: Test
  Components: Testing Infrastructure
Affects Versions: 0.9.0
Reporter: Ashutosh Chauhan
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2991.D2985.1.patch, 
 hive.2991.1.branch-0.10.patch, hive.2991.1.branch-0.9.patch, 
 hive.2991.1.trunk.patch, hive.2991.2.branch-0.10.patch, 
 hive.2991.2.branch-0.9.patch, hive.2991.2.trunk.patch, 
 hive-trunk-clover-html-report.zip


 Atlassian has donated license of their code coverage tool Clover to ASF. Lets 
 make use of it to generate code coverage report to figure out which areas of 
 Hive are well tested and which ones are not. More information about license 
 can be found in Hadoop jira HADOOP-1718 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira