[jira] [Commented] (HIVE-2229) Potentially Switch Build to Maven

2013-01-17 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13555969#comment-13555969
 ] 

Brock Noland commented on HIVE-2229:


John,

The interesting are mostly code generation I assume? 

I am +1 on this. Maven has it's issues as well but most hadoop projects are 
using maven these days. Of course that doesn't mean hive must move to maven, 
but cross project consistency is welcome when achievable.

 Potentially Switch Build to Maven
 -

 Key: HIVE-2229
 URL: https://issues.apache.org/jira/browse/HIVE-2229
 Project: Hive
  Issue Type: Improvement
Reporter: Ed Kohlwey
Assignee: David Phillips
Priority: Minor

 I want to propose this idea as gently as possible, because I know there's a 
 lot of passion around build tools these days.
 There should at least be some discussion around the merits of Maven vs. 
 Ant/IVY.
 If there's a lot of interest in switching Hive to Maven, I would be willing 
 to volunteer some time to put together a patch.
 The reasons to potentially look at Maven for the build system include:
 - Simplified build scripts/definitions
 - Getting features like publishing test artifacts automagically
 - Very good IDE integration using M2 eclipse
   - IDE integration also supports working on multiple projects at the same 
 time which may have dependencies on eachother.
 - If you absolutely must you can use the maven-antrun-plugin
 - Despite the fact that people have trouble thinking in maven at first, it 
 becomes easy to work with once you know it
  - This supports knowledge reuse
 Reasons for Ant/Ivy
 - There's more flexibility
 - The system's imperative style is familiar to all programmers, regardless of 
 their background in the tool
 Reasons not to go Maven
 - The build system is hard to learn for those not familiar with Maven due to 
 its unusual perspective on projects as objects
 - There's less flexibility
 - If you wind up dropping down to the maven ant plugin a lot everything will 
 be a big mess
 Reasons not to continue Ant/Ivy
 - Despite the fact that the programming paradigm is familiar, the structure 
 of Ant scripts is not very standardized and must be re-learned on pretty much 
 every project
 - Ant/Ivy doesn't emphasize reuse very well
  - There's a constant need to continue long-running development cycles to add 
 desirable features to build scripts which would be simple using other build 
 systems

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2820) Invalid tag is used for MapJoinProcessor

2013-01-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556001#comment-13556001
 ] 

Namit Jain commented on HIVE-2820:
--

[~navis], Sorry for the late response - but I had some questions on this patch.

I can understand tag is not needed for mapjoin, but aren't you mixing the data.
Let me try to reproduce.

 Invalid tag is used for MapJoinProcessor
 

 Key: HIVE-2820
 URL: https://issues.apache.org/jira/browse/HIVE-2820
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0, 0.10.0
 Environment: ubuntu
Reporter: Navis
Assignee: Navis
 Fix For: 0.11.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2820.D1935.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2820.D1935.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2820.D1935.3.patch, HIVE-2820.D1935.4.patch


 Testing HIVE-2810, I've found tag and alias are used in very confusing 
 manner. For example, query below fails..
 {code}
 hive set hive.auto.convert.join=true;
  
 hive select /*+ STREAMTABLE(a) */ * from myinput1 a join myinput1 b on 
 a.key=b.key join myinput1 c on a.key=c.key;
 Total MapReduce jobs = 4
 Ended Job = 1667415037, job is filtered out (removed at runtime).
 Ended Job = 1739566906, job is filtered out (removed at runtime).
 Ended Job = 1113337780, job is filtered out (removed at runtime).
 12/02/24 10:27:14 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120224102727_cafe0d8d-9b21-441d-bd4e-b83303b31cdc.log
 2012-02-24 10:27:14   Starting to launch local task to process map join;  
 maximum memory = 932118528
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:312)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.startForward(MapredLocalTask.java:325)
   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:272)
   at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:685)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Execution failed with exit status: 2
 Obtaining error information
 {code}
 Failed task has a plan which doesn't make sense.
 {noformat}
   Stage: Stage-8
 Map Reduce Local Work
   Alias - Map Local Tables:
 b 
   Fetch Operator
 limit: -1
 c 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 b 
   TableScan
 alias: b
 HashTable Sink Operator
   condition expressions:
 0 {key} {value}
 1 {key} {value}
 2 {key} {value}
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
 2 [Column[key]]
   Position of Big Table: 0
 c 
   TableScan
 alias: c
 Map Join Operator
   condition map:
Inner Join 0 to 1
Inner Join 0 to 2
   condition expressions:
 0 {key} {value}
 1 {key} {value}
 2 {key} {value}
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
 2 [Column[key]]
   outputColumnNames: _col0, _col1, _col4, _col5, _col8, _col9
   Position of Big Table: 0
   Select Operator
 expressions:
   expr: _col0
   type: int
   expr: _col1
   type: int
   expr: _col4
   type: int
   expr: _col5
   type: int
   expr: _col8
   type: int
   expr: _col9
 

[jira] [Created] (HIVE-3909) Wrong data due to HIVE-2820

2013-01-17 Thread Namit Jain (JIRA)
Namit Jain created HIVE-3909:


 Summary: Wrong data due to HIVE-2820
 Key: HIVE-3909
 URL: https://issues.apache.org/jira/browse/HIVE-3909
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain


Consider the query:

~/hive/hive1$ more ql/src/test/queries/clientpositive/join_reorder4.q
CREATE TABLE T1(key1 STRING, val1 STRING) STORED AS TEXTFILE;
CREATE TABLE T2(key2 STRING, val2 STRING) STORED AS TEXTFILE;
CREATE TABLE T3(key3 STRING, val3 STRING) STORED AS TEXTFILE;

LOAD DATA LOCAL INPATH '../data/files/T1.txt' INTO TABLE T1;
LOAD DATA LOCAL INPATH '../data/files/T2.txt' INTO TABLE T2;
LOAD DATA LOCAL INPATH '../data/files/T3.txt' INTO TABLE T3;

set hive.auto.convert.join=true;

explain select /*+ STREAMTABLE(a) */ a.*, b.*, c.* from T1 a join T2 b on 
a.key1=b.key2 join T3 c on a.key1=c.key3;
select /*+ STREAMTABLE(a) */ a.*, b.*, c.* from T1 a join T2 b on a.key1=b.key2 
join T3 c on a.key1=c.key3;

explain select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on 
a.key1=b.key2 join T3 c on a.key1=c.key3;
select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on a.key1=b.key2 
join T3 c on a.key1=c.key3;

explain select /*+ STREAMTABLE(c) */ a.*, b.*, c.* from T1 a join T2 b on 
a.key1=b.key2 join T3 c on a.key1=c.key3;
select /*+ STREAMTABLE(c) */ a.*, b.*, c.* from T1 a join T2 b on a.key1=b.key2 
join T3 c on a.key1=c.key3;



select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on a.key1=b.key2 
join T3 c on a.key1=c.key3;

returns:
2   12  2   12  2   22

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3909) Wrong data due to HIVE-2820

2013-01-17 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-3909:


Assignee: Navis

 Wrong data due to HIVE-2820
 ---

 Key: HIVE-3909
 URL: https://issues.apache.org/jira/browse/HIVE-3909
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Navis

 Consider the query:
 ~/hive/hive1$ more ql/src/test/queries/clientpositive/join_reorder4.q
 CREATE TABLE T1(key1 STRING, val1 STRING) STORED AS TEXTFILE;
 CREATE TABLE T2(key2 STRING, val2 STRING) STORED AS TEXTFILE;
 CREATE TABLE T3(key3 STRING, val3 STRING) STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '../data/files/T1.txt' INTO TABLE T1;
 LOAD DATA LOCAL INPATH '../data/files/T2.txt' INTO TABLE T2;
 LOAD DATA LOCAL INPATH '../data/files/T3.txt' INTO TABLE T3;
 set hive.auto.convert.join=true;
 explain select /*+ STREAMTABLE(a) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(a) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 explain select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 explain select /*+ STREAMTABLE(c) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(c) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 returns:
 2 12  2   12  2   22

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3872) MAP JOIN for VIEW thorws NULL pointer exception error

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556006#comment-13556006
 ] 

Hudson commented on HIVE-3872:
--

Integrated in Hive-trunk-hadoop2 #69 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/69/])
HIVE-3872 MAP JOIN for VIEW thorws NULL pointer exception error
(Navis via namit) (Revision 1433997)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1433997
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
* /hive/trunk/ql/src/test/queries/clientnegative/invalid_mapjoin1.q
* /hive/trunk/ql/src/test/results/clientnegative/invalid_mapjoin1.q.out


 MAP JOIN  for VIEW thorws NULL pointer exception error
 --

 Key: HIVE-3872
 URL: https://issues.apache.org/jira/browse/HIVE-3872
 Project: Hive
  Issue Type: Bug
  Components: Views
Reporter: Santosh Achhra
Assignee: Navis
Priority: Critical
  Labels: HINTS, MAPJOIN
 Fix For: 0.11.0

 Attachments: HIVE-3872.D7965.1.patch


 I have created a view  as shown below. 
 CREATE VIEW V1 AS
 select /*+ MAPJOIN(t1) ,MAPJOIN(t2)  */ t1.f1, t1.f2, t1.f3, t1.f4, t2.f1, 
 t2.f2, t2.f3 from TABLE1 t1 join TABLE t2 on ( t1.f2= t2.f2 and t1.f3 = t2.f3 
 and t1.f4 = t2.f4 ) group by t1.f1, t1.f2, t1.f3, t1.f4, t2.f1, t2.f2, t2.f3
 View get created successfully however when I execute below mentioned SQL or 
 any SQL on the view  get NULLPOINTER exception error
 hive select count (*) from V1;
 FAILED: NullPointerException null
 hive
 Is there anything wrong with the view creation ?
 Next I created view without MAPJOIN hints 
 CREATE VIEW V1 AS
 select  t1.f1, t1.f2, t1.f3, t1.f4, t2.f1, t2.f2, t2.f3 from TABLE1 t1 join 
 TABLE t2 on ( t1.f2= t2.f2 and t1.f3 = t2.f3 and t1.f4 = t2.f4 ) group by 
 t1.f1, t1.f2, t1.f3, t1.f4, t2.f1, t2.f2, t2.f3
 Before executing select SQL I excute set  hive.auto.convert.join=true; 
 I am getting beloow mentioned warnings
 java.lang.InstantiationException: 
 org.apache.hadoop.hive.ql.parse.ASTNodeOrigin
 Continuing ...
 java.lang.RuntimeException: failed to evaluate: unbound=Class.new();
 Continuing ...
 And I see from log that total 5 mapreduce jobs are started however when don't 
 set auto.convert.join to true, I see only 3 mapreduce jobs getting invoked.
 Total MapReduce jobs = 5
 Ended Job = 1116112419, job is filtered out (removed at runtime).
 Ended Job = -33256989, job is filtered out (removed at runtime).
 WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use 
 org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2820) Invalid tag is used for MapJoinProcessor

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556007#comment-13556007
 ] 

Hudson commented on HIVE-2820:
--

Integrated in Hive-trunk-hadoop2 #69 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/69/])
HIVE-2820 : Invalid tag is used for MapJoinProcessor (Navis via Ashutosh 
Chauhan) (Revision 1434012)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1434012
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
* /hive/trunk/ql/src/test/queries/clientpositive/join_reorder4.q
* /hive/trunk/ql/src/test/results/clientpositive/join_reorder4.q.out


 Invalid tag is used for MapJoinProcessor
 

 Key: HIVE-2820
 URL: https://issues.apache.org/jira/browse/HIVE-2820
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0, 0.10.0
 Environment: ubuntu
Reporter: Navis
Assignee: Navis
 Fix For: 0.11.0

 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2820.D1935.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2820.D1935.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2820.D1935.3.patch, HIVE-2820.D1935.4.patch


 Testing HIVE-2810, I've found tag and alias are used in very confusing 
 manner. For example, query below fails..
 {code}
 hive set hive.auto.convert.join=true;
  
 hive select /*+ STREAMTABLE(a) */ * from myinput1 a join myinput1 b on 
 a.key=b.key join myinput1 c on a.key=c.key;
 Total MapReduce jobs = 4
 Ended Job = 1667415037, job is filtered out (removed at runtime).
 Ended Job = 1739566906, job is filtered out (removed at runtime).
 Ended Job = 1113337780, job is filtered out (removed at runtime).
 12/02/24 10:27:14 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120224102727_cafe0d8d-9b21-441d-bd4e-b83303b31cdc.log
 2012-02-24 10:27:14   Starting to launch local task to process map join;  
 maximum memory = 932118528
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:312)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.startForward(MapredLocalTask.java:325)
   at 
 org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:272)
   at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:685)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Execution failed with exit status: 2
 Obtaining error information
 {code}
 Failed task has a plan which doesn't make sense.
 {noformat}
   Stage: Stage-8
 Map Reduce Local Work
   Alias - Map Local Tables:
 b 
   Fetch Operator
 limit: -1
 c 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 b 
   TableScan
 alias: b
 HashTable Sink Operator
   condition expressions:
 0 {key} {value}
 1 {key} {value}
 2 {key} {value}
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
 2 [Column[key]]
   Position of Big Table: 0
 c 
   TableScan
 alias: c
 Map Join Operator
   condition map:
Inner Join 0 to 1
Inner Join 0 to 2
 

[jira] [Resolved] (HIVE-3852) Multi-groupby optimization fails when same distinct column is used twice or more

2013-01-17 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain resolved HIVE-3852.
--

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed

Committed. Thanks Navis

 Multi-groupby optimization fails when same distinct column is used twice or 
 more
 

 Key: HIVE-3852
 URL: https://issues.apache.org/jira/browse/HIVE-3852
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.11.0

 Attachments: HIVE-3852.D7737.1.patch


 {code}
 FROM INPUT
 INSERT OVERWRITE TABLE dest1 
 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct 
 substr(INPUT.value,5)) GROUP BY INPUT.key
 INSERT OVERWRITE TABLE dest2 
 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct 
 substr(INPUT.value,5)) GROUP BY INPUT.key;
 {code}
 fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3910) Create a new DATE datatype

2013-01-17 Thread Namit Jain (JIRA)
Namit Jain created HIVE-3910:


 Summary: Create a new DATE datatype
 Key: HIVE-3910
 URL: https://issues.apache.org/jira/browse/HIVE-3910
 Project: Hive
  Issue Type: Task
Reporter: Namit Jain


It might be useful to have a DATE datatype along with timestamp.
This can only store the day (possibly number of days from 1970-01-01,
and would thus give space savings in binary format).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3537) release locks at the end of move tasks

2013-01-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556017#comment-13556017
 ] 

Namit Jain commented on HIVE-3537:
--

[~ashutoshc], let me take a look ? Thanks

 release locks at the end of move tasks
 --

 Key: HIVE-3537
 URL: https://issues.apache.org/jira/browse/HIVE-3537
 Project: Hive
  Issue Type: Bug
  Components: Locking, Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3537.1.patch, hive.3537.2.patch, hive.3537.3.patch, 
 hive.3537.4.patch, hive.3537.5.patch, hive.3537.6.patch, hive.3537.7.patch, 
 hive.3537.8.patch, hive.3537.9.patch


 Look at HIVE-3106 for details.
 In order to make sure that concurrency is not an issue for multi-table 
 inserts, the current option is to introduce a dependency task, which thereby
 delays the creation of all partitions. It would be desirable to release the
 locks for the outputs as soon as the move task is completed. That way, for
 multi-table inserts, the concurrency can be enabled without delaying any 
 table.
 Currently, the movetask contains a input/output, but they do not seem to be
 populated correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3537) release locks at the end of move tasks

2013-01-17 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3537:
-

Status: Open  (was: Patch Available)

 release locks at the end of move tasks
 --

 Key: HIVE-3537
 URL: https://issues.apache.org/jira/browse/HIVE-3537
 Project: Hive
  Issue Type: Bug
  Components: Locking, Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3537.1.patch, hive.3537.2.patch, hive.3537.3.patch, 
 hive.3537.4.patch, hive.3537.5.patch, hive.3537.6.patch, hive.3537.7.patch, 
 hive.3537.8.patch, hive.3537.9.patch


 Look at HIVE-3106 for details.
 In order to make sure that concurrency is not an issue for multi-table 
 inserts, the current option is to introduce a dependency task, which thereby
 delays the creation of all partitions. It would be desirable to release the
 locks for the outputs as soon as the move task is completed. That way, for
 multi-table inserts, the concurrency can be enabled without delaying any 
 table.
 Currently, the movetask contains a input/output, but they do not seem to be
 populated correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 1918 - Still Failing

2013-01-17 Thread Apache Jenkins Server
Changes for Build #1917
[hashutosh] HIVE-2820 : Invalid tag is used for MapJoinProcessor (Navis via 
Ashutosh Chauhan)

[namit] HIVE-3872 MAP JOIN for VIEW thorws NULL pointer exception error
(Navis via namit)


Changes for Build #1918
[cws] Add DECIMAL data type (Josh Wills, Vikram Dixit, Prasad Mujumdar, Mark 
Grover and Gunther Hagleitner via cws)




1 tests failed.
FAILED:  
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_stats_aggregator_error_1

Error Message:
Forked Java VM exited abnormally. Please note the time in the report does not 
reflect the time until the VM exit.

Stack Trace:
junit.framework.AssertionFailedError: Forked Java VM exited abnormally. Please 
note the time in the report does not reflect the time until the VM exit.
at 
net.sf.antcontrib.logic.ForTask.doSequentialIteration(ForTask.java:259)
at net.sf.antcontrib.logic.ForTask.doToken(ForTask.java:268)
at net.sf.antcontrib.logic.ForTask.doTheTasks(ForTask.java:324)
at net.sf.antcontrib.logic.ForTask.execute(ForTask.java:244)




The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1918)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1918/ to 
view the results.

[jira] [Commented] (HIVE-3852) Multi-groupby optimization fails when same distinct column is used twice or more

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556086#comment-13556086
 ] 

Hudson commented on HIVE-3852:
--

Integrated in hive-trunk-hadoop1 #20 (See 
[https://builds.apache.org/job/hive-trunk-hadoop1/20/])
HIVE-3852 Multi-groupby optimization fails when same distinct column is
used twice or more (Navis via namit) (Revision 1434600)

 Result = ABORTED
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1434600
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/test/queries/clientpositive/groupby10.q
* /hive/trunk/ql/src/test/results/clientpositive/groupby10.q.out


 Multi-groupby optimization fails when same distinct column is used twice or 
 more
 

 Key: HIVE-3852
 URL: https://issues.apache.org/jira/browse/HIVE-3852
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.11.0

 Attachments: HIVE-3852.D7737.1.patch


 {code}
 FROM INPUT
 INSERT OVERWRITE TABLE dest1 
 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct 
 substr(INPUT.value,5)) GROUP BY INPUT.key
 INSERT OVERWRITE TABLE dest2 
 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct 
 substr(INPUT.value,5)) GROUP BY INPUT.key;
 {code}
 fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3898) getReducersBucketing in SemanticAnalyzer may return more than the max number of reducers

2013-01-17 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3898:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks Kevin

 getReducersBucketing in SemanticAnalyzer may return more than the max number 
 of reducers
 

 Key: HIVE-3898
 URL: https://issues.apache.org/jira/browse/HIVE-3898
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.11.0

 Attachments: HIVE-3898.1.patch.txt, HIVE-3898.2.patch.txt


 getReducersBucketing rounds totalFiles / maxReducers down, when it should be 
 rounded up to the nearest integer.
 E.g. if totalFiles = 60 and maxReducers = 21, 
 totalFiles / maxReducers = 2
 totalFiles / 2 = 30
 So the job will get 30 reducers, more than maxReducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3898) getReducersBucketing in SemanticAnalyzer may return more than the max number of reducers

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556139#comment-13556139
 ] 

Hudson commented on HIVE-3898:
--

Integrated in hive-trunk-hadoop1 #21 (See 
[https://builds.apache.org/job/hive-trunk-hadoop1/21/])
HIVE-3898 getReducersBucketing in SemanticAnalyzer may return more than the
max number of reducers (Kevin Wilfong via namit) (Revision 1434623)

 Result = ABORTED
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1434623
Files : 
* /hive/trunk/build-common.xml
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyNumReducersForBucketsHook.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyNumReducersHook.java
* /hive/trunk/ql/src/test/queries/clientpositive/bucket_num_reducers.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucket_num_reducers2.q
* /hive/trunk/ql/src/test/results/clientpositive/bucket_num_reducers2.q.out


 getReducersBucketing in SemanticAnalyzer may return more than the max number 
 of reducers
 

 Key: HIVE-3898
 URL: https://issues.apache.org/jira/browse/HIVE-3898
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.11.0

 Attachments: HIVE-3898.1.patch.txt, HIVE-3898.2.patch.txt


 getReducersBucketing rounds totalFiles / maxReducers down, when it should be 
 rounded up to the nearest integer.
 E.g. if totalFiles = 60 and maxReducers = 21, 
 totalFiles / maxReducers = 2
 totalFiles / 2 = 30
 So the job will get 30 reducers, more than maxReducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3628) Provide a way to use counters in Hive through UDF

2013-01-17 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3628:


 Assignee: Navis
Affects Version/s: (was: 0.7.0)
   Status: Patch Available  (was: Open)

 Provide a way to use counters in Hive through UDF
 -

 Key: HIVE-3628
 URL: https://issues.apache.org/jira/browse/HIVE-3628
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Viji
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3628.D8007.1.patch


 Currently it is not possible to generate counters through UDF. We should 
 support this. 
 Pig currently allows this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3628) Provide a way to use counters in Hive through UDF

2013-01-17 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3628:
--

Attachment: HIVE-3628.D8007.1.patch

navis requested code review of HIVE-3628 [jira] Provide a way to use counters 
in Hive through UDF.
Reviewers: JIRA

  DPAL-1964 Provide a way to use counters in Hive through UDF

  Currently it is not possible to generate counters through UDF. We should 
support this.

  Pig currently allows this.

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D8007

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecMapper.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecutionContext.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeEvaluator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/ContextualUDF.java
  ql/src/test/org/apache/hadoop/hive/ql/udf/generic/DummyContextUDF.java
  ql/src/test/queries/clientpositive/udf_context_aware.q
  ql/src/test/results/clientpositive/udf_context_aware.q.out

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/19299/

To: JIRA, navis


 Provide a way to use counters in Hive through UDF
 -

 Key: HIVE-3628
 URL: https://issues.apache.org/jira/browse/HIVE-3628
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Viji
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3628.D8007.1.patch


 Currently it is not possible to generate counters through UDF. We should 
 support this. 
 Pig currently allows this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3628) Provide a way to use counters in Hive through UDF

2013-01-17 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3628:
--

Attachment: HIVE-3628.D8007.2.patch

navis updated the revision HIVE-3628 [jira] Provide a way to use counters in 
Hive through UDF.
Reviewers: JIRA

  Removed dummy code


REVISION DETAIL
  https://reviews.facebook.net/D8007

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecMapper.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExecutionContext.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java
  ql/src/java/org/apache/hadoop/hive/ql/udf/generic/ContextualUDF.java
  ql/src/test/org/apache/hadoop/hive/ql/udf/generic/DummyContextUDF.java
  ql/src/test/queries/clientpositive/udf_context_aware.q
  ql/src/test/results/clientpositive/udf_context_aware.q.out

To: JIRA, navis


 Provide a way to use counters in Hive through UDF
 -

 Key: HIVE-3628
 URL: https://issues.apache.org/jira/browse/HIVE-3628
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Viji
Assignee: Navis
Priority: Minor
 Attachments: HIVE-3628.D8007.1.patch, HIVE-3628.D8007.2.patch


 Currently it is not possible to generate counters through UDF. We should 
 support this. 
 Pig currently allows this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3768) Document JDBC client configuration for secure clusters

2013-01-17 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-3768:
-

Attachment: HIVE-3768.2.patch

Patch 2 replaces patch 1, documenting JDBC client configuration for secure 
clusters in the Hive Client doc (converted from wiki).

Auxiliary files site.vsl and project.xml fix the copyright year and menu.

 Document JDBC client configuration for secure clusters
 --

 Key: HIVE-3768
 URL: https://issues.apache.org/jira/browse/HIVE-3768
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.9.0
Reporter: Lefty Leverenz
Assignee: Lefty Leverenz
 Attachments: HIVE-3768.1.patch, HIVE-3768.2.patch


 Document the JDBC client configuration required for starting Hive on a secure 
 cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3768) Document JDBC client configuration for secure clusters

2013-01-17 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-3768:
-

Fix Version/s: 0.10.0
   Labels: documentation  (was: )
 Release Note: Document JDBC client config for secure clusters in Hive 
Client user doc and tidy up the menu.
   Status: Patch Available  (was: Open)

Patch 2 documents the need for hive-site.xml to be in the CLASSPATH of the JDBC 
client when configuring Hive on a secure cluster.

This information is added to a Hive Client doc, which was converted from the 
wiki.

The patch adds one doc file and modifies two auxiliary files:

* docs/xdocs/hiveclient.xml is the source doc for Hive client setup

* project.xml adds a menu item for the client setup doc, fixes a misnamed menu 
item (DDL, not DML), adds a link to the javadocs, and changes the 
capitalization of a few menu items

* site.vsl removes menu indentation and changes the copyright year to 2013 in 
HTML footers

 Document JDBC client configuration for secure clusters
 --

 Key: HIVE-3768
 URL: https://issues.apache.org/jira/browse/HIVE-3768
 Project: Hive
  Issue Type: Bug
  Components: Documentation
Affects Versions: 0.9.0
Reporter: Lefty Leverenz
Assignee: Lefty Leverenz
  Labels: documentation
 Fix For: 0.10.0

 Attachments: HIVE-3768.1.patch, HIVE-3768.2.patch


 Document the JDBC client configuration required for starting Hive on a secure 
 cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3911) udaf_percentile_approx.q fails with Hadoop 0.23.5 when map-side aggr is disabled.

2013-01-17 Thread Thiruvel Thirumoolan (JIRA)
Thiruvel Thirumoolan created HIVE-3911:
--

 Summary: udaf_percentile_approx.q fails with Hadoop 0.23.5 when 
map-side aggr is disabled.
 Key: HIVE-3911
 URL: https://issues.apache.org/jira/browse/HIVE-3911
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Thiruvel Thirumoolan
 Fix For: 0.10.0, 0.11.0


I am running Hive10 unit tests against Hadoop 0.23.5 and 
udaf_percentile_approx.q fails with a different value when map-side aggr is 
disabled and only when 3rd argument to this UDAF is 100. Matches expected 
output when map-side aggr is enabled for the same arguments.

This test passes when hadoop.version is 1.1.1 and fails when its 0.23.x or 
2.0.0-alpha or 2.0.2-alpha.

[junit] 20c20
[junit]  254.083331
[junit] ---
[junit]  252.77
[junit] 47c47
[junit]  254.083331
[junit] ---
[junit]  252.77
[junit] 74c74
[junit]  [23.358,254.083331,477.0625,489.54667]
[junit] ---
[junit]  [24.07,252.77,476.9,487.82]
[junit] 101c101
[junit]  [23.358,254.083331,477.0625,489.54667]
[junit] ---
[junit]  [24.07,252.77,476.9,487.82]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3893) something wrong with the hive-default.xml

2013-01-17 Thread Hongjiang Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongjiang Zhang updated HIVE-3893:
--

Status: Patch Available  (was: Open)

./conf/hive-default.xml.template is not welformed.

 something wrong with the hive-default.xml
 -

 Key: HIVE-3893
 URL: https://issues.apache.org/jira/browse/HIVE-3893
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.10.0
Reporter: jet cheng

 in the line  482  in the hive-site.xml, there is no  matching end-tag for the 
 element type description ;
 The same mistake also appears in the line 561.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3893) something wrong with the hive-default.xml

2013-01-17 Thread Hongjiang Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongjiang Zhang updated HIVE-3893:
--

Attachment: hive-3893.patch.txt

 something wrong with the hive-default.xml
 -

 Key: HIVE-3893
 URL: https://issues.apache.org/jira/browse/HIVE-3893
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.10.0
Reporter: jet cheng
 Attachments: hive-3893.patch.txt


 in the line  482  in the hive-site.xml, there is no  matching end-tag for the 
 element type description ;
 The same mistake also appears in the line 561.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #264

2013-01-17 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/264/

--
[...truncated 5519 lines...]
 [echo] Project: hive

create-dirs:
 [echo] Project: shims
 [copy] Warning: 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/shims/src/test/resources
 does not exist.

init:
 [echo] Project: shims

ivy-init-settings:
 [echo] Project: shims

ivy-resolve:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/ivy/ivysettings.xml
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/thrift/libthrift/0.7.0/libthrift-0.7.0.jar
 ...
[ivy:resolve] . (294kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] org.apache.thrift#libthrift;0.7.0!libthrift.jar 
(133ms)
[ivy:resolve] 
[ivy:resolve] :: problems summary ::
[ivy:resolve]  ERRORS
[ivy:resolve]   Server access Error: No route to host 
url=http://mirror.facebook.net/facebook/hive-deps/hadoop/core/libthrift-0.7.0/libthrift-0.7.0.jar
[ivy:resolve]   Server access Error: No route to host 
url=http://mirror.facebook.net/facebook/hive-deps/hadoop/core/commons-logging-api-1.0.4/commons-logging-api-1.0.4.jar
[ivy:resolve]   Server access Error: No route to host 
url=http://mirror.facebook.net/facebook/hive-deps/hadoop/core/jackson-core-asl-1.8.8/jackson-core-asl-1.8.8.jar
[ivy:resolve] 
[ivy:resolve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
[ivy:report] Processing 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy/resolution-cache/org.apache.hive-hive-shims-default.xml
 to 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/ivy/report/org.apache.hive-hive-shims-default.html

ivy-retrieve:
 [echo] Project: shims

compile:
 [echo] Project: shims
 [echo] Building shims 0.20

build_shims:
 [echo] Project: shims
 [echo] Compiling 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/shims/src/common/java;/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/shims/src/0.20/java
 against hadoop 0.20.2 
(/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/build/hadoopcore/hadoop-0.20.2)

ivy-init-settings:
 [echo] Project: shims

ivy-resolve-hadoop-shim:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21/hive/ivy/ivysettings.xml
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/com/google/guava/guava/r09/guava-r09.jar ...
[ivy:resolve] 
...
 (1117kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] com.google.guava#guava;r09!guava.jar (142ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-core/0.20.2/hadoop-core-0.20.2.jar
 ...
[ivy:resolve] 

 (2624kB)
[ivy:resolve] .. (0kB)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-tools/0.20.2/hadoop-tools-0.20.2.jar
 ...
[ivy:resolve] . (68kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.apache.hadoop#hadoop-tools;0.20.2!hadoop-tools.jar (90ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-test/0.20.2/hadoop-test-0.20.2.jar
 ...
[ivy:resolve] 
.
 (1527kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.apache.hadoop#hadoop-test;0.20.2!hadoop-test.jar (90ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-cli/commons-cli/1.2/commons-cli-1.2.jar 
...
[ivy:resolve] .. (40kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] commons-cli#commons-cli;1.2!commons-cli.jar (51ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/xmlenc/xmlenc/0.52/xmlenc-0.52.jar ...
[ivy:resolve] ... (14kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] xmlenc#xmlenc;0.52!xmlenc.jar (31ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-httpclient/commons-httpclient/3.0.1/commons-httpclient-3.0.1.jar
 ...
[ivy:resolve]  (273kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
commons-httpclient#commons-httpclient;3.0.1!commons-httpclient.jar (37ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-codec/commons-codec/1.3/commons-codec-1.3.jar
 ...
[ivy:resolve]  (45kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] commons-codec#commons-codec;1.3!commons-codec.jar 
(32ms)
[ivy:resolve] downloading 

hive-trunk-hadoop1 - Build # 22 - Failure

2013-01-17 Thread Apache Jenkins Server
Changes for Build #1

Changes for Build #2

Changes for Build #3

Changes for Build #4
[kevinwilfong] HIVE-3552. performant manner for performing 
cubes/rollups/grouping sets for a high number of grouping set keys.


Changes for Build #5

Changes for Build #6
[cws] HIVE-3875. Negative value for hive.stats.ndv.error should be disallowed 
(Shreepadma Venugopalan via cws)


Changes for Build #7
[namit] HIVE-3888 wrong mapside groupby if no partition is being selected
(Namit Jain via Ashutosh and namit)


Changes for Build #8

Changes for Build #9

Changes for Build #10
[kevinwilfong] HIVE-3803. explain dependency should show the dependencies 
hierarchically in presence of views. (njain via kevinwilfong)


Changes for Build #11

Changes for Build #12
[namit] HIVE-3824 bug if different serdes are used for different partitions
(Namit Jain via Ashutosh and namit)


Changes for Build #13

Changes for Build #14
[hashutosh] HIVE-3004 : RegexSerDe should support other column types in 
addition to STRING (Shreepadma Venugoplan via Ashutosh Chauhan)


Changes for Build #15
[hashutosh] HIVE-2439 : Upgrade antlr version to 3.4 (Thiruvel Thirumoolan via 
Ashutosh Chauhan)


Changes for Build #16
[namit] HIVE-3897 Add a way to get the uncompressed/compressed sizes of columns
from an RC File (Kevin Wilfong via namit)


Changes for Build #17
[namit] HIVE-3899 Partition pruning fails on constant = constant expression
(Kevin Wilfong via namit)


Changes for Build #18
[hashutosh] HIVE-2820 : Invalid tag is used for MapJoinProcessor (Navis via 
Ashutosh Chauhan)

[namit] HIVE-3872 MAP JOIN for VIEW thorws NULL pointer exception error
(Navis via namit)


Changes for Build #19
[cws] Add DECIMAL data type (Josh Wills, Vikram Dixit, Prasad Mujumdar, Mark 
Grover and Gunther Hagleitner via cws)


Changes for Build #20
[namit] HIVE-3852 Multi-groupby optimization fails when same distinct column is
used twice or more (Navis via namit)


Changes for Build #21
[namit] HIVE-3898 getReducersBucketing in SemanticAnalyzer may return more than 
the
max number of reducers (Kevin Wilfong via namit)


Changes for Build #22



No tests ran.

The Apache Jenkins build system has built hive-trunk-hadoop1 (build #22)

Status: Failure

Check console output at https://builds.apache.org/job/hive-trunk-hadoop1/22/ to 
view the results.

[jira] [Updated] (HIVE-3537) release locks at the end of move tasks

2013-01-17 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3537:
-

Attachment: hive.3537.10.patch

 release locks at the end of move tasks
 --

 Key: HIVE-3537
 URL: https://issues.apache.org/jira/browse/HIVE-3537
 Project: Hive
  Issue Type: Bug
  Components: Locking, Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3537.10.patch, hive.3537.1.patch, 
 hive.3537.2.patch, hive.3537.3.patch, hive.3537.4.patch, hive.3537.5.patch, 
 hive.3537.6.patch, hive.3537.7.patch, hive.3537.8.patch, hive.3537.9.patch


 Look at HIVE-3106 for details.
 In order to make sure that concurrency is not an issue for multi-table 
 inserts, the current option is to introduce a dependency task, which thereby
 delays the creation of all partitions. It would be desirable to release the
 locks for the outputs as soon as the move task is completed. That way, for
 multi-table inserts, the concurrency can be enabled without delaying any 
 table.
 Currently, the movetask contains a input/output, but they do not seem to be
 populated correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3537) release locks at the end of move tasks

2013-01-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556343#comment-13556343
 ] 

Namit Jain commented on HIVE-3537:
--

Found the issue - there was a issue in the equals method for lock, whereby the 
lock data was not getting fetched.
Due to that, although the lock was getting released, but hive thought that the 
lock was still present, and it was again
repeatedly trying to release it before giving up.

Manually verified above tests finish normally - running the whole suite now.

 release locks at the end of move tasks
 --

 Key: HIVE-3537
 URL: https://issues.apache.org/jira/browse/HIVE-3537
 Project: Hive
  Issue Type: Bug
  Components: Locking, Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3537.10.patch, hive.3537.1.patch, 
 hive.3537.2.patch, hive.3537.3.patch, hive.3537.4.patch, hive.3537.5.patch, 
 hive.3537.6.patch, hive.3537.7.patch, hive.3537.8.patch, hive.3537.9.patch


 Look at HIVE-3106 for details.
 In order to make sure that concurrency is not an issue for multi-table 
 inserts, the current option is to introduce a dependency task, which thereby
 delays the creation of all partitions. It would be desirable to release the
 locks for the outputs as soon as the move task is completed. That way, for
 multi-table inserts, the concurrency can be enabled without delaying any 
 table.
 Currently, the movetask contains a input/output, but they do not seem to be
 populated correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3893) something wrong with the hive-default.xml

2013-01-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556404#comment-13556404
 ] 

Namit Jain commented on HIVE-3893:
--

+1

 something wrong with the hive-default.xml
 -

 Key: HIVE-3893
 URL: https://issues.apache.org/jira/browse/HIVE-3893
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.10.0
Reporter: jet cheng
 Attachments: hive-3893.patch.txt


 in the line  482  in the hive-site.xml, there is no  matching end-tag for the 
 element type description ;
 The same mistake also appears in the line 561.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3893) something wrong with the hive-default.xml

2013-01-17 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3893:
-

   Resolution: Fixed
Fix Version/s: 0.11.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed. Thanks jet

 something wrong with the hive-default.xml
 -

 Key: HIVE-3893
 URL: https://issues.apache.org/jira/browse/HIVE-3893
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.10.0
Reporter: jet cheng
 Fix For: 0.11.0

 Attachments: hive-3893.patch.txt


 in the line  482  in the hive-site.xml, there is no  matching end-tag for the 
 element type description ;
 The same mistake also appears in the line 561.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3903) Allow updating bucketing/sorting metadata of a partition through the CLI

2013-01-17 Thread Samuel Yuan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Samuel Yuan reassigned HIVE-3903:
-

Assignee: Samuel Yuan

 Allow updating bucketing/sorting metadata of a partition through the CLI
 

 Key: HIVE-3903
 URL: https://issues.apache.org/jira/browse/HIVE-3903
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Samuel Yuan

 Right now users can update the bucketing/sorting metadata of a table through 
 the CLI, but not a partition.  
 Use case:
 Need to merge a partition's files, but it's bucketed/sorted, so want to mark 
 the partition as unbucketed/unsorted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3537) release locks at the end of move tasks

2013-01-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556447#comment-13556447
 ] 

Namit Jain commented on HIVE-3537:
--

[~ashutoshc], the tests finished in appropriate time.
Can you take a look ?

I have updated the phabricator entry and uploaded the new patch.

 release locks at the end of move tasks
 --

 Key: HIVE-3537
 URL: https://issues.apache.org/jira/browse/HIVE-3537
 Project: Hive
  Issue Type: Bug
  Components: Locking, Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Attachments: hive.3537.10.patch, hive.3537.1.patch, 
 hive.3537.2.patch, hive.3537.3.patch, hive.3537.4.patch, hive.3537.5.patch, 
 hive.3537.6.patch, hive.3537.7.patch, hive.3537.8.patch, hive.3537.9.patch


 Look at HIVE-3106 for details.
 In order to make sure that concurrency is not an issue for multi-table 
 inserts, the current option is to introduce a dependency task, which thereby
 delays the creation of all partitions. It would be desirable to release the
 locks for the outputs as soon as the move task is completed. That way, for
 multi-table inserts, the concurrency can be enabled without delaying any 
 table.
 Currently, the movetask contains a input/output, but they do not seem to be
 populated correctly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3893) something wrong with the hive-default.xml

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556479#comment-13556479
 ] 

Hudson commented on HIVE-3893:
--

Integrated in hive-trunk-hadoop1 #23 (See 
[https://builds.apache.org/job/hive-trunk-hadoop1/23/])
HIVE-3893 something wrong with the hive-default.xml
(jet cheng via namit) (Revision 1434811)

 Result = ABORTED
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1434811
Files : 
* /hive/trunk/conf/hive-default.xml.template


 something wrong with the hive-default.xml
 -

 Key: HIVE-3893
 URL: https://issues.apache.org/jira/browse/HIVE-3893
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.10.0
Reporter: jet cheng
 Fix For: 0.11.0

 Attachments: hive-3893.patch.txt


 in the line  482  in the hive-site.xml, there is no  matching end-tag for the 
 element type description ;
 The same mistake also appears in the line 561.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby

2013-01-17 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556513#comment-13556513
 ] 

Ashutosh Chauhan commented on HIVE-2340:


Yeah, correct JOIN-GBY and GBY-GBY are taken care of in ysmart also. Its the 
group-by followed by order-by case which is also of interest to me, which this 
already covers. 

Besides the scenario covered by these two patches, I am also comparing the 
approaches taken in these two. I have just briefly looked at this patch, but 
fundamental difference which I can make out in this approach Vs ysmart approach 
is that here RS is deduplicated that is completely removed from operator 
pipeline, wherever it could be (i.e. when keys of subsequent RS is superset of 
the earlier one) thus fusing multiple MR jobs. Ysmart on the other hand instead 
replaces the second RS with a new operator its introducing 
(LocalSimulatedReduceSink?) which fakes the RS but doesn't let the plan split 
in 2 MR jobs and thus generating one MR job. I haven't thought through 
completely on this, but on initial pass it seems like approach of this patch is 
better than ysmart because:
* Here you don't need a new operator.
* Here you are simplifying the plan by eliminating the operators as oppose to 
ysmart which is replacing the operator thereby increasing the complexity of 
plan (by having a new type of operator)
* In that new operator ysmart currently serializes and deserializes the data 
through that operator, thereby unnecessarily introducing performance penalty. 
Granted this could be improved, but problem doesn't exist in patch proposed on 
this jira to begin with. 

Though there are certainly other scenarios which ysmart can cover (Yin, can you 
list those) which this patch is not covering, but for the scenarios that are 
common this approach seems to be better. 

There might be other differences in the approach, please feel free to raise 
those.

 optimize orderby followed by a groupby
 --

 Key: HIVE-2340
 URL: https://issues.apache.org/jira/browse/HIVE-2340
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: perfomance
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt


 Before implementing optimizer for JOIN-GBY, try to implement RS-GBY 
 optimizer(cluster-by following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3825) Add Operator level Hooks

2013-01-17 Thread Pamela Vagata (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pamela Vagata updated HIVE-3825:


Attachment: HIVE-3825.patch.5.txt

Addressed comments from feedback https://reviews.facebook.net/D7821

 Add Operator level Hooks
 

 Key: HIVE-3825
 URL: https://issues.apache.org/jira/browse/HIVE-3825
 Project: Hive
  Issue Type: New Feature
Reporter: Pamela Vagata
Assignee: Pamela Vagata
Priority: Minor
 Attachments: HIVE-3825.2.patch.txt, HIVE-3825.3.patch.txt, 
 HIVE-3825.patch.4.txt, HIVE-3825.patch.5.txt, HIVE-3825.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3852) Multi-groupby optimization fails when same distinct column is used twice or more

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556585#comment-13556585
 ] 

Hudson commented on HIVE-3852:
--

Integrated in Hive-trunk-hadoop2 #70 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/70/])
HIVE-3852 Multi-groupby optimization fails when same distinct column is
used twice or more (Navis via namit) (Revision 1434600)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1434600
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/test/queries/clientpositive/groupby10.q
* /hive/trunk/ql/src/test/results/clientpositive/groupby10.q.out


 Multi-groupby optimization fails when same distinct column is used twice or 
 more
 

 Key: HIVE-3852
 URL: https://issues.apache.org/jira/browse/HIVE-3852
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.11.0

 Attachments: HIVE-3852.D7737.1.patch


 {code}
 FROM INPUT
 INSERT OVERWRITE TABLE dest1 
 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct 
 substr(INPUT.value,5)) GROUP BY INPUT.key
 INSERT OVERWRITE TABLE dest2 
 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct 
 substr(INPUT.value,5)) GROUP BY INPUT.key;
 {code}
 fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby

2013-01-17 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556608#comment-13556608
 ] 

Yin Huai commented on HIVE-2340:


Let me explain the reason that I introduced the fake RS operator instead of 
just removing the original RS. When I was developing the patch for 2206, I 
found that the aggregation operator (GBY) and the join operator (JOIN) use 
different logic on processing rows forwarded to it. Although they both buffer 
rows, a GBY determines if it need to forward results to its children in 
processOp. While, a JOIN replies on endGroup to know when it should forward 
results. When we have plans like GBY-GBY or JOIN-GBY, that difference on 
processing logic is fine. However, when we have plan like
{code}
GBYGBY
   \  \
JOINor JOIN
   /  /
GBYJOIN---
{code}
We need operators between the child JOIN and parent GBYs and JOINs to make sure 
JOIN process rows in a correct way. This is also the reason that in 
CorrelationLocalSimulativeReduceSinkOperator, it determines when to start the 
group of its children in processOp and leave a empty startGroup and endGroup.

Also, by replacing RSs with those fake RSs, I do not need to touch those GBYs 
and JOINs which will be merged into the same Reduce phase. Since the input of 
the first operator in the Reduce side is in the format of [key, value, tag], so 
I use those fake RSs to generate rows in the same format.

But this part of work was implemented about almost 2 years ago. Definitely let 
me know if anything has been changed and this fake RS is no longer needed.

 optimize orderby followed by a groupby
 --

 Key: HIVE-2340
 URL: https://issues.apache.org/jira/browse/HIVE-2340
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: perfomance
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt


 Before implementing optimizer for JOIN-GBY, try to implement RS-GBY 
 optimizer(cluster-by following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby

2013-01-17 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556620#comment-13556620
 ] 

Yin Huai commented on HIVE-2340:


The current implementation of the patch of YSmart covers scenarios when a join 
or aggregation operator share the same partition keys with its all parents 
(join or aggregation operators). 
For example, a single MR job will be generated if all operators in the 
following plan share the same partition keys.
{code}
JOIN
   \  
JOIN  
   /\  
GBY  \
  JOIN
 /
GBY--- -/
{code}


Also, it requires that the bottom join or aggregation operators which will be 
processed in the same MR job take input tables instead of intermediate tables. 
In future, it should be extended to cover scenarios that involve intermediate 
tables, that correlated operators share common partition keys (not exactly the 
same keys), and that a join or aggregation operator share common keys with some 
of its parents. 

 optimize orderby followed by a groupby
 --

 Key: HIVE-2340
 URL: https://issues.apache.org/jira/browse/HIVE-2340
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: perfomance
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt


 Before implementing optimizer for JOIN-GBY, try to implement RS-GBY 
 optimizer(cluster-by following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3912) table_access_keys_stats.q fails with hadoop 0.23

2013-01-17 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-3912:
--

 Summary: table_access_keys_stats.q fails with hadoop 0.23
 Key: HIVE-3912
 URL: https://issues.apache.org/jira/browse/HIVE-3912
 Project: Hive
  Issue Type: Bug
  Components: Tests
 Environment: Hadoop 0.23  (2.0.2-alpha)
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
Priority: Minor


CliDriver test table_access_keys_stats.q fails with hadoop 0.23 because a 
different order of results from the join is produced under 0.23. The data 
itself doesn't seem wrong, but the output does not match the golden output file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3852) Multi-groupby optimization fails when same distinct column is used twice or more

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556634#comment-13556634
 ] 

Hudson commented on HIVE-3852:
--

Integrated in Hive-trunk-h0.21 #1919 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1919/])
HIVE-3852 Multi-groupby optimization fails when same distinct column is
used twice or more (Navis via namit) (Revision 1434600)

 Result = SUCCESS
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1434600
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* /hive/trunk/ql/src/test/queries/clientpositive/groupby10.q
* /hive/trunk/ql/src/test/results/clientpositive/groupby10.q.out


 Multi-groupby optimization fails when same distinct column is used twice or 
 more
 

 Key: HIVE-3852
 URL: https://issues.apache.org/jira/browse/HIVE-3852
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Fix For: 0.11.0

 Attachments: HIVE-3852.D7737.1.patch


 {code}
 FROM INPUT
 INSERT OVERWRITE TABLE dest1 
 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), count(distinct 
 substr(INPUT.value,5)) GROUP BY INPUT.key
 INSERT OVERWRITE TABLE dest2 
 SELECT INPUT.key, sum(distinct substr(INPUT.value,5)), avg(distinct 
 substr(INPUT.value,5)) GROUP BY INPUT.key;
 {code}
 fails with exception FAILED: IndexOutOfBoundsException Index: 0,Size: 0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Hive-trunk-h0.21 - Build # 1919 - Fixed

2013-01-17 Thread Apache Jenkins Server
Changes for Build #1917
[hashutosh] HIVE-2820 : Invalid tag is used for MapJoinProcessor (Navis via 
Ashutosh Chauhan)

[namit] HIVE-3872 MAP JOIN for VIEW thorws NULL pointer exception error
(Navis via namit)


Changes for Build #1918
[cws] Add DECIMAL data type (Josh Wills, Vikram Dixit, Prasad Mujumdar, Mark 
Grover and Gunther Hagleitner via cws)


Changes for Build #1919
[namit] HIVE-3852 Multi-groupby optimization fails when same distinct column is
used twice or more (Navis via namit)




All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1919)

Status: Fixed

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1919/ to 
view the results.

Re: [DISCUSS] HCatalog becoming a subproject of Hive

2013-01-17 Thread Carl Steinbach
Sounds like a good plan to me. Since Ashutosh is a member of both the Hive
and HCatalog PMCs it probably makes more sense for him to call the vote,
but I'm willing to do it too.

On Wed, Jan 16, 2013 at 8:24 AM, Alan Gates ga...@hortonworks.com wrote:

 If you think that's the best path forward that's fine.  I can't call a
 vote I don't think, since I'm not part of the Hive PMC.  But I'm happy to
 draft a resolution for you and then let you call the vote.  Should I do
 that?

 Alan.

 On Jan 11, 2013, at 4:34 PM, Carl Steinbach wrote:

  Hi Alan,
 
  I agree that submitting this for a vote is the best option.
 
  If anyone has additional proposed modifications please make them.
  Otherwise I propose that the Hive PMC vote on this proposal.
 
  In order for the Hive PMC to be able to vote on these changes they need
 to be expressed in terms of one or more of the actions listed at the end
 of the Hive project bylaws:
 
  https://cwiki.apache.org/confluence/display/Hive/Bylaws
 
  So I think we first need to amend to the bylaws in order to define the
 rights and privileges of a submodule committer, and then separately vote
 the HCatalog committers in as Hive submodule committers. Does this make
 sense?
 
  Thanks.
 
  Carl
 




[jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby

2013-01-17 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556651#comment-13556651
 ] 

Ashutosh Chauhan commented on HIVE-2340:


Thanks Yin for explaining. Your ASCII art helped in understanding the 
differences : ) I better understand the reason for the fake new operator now. I 
think in cases you have pointed out when there is such kind of trees, this 
reduce deduplication approach won't help, since it looks at linear chain of RS 
and eliminates the one where it could. You would need a fake operator in such 
case because you don't want to modify the GBY or Join operators which make 
sense. I see the merits of Ysmart better now.

Though, on the other hand patch on this jira is still useful and complementary 
to ysmart. Since, it will collapse linear RS, instead of adding fake ones. In 
addition to collapsing of those operators, it will also make the life of ysmart 
easier because than ysmart will be dealing with simpler plans with reduce sinks 
already deduplicated. We need to make sure reducededup rule fires before ysmart 
for both optimizations to play nicely. So, I think we should make progress on 
both these patches.

[~navis] Will you like to refresh this patch?

 optimize orderby followed by a groupby
 --

 Key: HIVE-2340
 URL: https://issues.apache.org/jira/browse/HIVE-2340
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: perfomance
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt


 Before implementing optimizer for JOIN-GBY, try to implement RS-GBY 
 optimizer(cluster-by following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby

2013-01-17 Thread Yin Huai (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556660#comment-13556660
 ] 

Yin Huai commented on HIVE-2340:


Yes, I agree.

 optimize orderby followed by a groupby
 --

 Key: HIVE-2340
 URL: https://issues.apache.org/jira/browse/HIVE-2340
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
  Labels: perfomance
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt


 Before implementing optimizer for JOIN-GBY, try to implement RS-GBY 
 optimizer(cluster-by following group-by).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.10.0-SNAPSHOT-h0.20.1 #37

2013-01-17 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/37/

--
[...truncated 41972 lines...]
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2013-01-17 14:31:41,911 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] Execution completed successfully
[junit] Mapred Local Task Succeeded . Convert the Join into MapJoin
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/37/artifact/hive/build/service/localscratchdir/hive_2013-01-17_14-31-38_572_6458477646857325659/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/37/artifact/hive/build/service/tmp/hive_job_log_jenkins_201301171431_1961600435.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] Copying file: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] Table default.testhivedrivertable stats: [num_partitions: 0, 
num_files: 1, num_rows: 0, total_size: 5812, raw_data_size: 0]
[junit] POSTHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/37/artifact/hive/build/service/localscratchdir/hive_2013-01-17_14-31-43_486_6101434080818216137/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/37/artifact/hive/build/service/localscratchdir/hive_2013-01-17_14-31-43_486_6101434080818216137/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/37/artifact/hive/build/service/tmp/hive_job_log_jenkins_201301171431_1572550515.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] 

[jira] [Commented] (HIVE-2828) make timestamp accessible in the hbase KeyValue

2013-01-17 Thread John Shields (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556862#comment-13556862
 ] 

John Shields commented on HIVE-2828:


What is the right process for applying these patches? Also, on which SVN paths 
would it be applicable? Obviously the code has changed since these were created 
so I'm trying to figure out how to apply them. We currently have a version of 
0.8.0 that would be great to apply these patches against. I tried against 
branches/0.8.0 to no avail.

Thanks!

John


 make timestamp accessible in the hbase KeyValue 
 

 Key: HIVE-2828
 URL: https://issues.apache.org/jira/browse/HIVE-2828
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: Navis
Assignee: Navis
Priority: Trivial
 Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.1.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.2.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.3.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.4.patch, 
 ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.5.patch


 Originated from HIVE-2781 and not accepted, but I think this could be helpful 
 to someone.
 By using special column notation ':timestamp' in HBASE_COLUMNS_MAPPING, user 
 might access timestamp value in hbase KeyValue.
 {code}
 CREATE TABLE hbase_table (key int, value string, time timestamp)
   STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
   WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf:string,:timestamp)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3909) Wrong data due to HIVE-2820

2013-01-17 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-3909:
--

Attachment: HIVE-3909.D8013.1.patch

navis requested code review of HIVE-3909 [jira] Wrong data due to HIVE-2820.
Reviewers: JIRA

  Consider the query:

  ~/hive/hive1$ more ql/src/test/queries/clientpositive/join_reorder4.q
  CREATE TABLE T1(key1 STRING, val1 STRING) STORED AS TEXTFILE;
  CREATE TABLE T2(key2 STRING, val2 STRING) STORED AS TEXTFILE;
  CREATE TABLE T3(key3 STRING, val3 STRING) STORED AS TEXTFILE;

  LOAD DATA LOCAL INPATH '../data/files/T1.txt' INTO TABLE T1;
  LOAD DATA LOCAL INPATH '../data/files/T2.txt' INTO TABLE T2;
  LOAD DATA LOCAL INPATH '../data/files/T3.txt' INTO TABLE T3;

  set hive.auto.convert.join=true;

  explain select /+ STREAMTABLE(a) */ a., b., c. from T1 a join T2 b on 
a.key1=b.key2 join T3 c on a.key1=c.key3;
  select /+ STREAMTABLE(a) */ a., b., c. from T1 a join T2 b on a.key1=b.key2 
join T3 c on a.key1=c.key3;

  explain select /+ STREAMTABLE(b) */ a., b., c. from T1 a join T2 b on 
a.key1=b.key2 join T3 c on a.key1=c.key3;
  select /+ STREAMTABLE(b) */ a., b., c. from T1 a join T2 b on a.key1=b.key2 
join T3 c on a.key1=c.key3;

  explain select /+ STREAMTABLE(c) */ a., b., c. from T1 a join T2 b on 
a.key1=b.key2 join T3 c on a.key1=c.key3;
  select /+ STREAMTABLE(c) */ a., b., c. from T1 a join T2 b on a.key1=b.key2 
join T3 c on a.key1=c.key3;

  select /+ STREAMTABLE(b) */ a., b., c. from T1 a join T2 b on a.key1=b.key2 
join T3 c on a.key1=c.key3;

  returns:
  2 12  2   12  2   22

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D8013

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java
  ql/src/test/queries/clientpositive/join_reorder4.q
  ql/src/test/results/clientpositive/join_reorder4.q.out

MANAGE HERALD DIFFERENTIAL RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/19341/

To: JIRA, navis


 Wrong data due to HIVE-2820
 ---

 Key: HIVE-3909
 URL: https://issues.apache.org/jira/browse/HIVE-3909
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Navis
 Attachments: HIVE-3909.D8013.1.patch


 Consider the query:
 ~/hive/hive1$ more ql/src/test/queries/clientpositive/join_reorder4.q
 CREATE TABLE T1(key1 STRING, val1 STRING) STORED AS TEXTFILE;
 CREATE TABLE T2(key2 STRING, val2 STRING) STORED AS TEXTFILE;
 CREATE TABLE T3(key3 STRING, val3 STRING) STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '../data/files/T1.txt' INTO TABLE T1;
 LOAD DATA LOCAL INPATH '../data/files/T2.txt' INTO TABLE T2;
 LOAD DATA LOCAL INPATH '../data/files/T3.txt' INTO TABLE T3;
 set hive.auto.convert.join=true;
 explain select /*+ STREAMTABLE(a) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(a) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 explain select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 explain select /*+ STREAMTABLE(c) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(c) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 returns:
 2 12  2   12  2   22

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3909) Wrong data due to HIVE-2820

2013-01-17 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556893#comment-13556893
 ] 

Phabricator commented on HIVE-3909:
---

navis has commented on the revision HIVE-3909 [jira] Wrong data due to 
HIVE-2820.

  Removed all of the confusing order mapping from mapjoins.

REVISION DETAIL
  https://reviews.facebook.net/D8013

To: JIRA, navis


 Wrong data due to HIVE-2820
 ---

 Key: HIVE-3909
 URL: https://issues.apache.org/jira/browse/HIVE-3909
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Navis
 Attachments: HIVE-3909.D8013.1.patch


 Consider the query:
 ~/hive/hive1$ more ql/src/test/queries/clientpositive/join_reorder4.q
 CREATE TABLE T1(key1 STRING, val1 STRING) STORED AS TEXTFILE;
 CREATE TABLE T2(key2 STRING, val2 STRING) STORED AS TEXTFILE;
 CREATE TABLE T3(key3 STRING, val3 STRING) STORED AS TEXTFILE;
 LOAD DATA LOCAL INPATH '../data/files/T1.txt' INTO TABLE T1;
 LOAD DATA LOCAL INPATH '../data/files/T2.txt' INTO TABLE T2;
 LOAD DATA LOCAL INPATH '../data/files/T3.txt' INTO TABLE T3;
 set hive.auto.convert.join=true;
 explain select /*+ STREAMTABLE(a) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(a) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 explain select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 explain select /*+ STREAMTABLE(c) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(c) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 select /*+ STREAMTABLE(b) */ a.*, b.*, c.* from T1 a join T2 b on 
 a.key1=b.key2 join T3 c on a.key1=c.key3;
 returns:
 2 12  2   12  2   22

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3913) Possible deadlock in ZK lock manager

2013-01-17 Thread Mikhail Bautin (JIRA)
Mikhail Bautin created HIVE-3913:


 Summary: Possible deadlock in ZK lock manager
 Key: HIVE-3913
 URL: https://issues.apache.org/jira/browse/HIVE-3913
 Project: Hive
  Issue Type: Bug
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Critical


ZK Hive lock manager can get into a state when the connection is closed, but no 
reconnection is attempted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3825) Add Operator level Hooks

2013-01-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556913#comment-13556913
 ] 

Namit Jain commented on HIVE-3825:
--

+1

 Add Operator level Hooks
 

 Key: HIVE-3825
 URL: https://issues.apache.org/jira/browse/HIVE-3825
 Project: Hive
  Issue Type: New Feature
Reporter: Pamela Vagata
Assignee: Pamela Vagata
Priority: Minor
 Attachments: HIVE-3825.2.patch.txt, HIVE-3825.3.patch.txt, 
 HIVE-3825.patch.4.txt, HIVE-3825.patch.5.txt, HIVE-3825.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3893) something wrong with the hive-default.xml

2013-01-17 Thread jet cheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556927#comment-13556927
 ] 

jet cheng commented on HIVE-3893:
-

[~namit] You are welcome.

 something wrong with the hive-default.xml
 -

 Key: HIVE-3893
 URL: https://issues.apache.org/jira/browse/HIVE-3893
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.10.0
Reporter: jet cheng
 Fix For: 0.11.0

 Attachments: hive-3893.patch.txt


 in the line  482  in the hive-site.xml, there is no  matching end-tag for the 
 element type description ;
 The same mistake also appears in the line 561.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3898) getReducersBucketing in SemanticAnalyzer may return more than the max number of reducers

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556963#comment-13556963
 ] 

Hudson commented on HIVE-3898:
--

Integrated in Hive-trunk-hadoop2 #71 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/71/])
HIVE-3898 getReducersBucketing in SemanticAnalyzer may return more than the
max number of reducers (Kevin Wilfong via namit) (Revision 1434623)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1434623
Files : 
* /hive/trunk/build-common.xml
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyNumReducersForBucketsHook.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyNumReducersHook.java
* /hive/trunk/ql/src/test/queries/clientpositive/bucket_num_reducers.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucket_num_reducers2.q
* /hive/trunk/ql/src/test/results/clientpositive/bucket_num_reducers2.q.out


 getReducersBucketing in SemanticAnalyzer may return more than the max number 
 of reducers
 

 Key: HIVE-3898
 URL: https://issues.apache.org/jira/browse/HIVE-3898
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.11.0

 Attachments: HIVE-3898.1.patch.txt, HIVE-3898.2.patch.txt


 getReducersBucketing rounds totalFiles / maxReducers down, when it should be 
 rounded up to the nearest integer.
 E.g. if totalFiles = 60 and maxReducers = 21, 
 totalFiles / maxReducers = 2
 totalFiles / 2 = 30
 So the job will get 30 reducers, more than maxReducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3893) something wrong with the hive-default.xml

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556962#comment-13556962
 ] 

Hudson commented on HIVE-3893:
--

Integrated in Hive-trunk-hadoop2 #71 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/71/])
HIVE-3893 something wrong with the hive-default.xml
(jet cheng via namit) (Revision 1434811)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1434811
Files : 
* /hive/trunk/conf/hive-default.xml.template


 something wrong with the hive-default.xml
 -

 Key: HIVE-3893
 URL: https://issues.apache.org/jira/browse/HIVE-3893
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.10.0
Reporter: jet cheng
 Fix For: 0.11.0

 Attachments: hive-3893.patch.txt


 in the line  482  in the hive-site.xml, there is no  matching end-tag for the 
 element type description ;
 The same mistake also appears in the line 561.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3871) show number of mappers/reducers as part of explain extended

2013-01-17 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3871:
---

Assignee: niraj rai

 show number of mappers/reducers as part of explain extended
 ---

 Key: HIVE-3871
 URL: https://issues.apache.org/jira/browse/HIVE-3871
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: niraj rai

 It would be useful to show the number of mappers/reducers as part of explain 
 extended.
 For the MR jobs referencing intermediate data, the number can be approximate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3914) use Chinese in hive column comment and table comment

2013-01-17 Thread caofangkun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caofangkun updated HIVE-3914:
-

Attachment: HIVE-3914-1.patch

use “outStream.writeUTF”  instead of “ outStream.writeBytes ”

 use Chinese in hive column comment and table comment
 

 Key: HIVE-3914
 URL: https://issues.apache.org/jira/browse/HIVE-3914
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0, 0.10.0
Reporter: caofangkun
Priority: Minor
 Attachments: HIVE-3914-1.patch


 use Chinese in hive column comment and table comment,and the metadata in 
 Mysql is regular,the charset of 'COMMENT' column in 'columns_v2' table and 
 'PARAM_VALUE' column in 'table_params' table both are 'utf8'.
 When I exec 'select * from columns_v2' with mysql client,the Chinese comments 
 display normally. But when I execute 'describe table' with hive cli,the 
 Chinese words are garbled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3914) use Chinese in hive column comment and table comment

2013-01-17 Thread caofangkun (JIRA)
caofangkun created HIVE-3914:


 Summary: use Chinese in hive column comment and table comment
 Key: HIVE-3914
 URL: https://issues.apache.org/jira/browse/HIVE-3914
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0, 0.10.0
Reporter: caofangkun
Priority: Minor
 Attachments: HIVE-3914-1.patch

use Chinese in hive column comment and table comment,and the metadata in Mysql 
is regular,the charset of 'COMMENT' column in 'columns_v2' table and 
'PARAM_VALUE' column in 'table_params' table both are 'utf8'.
When I exec 'select * from columns_v2' with mysql client,the Chinese comments 
display normally. But when I execute 'describe table' with hive cli,the Chinese 
words are garbled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3874) Create a new Optimized Row Columnar file format for Hive

2013-01-17 Thread Joydeep Sen Sarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556977#comment-13556977
 ] 

Joydeep Sen Sarma commented on HIVE-3874:
-

couple of observations:

- one use case mentioned is external indices. but in my experience, secondary 
index pointers have little correlation with the primary key ordering. If the 
use case is to speed up secondary index lookups - then one will be forced to 
consider smaller row groups. At that point - this starts breaking down - large 
row groups are good for scanning for scanning and compression - but poor for 
lookups.

  a possible way out is to do a two level structure - stripes or chunks as the 
unit of compression (column dictionaries maintained at this level), but a 
smaller unit for row-groups (a single 250MB chunk has many smaller row groups 
all encoded using a common dictionary). this can give a good balance of 
compression and lookup capabilities.

  at this point - i believe - we are closer to a HFile data structure - and I 
think converging HFile* so it works well for Hive would be a great goal. A lot 
of people would benefit from letting HBase do indexing and let Hive/Hadoop 
chomp on HBase produced HFiles.


- another use case mentioned is pruning based on column ranges. Once again - 
these use cases typically only benefit columns whose values are correlated with 
the primary row order. Timestamps and anything correlated with timestamps do 
benefit - but others don't. In systems like Netezza - this is used as a 
substitute for partitioning.

  The issue is that pruning at the block level is not enough - because one has 
already generated large number splits for MR to chomp on. And large number 
splits make processing really slow - even if everything is pruned out inside 
each mapper. Unless that issue is addressed - most users would end up 
repartitioning their (using Hive's dynamic partitioning) based on column values 
- and the whole column range stuff would largely not come in use.


 Create a new Optimized Row Columnar file format for Hive
 

 Key: HIVE-3874
 URL: https://issues.apache.org/jira/browse/HIVE-3874
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: OrcFileIntro.pptx


 There are several limitations of the current RC File format that I'd like to 
 address by creating a new format:
 * each column value is stored as a binary blob, which means:
 ** the entire column value must be read, decompressed, and deserialized
 ** the file format can't use smarter type-specific compression
 ** push down filters can't be evaluated
 * the start of each row group needs to be found by scanning
 * user metadata can only be added to the file when the file is created
 * the file doesn't store the number of rows per a file or row group
 * there is no mechanism for seeking to a particular row number, which is 
 required for external indexes.
 * there is no mechanism for storing light weight indexes within the file to 
 enable push-down filters to skip entire row groups.
 * the type of the rows aren't stored in the file

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-3915) Union with map-only query on one side and two MR job query on the other produces wrong results

2013-01-17 Thread Kevin Wilfong (JIRA)
Kevin Wilfong created HIVE-3915:
---

 Summary: Union with map-only query on one side and two MR job 
query on the other produces wrong results
 Key: HIVE-3915
 URL: https://issues.apache.org/jira/browse/HIVE-3915
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong


When a query contains a union with a map only subquery on one side and a 
subquery involving two sequential map reduce jobs on the other, it can produce 
wrong results.  It appears that if the map only queries table scan operator is 
processed first the task involving a union is made a root task.  Then when the 
other subquery is processed, the second map reduce job gains the task involving 
the union as a child and it is made a root task.  This means that both the 
first and second map reduce jobs are root tasks, so the dependency between the 
two is ignored.  If they are run in parallel (i.e. the cluster has more than 
one node) no results will be produced for the side of the union with the two 
map reduce jobs and only the results of the other side of the union will be 
returned.

The order TableScan operators are processed is crucial to reproducing this bug, 
and it is determined by the order values are retrieved from a map, and hence 
hard to predict, so it doesn't always reproduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3915) Union with map-only query on one side and two MR job query on the other produces wrong results

2013-01-17 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556994#comment-13556994
 ] 

Kevin Wilfong commented on HIVE-3915:
-

https://reviews.facebook.net/D8019

 Union with map-only query on one side and two MR job query on the other 
 produces wrong results
 --

 Key: HIVE-3915
 URL: https://issues.apache.org/jira/browse/HIVE-3915
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong

 When a query contains a union with a map only subquery on one side and a 
 subquery involving two sequential map reduce jobs on the other, it can 
 produce wrong results.  It appears that if the map only queries table scan 
 operator is processed first the task involving a union is made a root task.  
 Then when the other subquery is processed, the second map reduce job gains 
 the task involving the union as a child and it is made a root task.  This 
 means that both the first and second map reduce jobs are root tasks, so the 
 dependency between the two is ignored.  If they are run in parallel (i.e. the 
 cluster has more than one node) no results will be produced for the side of 
 the union with the two map reduce jobs and only the results of the other side 
 of the union will be returned.
 The order TableScan operators are processed is crucial to reproducing this 
 bug, and it is determined by the order values are retrieved from a map, and 
 hence hard to predict, so it doesn't always reproduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-3915) Union with map-only query on one side and two MR job query on the other produces wrong results

2013-01-17 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3915:


Attachment: HIVE-3915.1.patch.txt

 Union with map-only query on one side and two MR job query on the other 
 produces wrong results
 --

 Key: HIVE-3915
 URL: https://issues.apache.org/jira/browse/HIVE-3915
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3915.1.patch.txt


 When a query contains a union with a map only subquery on one side and a 
 subquery involving two sequential map reduce jobs on the other, it can 
 produce wrong results.  It appears that if the map only queries table scan 
 operator is processed first the task involving a union is made a root task.  
 Then when the other subquery is processed, the second map reduce job gains 
 the task involving the union as a child and it is made a root task.  This 
 means that both the first and second map reduce jobs are root tasks, so the 
 dependency between the two is ignored.  If they are run in parallel (i.e. the 
 cluster has more than one node) no results will be produced for the side of 
 the union with the two map reduce jobs and only the results of the other side 
 of the union will be returned.
 The order TableScan operators are processed is crucial to reproducing this 
 bug, and it is determined by the order values are retrieved from a map, and 
 hence hard to predict, so it doesn't always reproduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Add 'show version' command to Hive CLI

2013-01-17 Thread Zhuoluo Yang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8958/
---

(Updated Jan. 18, 2013, 6:28 a.m.)


Review request for hive, Carl Steinbach and Brock Noland.


Changes
---

Another patch as Brock's comments.


Description
---

We add a simple ddl grammar, called show version.
The version info is generated automatically while compiling.


This addresses bug HIVE-1151.
https://issues.apache.org/jira/browse/HIVE-1151


Diffs (updated)
-

  http://svn.apache.org/repos/asf/hive/trunk/bin/ext/version.sh PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/bin/hive 1435001 
  http://svn.apache.org/repos/asf/hive/trunk/build.xml 1435001 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/HiveVersionAnnotation.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
 1435001 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
 1435001 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
 1435001 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java
 1435001 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java
 1435001 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ShowVersionDesc.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/util/HiveVersionInfo.java
 PRE-CREATION 
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/saveVersion.sh PRE-CREATION 

Diff: https://reviews.apache.org/r/8958/diff/


Testing
---

zhuoluo@zhuoluo-Latitude-E6420:~$ hive --version
Hive 0.11.0-SNAPSHOT
Subversion git://github.com/apache/hive.git on branch trunk -r 
34c95e9e6ab2110653af20e6d34a8fe02b04198d
Compiled by zhuoluo on Wed Jan 16 12:26:12 CST 2013
zhuoluo@zhuoluo-Latitude-E6420:~$ hive
Hive history file=/tmp/zhuoluo/hive_job_log_zhuoluo_201301161232_1201027344.txt
hive show version;
OK
0.11.0-SNAPSHOT from 34c95e9e6ab2110653af20e6d34a8fe02b04198d by zhuoluo on Wed 
Jan 16 12:26:12 CST 2013
git://github.com/apache/hive.git on branch trunk
Time taken: 0.522 seconds, Fetched: 2 row(s)
hive 


Thanks,

Zhuoluo Yang



[jira] [Commented] (HIVE-3915) Union with map-only query on one side and two MR job query on the other produces wrong results

2013-01-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13556995#comment-13556995
 ] 

Namit Jain commented on HIVE-3915:
--

Great catch Kevin, this seems to have been around for a long time.

 Union with map-only query on one side and two MR job query on the other 
 produces wrong results
 --

 Key: HIVE-3915
 URL: https://issues.apache.org/jira/browse/HIVE-3915
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3915.1.patch.txt


 When a query contains a union with a map only subquery on one side and a 
 subquery involving two sequential map reduce jobs on the other, it can 
 produce wrong results.  It appears that if the map only queries table scan 
 operator is processed first the task involving a union is made a root task.  
 Then when the other subquery is processed, the second map reduce job gains 
 the task involving the union as a child and it is made a root task.  This 
 means that both the first and second map reduce jobs are root tasks, so the 
 dependency between the two is ignored.  If they are run in parallel (i.e. the 
 cluster has more than one node) no results will be produced for the side of 
 the union with the two map reduce jobs and only the results of the other side 
 of the union will be returned.
 The order TableScan operators are processed is crucial to reproducing this 
 bug, and it is determined by the order values are retrieved from a map, and 
 hence hard to predict, so it doesn't always reproduce.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: Add 'show version' command to Hive CLI

2013-01-17 Thread Zhuoluo Yang


 On Jan. 17, 2013, 1:55 a.m., Brock Noland wrote:
  Hi, Looks good!  A few more comments below.  Sorry for ignorance of git 
  and shamelessly cloning the code - no worries :) if you didn't copy this 
  I'd wonder why not!
  
  Also, FYI I am not a Hive committer.

Thank you very much for your comments no matter whether you are a commiter or 
not.


 On Jan. 17, 2013, 1:55 a.m., Brock Noland wrote:
  http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java,
   line 474
  https://reviews.apache.org/r/8958/diff/2/?file=248840#file248840line474
 
  What is the purpose of this cast? Also, this will be closed in the 
  finally, block, no?
 

I think the cast is to remind (make sure) that we calls the correct method.
And this kind of close are involved by HIVE-1884 to avoid potential risk of 
resource leaks in Hive, I think...


- Zhuoluo


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8958/#review15438
---


On Jan. 18, 2013, 6:28 a.m., Zhuoluo Yang wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/8958/
 ---
 
 (Updated Jan. 18, 2013, 6:28 a.m.)
 
 
 Review request for hive, Carl Steinbach and Brock Noland.
 
 
 Description
 ---
 
 We add a simple ddl grammar, called show version.
 The version info is generated automatically while compiling.
 
 
 This addresses bug HIVE-1151.
 https://issues.apache.org/jira/browse/HIVE-1151
 
 
 Diffs
 -
 
   http://svn.apache.org/repos/asf/hive/trunk/bin/ext/version.sh PRE-CREATION 
   http://svn.apache.org/repos/asf/hive/trunk/bin/hive 1435001 
   http://svn.apache.org/repos/asf/hive/trunk/build.xml 1435001 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/HiveVersionAnnotation.java
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
  1435001 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
  1435001 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  1435001 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java
  1435001 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java
  1435001 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ShowVersionDesc.java
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/util/HiveVersionInfo.java
  PRE-CREATION 
   http://svn.apache.org/repos/asf/hive/trunk/ql/src/saveVersion.sh 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/8958/diff/
 
 
 Testing
 ---
 
 zhuoluo@zhuoluo-Latitude-E6420:~$ hive --version
 Hive 0.11.0-SNAPSHOT
 Subversion git://github.com/apache/hive.git on branch trunk -r 
 34c95e9e6ab2110653af20e6d34a8fe02b04198d
 Compiled by zhuoluo on Wed Jan 16 12:26:12 CST 2013
 zhuoluo@zhuoluo-Latitude-E6420:~$ hive
 Hive history 
 file=/tmp/zhuoluo/hive_job_log_zhuoluo_201301161232_1201027344.txt
 hive show version;
 OK
 0.11.0-SNAPSHOT from 34c95e9e6ab2110653af20e6d34a8fe02b04198d by zhuoluo on 
 Wed Jan 16 12:26:12 CST 2013
 git://github.com/apache/hive.git on branch trunk
 Time taken: 0.522 seconds, Fetched: 2 row(s)
 hive 
 
 
 Thanks,
 
 Zhuoluo Yang
 




[jira] [Updated] (HIVE-1151) Add 'show version' command to Hive CLI

2013-01-17 Thread Zhuoluo (Clark) Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhuoluo (Clark) Yang updated HIVE-1151:
---

Attachment: HIVE-1151.3.patch

Correct some comments

 Add 'show version' command to Hive CLI
 --

 Key: HIVE-1151
 URL: https://issues.apache.org/jira/browse/HIVE-1151
 Project: Hive
  Issue Type: New Feature
  Components: CLI, Clients
Affects Versions: 0.6.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-1151.1.patch, HIVE-1151.2.patch, HIVE-1151.3.patch


 At a minimum this command should return the version information obtained
 from the hive-cli jar. Ideally this command will also return version 
 information
 obtained from each of the hive jar files present in the CLASSPATH, which
 will allow us to quickly detect cases where people are using incompatible
 jars.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3893) something wrong with the hive-default.xml

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557008#comment-13557008
 ] 

Hudson commented on HIVE-3893:
--

Integrated in Hive-trunk-h0.21 #1920 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1920/])
HIVE-3893 something wrong with the hive-default.xml
(jet cheng via namit) (Revision 1434811)

 Result = SUCCESS
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1434811
Files : 
* /hive/trunk/conf/hive-default.xml.template


 something wrong with the hive-default.xml
 -

 Key: HIVE-3893
 URL: https://issues.apache.org/jira/browse/HIVE-3893
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Affects Versions: 0.10.0
Reporter: jet cheng
 Fix For: 0.11.0

 Attachments: hive-3893.patch.txt


 in the line  482  in the hive-site.xml, there is no  matching end-tag for the 
 element type description ;
 The same mistake also appears in the line 561.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3898) getReducersBucketing in SemanticAnalyzer may return more than the max number of reducers

2013-01-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557009#comment-13557009
 ] 

Hudson commented on HIVE-3898:
--

Integrated in Hive-trunk-h0.21 #1920 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1920/])
HIVE-3898 getReducersBucketing in SemanticAnalyzer may return more than the
max number of reducers (Kevin Wilfong via namit) (Revision 1434623)

 Result = SUCCESS
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1434623
Files : 
* /hive/trunk/build-common.xml
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyNumReducersForBucketsHook.java
* 
/hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifyNumReducersHook.java
* /hive/trunk/ql/src/test/queries/clientpositive/bucket_num_reducers.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucket_num_reducers2.q
* /hive/trunk/ql/src/test/results/clientpositive/bucket_num_reducers2.q.out


 getReducersBucketing in SemanticAnalyzer may return more than the max number 
 of reducers
 

 Key: HIVE-3898
 URL: https://issues.apache.org/jira/browse/HIVE-3898
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.11.0

 Attachments: HIVE-3898.1.patch.txt, HIVE-3898.2.patch.txt


 getReducersBucketing rounds totalFiles / maxReducers down, when it should be 
 rounded up to the nearest integer.
 E.g. if totalFiles = 60 and maxReducers = 21, 
 totalFiles / maxReducers = 2
 totalFiles / 2 = 30
 So the job will get 30 reducers, more than maxReducers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-3825) Add Operator level Hooks

2013-01-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13557016#comment-13557016
 ] 

Namit Jain commented on HIVE-3825:
--

Can you recreate the patch ?

For some reason, build-common.xml is not applying cleanly and so I cannot run 
parallel tests.

 Add Operator level Hooks
 

 Key: HIVE-3825
 URL: https://issues.apache.org/jira/browse/HIVE-3825
 Project: Hive
  Issue Type: New Feature
Reporter: Pamela Vagata
Assignee: Pamela Vagata
Priority: Minor
 Attachments: HIVE-3825.2.patch.txt, HIVE-3825.3.patch.txt, 
 HIVE-3825.patch.4.txt, HIVE-3825.patch.5.txt, HIVE-3825.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira