date:20120905


[ 
https://issues.apache.org/jira/browse/HIVE-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449461#comment-13449461
 ] 

Navis commented on HIVE-3427:
-

@Ashutosh,
You are right. "build/ql/test/data/exports" directory is used by many 
tests(exim~, etc.). 
How about changing test directory "build/ql/test/data/exports" to 
"build/ql/test/data/exports/HIVE-3428" or something?


> Newly added test testCliDriver_metadata_export_drop is consistently failing 
> on trunk
> 
>
> Key: HIVE-3427
> URL: https://issues.apache.org/jira/browse/HIVE-3427
> Project: Hive
>  Issue Type: Test
>Affects Versions: 0.10.0
>Reporter: Ashutosh Chauhan
>Assignee: Navis
> Attachments: HIVE-3427.1.patch.txt
>
>
> I think its a new test which was added via HIVE-3068

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3306) SMBJoin/BucketMapJoin should be allowed only when join key expression is exactly matches with sort/cluster key


[ 
https://issues.apache.org/jira/browse/HIVE-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449427#comment-13449427
 ] 

Namit Jain commented on HIVE-3306:
--

+1

Running tests

> SMBJoin/BucketMapJoin should be allowed only when join key expression is 
> exactly matches with sort/cluster key
> --
>
> Key: HIVE-3306
> URL: https://issues.apache.org/jira/browse/HIVE-3306
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3306.1.patch.txt
>
>
> CREATE TABLE bucket_small (key int, value string) CLUSTERED BY (key) SORTED 
> BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket1outof4.txt' INTO TABLE 
> bucket_small;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket2outof4.txt' INTO TABLE 
> bucket_small;
> CREATE TABLE bucket_big (key int, value string) CLUSTERED BY (key) SORTED BY 
> (key) INTO 4 BUCKETS STORED AS TEXTFILE;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket1outof4.txt' INTO TABLE 
> bucket_big;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket2outof4.txt' INTO TABLE 
> bucket_big;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket3outof4.txt' INTO TABLE 
> bucket_big;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket4outof4.txt' INTO TABLE 
> bucket_big;
> select count(*) FROM bucket_small a JOIN bucket_big b ON a.key + a.key = 
> b.key;
> select /* + MAPJOIN(a) */ count(*) FROM bucket_small a JOIN bucket_big b ON 
> a.key + a.key = b.key;
> returns 116 (same) 
> But with BucketMapJoin or SMBJoin, it returns 61. But this should not be 
> allowed cause hash(a.key) != hash(a.key + a.key). 
> Bucket context should be utilized only with exact matching join expression 
> with sort/cluster key.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias

2012-09-05 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449425#comment-13449425
 ] 

Carl Steinbach commented on HIVE-3171:
--

+1. Will commit if tests pass.

> Bucketed sort merge join doesn't work when multiple files exist for small 
> alias
> ---
>
> Key: HIVE-3171
> URL: https://issues.apache.org/jira/browse/HIVE-3171
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>Assignee: Navis
>  Labels: bucketing, joins, partitioning
> Attachments: HIVE-3171.1.patch.txt, HIVE-3171.2.patch.txt
>
>
> Executing a query with the MAPJOIN hint and the bucketed sort merge join 
> optimizations enabled:
> {noformat}
> set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> set hive.optimize.bucketmapjoin = true;
> set hive.optimize.bucketmapjoin.sortedmerge = true;
> {noformat}
> works fine with partitioned tables if there is only one partition in the 
> table. However, if you add a second partition, Hive attempts to do a regular 
> map-side join which can fail because the tables are too large. Hive ought to 
> be able to still do the bucketed sort merge join with partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3436) Difference in exception string from native method causes script_pipe.q to fail on windows


[ 
https://issues.apache.org/jira/browse/HIVE-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449421#comment-13449421
 ] 

Ashutosh Chauhan commented on HIVE-3436:


+1 will commit if tests pass

>  Difference in exception string from native method causes script_pipe.q to 
> fail on windows
> --
>
> Key: HIVE-3436
> URL: https://issues.apache.org/jira/browse/HIVE-3436
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
> Attachments: HIVE-3436.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3323) ThriftSerde: Enable enum to string conversions


[ 
https://issues.apache.org/jira/browse/HIVE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449416#comment-13449416
 ] 

Ashutosh Chauhan commented on HIVE-3323:


+1

> ThriftSerde: Enable enum to string conversions
> --
>
> Key: HIVE-3323
> URL: https://issues.apache.org/jira/browse/HIVE-3323
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3323_enum_to_string.1.patch, 
> HIVE-3323_enum_to_string.2.patch, HIVE-3323_enum_to_string.3.patch, 
> HIVE-3323_enum_to_string.4.patch, HIVE-3323_enum_to_string.5.patch, 
> HIVE-3323_enum_to_string.6.patch
>
>
> When using serde-reported schemas with the ThriftDeserializer, Enum fields 
> are presented as {{struct}}
> Many users expect to work with the string values, which is both easier and 
> more meaningful as the string value communicates what is represented.
> Hive should provide a mechanism to optionally convert enum values to strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2999) Offline build is not working


 [ 
https://issues.apache.org/jira/browse/HIVE-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-2999:
---

Assignee: Navis

> Offline build is not working
> 
>
> Key: HIVE-2999
> URL: https://issues.apache.org/jira/browse/HIVE-2999
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
> Attachments: HIVE-2999.1.patch.txt, HIVE-2999.2.patch.txt
>
>
> It's fine without -Doffline=true option. But with offline option (ant 
> -Doffline=true clean package), it's failing with error message like this.
> {noformat}
> ivy-retrieve:
>  [echo] Project: common
> [ivy:retrieve] :: loading settings :: file = 
> /home/navis/apache/oss-hive/ivy/ivysettings.xml
> [ivy:retrieve] 
> [ivy:retrieve] :: problems summary ::
> [ivy:retrieve]  WARNINGS
> [ivy:retrieve]module not found: 
> org.apache.hadoop#hadoop-common;0.20.2
> [ivy:retrieve] local: tried
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-common/0.20.2/ivys/ivy.xml
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-common/0.20.2/jars/hadoop-common.jar
> [ivy:retrieve] apache-snapshot: tried
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] maven2: tried
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] datanucleus-repo: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://www.datanucleus.org/downloads/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] hadoop-source: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://mirror.facebook.net/facebook/hive-deps/hadoop/core/hadoop-common-0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] hadoop-source2: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://archive.cloudera.com/hive-deps/hadoop/core/hadoop-common-0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve]module not found: 
> org.apache.hadoop#hadoop-auth;0.20.2
> [ivy:retrieve] local: tried
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-auth/0.20.2/ivys/ivy.xml
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-auth/0.20.2/jars/hadoop-auth.jar
> [ivy:retrieve] apache-snapshot: tried
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.jar
> [ivy:retrieve] maven2: tried
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.jar
> [ivy:retrieve] datanucleus-repo: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> http://www.datanucleus.org/downloads/maven2/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.jar
> [ivy:retrieve] hadoop-source: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> http://mirror.facebook.net/facebook/hive-deps/hadoop/core/hadoop-auth-0.20.2/hadoop-aut

[jira] [Commented] (HIVE-2999) Offline build is not working


[ 
https://issues.apache.org/jira/browse/HIVE-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449408#comment-13449408
 ] 

Ashutosh Chauhan commented on HIVE-2999:


+1. Will commit if tests pass. 

Zhenxiao,
No worries. Its bit complicated w.r.t hadoop-0.23 deps. I am using following 
command to test and build against 0.23
{code}
ant clean package test -Dhadoop.mr.rev=23 -Dtest.print.classpath=true 
-Dhadoop.version=2.0.0-alpha -Dhadoop.security.version=2.0.0-alpha

{code} 

> Offline build is not working
> 
>
> Key: HIVE-2999
> URL: https://issues.apache.org/jira/browse/HIVE-2999
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.10.0
>Reporter: Navis
> Attachments: HIVE-2999.1.patch.txt, HIVE-2999.2.patch.txt
>
>
> It's fine without -Doffline=true option. But with offline option (ant 
> -Doffline=true clean package), it's failing with error message like this.
> {noformat}
> ivy-retrieve:
>  [echo] Project: common
> [ivy:retrieve] :: loading settings :: file = 
> /home/navis/apache/oss-hive/ivy/ivysettings.xml
> [ivy:retrieve] 
> [ivy:retrieve] :: problems summary ::
> [ivy:retrieve]  WARNINGS
> [ivy:retrieve]module not found: 
> org.apache.hadoop#hadoop-common;0.20.2
> [ivy:retrieve] local: tried
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-common/0.20.2/ivys/ivy.xml
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-common/0.20.2/jars/hadoop-common.jar
> [ivy:retrieve] apache-snapshot: tried
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] maven2: tried
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] datanucleus-repo: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://www.datanucleus.org/downloads/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] hadoop-source: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://mirror.facebook.net/facebook/hive-deps/hadoop/core/hadoop-common-0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] hadoop-source2: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://archive.cloudera.com/hive-deps/hadoop/core/hadoop-common-0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve]module not found: 
> org.apache.hadoop#hadoop-auth;0.20.2
> [ivy:retrieve] local: tried
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-auth/0.20.2/ivys/ivy.xml
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-auth/0.20.2/jars/hadoop-auth.jar
> [ivy:retrieve] apache-snapshot: tried
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.jar
> [ivy:retrieve] maven2: tried
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.jar
> [ivy:retrieve] datanucleus-repo: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> http://www.datanucleus.org/down

[jira] [Updated] (HIVE-3421) Column Level Top K Values Statistics

2012-09-05 Thread Feng Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Lu updated HIVE-3421:
--

Attachment: HIVE-3421.patch.3.txt

> Column Level Top K Values Statistics
> 
>
> Key: HIVE-3421
> URL: https://issues.apache.org/jira/browse/HIVE-3421
> Project: Hive
>  Issue Type: New Feature
>Reporter: Feng Lu
>Assignee: Feng Lu
> Attachments: HIVE-3421.patch.1.txt, HIVE-3421.patch.2.txt, 
> HIVE-3421.patch.3.txt, HIVE-3421.patch.txt
>
>
> Compute (estimate) top k values for each column, and put the most skewed 
> column into skewed info, if user hasn't specified skew.
> This feature depends on ListBucketing (create table skewed on) 
> https://cwiki.apache.org/Hive/listbucketing.html.
> All column topk can be added to skewed info, if in the future skewed info 
> supports multiple independent columns.
> The TopK algorithm is based on this paper:
> http://www.cs.ucsb.edu/research/tech_reports/reports/2005-23.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3427) Newly added test testCliDriver_metadata_export_drop is consistently failing on trunk


[ 
https://issues.apache.org/jira/browse/HIVE-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449398#comment-13449398
 ] 

Ashutosh Chauhan commented on HIVE-3427:


Navis,
Nice catch, but I am not sure if this will fix the issue. Since test fails 
while doing mkdir and not rmr. Looks like dir already exists before that tests 
begin to run (perhaps created by some other test, which didnt do rmr of it). We 
can either hunt down that offending test which is not rmr'ng the dir and update 
it or we can name this dir with some unique name, like exports_3068 so thats 
make it unlikely that dir already exist by the time this test runs. 
I am also running the full-suite with your current patch to test whether that 
fixes the build or not.

> Newly added test testCliDriver_metadata_export_drop is consistently failing 
> on trunk
> 
>
> Key: HIVE-3427
> URL: https://issues.apache.org/jira/browse/HIVE-3427
> Project: Hive
>  Issue Type: Test
>Affects Versions: 0.10.0
>Reporter: Ashutosh Chauhan
>Assignee: Navis
> Attachments: HIVE-3427.1.patch.txt
>
>
> I think its a new test which was added via HIVE-3068

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3306) SMBJoin/BucketMapJoin should be allowed only when join key expression is exactly matches with sort/cluster key


 [ 
https://issues.apache.org/jira/browse/HIVE-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3306:


Attachment: HIVE-3306.1.patch.txt

> SMBJoin/BucketMapJoin should be allowed only when join key expression is 
> exactly matches with sort/cluster key
> --
>
> Key: HIVE-3306
> URL: https://issues.apache.org/jira/browse/HIVE-3306
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3306.1.patch.txt
>
>
> CREATE TABLE bucket_small (key int, value string) CLUSTERED BY (key) SORTED 
> BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket1outof4.txt' INTO TABLE 
> bucket_small;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket2outof4.txt' INTO TABLE 
> bucket_small;
> CREATE TABLE bucket_big (key int, value string) CLUSTERED BY (key) SORTED BY 
> (key) INTO 4 BUCKETS STORED AS TEXTFILE;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket1outof4.txt' INTO TABLE 
> bucket_big;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket2outof4.txt' INTO TABLE 
> bucket_big;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket3outof4.txt' INTO TABLE 
> bucket_big;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket4outof4.txt' INTO TABLE 
> bucket_big;
> select count(*) FROM bucket_small a JOIN bucket_big b ON a.key + a.key = 
> b.key;
> select /* + MAPJOIN(a) */ count(*) FROM bucket_small a JOIN bucket_big b ON 
> a.key + a.key = b.key;
> returns 116 (same) 
> But with BucketMapJoin or SMBJoin, it returns 61. But this should not be 
> allowed cause hash(a.key) != hash(a.key + a.key). 
> Bucket context should be utilized only with exact matching join expression 
> with sort/cluster key.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2999) Offline build is not working


[ 
https://issues.apache.org/jira/browse/HIVE-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449368#comment-13449368
 ] 

Zhenxiao Luo commented on HIVE-2999:


@Navis and Ashutosh:
I think some of my patches are referencing hadoop-23 libraries. which might 
fail offline build. Sorry about that.
Which commands are you using to build/test against hadoop-23? I will double 
check for further patches.

Also, +1 for the submitted patch.

Thanks,
Zhenxiao

> Offline build is not working
> 
>
> Key: HIVE-2999
> URL: https://issues.apache.org/jira/browse/HIVE-2999
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.10.0
>Reporter: Navis
> Attachments: HIVE-2999.1.patch.txt, HIVE-2999.2.patch.txt
>
>
> It's fine without -Doffline=true option. But with offline option (ant 
> -Doffline=true clean package), it's failing with error message like this.
> {noformat}
> ivy-retrieve:
>  [echo] Project: common
> [ivy:retrieve] :: loading settings :: file = 
> /home/navis/apache/oss-hive/ivy/ivysettings.xml
> [ivy:retrieve] 
> [ivy:retrieve] :: problems summary ::
> [ivy:retrieve]  WARNINGS
> [ivy:retrieve]module not found: 
> org.apache.hadoop#hadoop-common;0.20.2
> [ivy:retrieve] local: tried
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-common/0.20.2/ivys/ivy.xml
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-common/0.20.2/jars/hadoop-common.jar
> [ivy:retrieve] apache-snapshot: tried
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] maven2: tried
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] datanucleus-repo: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://www.datanucleus.org/downloads/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] hadoop-source: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://mirror.facebook.net/facebook/hive-deps/hadoop/core/hadoop-common-0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] hadoop-source2: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://archive.cloudera.com/hive-deps/hadoop/core/hadoop-common-0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve]module not found: 
> org.apache.hadoop#hadoop-auth;0.20.2
> [ivy:retrieve] local: tried
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-auth/0.20.2/ivys/ivy.xml
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-auth/0.20.2/jars/hadoop-auth.jar
> [ivy:retrieve] apache-snapshot: tried
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.jar
> [ivy:retrieve] maven2: tried
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.jar
> [ivy:retrieve] datanucleus-repo: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> http://www.datanucleus.org/downloads/maven2/org/apache/hadoo

[jira] [Updated] (HIVE-3306) SMBJoin/BucketMapJoin should be allowed only when join key expression is exactly matches with sort/cluster key


 [ 
https://issues.apache.org/jira/browse/HIVE-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3306:


Status: Patch Available  (was: Open)

> SMBJoin/BucketMapJoin should be allowed only when join key expression is 
> exactly matches with sort/cluster key
> --
>
> Key: HIVE-3306
> URL: https://issues.apache.org/jira/browse/HIVE-3306
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
>
> CREATE TABLE bucket_small (key int, value string) CLUSTERED BY (key) SORTED 
> BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket1outof4.txt' INTO TABLE 
> bucket_small;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket2outof4.txt' INTO TABLE 
> bucket_small;
> CREATE TABLE bucket_big (key int, value string) CLUSTERED BY (key) SORTED BY 
> (key) INTO 4 BUCKETS STORED AS TEXTFILE;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket1outof4.txt' INTO TABLE 
> bucket_big;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket2outof4.txt' INTO TABLE 
> bucket_big;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket3outof4.txt' INTO TABLE 
> bucket_big;
> load data local inpath 
> '/home/navis/apache/oss-hive/data/files/srcsortbucket4outof4.txt' INTO TABLE 
> bucket_big;
> select count(*) FROM bucket_small a JOIN bucket_big b ON a.key + a.key = 
> b.key;
> select /* + MAPJOIN(a) */ count(*) FROM bucket_small a JOIN bucket_big b ON 
> a.key + a.key = b.key;
> returns 116 (same) 
> But with BucketMapJoin or SMBJoin, it returns 61. But this should not be 
> allowed cause hash(a.key) != hash(a.key + a.key). 
> Bucket context should be utilized only with exact matching join expression 
> with sort/cluster key.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1173) Partition pruner cancels pruning if non-deterministic function present in filtering expression only in joins is present in query


 [ 
https://issues.apache.org/jira/browse/HIVE-1173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-1173:


Status: Patch Available  (was: Open)

> Partition pruner cancels pruning if non-deterministic function present in 
> filtering expression only in joins is present in query
> 
>
> Key: HIVE-1173
> URL: https://issues.apache.org/jira/browse/HIVE-1173
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.4.1, 0.4.0, 0.10.0
>Reporter: Vladimir Klimontovich
>Assignee: Navis
>
> Brief description:
> case 1) non-deterministic present in partition condition, joins are present 
> in query => partition pruner doesn't do filtering of partitions based on 
> condition
> case 2) non-deterministic present in partition condition, joins aren't 
> present in query => partition pruner do filtering of partitions based on 
> condition
> It's quite illogical when pruning depends on presence of joins in query.
> Example:
> Let's consider following sequence of hive queries:
> 1) Create non-deterministic function:
> create temporary function UDF2 as 'UDF2';
> {{
> import org.apache.hadoop.hive.ql.exec.UDF;
> import org.apache.hadoop.hive.ql.udf.UDFType;
> @UDFType(deterministic=false)
>   public class UDF2 extends UDF {
>   public String evaluate(String val) {
>   return val;
>   }
>   }
> }}
> 2) Create tables
> CREATE TABLE Main (
>   a STRING,
>   b INT
> )
> PARTITIONED BY(part STRING)
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
> LINES TERMINATED BY '10'
> STORED AS TEXTFILE;
> ALTER TABLE Main ADD PARTITION (part="part1") LOCATION 
> "/hive-join-test/part1/";
> ALTER TABLE Main ADD PARTITION (part="part2") LOCATION 
> "/hive-join-test/part2/";
> CREATE TABLE Joined (
>   a STRING,
>   f STRING
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
> LINES TERMINATED BY '10'
> STORED AS TEXTFILE
> LOCATION '/hive-join-test/join/';
> 3) Run first query:
> select 
>   m.a,
>   m.b
> from Main m
> where
>   part > UDF2('part0') AND part = 'part1';
> The pruner will work for this query: 
> mapred.input.dir=hdfs://localhost:9000/hive-join-test/part1
> 4) Run second query (with join):
> select 
>   m.a,
>   j.a,
>   m.b
> from Main m
> join Joined j on
>   j.a=m.a
> where
>   part > UDF2('part0') AND part = 'part1';
> Pruner doesn't work: 
> mapred.input.dir=hdfs://localhost:9000/hive-join-test/part1,hdfs://localhost:9000/hive-join-test/part2,hdfs://localhost:9000/hive-join-test/join
> 5) Also lets try to run query with MAPJOIN hint
> select /*+MAPJOIN(j)*/ 
>   m.a,
>   j.a,
>   m.b
> from Main m
> join Joined j on
>   j.a=m.a
> where
>   part > UDF2('part0') AND part = 'part1';
> The result is the same, pruner doesn't work: 
> mapred.input.dir=hdfs://localhost:9000/hive-join-test/part1,hdfs://localhost:9000/hive-join-test/part2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3437) 0.23 compatibility: fix unit tests when building against 0.23

2012-09-05 Thread Chris Drome (JIRA)

Chris Drome created HIVE-3437:
-

 Summary: 0.23 compatibility: fix unit tests when building against 
0.23
 Key: HIVE-3437
 URL: https://issues.apache.org/jira/browse/HIVE-3437
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0, 0.9.1
Reporter: Chris Drome


Many unit tests fail as a result of building the code against hadoop 0.23. 
Initial focus will be to fix 0.9.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2999) Offline build is not working


 [ 
https://issues.apache.org/jira/browse/HIVE-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2999:


Attachment: HIVE-2999.2.patch.txt

serde and builtins also stated referencing hadoop-23 libraries. updated that.

> Offline build is not working
> 
>
> Key: HIVE-2999
> URL: https://issues.apache.org/jira/browse/HIVE-2999
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.10.0
>Reporter: Navis
> Attachments: HIVE-2999.1.patch.txt, HIVE-2999.2.patch.txt
>
>
> It's fine without -Doffline=true option. But with offline option (ant 
> -Doffline=true clean package), it's failing with error message like this.
> {noformat}
> ivy-retrieve:
>  [echo] Project: common
> [ivy:retrieve] :: loading settings :: file = 
> /home/navis/apache/oss-hive/ivy/ivysettings.xml
> [ivy:retrieve] 
> [ivy:retrieve] :: problems summary ::
> [ivy:retrieve]  WARNINGS
> [ivy:retrieve]module not found: 
> org.apache.hadoop#hadoop-common;0.20.2
> [ivy:retrieve] local: tried
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-common/0.20.2/ivys/ivy.xml
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-common/0.20.2/jars/hadoop-common.jar
> [ivy:retrieve] apache-snapshot: tried
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] maven2: tried
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] datanucleus-repo: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://www.datanucleus.org/downloads/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] hadoop-source: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://mirror.facebook.net/facebook/hive-deps/hadoop/core/hadoop-common-0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve] hadoop-source2: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar:
> [ivy:retrieve]  
> http://archive.cloudera.com/hive-deps/hadoop/core/hadoop-common-0.20.2/hadoop-common-0.20.2.jar
> [ivy:retrieve]module not found: 
> org.apache.hadoop#hadoop-auth;0.20.2
> [ivy:retrieve] local: tried
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-auth/0.20.2/ivys/ivy.xml
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> /home/navis/.ivy2/local/org.apache.hadoop/hadoop-auth/0.20.2/jars/hadoop-auth.jar
> [ivy:retrieve] apache-snapshot: tried
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.jar
> [ivy:retrieve] maven2: tried
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.pom
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.jar
> [ivy:retrieve] datanucleus-repo: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> http://www.datanucleus.org/downloads/maven2/org/apache/hadoop/hadoop-auth/0.20.2/hadoop-auth-0.20.2.jar
> [ivy:retrieve] hadoop-source: tried
> [ivy:retrieve]  -- artifact 
> org.apache.hadoop#hadoop-auth;0.20.2!hadoop-auth.jar:
> [ivy:retrieve]  
> http://mirror.facebook.net/fa

[jira] [Updated] (HIVE-3431) Resources on non-local file system should be downloaded to temporary directory sometimes


 [ 
https://issues.apache.org/jira/browse/HIVE-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3431:


Attachment: HIVE-3431.1.patch.txt

> Resources on non-local file system should be downloaded to temporary 
> directory sometimes
> 
>
> Key: HIVE-3431
> URL: https://issues.apache.org/jira/browse/HIVE-3431
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
> Attachments: HIVE-3431.1.patch.txt
>
>
> "add resource " command downloads the resource file to location 
> specified by conf "hive.downloaded.resources.dir" in local file system. But 
> when the command above is executed concurrently to hive-server for same file, 
> some client fails by VM crash, which is caused by overwritten file by other 
> requests.
> So there should be a configuration to provide per request location for add 
> resource command, something like "set 
> hiveconf:hive.downloaded.resources.dir=temporary"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3421) Column Level Top K Values Statistics

2012-09-05 Thread Feng Lu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Feng Lu updated HIVE-3421:
--

Attachment: HIVE-3421.patch.2.txt

> Column Level Top K Values Statistics
> 
>
> Key: HIVE-3421
> URL: https://issues.apache.org/jira/browse/HIVE-3421
> Project: Hive
>  Issue Type: New Feature
>Reporter: Feng Lu
>Assignee: Feng Lu
> Attachments: HIVE-3421.patch.1.txt, HIVE-3421.patch.2.txt, 
> HIVE-3421.patch.txt
>
>
> Compute (estimate) top k values for each column, and put the most skewed 
> column into skewed info, if user hasn't specified skew.
> This feature depends on ListBucketing (create table skewed on) 
> https://cwiki.apache.org/Hive/listbucketing.html.
> All column topk can be added to skewed info, if in the future skewed info 
> supports multiple independent columns.
> The TopK algorithm is based on this paper:
> http://www.cs.ucsb.edu/research/tech_reports/reports/2005-23.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3431) Resources on non-local file system should be downloaded to temporary directory sometimes


 [ 
https://issues.apache.org/jira/browse/HIVE-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3431:


Status: Patch Available  (was: Open)

https://reviews.facebook.net/D5199

> Resources on non-local file system should be downloaded to temporary 
> directory sometimes
> 
>
> Key: HIVE-3431
> URL: https://issues.apache.org/jira/browse/HIVE-3431
> Project: Hive
>  Issue Type: Improvement
>  Components: Configuration
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Trivial
>
> "add resource " command downloads the resource file to location 
> specified by conf "hive.downloaded.resources.dir" in local file system. But 
> when the command above is executed concurrently to hive-server for same file, 
> some client fails by VM crash, which is caused by overwritten file by other 
> requests.
> So there should be a configuration to provide per request location for add 
> resource command, something like "set 
> hiveconf:hive.downloaded.resources.dir=temporary"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3315) Propagate filers on inner join condition transitively


 [ 
https://issues.apache.org/jira/browse/HIVE-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3315:


Attachment: HIVE-3315.3.patch.txt

> Propagate filers on inner join condition transitively 
> --
>
> Key: HIVE-3315
> URL: https://issues.apache.org/jira/browse/HIVE-3315
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3315.1.patch.txt, HIVE-3315.2.patch.txt, 
> HIVE-3315.3.patch.txt
>
>
> explain select src1.key from src src1 join src src2 on src1.key=src2.key and 
> src1.key < 100;
> In this case, filter on join condition src1.key < 100 can be propagated 
> transitively to src2 by src2.key < 100. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3315) Propagate filers on inner join condition transitively


 [ 
https://issues.apache.org/jira/browse/HIVE-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3315:


Status: Patch Available  (was: Open)

Fixed bug

> Propagate filers on inner join condition transitively 
> --
>
> Key: HIVE-3315
> URL: https://issues.apache.org/jira/browse/HIVE-3315
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3315.1.patch.txt, HIVE-3315.2.patch.txt
>
>
> explain select src1.key from src src1 join src src2 on src1.key=src2.key and 
> src1.key < 100;
> In this case, filter on join condition src1.key < 100 can be propagated 
> transitively to src2 by src2.key < 100. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3427) Newly added test testCliDriver_metadata_export_drop is consistently failing on trunk


 [ 
https://issues.apache.org/jira/browse/HIVE-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3427:


Attachment: HIVE-3427.1.patch.txt

> Newly added test testCliDriver_metadata_export_drop is consistently failing 
> on trunk
> 
>
> Key: HIVE-3427
> URL: https://issues.apache.org/jira/browse/HIVE-3427
> Project: Hive
>  Issue Type: Test
>Affects Versions: 0.10.0
>Reporter: Ashutosh Chauhan
>Assignee: Navis
> Attachments: HIVE-3427.1.patch.txt
>
>
> I think its a new test which was added via HIVE-3068

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias


 [ 
https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3171:


Attachment: HIVE-3171.2.patch.txt

> Bucketed sort merge join doesn't work when multiple files exist for small 
> alias
> ---
>
> Key: HIVE-3171
> URL: https://issues.apache.org/jira/browse/HIVE-3171
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>Assignee: Navis
>  Labels: bucketing, joins, partitioning
> Attachments: HIVE-3171.1.patch.txt, HIVE-3171.2.patch.txt
>
>
> Executing a query with the MAPJOIN hint and the bucketed sort merge join 
> optimizations enabled:
> {noformat}
> set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> set hive.optimize.bucketmapjoin = true;
> set hive.optimize.bucketmapjoin.sortedmerge = true;
> {noformat}
> works fine with partitioned tables if there is only one partition in the 
> table. However, if you add a second partition, Hive attempts to do a regular 
> map-side join which can fail because the tables are too large. Hive ought to 
> be able to still do the bucketed sort merge join with partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias


 [ 
https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3171:


Status: Patch Available  (was: Open)

Passed all tests

> Bucketed sort merge join doesn't work when multiple files exist for small 
> alias
> ---
>
> Key: HIVE-3171
> URL: https://issues.apache.org/jira/browse/HIVE-3171
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>Assignee: Navis
>  Labels: bucketing, joins, partitioning
> Attachments: HIVE-3171.1.patch.txt
>
>
> Executing a query with the MAPJOIN hint and the bucketed sort merge join 
> optimizations enabled:
> {noformat}
> set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> set hive.optimize.bucketmapjoin = true;
> set hive.optimize.bucketmapjoin.sortedmerge = true;
> {noformat}
> works fine with partitioned tables if there is only one partition in the 
> table. However, if you add a second partition, Hive attempts to do a regular 
> map-side join which can fail because the tables are too large. Hive ought to 
> be able to still do the bucketed sort merge join with partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3340) shims unit test failures fails further test progress

2012-09-05 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449293#comment-13449293
 ] 

Hudson commented on HIVE-3340:
--

Integrated in Hive-trunk-h0.21 #1650 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1650/])
HIVE-3340 : shims unit test failures fails further test progress 
(Giridharan Kesavan via Ashutosh Chauhan) (Revision 1381250)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381250
Files : 
* /hive/trunk/build.properties
* /hive/trunk/build.xml


> shims unit test failures fails further test progress
> 
>
> Key: HIVE-3340
> URL: https://issues.apache.org/jira/browse/HIVE-3340
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Giridharan Kesavan
>Assignee: Giridharan Kesavan
> Fix For: 0.10.0
>
> Attachments: HIVE-3340.patch
>
>
> enable failonerror flag so that unit test's can continue on even when shims 
> unit test fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Hive-trunk-h0.21 - Build # 1650 - Still Failing

Changes for Build #1638
[namit] HIVE-3393 get_json_object and json_tuple should use Jackson library
(Kevin Wilfong via namit)


Changes for Build #1639

Changes for Build #1640
[ecapriolo] HIVE-3068 Export table metadata as JSON on table drop (Andrew 
Chalfant via egc)


Changes for Build #1641

Changes for Build #1642
[hashutosh] HIVE-3338 : Archives broken for hadoop 1.0 (Vikram Dixit via 
Ashutosh Chauhan)


Changes for Build #1643

Changes for Build #1644

Changes for Build #1645
[cws] HIVE-3413. Fix pdk.PluginTest on hadoop23 (Zhenxiao Luo via cws)


Changes for Build #1646
[cws] HIVE-3056. Ability to bulk update location field in Db/Table/Partition 
records (Shreepadma Venugopalan via cws)

[cws] HIVE-3416 [jira] Fix 
TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS when running Hive on 
hadoop23
(Zhenxiao Luo via Carl Steinbach)

Summary:
HIVE-3416: Fix TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS when 
running Hive on hadoop23

TestAvroSerdeUtils determinSchemaCanReadSchemaFromHDFS is failing when running 
hive on hadoop23:

$ant very-clean package -Dhadoop.version=0.23.1 -Dhadoop-0.23.version=0.23.1 
-Dhadoop.mr.rev=23

$ant test -Dhadoop.version=0.23.1 -Dhadoop-0.23.version=0.23.1 
-Dhadoop.mr.rev=23 -Dtestcase=TestAvroSerdeUtils

 
java.lang.NoClassDefFoundError: 
org/apache/hadoop/net/StaticMapping
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:534)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:489)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:360)
at 
org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS(TestAvroSerdeUtils.java:187)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.net.StaticMapping
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
... 25 more

  

Test Plan: EMPTY

Reviewers: JIRA

Differential Revision: https://reviews.facebook.net/D5025

[cws] HIVE-3424. Error by upgrading a Hive 0.7.0 database to 0.8.0 
(008-HIVE-2246.mysql.sql) (Alexander Alten-Lorenz via cws)

[cws] HIVE-3412. Fix TestCliDriver.repair on Hadoop 0.23.3, 3.0.0, and 
2.2.0-alpha (Zhenxiao Luo via cws)


Changes for Build #1647

Changes for Build #1648
[namit] HIVE-3429 Bucket map join involving table with more than 1 partition 
column causes 
FileNotFoundException (Kevin Wilfong via namit)


Changes for Build #1649
[hashutosh] HIVE-3075 : Improve HiveMetaStore logging (Travis Crawford via 
Ashutosh Chauhan)


Changes for Build #1650
[hashutosh] HIVE-3340 : shims unit test failures fails further test progress 
(Giridharan Kesavan via Ashutosh Chauhan)




1 tests failed.
FAILED:  
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_metadata_export_drop

Error Message:
Unexpected exception See build/ql/tmp/hive.log, or try "ant test ... 
-Dtest.silent=false" to get more logs.

Stack Trace:
junit.framework.AssertionFaile

Re: Review Request: HIVE-3323 ThriftSerde: Enable enum to string conversions

2012-09-05 Thread Travis Crawford

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6915/
---

(Updated Sept. 6, 2012, 12:12 a.m.)

Review request for hive and Ashutosh Chauhan.

Changes
---

This patch makes enum-to-string conversion mandatory, removing the config
option. The patch is much cleaner now.

I just started CI at
https://travis.ci.cloudbees.com/job/HIVE-3323_enum_to_string/ which will take
6-7 hours. If that passes I'll post the patch in the Jira.

Description
---

ThriftSerde: Enable enum to string conversions

This addresses bug HIVE-3323.
https://issues.apache.org/jira/browse/HIVE-3323

Diffs (updated)
-

ql/src/test/queries/clientpositive/convert_enum_to_string.q PRE-CREATION
ql/src/test/results/clientpositive/convert_enum_to_string.q.out PRE-CREATION
serde/if/test/megastruct.thrift PRE-CREATION

serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MegaStruct.java
PRE-CREATION

serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MiniStruct.java
PRE-CREATION

serde/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/serde2/thrift/test/MyEnum.java
PRE-CREATION

serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorFactory.java
b21755e

serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/JavaStringObjectInspector.java
921ce2b

Diff: https://reviews.apache.org/r/6915/diff/

Testing
---

Running CI now after rebasing to master and changing the default to enabled.
Some preliminary feedback would be great though

https://travis.ci.cloudbees.com/job/HIVE-3323_enum_to_string/10/

To test, I added a new struct that contains an enum field, we check that its
schema is correctly described, and that this property can be enable/disabled at
runtime.

Something I'm not clear on with Hive is how to write more comprehensive tests
that involved more than just ql commands. For example, take a look at:

http://svn.apache.org/viewvc/incubator/hcatalog/trunk/src/test/org/apache/hcatalog/mapreduce/TestHCatHiveThriftCompatibility.java?view=markup

Here we see an example junit test I wrote that creates a file containing thrift
structs, creates the table, checks its schema, and ensures the query returns
expected output. With the Hive test suite all I add here are ql commands that
check the schema, since I'm not sure how to do the test setup. I'm more than
happy to add a more comprehensive test but would appreciate some guidance to do
that correctly.

Thanks,

Travis Crawford

[jira] [Updated] (HIVE-3427) Newly added test testCliDriver_metadata_export_drop is consistently failing on trunk


 [ 
https://issues.apache.org/jira/browse/HIVE-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3427:


Assignee: Navis
  Status: Patch Available  (was: Open)

https://reviews.facebook.net/D5193

The test missed semicolon in last line, which cleans-up the directory.

> Newly added test testCliDriver_metadata_export_drop is consistently failing 
> on trunk
> 
>
> Key: HIVE-3427
> URL: https://issues.apache.org/jira/browse/HIVE-3427
> Project: Hive
>  Issue Type: Test
>Affects Versions: 0.10.0
>Reporter: Ashutosh Chauhan
>Assignee: Navis
>
> I think its a new test which was added via HIVE-3068

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias


[ 
https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449275#comment-13449275
 ] 

Navis commented on HIVE-3171:
-

I've addressed comments and just finished full test. Will update patch shortly.

> Bucketed sort merge join doesn't work when multiple files exist for small 
> alias
> ---
>
> Key: HIVE-3171
> URL: https://issues.apache.org/jira/browse/HIVE-3171
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>Assignee: Navis
>  Labels: bucketing, joins, partitioning
> Attachments: HIVE-3171.1.patch.txt
>
>
> Executing a query with the MAPJOIN hint and the bucketed sort merge join 
> optimizations enabled:
> {noformat}
> set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> set hive.optimize.bucketmapjoin = true;
> set hive.optimize.bucketmapjoin.sortedmerge = true;
> {noformat}
> works fine with partitioned tables if there is only one partition in the 
> table. However, if you add a second partition, Hive attempts to do a regular 
> map-side join which can fail because the tables are too large. Hive ought to 
> be able to still do the bucketed sort merge join with partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3436) Difference in exception string from native method causes script_pipe.q to fail on windows

2012-09-05 Thread Thejas M Nair (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-3436:


Attachment: HIVE-3436.1.patch

HIVE-3436.1.patch - Patch checks for windows error message if OS is windows, to 
see if it was a broken pipe exception.

>  Difference in exception string from native method causes script_pipe.q to 
> fail on windows
> --
>
> Key: HIVE-3436
> URL: https://issues.apache.org/jira/browse/HIVE-3436
> Project: Hive
>  Issue Type: Bug
>Reporter: Thejas M Nair
> Attachments: HIVE-3436.1.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2999) Offline build is not working


[ 
https://issues.apache.org/jira/browse/HIVE-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449181#comment-13449181
 ] 

Ashutosh Chauhan commented on HIVE-2999:


Navis,
You are not alone. I am also the one :) I tested your patch, but still cant 
build with -Doffline=true. Attaching the log. There are few build related 
changes recently, that might have affected this. If you update your patch, I 
will try to get this in.
{code}
init:
 [echo] Project: serde

ivy-init-settings:
 [echo] Project: serde

ivy-resolve:

ivy-retrieve:
 [echo] Project: serde
[ivy:retrieve] :: loading settings :: file = 
/Users/ashutosh/workspace/hive-trunk/ivy/ivysettings.xml
[ivy:retrieve] 
[ivy:retrieve] :: problems summary ::
[ivy:retrieve]  WARNINGS
[ivy:retrieve]  module not found: org.apache.hadoop#hadoop-common;0.20.2
[ivy:retrieve]   local: tried
[ivy:retrieve]
/Users/ashutosh/.ivy2/local/org.apache.hadoop/hadoop-common/0.20.2/ivys/ivy.xml
[ivy:retrieve]-- artifact 
org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar(tests):
[ivy:retrieve]
/Users/ashutosh/.ivy2/local/org.apache.hadoop/hadoop-common/0.20.2/testss/hadoop-common.jar
[ivy:retrieve]   apache-snapshot: tried
[ivy:retrieve]
https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.pom
[ivy:retrieve]-- artifact 
org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar(tests):
[ivy:retrieve]
https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2-tests.jar
[ivy:retrieve]   maven2: tried
[ivy:retrieve]
http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.pom
[ivy:retrieve]-- artifact 
org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar(tests):
[ivy:retrieve]
http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2-tests.jar
[ivy:retrieve]   datanucleus-repo: tried
[ivy:retrieve]-- artifact 
org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar(tests):
[ivy:retrieve]
http://www.datanucleus.org/downloads/maven2/org/apache/hadoop/hadoop-common/0.20.2/hadoop-common-0.20.2.jar
[ivy:retrieve]   sourceforge: tried
[ivy:retrieve]-- artifact 
org.apache.hadoop#hadoop-common;0.20.2!hadoop-common.jar(tests):
[ivy:retrieve]
http://www.sourceforge.net/projects/hadoop-common/files/hadoop-common//hadoop-common-0.20.2.jar
[ivy:retrieve]  module not found: org.apache.hadoop#hadoop-hdfs;0.20.2
[ivy:retrieve]   local: tried
[ivy:retrieve]
/Users/ashutosh/.ivy2/local/org.apache.hadoop/hadoop-hdfs/0.20.2/ivys/ivy.xml
[ivy:retrieve]-- artifact 
org.apache.hadoop#hadoop-hdfs;0.20.2!hadoop-hdfs.jar(tests):
[ivy:retrieve]
/Users/ashutosh/.ivy2/local/org.apache.hadoop/hadoop-hdfs/0.20.2/testss/hadoop-hdfs.jar
[ivy:retrieve]   apache-snapshot: tried
[ivy:retrieve]
https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-hdfs/0.20.2/hadoop-hdfs-0.20.2.pom
[ivy:retrieve]-- artifact 
org.apache.hadoop#hadoop-hdfs;0.20.2!hadoop-hdfs.jar(tests):
[ivy:retrieve]
https://repository.apache.org/content/repositories/snapshots/org/apache/hadoop/hadoop-hdfs/0.20.2/hadoop-hdfs-0.20.2-tests.jar
[ivy:retrieve]   maven2: tried
[ivy:retrieve]
http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-hdfs/0.20.2/hadoop-hdfs-0.20.2.pom
[ivy:retrieve]-- artifact 
org.apache.hadoop#hadoop-hdfs;0.20.2!hadoop-hdfs.jar(tests):
[ivy:retrieve]
http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-hdfs/0.20.2/hadoop-hdfs-0.20.2-tests.jar
[ivy:retrieve]   datanucleus-repo: tried
[ivy:retrieve]-- artifact 
org.apache.hadoop#hadoop-hdfs;0.20.2!hadoop-hdfs.jar(tests):
[ivy:retrieve]
http://www.datanucleus.org/downloads/maven2/org/apache/hadoop/hadoop-hdfs/0.20.2/hadoop-hdfs-0.20.2.jar
[ivy:retrieve]   sourceforge: tried
[ivy:retrieve]-- artifact 
org.apache.hadoop#hadoop-hdfs;0.20.2!hadoop-hdfs.jar(tests):
[ivy:retrieve]
http://www.sourceforge.net/projects/hadoop-hdfs/files/hadoop-hdfs//hadoop-hdfs-0.20.2.jar
[ivy:retrieve]  ::
[ivy:retrieve]  ::  UNRESOLVED DEPENDENCIES ::
[ivy:retrieve]  ::
[ivy:retrieve]  :: org.apache.hadoop#hadoop-common;0.20.2: not found
[ivy:retrieve]  :: org.apache.hadoop#hadoop-hdfs;0.20.2: not found
[ivy:retrieve]  ::
[ivy:retrieve] 
[ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS

BUILD FAILED

{code}

> Offline build is not working
> 
>
> Key: HIVE-2999
> URL: https://issues.apache.org/jira/browse/HIVE-2999
>

[jira] [Created] (HIVE-3436) Difference in exception string from native method causes script_pipe.q to fail on windows

2012-09-05 Thread Thejas M Nair (JIRA)

Thejas M Nair created HIVE-3436:
---

 Summary:  Difference in exception string from native method causes 
script_pipe.q to fail on windows
 Key: HIVE-3436
 URL: https://issues.apache.org/jira/browse/HIVE-3436
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3435) Get pdk pluginTest passed when triggered from both builtin tests and pdk tests on hadoop23


 [ 
https://issues.apache.org/jira/browse/HIVE-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo updated HIVE-3435:
---

Attachment: HIVE-3435.1.patch.txt

> Get pdk pluginTest passed when triggered from both builtin tests and pdk 
> tests on hadoop23 
> ---
>
> Key: HIVE-3435
> URL: https://issues.apache.org/jira/browse/HIVE-3435
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
> Attachments: HIVE-3435.1.patch.txt
>
>
> Hive pdk pluginTest is running twice in unit testing, one is triggered from 
> running builtin tests, another is triggered from running pdk tests.
> HIVE-3413 fixed pdk pluginTest on hadoop23 when triggered from running 
> builtin tests. While, when triggered from running pdk tests directly on 
> hadoop23, it is failing:
> Testcase: SELECT tp_rot13('Mixed Up!') FROM onerow; took 6.426 sec
> FAILED
> expected:<[]Zvkrq Hc!> but was:<[2012-09-04 18:13:01,668 WARN [main] 
> conf.HiveConf (HiveConf.java:(73)) - hive-site.xml not found on 
> CLASSPATH
> ]Zvkrq Hc!>
> junit.framework.ComparisonFailure: expected:<[]Zvkrq Hc!> but 
> was:<[2012-09-04 18:13:01,668 WARN [main] conf.HiveConf 
> (HiveConf.java:(73)) - hive-site.xml not found on CLASSPATH
> ]Zvkrq Hc!>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3435) Get pdk pluginTest passed when triggered from both builtin tests and pdk tests on hadoop23


[ 
https://issues.apache.org/jira/browse/HIVE-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449179#comment-13449179
 ] 

Zhenxiao Luo commented on HIVE-3435:


Review Request submitted at:
https://reviews.facebook.net/D5187

> Get pdk pluginTest passed when triggered from both builtin tests and pdk 
> tests on hadoop23 
> ---
>
> Key: HIVE-3435
> URL: https://issues.apache.org/jira/browse/HIVE-3435
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
> Attachments: HIVE-3435.1.patch.txt
>
>
> Hive pdk pluginTest is running twice in unit testing, one is triggered from 
> running builtin tests, another is triggered from running pdk tests.
> HIVE-3413 fixed pdk pluginTest on hadoop23 when triggered from running 
> builtin tests. While, when triggered from running pdk tests directly on 
> hadoop23, it is failing:
> Testcase: SELECT tp_rot13('Mixed Up!') FROM onerow; took 6.426 sec
> FAILED
> expected:<[]Zvkrq Hc!> but was:<[2012-09-04 18:13:01,668 WARN [main] 
> conf.HiveConf (HiveConf.java:(73)) - hive-site.xml not found on 
> CLASSPATH
> ]Zvkrq Hc!>
> junit.framework.ComparisonFailure: expected:<[]Zvkrq Hc!> but 
> was:<[2012-09-04 18:13:01,668 WARN [main] conf.HiveConf 
> (HiveConf.java:(73)) - hive-site.xml not found on CLASSPATH
> ]Zvkrq Hc!>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3435) Get pdk pluginTest passed when triggered from both builtin tests and pdk tests on hadoop23


 [ 
https://issues.apache.org/jira/browse/HIVE-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo updated HIVE-3435:
---

Status: Patch Available  (was: Open)

> Get pdk pluginTest passed when triggered from both builtin tests and pdk 
> tests on hadoop23 
> ---
>
> Key: HIVE-3435
> URL: https://issues.apache.org/jira/browse/HIVE-3435
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
> Attachments: HIVE-3435.1.patch.txt
>
>
> Hive pdk pluginTest is running twice in unit testing, one is triggered from 
> running builtin tests, another is triggered from running pdk tests.
> HIVE-3413 fixed pdk pluginTest on hadoop23 when triggered from running 
> builtin tests. While, when triggered from running pdk tests directly on 
> hadoop23, it is failing:
> Testcase: SELECT tp_rot13('Mixed Up!') FROM onerow; took 6.426 sec
> FAILED
> expected:<[]Zvkrq Hc!> but was:<[2012-09-04 18:13:01,668 WARN [main] 
> conf.HiveConf (HiveConf.java:(73)) - hive-site.xml not found on 
> CLASSPATH
> ]Zvkrq Hc!>
> junit.framework.ComparisonFailure: expected:<[]Zvkrq Hc!> but 
> was:<[2012-09-04 18:13:01,668 WARN [main] conf.HiveConf 
> (HiveConf.java:(73)) - hive-site.xml not found on CLASSPATH
> ]Zvkrq Hc!>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3435) Get pdk pluginTest passed when triggered from both builtin tests and pdk tests on hadoop23


[ 
https://issues.apache.org/jira/browse/HIVE-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449176#comment-13449176
 ] 

Zhenxiao Luo commented on HIVE-3435:


The command I am using to build/test hive on hadoop23 is:

$ant very-clean package -Dhadoop.version=0.23.1 -Dhadoop-0.23.version=0.23.1 
-Dhadoop.mr.rev=23

$ant test -Dhadoop.version=0.23.1 -Dhadoop-0.23.version=0.23.1 
-Dhadoop.mr.rev=23 -Dtest.continue.on.failure=false

The second line will run test on hadoop23. We need to monitor both builtin 
tests passed and pdk tests passed.

> Get pdk pluginTest passed when triggered from both builtin tests and pdk 
> tests on hadoop23 
> ---
>
> Key: HIVE-3435
> URL: https://issues.apache.org/jira/browse/HIVE-3435
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
>
> Hive pdk pluginTest is running twice in unit testing, one is triggered from 
> running builtin tests, another is triggered from running pdk tests.
> HIVE-3413 fixed pdk pluginTest on hadoop23 when triggered from running 
> builtin tests. While, when triggered from running pdk tests directly on 
> hadoop23, it is failing:
> Testcase: SELECT tp_rot13('Mixed Up!') FROM onerow; took 6.426 sec
> FAILED
> expected:<[]Zvkrq Hc!> but was:<[2012-09-04 18:13:01,668 WARN [main] 
> conf.HiveConf (HiveConf.java:(73)) - hive-site.xml not found on 
> CLASSPATH
> ]Zvkrq Hc!>
> junit.framework.ComparisonFailure: expected:<[]Zvkrq Hc!> but 
> was:<[2012-09-04 18:13:01,668 WARN [main] conf.HiveConf 
> (HiveConf.java:(73)) - hive-site.xml not found on CLASSPATH
> ]Zvkrq Hc!>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3435) Get pdk pluginTest passed when triggered from both builtin tests and pdk tests on hadoop23


[ 
https://issues.apache.org/jira/browse/HIVE-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449174#comment-13449174
 ] 

Zhenxiao Luo commented on HIVE-3435:


The problems are similar to HIVE-3413:

1. missing dependency
   hadoop-minicluster needs to be added to pdk/ivy.xml as well.
   In pdk/build.xml, the test target needs to depend on "init, setup, 
ivy-retrieve, ivy-retrieve-test", so that when running pdk tests directly, the 
pdk/ivy.xml dependency will be resolved firstly.

2. configuration files need to be added
   log4j.properties and hive-site.xml need to be added to get rid of the 
following warning nessage:

[junit] 2012-08-28 18:13:01,668 WARN [main] conf.HiveConf 
(HiveConf.java:(73)) - hive-site.xml not found on CLASSPATH
[junit] 2012-08-28 19:05:20,679 WARN [main] conf.Configuration 
(Configuration.java:loadProperty(1621)) - 
file:/tmp/cloudera/hive_2012-08-28_19-05-17_531_4347419252405007581/-local-10002/jobconf.xml:an
 attempt to override final parameter: 
mapreduce.job.end-notification.max.retry.interval; Ignoring.
[junit] 2012-08-28 19:05:20,680 WARN [main] conf.Configuration 
(Configuration.java:loadProperty(1621)) - 
file:/tmp/cloudera/hive_2012-08-28_19-05-17_531_4347419252405007581/-local-10002/jobconf.xml:an
 attempt to override final parameter: 
mapreduce.job.end-notification.max.attempts; Ignoring.
   
These configuration files also needs to be put in a place where both 
builtin tests and pdk tests could load(due to property overload, 
pdk/test-plugin/test/conf could not be loaded by pdk, my plan is to update the 
place to pdk/scripts/conf).

3. HADOOP_ROOT_LOGGER needs to be set in pdk/scripts/build-plugin.xml
   If not set, testutils/hadoop would initialize it to be "INFO,console", which 
would print log4j warning message on the console, and is not expected.

> Get pdk pluginTest passed when triggered from both builtin tests and pdk 
> tests on hadoop23 
> ---
>
> Key: HIVE-3435
> URL: https://issues.apache.org/jira/browse/HIVE-3435
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Fix For: 0.10.0
>
>
> Hive pdk pluginTest is running twice in unit testing, one is triggered from 
> running builtin tests, another is triggered from running pdk tests.
> HIVE-3413 fixed pdk pluginTest on hadoop23 when triggered from running 
> builtin tests. While, when triggered from running pdk tests directly on 
> hadoop23, it is failing:
> Testcase: SELECT tp_rot13('Mixed Up!') FROM onerow; took 6.426 sec
> FAILED
> expected:<[]Zvkrq Hc!> but was:<[2012-09-04 18:13:01,668 WARN [main] 
> conf.HiveConf (HiveConf.java:(73)) - hive-site.xml not found on 
> CLASSPATH
> ]Zvkrq Hc!>
> junit.framework.ComparisonFailure: expected:<[]Zvkrq Hc!> but 
> was:<[2012-09-04 18:13:01,668 WARN [main] conf.HiveConf 
> (HiveConf.java:(73)) - hive-site.xml not found on CLASSPATH
> ]Zvkrq Hc!>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3141) Bug in SELECT query

2012-09-05 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449167#comment-13449167
 ] 

Carl Steinbach commented on HIVE-3141:
--

Syntax checks like this don't belong in CliDriver because other clients (e.g. 
Thrift) bypass this logic. We should either modify the grammar to find problems 
like this, or else do the same check in the SemanticAnalyzer.

> Bug in SELECT query
> ---
>
> Key: HIVE-3141
> URL: https://issues.apache.org/jira/browse/HIVE-3141
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.9.0
> Environment: OS: Ubuntu 
> Hive version: hive-0.7.1-cdh3u2
> Hadoop : hadoop-0.20.2
>Reporter: ASK
>Priority: Minor
>  Labels: patch
> Attachments: HIVE-3141.2.patch.txt, Hive_bug_3141_resolution.pdf, 
> select_syntax.q, select_syntax.q.out
>
>
> When i try to execute select *(followed by any alphanumeric character) from 
> table , query is throwing some issues. It display the result for select *
> It doesnot happen when only numbers follow the *

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3141) Bug in SELECT query

2012-09-05 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3141:
-

Status: Open  (was: Patch Available)

> Bug in SELECT query
> ---
>
> Key: HIVE-3141
> URL: https://issues.apache.org/jira/browse/HIVE-3141
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.9.0
> Environment: OS: Ubuntu 
> Hive version: hive-0.7.1-cdh3u2
> Hadoop : hadoop-0.20.2
>Reporter: ASK
>Priority: Minor
>  Labels: patch
> Attachments: HIVE-3141.2.patch.txt, Hive_bug_3141_resolution.pdf, 
> select_syntax.q, select_syntax.q.out
>
>
> When i try to execute select *(followed by any alphanumeric character) from 
> table , query is throwing some issues. It display the result for select *
> It doesnot happen when only numbers follow the *

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3435) Get pdk pluginTest passed when triggered from both builtin tests and pdk tests on hadoop23

Zhenxiao Luo created HIVE-3435:
--

 Summary: Get pdk pluginTest passed when triggered from both 
builtin tests and pdk tests on hadoop23 
 Key: HIVE-3435
 URL: https://issues.apache.org/jira/browse/HIVE-3435
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.10.0
Reporter: Zhenxiao Luo
Assignee: Zhenxiao Luo
 Fix For: 0.10.0


Hive pdk pluginTest is running twice in unit testing, one is triggered from 
running builtin tests, another is triggered from running pdk tests.

HIVE-3413 fixed pdk pluginTest on hadoop23 when triggered from running builtin 
tests. While, when triggered from running pdk tests directly on hadoop23, it is 
failing:

Testcase: SELECT tp_rot13('Mixed Up!') FROM onerow; took 6.426 sec
FAILED
expected:<[]Zvkrq Hc!> but was:<[2012-09-04 18:13:01,668 WARN [main] 
conf.HiveConf (HiveConf.java:(73)) - hive-site.xml not found on 
CLASSPATH
]Zvkrq Hc!>
junit.framework.ComparisonFailure: expected:<[]Zvkrq Hc!> but was:<[2012-09-04 
18:13:01,668 WARN [main] conf.HiveConf (HiveConf.java:(73)) - 
hive-site.xml not found on CLASSPATH
]Zvkrq Hc!>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3434) Batch creation of indexes for tables with a very large number of partitions

2012-09-05 Thread Esteban Gutierrez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez updated HIVE-3434:


Summary: Batch creation of indexes for tables with a very large number of 
partitions  (was: Batch index creation for tables with a very large number of 
partitions)

> Batch creation of indexes for tables with a very large number of partitions
> ---
>
> Key: HIVE-3434
> URL: https://issues.apache.org/jira/browse/HIVE-3434
> Project: Hive
>  Issue Type: Bug
>  Components: Indexing, Metastore
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
>
> When creating indexes for a table with a very large number of partitions, 
> Hive pulls all the metadata definition from that table before starting 
> starting the index generation. Even some of the side effects of this problem 
> can be addressed (OOMEs, timeouts) it would be really helpful to create the 
> indexes in batches (similar to HIVE-2050).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3434) Batch index creation for tables with a very large number of partitions

2012-09-05 Thread Esteban Gutierrez (JIRA)

Esteban Gutierrez created HIVE-3434:
---

 Summary: Batch index creation for tables with a very large number 
of partitions
 Key: HIVE-3434
 URL: https://issues.apache.org/jira/browse/HIVE-3434
 Project: Hive
  Issue Type: Bug
  Components: Indexing, Metastore
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


When creating indexes for a table with a very large number of partitions, Hive 
pulls all the metadata definition from that table before starting starting the 
index generation. Even some of the side effects of this problem can be 
addressed (OOMEs, timeouts) it would be really helpful to create the indexes in 
batches (similar to HIVE-2050).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #127

See 

--
[...truncated 36579 lines...]
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2012-09-05_14-05-55_214_2185328355983877515/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Copying file: 

[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 

[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/jenkins/hive_2012-09-05_14-05-59_601_1626747595976322056/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2012-09-05_14-05-59_601_1626747595976322056/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] Copying file:

[jira] [Updated] (HIVE-3323) ThriftSerde: Enable enum to string conversions

2012-09-05 Thread Travis Crawford (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Travis Crawford updated HIVE-3323:
--

Status: Open  (was: Patch Available)

> ThriftSerde: Enable enum to string conversions
> --
>
> Key: HIVE-3323
> URL: https://issues.apache.org/jira/browse/HIVE-3323
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3323_enum_to_string.1.patch, 
> HIVE-3323_enum_to_string.2.patch, HIVE-3323_enum_to_string.3.patch, 
> HIVE-3323_enum_to_string.4.patch, HIVE-3323_enum_to_string.5.patch, 
> HIVE-3323_enum_to_string.6.patch
>
>
> When using serde-reported schemas with the ThriftDeserializer, Enum fields 
> are presented as {{struct}}
> Many users expect to work with the string values, which is both easier and 
> more meaningful as the string value communicates what is represented.
> Hive should provide a mechanism to optionally convert enum values to strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3323) ThriftSerde: Enable enum to string conversions

2012-09-05 Thread Travis Crawford (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449063#comment-13449063
 ] 

Travis Crawford commented on HIVE-3323:
---

Thanks for the feedback Ashutosh – I agree making this conversion optional 
introduces complexity, and will remove the option + update the patch.

> ThriftSerde: Enable enum to string conversions
> --
>
> Key: HIVE-3323
> URL: https://issues.apache.org/jira/browse/HIVE-3323
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3323_enum_to_string.1.patch, 
> HIVE-3323_enum_to_string.2.patch, HIVE-3323_enum_to_string.3.patch, 
> HIVE-3323_enum_to_string.4.patch, HIVE-3323_enum_to_string.5.patch, 
> HIVE-3323_enum_to_string.6.patch
>
>
> When using serde-reported schemas with the ThriftDeserializer, Enum fields 
> are presented as {{struct}}
> Many users expect to work with the string values, which is both easier and 
> more meaningful as the string value communicates what is represented.
> Hive should provide a mechanism to optionally convert enum values to strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3323) ThriftSerde: Enable enum to string conversions


[ 
https://issues.apache.org/jira/browse/HIVE-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449024#comment-13449024
 ] 

Ashutosh Chauhan commented on HIVE-3323:


MultiKey stuff you added in OIcache doesn't look particularly clean. I think 
motivation for making that change was if there are multiple ThriftDeserializer 
in a job and for some reason they have different value for this new config key, 
you wanted to return different OIs with correct config. Intention is benign, 
but I am not sure if this really will work though. First of all its a rare 
condition that you have more then one serde in a job, only place I can think of 
is when you are reading two different tables in a same job (probably for 
joining) this will come up. There since configuration object is same for whole 
job, you are going to get only one value for whole job for your config key, not 
for each serde. So, even if some comes up with this rare use case of having two 
different ThriftSerDe which differs on this property, I doubt this will work. 
Further, this introduces dependecy on commons-collections which implies we need 
to package and ship this jar to backend. Thirdly, it introduces code 
complexity. So, I think we should drop this Multi-Key stuff for OIcache and 
keep it in original form. You might be able to make this work if you keep this 
config key in Table properties instead of Configuration, but I think we are 
introducing too much complexity for a rare case. At this point I will retract 
from opening up this option of having struct. Lets always convert to 
string going forward, keeping the code changes to minimal and avoiding 
complexity.

Another feedback is I think for JavaStringObjectInspector. I think instead of 
testing for type of object, you can always return o.toString() there, since if 
it was Enum, thats what you are doing and if it were String, then toString() on 
String returns {{this}} so we will get desired result for both cases and avoid 
type-check test. 

> ThriftSerde: Enable enum to string conversions
> --
>
> Key: HIVE-3323
> URL: https://issues.apache.org/jira/browse/HIVE-3323
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 0.10.0
>Reporter: Travis Crawford
>Assignee: Travis Crawford
> Attachments: HIVE-3323_enum_to_string.1.patch, 
> HIVE-3323_enum_to_string.2.patch, HIVE-3323_enum_to_string.3.patch, 
> HIVE-3323_enum_to_string.4.patch, HIVE-3323_enum_to_string.5.patch, 
> HIVE-3323_enum_to_string.6.patch
>
>
> When using serde-reported schemas with the ThriftDeserializer, Enum fields 
> are presented as {{struct}}
> Many users expect to work with the string values, which is both easier and 
> more meaningful as the string value communicates what is represented.
> Hive should provide a mechanism to optionally convert enum values to strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2517) Support group by on union and struct type


[ 
https://issues.apache.org/jira/browse/HIVE-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448899#comment-13448899
 ] 

Ashutosh Chauhan commented on HIVE-2517:


@Philip : Certainly. Since I wrote the patch I cant check this in. If some 
other committer volunteers to help, then we can get this in.

> Support group by on union and struct type
> -
>
> Key: HIVE-2517
> URL: https://issues.apache.org/jira/browse/HIVE-2517
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>  Labels: structtype, uniontype
> Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2517.D2151.1.patch, 
> hive-2517_1.patch, hive-2517_2.patch, hive-2517.patch
>
>
> Currently group by on struct and union types are not supported. This issue 
> will enable support for those.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3340) shims unit test failures fails further test progress


 [ 
https://issues.apache.org/jira/browse/HIVE-3340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3340:
---

   Resolution: Fixed
Fix Version/s: 0.10.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Giri!

> shims unit test failures fails further test progress
> 
>
> Key: HIVE-3340
> URL: https://issues.apache.org/jira/browse/HIVE-3340
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Giridharan Kesavan
>Assignee: Giridharan Kesavan
> Fix For: 0.10.0
>
> Attachments: HIVE-3340.patch
>
>
> enable failonerror flag so that unit test's can continue on even when shims 
> unit test fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Hive-trunk-h0.21 - Build # 1649 - Still Failing

Changes for Build #1638
[namit] HIVE-3393 get_json_object and json_tuple should use Jackson library
(Kevin Wilfong via namit)


Changes for Build #1639

Changes for Build #1640
[ecapriolo] HIVE-3068 Export table metadata as JSON on table drop (Andrew 
Chalfant via egc)


Changes for Build #1641

Changes for Build #1642
[hashutosh] HIVE-3338 : Archives broken for hadoop 1.0 (Vikram Dixit via 
Ashutosh Chauhan)


Changes for Build #1643

Changes for Build #1644

Changes for Build #1645
[cws] HIVE-3413. Fix pdk.PluginTest on hadoop23 (Zhenxiao Luo via cws)


Changes for Build #1646
[cws] HIVE-3056. Ability to bulk update location field in Db/Table/Partition 
records (Shreepadma Venugopalan via cws)

[cws] HIVE-3416 [jira] Fix 
TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS when running Hive on 
hadoop23
(Zhenxiao Luo via Carl Steinbach)

Summary:
HIVE-3416: Fix TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS when 
running Hive on hadoop23

TestAvroSerdeUtils determinSchemaCanReadSchemaFromHDFS is failing when running 
hive on hadoop23:

$ant very-clean package -Dhadoop.version=0.23.1 -Dhadoop-0.23.version=0.23.1 
-Dhadoop.mr.rev=23

$ant test -Dhadoop.version=0.23.1 -Dhadoop-0.23.version=0.23.1 
-Dhadoop.mr.rev=23 -Dtestcase=TestAvroSerdeUtils

 
java.lang.NoClassDefFoundError: 
org/apache/hadoop/net/StaticMapping
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:534)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:489)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:360)
at 
org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS(TestAvroSerdeUtils.java:187)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.net.StaticMapping
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
... 25 more

  

Test Plan: EMPTY

Reviewers: JIRA

Differential Revision: https://reviews.facebook.net/D5025

[cws] HIVE-3424. Error by upgrading a Hive 0.7.0 database to 0.8.0 
(008-HIVE-2246.mysql.sql) (Alexander Alten-Lorenz via cws)

[cws] HIVE-3412. Fix TestCliDriver.repair on Hadoop 0.23.3, 3.0.0, and 
2.2.0-alpha (Zhenxiao Luo via cws)


Changes for Build #1647

Changes for Build #1648
[namit] HIVE-3429 Bucket map join involving table with more than 1 partition 
column causes 
FileNotFoundException (Kevin Wilfong via namit)


Changes for Build #1649
[hashutosh] HIVE-3075 : Improve HiveMetaStore logging (Travis Crawford via 
Ashutosh Chauhan)




No tests ran.

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1649)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1649/ to 
view the results.

[jira] [Commented] (HIVE-3075) Improve HiveMetaStore logging

2012-09-05 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448889#comment-13448889
 ] 

Hudson commented on HIVE-3075:
--

Integrated in Hive-trunk-h0.21 #1649 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1649/])
HIVE-3075 : Improve HiveMetaStore logging (Travis Crawford via Ashutosh 
Chauhan) (Revision 1381244)

 Result = FAILURE
hashutosh : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381244
Files : 
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java


> Improve HiveMetaStore logging
> -
>
> Key: HIVE-3075
> URL: https://issues.apache.org/jira/browse/HIVE-3075
> Project: Hive
>  Issue Type: Improvement
>Reporter: Travis Crawford
>Assignee: Travis Crawford
>Priority: Minor
> Fix For: 0.10.0
>
> Attachments: HIVE-3075_hivemetastore_logging.1.patch
>
>
> {{HiveMetaStore}} logs when actions are taken, which is useful for debugging. 
> Many parameters are thrift objects which have useful toString methods, 
> however, log messages only print certain fields. It would be useful if log 
> messages contained all fields for thrift objects.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3075) Improve HiveMetaStore logging


 [ 
https://issues.apache.org/jira/browse/HIVE-3075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3075:
---

   Resolution: Fixed
Fix Version/s: 0.10.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Travis!

> Improve HiveMetaStore logging
> -
>
> Key: HIVE-3075
> URL: https://issues.apache.org/jira/browse/HIVE-3075
> Project: Hive
>  Issue Type: Improvement
>Reporter: Travis Crawford
>Assignee: Travis Crawford
>Priority: Minor
> Fix For: 0.10.0
>
> Attachments: HIVE-3075_hivemetastore_logging.1.patch
>
>
> {{HiveMetaStore}} logs when actions are taken, which is useful for debugging. 
> Many parameters are thrift objects which have useful toString methods, 
> however, log messages only print certain fields. It would be useful if log 
> messages contained all fields for thrift objects.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3427) Newly added test testCliDriver_metadata_export_drop is consistently failing on trunk


[ 
https://issues.apache.org/jira/browse/HIVE-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448881#comment-13448881
 ] 

Ashutosh Chauhan commented on HIVE-3427:


This is holding off apache builds from going green 
https://builds.apache.org/job/Hive-trunk-h0.21/1647/testReport/

> Newly added test testCliDriver_metadata_export_drop is consistently failing 
> on trunk
> 
>
> Key: HIVE-3427
> URL: https://issues.apache.org/jira/browse/HIVE-3427
> Project: Hive
>  Issue Type: Test
>Affects Versions: 0.10.0
>Reporter: Ashutosh Chauhan
>
> I think its a new test which was added via HIVE-3068

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3171) Bucketed sort merge join doesn't work when multiple files exist for small alias


[ 
https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448872#comment-13448872
 ] 

Namit Jain commented on HIVE-3171:
--

@Navis, are you planning to work on this ?
Most of the comments are minor - it would be useful to get it in.

> Bucketed sort merge join doesn't work when multiple files exist for small 
> alias
> ---
>
> Key: HIVE-3171
> URL: https://issues.apache.org/jira/browse/HIVE-3171
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Joey Echeverria
>Assignee: Navis
>  Labels: bucketing, joins, partitioning
> Attachments: HIVE-3171.1.patch.txt
>
>
> Executing a query with the MAPJOIN hint and the bucketed sort merge join 
> optimizations enabled:
> {noformat}
> set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
> set hive.optimize.bucketmapjoin = true;
> set hive.optimize.bucketmapjoin.sortedmerge = true;
> {noformat}
> works fine with partitioned tables if there is only one partition in the 
> table. However, if you add a second partition, Hive attempts to do a regular 
> map-side join which can fail because the tables are too large. Hive ought to 
> be able to still do the bucketed sort merge join with partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Hive-trunk-h0.21 - Build # 1648 - Still Failing

Changes for Build #1638
[namit] HIVE-3393 get_json_object and json_tuple should use Jackson library
(Kevin Wilfong via namit)


Changes for Build #1639

Changes for Build #1640
[ecapriolo] HIVE-3068 Export table metadata as JSON on table drop (Andrew 
Chalfant via egc)


Changes for Build #1641

Changes for Build #1642
[hashutosh] HIVE-3338 : Archives broken for hadoop 1.0 (Vikram Dixit via 
Ashutosh Chauhan)


Changes for Build #1643

Changes for Build #1644

Changes for Build #1645
[cws] HIVE-3413. Fix pdk.PluginTest on hadoop23 (Zhenxiao Luo via cws)


Changes for Build #1646
[cws] HIVE-3056. Ability to bulk update location field in Db/Table/Partition 
records (Shreepadma Venugopalan via cws)

[cws] HIVE-3416 [jira] Fix 
TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS when running Hive on 
hadoop23
(Zhenxiao Luo via Carl Steinbach)

Summary:
HIVE-3416: Fix TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS when 
running Hive on hadoop23

TestAvroSerdeUtils determinSchemaCanReadSchemaFromHDFS is failing when running 
hive on hadoop23:

$ant very-clean package -Dhadoop.version=0.23.1 -Dhadoop-0.23.version=0.23.1 
-Dhadoop.mr.rev=23

$ant test -Dhadoop.version=0.23.1 -Dhadoop-0.23.version=0.23.1 
-Dhadoop.mr.rev=23 -Dtestcase=TestAvroSerdeUtils

 
java.lang.NoClassDefFoundError: 
org/apache/hadoop/net/StaticMapping
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:534)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:489)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:360)
at 
org.apache.hadoop.hive.serde2.avro.TestAvroSerdeUtils.determineSchemaCanReadSchemaFromHDFS(TestAvroSerdeUtils.java:187)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.net.StaticMapping
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
... 25 more

  

Test Plan: EMPTY

Reviewers: JIRA

Differential Revision: https://reviews.facebook.net/D5025

[cws] HIVE-3424. Error by upgrading a Hive 0.7.0 database to 0.8.0 
(008-HIVE-2246.mysql.sql) (Alexander Alten-Lorenz via cws)

[cws] HIVE-3412. Fix TestCliDriver.repair on Hadoop 0.23.3, 3.0.0, and 
2.2.0-alpha (Zhenxiao Luo via cws)


Changes for Build #1647

Changes for Build #1648
[namit] HIVE-3429 Bucket map join involving table with more than 1 partition 
column causes 
FileNotFoundException (Kevin Wilfong via namit)




No tests ran.

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1648)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1648/ to 
view the results.

[jira] [Commented] (HIVE-3429) Bucket map join involving table with more than 1 partition column causes FileNotFoundException

2012-09-05 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448847#comment-13448847
 ] 

Hudson commented on HIVE-3429:
--

Integrated in Hive-trunk-h0.21 #1648 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1648/])
HIVE-3429 Bucket map join involving table with more than 1 partition column 
causes 
FileNotFoundException (Kevin Wilfong via namit) (Revision 1381213)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1381213
Files : 
* /hive/trunk/build-common.xml
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/BucketMapJoinContext.java
* /hive/trunk/ql/src/test/queries/clientpositive/bucketmapjoin7.q
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin7.q.out


> Bucket map join involving table with more than 1 partition column causes 
> FileNotFoundException
> --
>
> Key: HIVE-3429
> URL: https://issues.apache.org/jira/browse/HIVE-3429
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.10.0
>
> Attachments: HIVE-3429.1.patch.txt
>
>
> Running a bucket map join exception on a table with more than one partition 
> results in an exception is below.  This is because the partition spec is 
> added to the file name, which unintentionally, produces a new subdirectory.   
>  [junit] java.io.FileNotFoundException: 
> /Users/kevinwilfong/Documents/hive_driver_start/build/ql/scratchdir/local/hive_2012-09-04_18-35-38_679_3765928822897237252/-local-10002/HashTable-Stage-1/MapJoin-b-21-(ds=2008-04-08
>  (No such file or directory)
> [junit]   at java.io.FileInputStream.open(Native Method)
> [junit]   at java.io.FileInputStream.(FileInputStream.java:120)
> [junit]   at 
> org.apache.hadoop.hive.common.CompressionUtils.tar(CompressionUtils.java:59)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:398)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:135)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> [junit]   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
> [junit]   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1112)
> [junit]   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
> [junit]   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> [junit]   at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> [junit]   at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
> [junit]   at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
> [junit]   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:712)
> [junit]   at 
> org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin7(TestMinimrCliDriver.java:288)
> [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> [junit]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
> [junit]   at junit.framework.TestCase.runTest(TestCase.java:168)
> [junit]   at junit.framework.TestCase.runBare(TestCase.java:134)
> [junit]   at junit.framework.TestResult$1.protect(TestResult.java:110)
> [junit]   at junit.framework.TestResult.runProtected(TestResult.java:128)
> [junit]   at junit.framework.TestResult.run(TestResult.java:113)
> [junit]   at junit.framework.TestCase.run(TestCase.java:124)
> [junit]   at junit.framework.TestSuite.runTest(TestSuite.java:232)
> [junit]   at junit.framework.TestSuite.run(TestSuite.java:227)
> [junit]   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
> [junit]   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
> [junit]   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)
> [junit] java.lang.IllegalArgumentException: Can not create a Path from an 
> empty string
> [junit]   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
> [junit]   at org.apache.hadoop.fs.Path.(Path.java:90)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:381)
> [junit]   at 
> org.ap

Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #127

See 


--
[...truncated 5836 lines...]
[ivy:resolve] ... 
(447kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
com.sun.jersey#jersey-core;1.8!jersey-core.jar(bundle) (38ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/com/sun/jersey/jersey-json/1.8/jersey-json-1.8.jar
 ...
[ivy:resolve]  (144kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
com.sun.jersey#jersey-json;1.8!jersey-json.jar(bundle) (227ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/com/sun/jersey/jersey-server/1.8/jersey-server-1.8.jar
 ...
[ivy:resolve] 
 (678kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
com.sun.jersey#jersey-server;1.8!jersey-server.jar(bundle) (345ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/tomcat/jasper-compiler/5.5.23/jasper-compiler-5.5.23.jar
 ...
[ivy:resolve] . (398kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] tomcat#jasper-compiler;5.5.23!jasper-compiler.jar 
(211ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/tomcat/jasper-runtime/5.5.23/jasper-runtime-5.5.23.jar
 ...
[ivy:resolve]  (75kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] tomcat#jasper-runtime;5.5.23!jasper-runtime.jar 
(35ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/javax/servlet/jsp/jsp-api/2.1/jsp-api-2.1.jar ...
[ivy:resolve] .. (98kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] javax.servlet.jsp#jsp-api;2.1!jsp-api.jar (42ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-el/commons-el/1.0/commons-el-1.0.jar ...
[ivy:resolve]  (109kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] commons-el#commons-el;1.0!commons-el.jar (39ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-logging/commons-logging/1.1.1/commons-logging-1.1.1.jar
 ...
[ivy:resolve]  (59kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
commons-logging#commons-logging;1.1.1!commons-logging.jar (40ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-logging/commons-logging-api/1.1/commons-logging-api-1.1.jar
 ...
[ivy:resolve]  (43kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
commons-logging#commons-logging-api;1.1!commons-logging-api.jar (33ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/log4j/log4j/1.2.15/log4j-1.2.15.jar ...
[ivy:resolve] . (382kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] log4j#log4j;1.2.15!log4j.jar (45ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/net/java/dev/jets3t/jets3t/0.6.1/jets3t-0.6.1.jar 
...
[ivy:resolve] ... (314kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] net.java.dev.jets3t#jets3t;0.6.1!jets3t.jar (42ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-lang/commons-lang/2.5/commons-lang-2.5.jar
 ...
[ivy:resolve] . (272kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] commons-lang#commons-lang;2.5!commons-lang.jar 
(38ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar
 ...
[ivy:resolve] ... 
(561kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
commons-collections#commons-collections;3.2.1!commons-collections.jar (52ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar
 ...
[ivy:resolve] . (291kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
commons-configuration#commons-configuration;1.6!commons-configuration.jar (46ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/hsqldb/hsqldb/1.8.0.7/hsqldb-1.8.0.7.jar ...
[ivy:resolve] 
...
 (628kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] hsqldb#hsqldb;1.8.0.7!hsqldb.jar (56ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/slf4j/slf4j-api/1.6.1/slf4j-api-1.6.1.jar ...
[ivy:resolve] .. (24kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] org.slf4j#slf4j-api;1.6.1!slf4j-api.jar (31ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar
 ...
[ivy:resolve] .. (9kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] org.slf4j#slf4j-log4j12;1.6.1!slf4j-log4j12.jar 
(31ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/eclipse/jdt/core/3.1.1/core-3.1.1.jar ...
[ivy:resolve] 
..

[jira] [Commented] (HIVE-3430) group by followed by join with the same key should be optimized

2012-09-05 Thread Yin Huai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448825#comment-13448825
 ] 

Yin Huai commented on HIVE-3430:


There is one thing I forgot to add in last comment...

The current patch of HIVE-2206 can only handle the simpler query example, 
because I let the optimizer to check if the correlation can reach the bottom of 
the tree (i.e. input tables). Since, in the original example, one of the group 
by operation on "value" starts from an intermediate table, the current 
implementation cannot optimize that. But if two separate queries (one for the 
join operations on "key" and another for the join and group by operations on 
"value", as shown by the simpler example) are used, the current implementation 
should be able to optimize the second one. The idea of YSmart covers the 
original example, but I have not implemented it yet...

> group by followed by join with the same key should be optimized
> ---
>
> Key: HIVE-3430
> URL: https://issues.apache.org/jira/browse/HIVE-3430
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3430) group by followed by join with the same key should be optimized

2012-09-05 Thread Yin Huai (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448814#comment-13448814
 ] 

Yin Huai commented on HIVE-3430:


Yes, YSmart (https://issues.apache.org/jira/browse/HIVE-2206) can optimize this 
pattern. 

For the query shown below, two jobs will be generated. The first one takes care 
the join operation on "key", and the second one takes care group by and join 
operations on "value". 
{code:SQL}
select * from
(
  select c.value, count(1) as cnt from
  (
select b.key, b.value from
(
  select key, length(value) from T1 where ds = '1'
) a
join
T2 b on b.ds = '1' and a.key = b.key
  ) c
  group by c.value
) d
join
(
  select value, count(1) as cnt from T2 c where c.ds = '1' group by value
) e
on d.value = e.value;
{code}

> group by followed by join with the same key should be optimized
> ---
>
> Key: HIVE-3430
> URL: https://issues.apache.org/jira/browse/HIVE-3430
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3429) Bucket map join involving table with more than 1 partition column causes FileNotFoundException


 [ 
https://issues.apache.org/jira/browse/HIVE-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3429:
-

   Resolution: Fixed
Fix Version/s: 0.10.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

committed. Thanks Kevin

> Bucket map join involving table with more than 1 partition column causes 
> FileNotFoundException
> --
>
> Key: HIVE-3429
> URL: https://issues.apache.org/jira/browse/HIVE-3429
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Kevin Wilfong
>Assignee: Kevin Wilfong
> Fix For: 0.10.0
>
> Attachments: HIVE-3429.1.patch.txt
>
>
> Running a bucket map join exception on a table with more than one partition 
> results in an exception is below.  This is because the partition spec is 
> added to the file name, which unintentionally, produces a new subdirectory.   
>  [junit] java.io.FileNotFoundException: 
> /Users/kevinwilfong/Documents/hive_driver_start/build/ql/scratchdir/local/hive_2012-09-04_18-35-38_679_3765928822897237252/-local-10002/HashTable-Stage-1/MapJoin-b-21-(ds=2008-04-08
>  (No such file or directory)
> [junit]   at java.io.FileInputStream.open(Native Method)
> [junit]   at java.io.FileInputStream.(FileInputStream.java:120)
> [junit]   at 
> org.apache.hadoop.hive.common.CompressionUtils.tar(CompressionUtils.java:59)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:398)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:135)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> [junit]   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1326)
> [junit]   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1112)
> [junit]   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
> [junit]   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> [junit]   at 
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> [junit]   at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
> [junit]   at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
> [junit]   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:712)
> [junit]   at 
> org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin7(TestMinimrCliDriver.java:288)
> [junit]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> [junit]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> [junit]   at java.lang.reflect.Method.invoke(Method.java:597)
> [junit]   at junit.framework.TestCase.runTest(TestCase.java:168)
> [junit]   at junit.framework.TestCase.runBare(TestCase.java:134)
> [junit]   at junit.framework.TestResult$1.protect(TestResult.java:110)
> [junit]   at junit.framework.TestResult.runProtected(TestResult.java:128)
> [junit]   at junit.framework.TestResult.run(TestResult.java:113)
> [junit]   at junit.framework.TestCase.run(TestCase.java:124)
> [junit]   at junit.framework.TestSuite.runTest(TestSuite.java:232)
> [junit]   at junit.framework.TestSuite.run(TestSuite.java:227)
> [junit]   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
> [junit]   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
> [junit]   at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)
> [junit] java.lang.IllegalArgumentException: Can not create a Path from an 
> empty string
> [junit]   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
> [junit]   at org.apache.hadoop.fs.Path.(Path.java:90)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:381)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:194)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:472)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:135)
> [junit]   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> [junit]   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.ja

[jira] [Commented] (HIVE-3432) perform a map-only group by is grouping key matches the sorting properties of the table


[ 
https://issues.apache.org/jira/browse/HIVE-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448797#comment-13448797
 ] 

Namit Jain commented on HIVE-3432:
--

Conditional Task selects one task or the other.
In case of map-joins, one local job reads the small table and fails/succeeds 
quickly.

In this case, all the mappers will have to be tried, and then we may fail after 
a very long time.
Moreover, it would be very difficult to know, that the task has failed due to 
memory overflow by the bucket.

> perform a map-only group by is grouping key matches the sorting properties of 
> the table
> ---
>
> Key: HIVE-3432
> URL: https://issues.apache.org/jira/browse/HIVE-3432
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Sambavi Muthukrishnan
>
> There should be an option to use bucketizedinputformat and use map-only group 
> by. There would be no need to perform a map-side aggregation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3432) perform a map-only group by is grouping key matches the sorting properties of the table


[ 
https://issues.apache.org/jira/browse/HIVE-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448790#comment-13448790
 ] 

Sambavi Muthukrishnan commented on HIVE-3432:
-

Cant we do this similar to map joins where we attempt it on map side and fall 
back to reducer in a conditional task?

> perform a map-only group by is grouping key matches the sorting properties of 
> the table
> ---
>
> Key: HIVE-3432
> URL: https://issues.apache.org/jira/browse/HIVE-3432
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Sambavi Muthukrishnan
>
> There should be an option to use bucketizedinputformat and use map-only group 
> by. There would be no need to perform a map-side aggregation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3433) Implement CUBE and ROLLUP operators in Hive


[ 
https://issues.apache.org/jira/browse/HIVE-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448787#comment-13448787
 ] 

Sambavi Muthukrishnan commented on HIVE-3433:
-

Provide an efficient implementation of CUBE and ROLLUP in Hive. We can use a 
group by followed by adding rows and re-grouping to provide the CUBE/ROLLUP.

> Implement CUBE and ROLLUP operators in Hive
> ---
>
> Key: HIVE-3433
> URL: https://issues.apache.org/jira/browse/HIVE-3433
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.9.0
>Reporter: Sambavi Muthukrishnan
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3432) perform a map-only group by is grouping key matches the sorting properties of the table


[ 
https://issues.apache.org/jira/browse/HIVE-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448786#comment-13448786
 ] 

Namit Jain commented on HIVE-3432:
--

For bucketized tables, there is no guarantee that mapper will be able to hold 
all the data in memory.
So, we would always need a reducer.

> perform a map-only group by is grouping key matches the sorting properties of 
> the table
> ---
>
> Key: HIVE-3432
> URL: https://issues.apache.org/jira/browse/HIVE-3432
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Sambavi Muthukrishnan
>
> There should be an option to use bucketizedinputformat and use map-only group 
> by. There would be no need to perform a map-side aggregation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3433) Implement CUBE and ROLLUP operators in Hive

Sambavi Muthukrishnan created HIVE-3433:
---

 Summary: Implement CUBE and ROLLUP operators in Hive
 Key: HIVE-3433
 URL: https://issues.apache.org/jira/browse/HIVE-3433
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
Reporter: Sambavi Muthukrishnan




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3432) perform a map-only group by is grouping key matches the sorting properties of the table


[ 
https://issues.apache.org/jira/browse/HIVE-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448782#comment-13448782
 ] 

Sambavi Muthukrishnan commented on HIVE-3432:
-

We can do this for bucketed tables as well, not just sorted by grouping key. We 
will need to fallback in that case to a regular group by in reducer.

> perform a map-only group by is grouping key matches the sorting properties of 
> the table
> ---
>
> Key: HIVE-3432
> URL: https://issues.apache.org/jira/browse/HIVE-3432
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Sambavi Muthukrishnan
>
> There should be an option to use bucketizedinputformat and use map-only group 
> by. There would be no need to perform a map-side aggregation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-3432) perform a map-only group by is grouping key matches the sorting properties of the table


 [ 
https://issues.apache.org/jira/browse/HIVE-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sambavi Muthukrishnan reassigned HIVE-3432:
---

Assignee: Sambavi Muthukrishnan

> perform a map-only group by is grouping key matches the sorting properties of 
> the table
> ---
>
> Key: HIVE-3432
> URL: https://issues.apache.org/jira/browse/HIVE-3432
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Sambavi Muthukrishnan
>
> There should be an option to use bucketizedinputformat and use map-only group 
> by. There would be no need to perform a map-side aggregation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3141) Bug in SELECT query

2012-09-05 Thread Ajesh Kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajesh Kumar updated HIVE-3141:
--

Attachment: (was: HIVE-3141.1.patch.txt)

> Bug in SELECT query
> ---
>
> Key: HIVE-3141
> URL: https://issues.apache.org/jira/browse/HIVE-3141
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.9.0
> Environment: OS: Ubuntu 
> Hive version: hive-0.7.1-cdh3u2
> Hadoop : hadoop-0.20.2
>Reporter: ASK
>Priority: Minor
>  Labels: patch
> Attachments: HIVE-3141.2.patch.txt, Hive_bug_3141_resolution.pdf, 
> select_syntax.q, select_syntax.q.out
>
>
> When i try to execute select *(followed by any alphanumeric character) from 
> table , query is throwing some issues. It display the result for select *
> It doesnot happen when only numbers follow the *

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3432) perform a map-only group by is grouping key matches the sorting properties of the table

Namit Jain created HIVE-3432:


 Summary: perform a map-only group by is grouping key matches the 
sorting properties of the table
 Key: HIVE-3432
 URL: https://issues.apache.org/jira/browse/HIVE-3432
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain


There should be an option to use bucketizedinputformat and use map-only group 
by. There would be no need to perform a map-side aggregation.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3431) Resources on non-local file system should be downloaded to temporary directory sometimes

Navis created HIVE-3431:
---

 Summary: Resources on non-local file system should be downloaded 
to temporary directory sometimes
 Key: HIVE-3431
 URL: https://issues.apache.org/jira/browse/HIVE-3431
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Trivial


"add resource " command downloads the resource file to location 
specified by conf "hive.downloaded.resources.dir" in local file system. But 
when the command above is executed concurrently to hive-server for same file, 
some client fails by VM crash, which is caused by overwritten file by other 
requests.

So there should be a configuration to provide per request location for add 
resource command, something like "set 
hiveconf:hive.downloaded.resources.dir=temporary"

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3430) group by followed by join with the same key should be optimized


[ 
https://issues.apache.org/jira/browse/HIVE-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448654#comment-13448654
 ] 

Navis commented on HIVE-3430:
-

Is the "YSmart" expected to provide this kind of optimization?

> group by followed by join with the same key should be optimized
> ---
>
> Key: HIVE-3430
> URL: https://issues.apache.org/jira/browse/HIVE-3430
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3430) group by followed by join with the same key should be optimized


[ 
https://issues.apache.org/jira/browse/HIVE-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448653#comment-13448653
 ] 

Namit Jain commented on HIVE-3430:
--

The final join should be run as a sort-merge join ideally.

> group by followed by join with the same key should be optimized
> ---
>
> Key: HIVE-3430
> URL: https://issues.apache.org/jira/browse/HIVE-3430
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3430) group by followed by join with the same key should be optimized


[ 
https://issues.apache.org/jira/browse/HIVE-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448652#comment-13448652
 ] 

Namit Jain commented on HIVE-3430:
--

Or a much simpler query:

explain
select b.value, b.cnt as cnt1, d.cnt as cnt2 from
  (select a.value, count(1) as cnt from T1 a where a.ds='1' group by a.value) b
  join
  (select c.value, count(1) as cnt from T2 c where c.ds = '1' group by c.value) 
d
  on b.value = d.value;


The join needs a separate map-reduce job - it does not use the fact that the 
group by has already distributed the data by join key.


> group by followed by join with the same key should be optimized
> ---
>
> Key: HIVE-3430
> URL: https://issues.apache.org/jira/browse/HIVE-3430
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3430) group by followed by join with the same key should be optimized


[ 
https://issues.apache.org/jira/browse/HIVE-3430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448651#comment-13448651
 ] 

Namit Jain commented on HIVE-3430:
--

Consider the following query:

select * from
(
  select c.value, count(1) as cnt from
  (
select b.key, b.value from
  (select key, length(value) from T1 where ds = '1') a
join
  T2 b on b.ds = '1' and a.key = b.key
  ) c
  group by c.value
) d
join
(select value, count(1) as cnt from T2 c where c.ds = '1' group by value) e
on d.value = e.value;



Every join and group by above has a separate map-reduce job.


> group by followed by join with the same key should be optimized
> ---
>
> Key: HIVE-3430
> URL: https://issues.apache.org/jira/browse/HIVE-3430
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3430) group by followed by join with the same key should be optimized

Namit Jain created HIVE-3430:


 Summary: group by followed by join with the same key should be 
optimized
 Key: HIVE-3430
 URL: https://issues.apache.org/jira/browse/HIVE-3430
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3315) Propagate filers on inner join condition transitively


 [ 
https://issues.apache.org/jira/browse/HIVE-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3315:


Status: Open  (was: Patch Available)

I've misunderstood join conditions. Will be updated soon.

> Propagate filers on inner join condition transitively 
> --
>
> Key: HIVE-3315
> URL: https://issues.apache.org/jira/browse/HIVE-3315
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.10.0
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3315.1.patch.txt, HIVE-3315.2.patch.txt
>
>
> explain select src1.key from src src1 join src src2 on src1.key=src2.key and 
> src1.key < 100;
> In this case, filter on join condition src1.key < 100 can be propagated 
> transitively to src2 by src2.key < 100. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3387) meta data file size exceeds limit


 [ 
https://issues.apache.org/jira/browse/HIVE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3387:


Attachment: HIVE-3387.1.patch.txt

> meta data file size exceeds limit
> -
>
> Key: HIVE-3387
> URL: https://issues.apache.org/jira/browse/HIVE-3387
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.1
>Reporter: Alexander Alten-Lorenz
>Assignee: Navis
> Fix For: 0.9.1
>
> Attachments: HIVE-3387.1.patch.txt
>
>
> The cause is certainly that we use an array list instead of a set structure 
> in the split locations API. Looks like a bug in Hive's CombineFileInputFormat.
> Reproduce:
> Set mapreduce.jobtracker.split.metainfo.maxsize=1 when submitting the 
> Hive query. Run a big hive query that write data into a partitioned table. 
> Due to the large number of splits, you encounter an exception on the job 
> submitted to Hadoop and the exception said:
> meta data size exceeds 1.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3141) Bug in SELECT query

2012-09-05 Thread Ajesh Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448582#comment-13448582
 ] 

Ajesh Kumar commented on HIVE-3141:
---

Thanks Carl
I have added review request in reviews.apache.org.
Please find the link for the same  https://reviews.apache.org/r/6919/

> Bug in SELECT query
> ---
>
> Key: HIVE-3141
> URL: https://issues.apache.org/jira/browse/HIVE-3141
> Project: Hive
>  Issue Type: Bug
>  Components: CLI
>Affects Versions: 0.9.0
> Environment: OS: Ubuntu 
> Hive version: hive-0.7.1-cdh3u2
> Hadoop : hadoop-0.20.2
>Reporter: ASK
>Priority: Minor
>  Labels: patch
> Attachments: HIVE-3141.1.patch.txt, HIVE-3141.2.patch.txt, 
> Hive_bug_3141_resolution.pdf, select_syntax.q, select_syntax.q.out
>
>
> When i try to execute select *(followed by any alphanumeric character) from 
> table , query is throwing some issues. It display the result for select *
> It doesnot happen when only numbers follow the *

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: HIVE-3141 - Review request

2012-09-05 Thread Ajesh kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6919/
---

Review request for hive and namit jain.


Description
---

Approach: provided extra verification for the syntax of select statement to 
address the issue raised. 
This would provide an appropriate error message to the user if the syntax is 
wrong.

Fix: Made change in CliDriver.java to make an extra check on the syntax of 
select statement.


This addresses bug HIVE-3141.
https://issues.apache.org/jira/browse/HIVE-3141


Diffs
-

  cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 344606c 
  ql/src/test/queries/clientnegative/select_syntax.q PRE-CREATION 
  ql/src/test/results/clientnegative/select_syntax.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/6919/diff/


Testing
---

Negative Test case: Added select_syntax.q file for negative test.

Test Result: Output of select_syntax.q is available in select_syntax.q.out.


Thanks,

Ajesh kumar

[jira] [Commented] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly


[ 
https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13448563#comment-13448563
 ] 

Navis commented on HIVE-3411:
-

A boolean tag for filter result is replaced with a byte tag, limiting number of 
aliases for single join to max 8. 
I'm not yet decided whether to extended the limit or to throw exception at 
compile time.

> Filter predicates on outer join overlapped on single alias is not handled 
> properly
> --
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: 0.10.0
> Environment: ubuntu 10.10
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3411.1.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or 
> HashSink for MapJoin) and the result value is tagged to end of each values(as 
> a boolean), which is used for joining values. But when predicates are 
> overlapped on single alias, all the predicates are evaluated with AND 
> conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the 
> values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND 
> b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND 
> c.value=60);
> NULL  NULL100 40  NULLNULL
> NULL  NULL100 50  NULLNULL
> NULL  NULL100 60  NULLNULL
> -- Join predicate
> Join Operator
>   condition map:
>Right Outer Join0 to 1
>Left Outer Join1 to 2
>   condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
> 0 
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2 
> {noformat}
> but this should be 
> {noformat}
> NULL  NULL100 40  NULLNULL
> 100   50  100 50  NULLNULL
> NULL  NULL100 60  100 60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly


 [ 
https://issues.apache.org/jira/browse/HIVE-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3411:


Attachment: HIVE-3411.1.patch.txt

> Filter predicates on outer join overlapped on single alias is not handled 
> properly
> --
>
> Key: HIVE-3411
> URL: https://issues.apache.org/jira/browse/HIVE-3411
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Affects Versions: 0.10.0
> Environment: ubuntu 10.10
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-3411.1.patch.txt
>
>
> Currently, join predicates on outer join are evaluated in join operator (or 
> HashSink for MapJoin) and the result value is tagged to end of each values(as 
> a boolean), which is used for joining values. But when predicates are 
> overlapped on single alias, all the predicates are evaluated with AND 
> conjunction, which makes invalid result. 
> For example with table a with values,
> {noformat}
> 100 40
> 100 50
> 100 60
> {noformat}
> Query below has overlapped predicates on alias b, which is making all the 
> values on b are tagged with true(filtered)
> {noformat}
> select * from a right outer join a b on (a.key=b.key AND a.value=50 AND 
> b.value=50) left outer join a c on (b.key=c.key AND b.value=60 AND 
> c.value=60);
> NULL  NULL100 40  NULLNULL
> NULL  NULL100 50  NULLNULL
> NULL  NULL100 60  NULLNULL
> -- Join predicate
> Join Operator
>   condition map:
>Right Outer Join0 to 1
>Left Outer Join1 to 2
>   condition expressions:
> 0 {VALUE._col0} {VALUE._col1}
> 1 {VALUE._col0} {VALUE._col1}
> 2 {VALUE._col0} {VALUE._col1}
>   filter predicates:
> 0 
> 1 {(VALUE._col1 = 50)} {(VALUE._col1 = 60)}
> 2 
> {noformat}
> but this should be 
> {noformat}
> NULL  NULL100 40  NULLNULL
> 100   50  100 50  NULLNULL
> NULL  NULL100 60  100 60
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3411) Filter predicates on outer join overlapped on single alias is not handled properly