Build failed in Hudson: Hive-trunk-h0.17 #260

2009-11-01 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/260/

--
[...truncated 8575 lines...]
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ql/test/logs/negative/unknown_function2.q.out
 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/src/test/results/compiler/errors/unknown_function2.q.out
[junit] Done query: unknown_function2.q
[junit] Begin query: unknown_function3.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ql/test/logs/negative/unknown_function3.q.out
 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/src/test/results/compiler/errors/unknown_function3.q.out
[junit] Done query: unknown_function3.q
[junit] Begin query: unknown_function4.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ql/test/logs/negative/unknown_function4.q.out
 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/src/test/results/compiler/errors/unknown_function4.q.out
[junit] Done query: unknown_function4.q
[junit] Begin query: unknown_table1.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading 

Build failed in Hudson: Hive-trunk-h0.20 #86

2009-11-01 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.20/86/

--
[...truncated 10686 lines...]
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/ql/test/logs/negative/unknown_function2.q.out
 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/ql/src/test/results/compiler/errors/unknown_function2.q.out
[junit] Done query: unknown_function2.q
[junit] Begin query: unknown_function3.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/ql/test/logs/negative/unknown_function3.q.out
 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/ql/src/test/results/compiler/errors/unknown_function3.q.out
[junit] Done query: unknown_function3.q
[junit] Begin query: unknown_function4.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/ql/test/logs/negative/unknown_function4.q.out
 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/ql/src/test/results/compiler/errors/unknown_function4.q.out
[junit] Done query: unknown_function4.q
[junit] Begin query: unknown_table1.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: 

[jira] Commented: (HIVE-819) Add lazy decompress ability to RCFile

2009-11-01 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772371#action_12772371
 ] 

Namit Jain commented on HIVE-819:
-

will merge is the tests pass

 Add lazy decompress ability to RCFile
 -

 Key: HIVE-819
 URL: https://issues.apache.org/jira/browse/HIVE-819
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor, Serializers/Deserializers
Reporter: He Yongqiang
Assignee: He Yongqiang
 Fix For: 0.5.0

 Attachments: hive-819-2009-9-12.patch, hive-819-2009-9-21.patch


 This is especially useful for a filter scanning. 
 For example, for query 'select a, b, c from table_rc_lazydecompress where 
 a1;' we only need to decompress the block data of b,c columns when one row's 
 column 'a' in that block satisfies the filter condition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-796) RCFile results missing columns from UNION ALL

2009-11-01 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772372#action_12772372
 ] 

Namit Jain commented on HIVE-796:
-

will merge if the tests pass

 RCFile results missing columns from UNION ALL
 -

 Key: HIVE-796
 URL: https://issues.apache.org/jira/browse/HIVE-796
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: He Yongqiang
 Attachments: hive-796-2009-08-26.patch, hive-796-2009-9-9.patch


 create table tt(a int, b string, c string) row format serde 
 org.apache.hadoop.hive.serde2.column.ColumnarSerDe stored as RCFile;
 load data:
   1 b c
   2 e f
   3 i j
 select * from (
   select b as cola from tt
   union all
   select c as cola from tt) s;
 results:
   NULL
   b
   NULL
   e
   NULL
   i

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-880) user group information not populated for pre and post hook

2009-11-01 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-880:


Fix Version/s: (was: 0.4.1)

Actually this is only fixed in trunk - the patch does not work for branch-0.4.

 user group information not populated for pre and post hook
 --

 Key: HIVE-880
 URL: https://issues.apache.org/jira/browse/HIVE-880
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain
 Fix For: 0.5.0

 Attachments: hive.880.1.patch, hive.880.2.patch, hive.880.3.patch, 
 hive.880.4.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[VOTE] hive release candidate 0.4.1-rc0

2009-11-01 Thread Zheng Shao
I have made a release candidate 0.4.1-rc0.

We've fixed several critical bugs to hive release 0.4.0. We need hive
release 0.4.1 out asap.

Here are the list of changes:

HIVE-884. Metastore Server should call System.exit() on error.
(Zheng Shao via pchakka)

HIVE-864. Fix map-join memory-leak.
(Namit Jain via zshao)

HIVE-878. Update the hash table entry before flushing in Group By
hash aggregation (Zheng Shao via namit)

HIVE-882. Create a new directory every time for scratch.
(Namit Jain via zshao)

HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)

HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via zshao)

HIVE-883. URISyntaxException when partition value contains special chars.
(Zheng Shao via namit)


Please vote.

--
Yours,
Zheng


[jira] Commented: (HIVE-549) Parallel Execution Mechanism

2009-11-01 Thread Zheng Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772385#action_12772385
 ] 

Zheng Shao commented on HIVE-549:
-

THIVE-549-v3.patch:
Can you add javadoc comments to all public methods (except getters and 
setters)? I know we were not following that all the time historically, but we 
want to enforce that on all new patches.

It seems to me that shouldLaunch() is a better name than checkLaunch(), but 
it's up to you whether you would like to change it.

Can we change the name of hive.optimize.par to hive.execution.parallel?
optimize was meant for plan optimization. This is an execution time option. 
Also par does not reflect to parallel in my mind.


 Parallel Execution Mechanism
 

 Key: HIVE-549
 URL: https://issues.apache.org/jira/browse/HIVE-549
 Project: Hadoop Hive
  Issue Type: Wish
  Components: Query Processor
Reporter: Adam Kramer
Assignee: Chaitanya Mishra
 Attachments: HIVE-549-v3.patch


 In a massively parallel database system, it would be awesome to also 
 parallelize some of the mapreduce phases that our data needs to go through.
 One example that just occurred to me is UNION ALL: when you union two SELECT 
 statements, effectively you could run those statements in parallel. There's 
 no situation (that I can think of, but I don't have a formal proof) in which 
 the left statement would rely on the right statement, or vice versa. So, they 
 could be run at the same time...and perhaps they should be. Or, perhaps there 
 should be a way to make this happen...PARALLEL UNION ALL? PUNION ALL?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-796) RCFile results missing columns from UNION ALL

2009-11-01 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772387#action_12772387
 ] 

Namit Jain commented on HIVE-796:
-

tests failed - the new test that you added
can you take a look ?

 RCFile results missing columns from UNION ALL
 -

 Key: HIVE-796
 URL: https://issues.apache.org/jira/browse/HIVE-796
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: He Yongqiang
 Attachments: hive-796-2009-08-26.patch, hive-796-2009-9-9.patch


 create table tt(a int, b string, c string) row format serde 
 org.apache.hadoop.hive.serde2.column.ColumnarSerDe stored as RCFile;
 load data:
   1 b c
   2 e f
   3 i j
 select * from (
   select b as cola from tt
   union all
   select c as cola from tt) s;
 results:
   NULL
   b
   NULL
   e
   NULL
   i

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-819) Add lazy decompress ability to RCFile

2009-11-01 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772388#action_12772388
 ] 

Namit Jain commented on HIVE-819:
-

ql/src/test/queries/clientpositive/rcfile_lazydecompress.q
failed

 Add lazy decompress ability to RCFile
 -

 Key: HIVE-819
 URL: https://issues.apache.org/jira/browse/HIVE-819
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor, Serializers/Deserializers
Reporter: He Yongqiang
Assignee: He Yongqiang
 Fix For: 0.5.0

 Attachments: hive-819-2009-9-12.patch, hive-819-2009-9-21.patch


 This is especially useful for a filter scanning. 
 For example, for query 'select a, b, c from table_rc_lazydecompress where 
 a1;' we only need to decompress the block data of b,c columns when one row's 
 column 'a' in that block satisfies the filter condition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-902) cli.sh can not correctly identify Hadoop minor version numbers less than 20

2009-11-01 Thread Zheng Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772391#action_12772391
 ] 

Zheng Shao commented on HIVE-902:
-

HIVE-902.1.patch:

I remember there is a hadoop version 0.17.2.1. It seems the patch does not 
handle this case.
Can you change the regex to work for that?
Also, please print out the hadoop version in the error message in case we 
cannot detect hadoop version, to make it easier to debug.

Can you also help try it out on earlier versions of bash (bash 3.00 for 
example)?


 cli.sh can not correctly identify Hadoop minor version numbers less than 20
 ---

 Key: HIVE-902
 URL: https://issues.apache.org/jira/browse/HIVE-902
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Carl Steinbach
 Attachments: HIVE-902.1.patch


 cli.sh uses the following logic to detect the version of hadoop:
   version=$($HADOOP version | awk '{print $2;}');
   if [[ $version =~ ^0\.17 ]] || [[ $version =~ ^0\.18 ]] || [[ $version 
 =~ ^0.19 ]]; then
   exec $HADOOP jar $AUX_JARS_CMD_LINE ${HIVE_LIB}/hive_cli.jar $CLASS 
 $HIVE_OPTS $@
   else
   # hadoop 20 or newer - skip the aux_jars option. picked up from hiveconf
   exec $HADOOP jar ${HIVE_LIB}/hive_cli.jar $CLASS $HIVE_OPTS $@ 
   fi
 Apparently bash doesn't expect you to quote the regex:
 % ./bash -version
 GNU bash, version 4.0.0(1)-release (i386-apple-darwin9.8.0)
 % hadoop version
 Hadoop 0.19.0
 Subversion https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 
 -r 713890
 Compiled by ndaley on Fri Nov 14 03:12:29 UTC 2008
 % version=$(hadoop version | awk '{print $2;}')
 % echo $version
 0.19.0 https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 by
 % [[ $version =~ ^0\.19 ]]  echo Yes || echo No
 No
 % [[ $version =~ ^0.19 ]]  echo Yes || echo No
 No
 % [[ $version =~ ^0.19 ]]  echo Yes || echo No
 Yes

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-819) Add lazy decompress ability to RCFile

2009-11-01 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-819:
--

Attachment: hive-819-2009-11-1.patch

update the patch to the trunk code (overwrite the .q.out file).

 Add lazy decompress ability to RCFile
 -

 Key: HIVE-819
 URL: https://issues.apache.org/jira/browse/HIVE-819
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor, Serializers/Deserializers
Reporter: He Yongqiang
Assignee: He Yongqiang
 Fix For: 0.5.0

 Attachments: hive-819-2009-11-1.patch, hive-819-2009-9-12.patch, 
 hive-819-2009-9-21.patch


 This is especially useful for a filter scanning. 
 For example, for query 'select a, b, c from table_rc_lazydecompress where 
 a1;' we only need to decompress the block data of b,c columns when one row's 
 column 'a' in that block satisfies the filter condition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-796) RCFile results missing columns from UNION ALL

2009-11-01 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-796:
--

Attachment: hive-796-2009-11-1.patch

update the patch to trunk code (overwritten the .q.out file). Thanks, Namit!

 RCFile results missing columns from UNION ALL
 -

 Key: HIVE-796
 URL: https://issues.apache.org/jira/browse/HIVE-796
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: He Yongqiang
 Attachments: hive-796-2009-08-26.patch, hive-796-2009-11-1.patch, 
 hive-796-2009-9-9.patch


 create table tt(a int, b string, c string) row format serde 
 org.apache.hadoop.hive.serde2.column.ColumnarSerDe stored as RCFile;
 load data:
   1 b c
   2 e f
   3 i j
 select * from (
   select b as cola from tt
   union all
   select c as cola from tt) s;
 results:
   NULL
   b
   NULL
   e
   NULL
   i

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-902) cli.sh can not correctly identify Hadoop minor version numbers less than 20

2009-11-01 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-902:


Attachment: HIVE-902.2.patch

- Adjusted regex to handle version numbers like 0.17.2.1
- On error, now prints out string returned by 'hadoop version'.
- Verified that this works on Bash 3.2.17


 cli.sh can not correctly identify Hadoop minor version numbers less than 20
 ---

 Key: HIVE-902
 URL: https://issues.apache.org/jira/browse/HIVE-902
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Carl Steinbach
 Attachments: HIVE-902.1.patch, HIVE-902.2.patch


 cli.sh uses the following logic to detect the version of hadoop:
   version=$($HADOOP version | awk '{print $2;}');
   if [[ $version =~ ^0\.17 ]] || [[ $version =~ ^0\.18 ]] || [[ $version 
 =~ ^0.19 ]]; then
   exec $HADOOP jar $AUX_JARS_CMD_LINE ${HIVE_LIB}/hive_cli.jar $CLASS 
 $HIVE_OPTS $@
   else
   # hadoop 20 or newer - skip the aux_jars option. picked up from hiveconf
   exec $HADOOP jar ${HIVE_LIB}/hive_cli.jar $CLASS $HIVE_OPTS $@ 
   fi
 Apparently bash doesn't expect you to quote the regex:
 % ./bash -version
 GNU bash, version 4.0.0(1)-release (i386-apple-darwin9.8.0)
 % hadoop version
 Hadoop 0.19.0
 Subversion https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 
 -r 713890
 Compiled by ndaley on Fri Nov 14 03:12:29 UTC 2008
 % version=$(hadoop version | awk '{print $2;}')
 % echo $version
 0.19.0 https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 by
 % [[ $version =~ ^0\.19 ]]  echo Yes || echo No
 No
 % [[ $version =~ ^0.19 ]]  echo Yes || echo No
 No
 % [[ $version =~ ^0.19 ]]  echo Yes || echo No
 Yes

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] hive release candidate 0.4.1-rc0

2009-11-01 Thread Min Zhou
I think there may be a bug still in this release.

hiveselect stuff_status from auctions where auction_id='2591238417'
and pt='20091027';

auctions is a table partitioned by date, it stored as a textfile w/o
compression. The query above should return 0 rows.
but when hive.exec.compress.output=true,  hive will crash with a
StackOverflowError

java.lang.StackOverflowError
at java.lang.ref.FinalReference.init(FinalReference.java:16)
at java.lang.ref.Finalizer.init(Finalizer.java:66)
at java.lang.ref.Finalizer.register(Finalizer.java:72)
at java.lang.Object.init(Object.java:20)
at java.net.SocketImpl.init(SocketImpl.java:27)
at java.net.PlainSocketImpl.init(PlainSocketImpl.java:90)
at java.net.SocksSocketImpl.init(SocksSocketImpl.java:33)
at java.net.Socket.setImpl(Socket.java:434)
at java.net.Socket.init(Socket.java:68)
at sun.nio.ch.SocketAdaptor.init(SocketAdaptor.java:50)
at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
at 
org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
at java.io.DataInputStream.read(DataInputStream.java:132)
at 
org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
at 
org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
at java.io.InputStream.read(InputStream.java:85)
at org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
at 
org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
at 
org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)

Each mapper will produce a 8 bytes deflate file on hdfs(we set
hive.merge.mapfiles=false), their hex representation  is like below:

78 9C 03 00 00 00 00 01

This is the reason why FetchOperator:272 is called recursively, and
caused a stack overflow error.

Regards,
Min


On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao zsh...@gmail.com wrote:
 I have made a release candidate 0.4.1-rc0.

 We've fixed several critical bugs to hive release 0.4.0. We need hive
 release 0.4.1 out asap.

 Here are the list of changes:

    HIVE-884. Metastore Server should call System.exit() on error.
    (Zheng Shao via pchakka)

    HIVE-864. Fix map-join memory-leak.
    (Namit Jain via zshao)

    HIVE-878. Update the hash table entry before flushing in Group By
    hash aggregation (Zheng Shao via namit)

    HIVE-882. Create a new directory every time for scratch.
    (Namit Jain via zshao)

    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)

    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via zshao)

    HIVE-883. URISyntaxException when partition value contains special chars.
    (Zheng Shao via namit)


 Please vote.

 --
 Yours,
 Zheng




-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com


[jira] Commented: (HIVE-902) cli.sh can not correctly identify Hadoop minor version numbers less than 20

2009-11-01 Thread Zheng Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12772437#action_12772437
 ] 

Zheng Shao commented on HIVE-902:
-

+1. Will commit after test passes.


 cli.sh can not correctly identify Hadoop minor version numbers less than 20
 ---

 Key: HIVE-902
 URL: https://issues.apache.org/jira/browse/HIVE-902
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Carl Steinbach
 Attachments: HIVE-902.1.patch, HIVE-902.2.patch


 cli.sh uses the following logic to detect the version of hadoop:
   version=$($HADOOP version | awk '{print $2;}');
   if [[ $version =~ ^0\.17 ]] || [[ $version =~ ^0\.18 ]] || [[ $version 
 =~ ^0.19 ]]; then
   exec $HADOOP jar $AUX_JARS_CMD_LINE ${HIVE_LIB}/hive_cli.jar $CLASS 
 $HIVE_OPTS $@
   else
   # hadoop 20 or newer - skip the aux_jars option. picked up from hiveconf
   exec $HADOOP jar ${HIVE_LIB}/hive_cli.jar $CLASS $HIVE_OPTS $@ 
   fi
 Apparently bash doesn't expect you to quote the regex:
 % ./bash -version
 GNU bash, version 4.0.0(1)-release (i386-apple-darwin9.8.0)
 % hadoop version
 Hadoop 0.19.0
 Subversion https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 
 -r 713890
 Compiled by ndaley on Fri Nov 14 03:12:29 UTC 2008
 % version=$(hadoop version | awk '{print $2;}')
 % echo $version
 0.19.0 https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 by
 % [[ $version =~ ^0\.19 ]]  echo Yes || echo No
 No
 % [[ $version =~ ^0.19 ]]  echo Yes || echo No
 No
 % [[ $version =~ ^0.19 ]]  echo Yes || echo No
 Yes

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: [VOTE] hive release candidate 0.4.1-rc0

2009-11-01 Thread Zheng Shao
Min, can you check the default compression codec in your hadoop conf?
The 8-byte file must be a compressed file using the codec which
represents 0-length file.

It seems that codec was not able to decompress the stream.

Zheng

On Sun, Nov 1, 2009 at 10:49 PM, Min Zhou coderp...@gmail.com wrote:
 I think there may be a bug still in this release.

 hiveselect stuff_status from auctions where auction_id='2591238417'
 and pt='20091027';

 auctions is a table partitioned by date, it stored as a textfile w/o
 compression. The query above should return 0 rows.
 but when hive.exec.compress.output=true,  hive will crash with a
 StackOverflowError

 java.lang.StackOverflowError
        at java.lang.ref.FinalReference.init(FinalReference.java:16)
        at java.lang.ref.Finalizer.init(Finalizer.java:66)
        at java.lang.ref.Finalizer.register(Finalizer.java:72)
        at java.lang.Object.init(Object.java:20)
        at java.net.SocketImpl.init(SocketImpl.java:27)
        at java.net.PlainSocketImpl.init(PlainSocketImpl.java:90)
        at java.net.SocksSocketImpl.init(SocksSocketImpl.java:33)
        at java.net.Socket.setImpl(Socket.java:434)
        at java.net.Socket.init(Socket.java:68)
        at sun.nio.ch.SocketAdaptor.init(SocketAdaptor.java:50)
        at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
        at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
        at 
 org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
        at java.io.DataInputStream.read(DataInputStream.java:132)
        at 
 org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
        at 
 org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
        at 
 org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
        at java.io.InputStream.read(InputStream.java:85)
        at org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
        at 
 org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
        at 
 org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
        at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
        at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)

 Each mapper will produce a 8 bytes deflate file on hdfs(we set
 hive.merge.mapfiles=false), their hex representation  is like below:

 78 9C 03 00 00 00 00 01

 This is the reason why FetchOperator:272 is called recursively, and
 caused a stack overflow error.

 Regards,
 Min


 On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao zsh...@gmail.com wrote:
 I have made a release candidate 0.4.1-rc0.

 We've fixed several critical bugs to hive release 0.4.0. We need hive
 release 0.4.1 out asap.

 Here are the list of changes:

    HIVE-884. Metastore Server should call System.exit() on error.
    (Zheng Shao via pchakka)

    HIVE-864. Fix map-join memory-leak.
    (Namit Jain via zshao)

    HIVE-878. Update the hash table entry before flushing in Group By
    hash aggregation (Zheng Shao via namit)

    HIVE-882. Create a new directory every time for scratch.
    (Namit Jain via zshao)

    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)

    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via 
 zshao)

    HIVE-883. URISyntaxException when partition value contains special chars.
    (Zheng Shao via namit)


 Please vote.

 --
 Yours,
 Zheng




 --
 My research interests are distributed systems, parallel computing and
 bytecode based virtual machine.

 My profile:
 http://www.linkedin.com/in/coderplay
 My blog:
 http://coderplay.javaeye.com




-- 
Yours,
Zheng


Re: [VOTE] hive release candidate 0.4.1-rc0

2009-11-01 Thread Min Zhou
we use zip codec in default.
Some of the same lines were omitted from the error stack:
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)


Thanks,
Min

On Mon, Nov 2, 2009 at 2:57 PM, Zheng Shao zsh...@gmail.com wrote:
 Min, can you check the default compression codec in your hadoop conf?
 The 8-byte file must be a compressed file using the codec which
 represents 0-length file.

 It seems that codec was not able to decompress the stream.

 Zheng

 On Sun, Nov 1, 2009 at 10:49 PM, Min Zhou coderp...@gmail.com wrote:
 I think there may be a bug still in this release.

 hiveselect stuff_status from auctions where auction_id='2591238417'
 and pt='20091027';

 auctions is a table partitioned by date, it stored as a textfile w/o
 compression. The query above should return 0 rows.
 but when hive.exec.compress.output=true,  hive will crash with a
 StackOverflowError

 java.lang.StackOverflowError
        at java.lang.ref.FinalReference.init(FinalReference.java:16)
        at java.lang.ref.Finalizer.init(Finalizer.java:66)
        at java.lang.ref.Finalizer.register(Finalizer.java:72)
        at java.lang.Object.init(Object.java:20)
        at java.net.SocketImpl.init(SocketImpl.java:27)
        at java.net.PlainSocketImpl.init(PlainSocketImpl.java:90)
        at java.net.SocksSocketImpl.init(SocksSocketImpl.java:33)
        at java.net.Socket.setImpl(Socket.java:434)
        at java.net.Socket.init(Socket.java:68)
        at sun.nio.ch.SocketAdaptor.init(SocketAdaptor.java:50)
        at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
        at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
        at 
 org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
        at java.io.DataInputStream.read(DataInputStream.java:132)
        at 
 org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
        at 
 org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
        at 
 org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
        at java.io.InputStream.read(InputStream.java:85)
        at org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
        at 
 org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
        at 
 org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
        at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
        at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)

 Each mapper will produce a 8 bytes deflate file on hdfs(we set
 hive.merge.mapfiles=false), their hex representation  is like below:

 78 9C 03 00 00 00 00 01

 This is the reason why FetchOperator:272 is called recursively, and
 caused a stack overflow error.

 Regards,
 Min


 On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao zsh...@gmail.com wrote:
 I have made a release candidate 0.4.1-rc0.

 We've fixed several critical bugs to hive release 0.4.0. We need hive
 release 0.4.1 out asap.

 Here are the list of changes:

    HIVE-884. Metastore Server should call System.exit() on error.
    (Zheng Shao via pchakka)

    HIVE-864. Fix map-join memory-leak.
    (Namit Jain via zshao)

    HIVE-878. Update the hash table entry before flushing in Group By
    hash aggregation (Zheng Shao via namit)

    HIVE-882. Create a new directory every time for scratch.
    (Namit Jain via zshao)

    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)

    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via 
 zshao)

    HIVE-883. URISyntaxException when partition value contains special chars.
    (Zheng Shao via namit)


 Please vote.

 --
 Yours,
 Zheng




 --
 My research interests are distributed systems, parallel computing and
 bytecode based virtual machine.

 My profile:
 http://www.linkedin.com/in/coderplay
 My blog:
 http://coderplay.javaeye.com




 --
 Yours,
 Zheng




-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com


[jira] Assigned: (HIVE-902) cli.sh can not correctly identify Hadoop minor version numbers less than 20

2009-11-01 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao reassigned HIVE-902:
---

Assignee: Carl Steinbach

 cli.sh can not correctly identify Hadoop minor version numbers less than 20
 ---

 Key: HIVE-902
 URL: https://issues.apache.org/jira/browse/HIVE-902
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: HIVE-902.1.patch, HIVE-902.2.patch


 cli.sh uses the following logic to detect the version of hadoop:
   version=$($HADOOP version | awk '{print $2;}');
   if [[ $version =~ ^0\.17 ]] || [[ $version =~ ^0\.18 ]] || [[ $version 
 =~ ^0.19 ]]; then
   exec $HADOOP jar $AUX_JARS_CMD_LINE ${HIVE_LIB}/hive_cli.jar $CLASS 
 $HIVE_OPTS $@
   else
   # hadoop 20 or newer - skip the aux_jars option. picked up from hiveconf
   exec $HADOOP jar ${HIVE_LIB}/hive_cli.jar $CLASS $HIVE_OPTS $@ 
   fi
 Apparently bash doesn't expect you to quote the regex:
 % ./bash -version
 GNU bash, version 4.0.0(1)-release (i386-apple-darwin9.8.0)
 % hadoop version
 Hadoop 0.19.0
 Subversion https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 
 -r 713890
 Compiled by ndaley on Fri Nov 14 03:12:29 UTC 2008
 % version=$(hadoop version | awk '{print $2;}')
 % echo $version
 0.19.0 https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 by
 % [[ $version =~ ^0\.19 ]]  echo Yes || echo No
 No
 % [[ $version =~ ^0.19 ]]  echo Yes || echo No
 No
 % [[ $version =~ ^0.19 ]]  echo Yes || echo No
 Yes

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-902) cli.sh can not correctly identify Hadoop minor version numbers less than 20

2009-11-01 Thread Zheng Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zheng Shao updated HIVE-902:


   Resolution: Fixed
Fix Version/s: 0.5.0
   0.4.1
 Release Note: HIVE-902. Fix cli.sh to work with hadoop versions less than 
20. (Carl Steinbach via zshao)
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed. Thanks Carl!

 cli.sh can not correctly identify Hadoop minor version numbers less than 20
 ---

 Key: HIVE-902
 URL: https://issues.apache.org/jira/browse/HIVE-902
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.4.1, 0.5.0

 Attachments: HIVE-902.1.patch, HIVE-902.2.patch


 cli.sh uses the following logic to detect the version of hadoop:
   version=$($HADOOP version | awk '{print $2;}');
   if [[ $version =~ ^0\.17 ]] || [[ $version =~ ^0\.18 ]] || [[ $version 
 =~ ^0.19 ]]; then
   exec $HADOOP jar $AUX_JARS_CMD_LINE ${HIVE_LIB}/hive_cli.jar $CLASS 
 $HIVE_OPTS $@
   else
   # hadoop 20 or newer - skip the aux_jars option. picked up from hiveconf
   exec $HADOOP jar ${HIVE_LIB}/hive_cli.jar $CLASS $HIVE_OPTS $@ 
   fi
 Apparently bash doesn't expect you to quote the regex:
 % ./bash -version
 GNU bash, version 4.0.0(1)-release (i386-apple-darwin9.8.0)
 % hadoop version
 Hadoop 0.19.0
 Subversion https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 
 -r 713890
 Compiled by ndaley on Fri Nov 14 03:12:29 UTC 2008
 % version=$(hadoop version | awk '{print $2;}')
 % echo $version
 0.19.0 https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.19 by
 % [[ $version =~ ^0\.19 ]]  echo Yes || echo No
 No
 % [[ $version =~ ^0.19 ]]  echo Yes || echo No
 No
 % [[ $version =~ ^0.19 ]]  echo Yes || echo No
 Yes

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.