Build failed in Hudson: Hive-trunk-h0.17 #465

2010-06-10 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/465/changes

Changes:

[athusoo] HIVE-1373. Missing connection pool plugin in Eclipse classpath.
(Vinithra via athusoo)

--
[...truncated 11410 lines...]
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ql/test/logs/negative/unknown_function4.q.out
 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/src/test/results/compiler/errors/unknown_function4.q.out
[junit] Done query: unknown_function4.q
[junit] Begin query: unknown_table1.q
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/build/ql/test/logs/negative/unknown_table1.q.out
 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.17/ws/hive/ql/src/test/results/compiler/errors/unknown_table1.q.out
[junit] Done query: unknown_table1.q
[junit] Begin query: unknown_table2.q
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] 

Hudson build is back to normal : Hive-trunk-h0.18 #468

2010-06-10 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/468/changes




[jira] Updated: (HIVE-895) Add SerDe for Avro serialized data

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-895:


Assignee: (was: Carl Steinbach)

 Add SerDe for Avro serialized data
 --

 Key: HIVE-895
 URL: https://issues.apache.org/jira/browse/HIVE-895
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Jeff Hammerbacher

 As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
 data seems like a solid win.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-1170) Ivy looks for Hadoop POMs that don't exist

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-1170.
--

Resolution: Not A Problem

This is not actually a bug as far as I can tell. Resolving as Not A Problem.

 Ivy looks for Hadoop POMs that don't exist
 --

 Key: HIVE-1170
 URL: https://issues.apache.org/jira/browse/HIVE-1170
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.6.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach

 In the event that Ivy can not satisfy the shim dependencies using
 archive.apache.org our ivysetttings configuration causes it to
 look for Hadoop POMs. This will always fail since Hadoop POMs do
 not exist (see HADOOP-6382).
 {noformat}
 ivy-retrieve-hadoop-source:
 [ivy:retrieve] :: Ivy 2.0.0-rc2 - 20081028224207 :: 
 http://ant.apache.org/ivy/ :
 :: loading settings :: file = /master/hive/ivy/ivysettings.xml
 [ivy:retrieve] :: resolving dependencies :: 
 org.apache.hadoop.hive#shims;working
 [ivy:retrieve]  confs: [default]
 [ivy:retrieve] :: resolution report :: resolve 953885ms :: artifacts dl 0ms
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
|  default |   1   |   0   |   0   |   0   ||   0   |   0   |
-
 [ivy:retrieve]
 [ivy:retrieve] :: problems summary ::
 [ivy:retrieve]  WARNINGS
 [ivy:retrieve]  module not found: hadoop#core;0.20.1
 [ivy:retrieve]   hadoop-source: tried
 [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
 [ivy:retrieve]
 http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]   apache-snapshot: tried
 [ivy:retrieve]
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
 [ivy:retrieve]
 https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]   maven2: tried
 [ivy:retrieve]
 http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]-- artifact hadoop#core;0.20.1!hadoop.tar.gz(source):
 [ivy:retrieve]
 http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.tar.gz
 [ivy:retrieve]  ::
 [ivy:retrieve]  ::  UNRESOLVED DEPENDENCIES ::
 [ivy:retrieve]  ::
 [ivy:retrieve]  :: hadoop#core;0.20.1: not found
 [ivy:retrieve]  ::
 [ivy:retrieve]  ERRORS
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=http://archive.apache.org/dist/hadoop/core/hadoop-0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=https://repository.apache.org/content/repositories/snapshots/hadoop/core/0.20.1/hadoop-0.20.1.tar.gz
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.pom
 [ivy:retrieve]  Server access Error: Connection timed out 
 url=http://repo1.maven.org/maven2/hadoop/core/0.20.1/core-0.20.1.tar.gz
 [ivy:retrieve]
 [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-1187) Implement ddldump utility for Hive Metastore

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-1187.
--

Resolution: Duplicate

 Implement ddldump utility for Hive Metastore
 

 Key: HIVE-1187
 URL: https://issues.apache.org/jira/browse/HIVE-1187
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore
Affects Versions: 0.6.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach

 Implement a ddldump utility for the Hive metastore that will generate the QL 
 DDL necessary to recreate the state of the current metastore on another 
 metastore instance.
 A major use case for this utility is migrating a metastore from one database 
 to another database, e.g. from an embedded Derby instanced to a MySQL 
 instance.
 The ddldump utility should support the following features:
 * Ability to generate DDL for specific tables or all tables.
 * Ability to specify a table name prefix for the generated DDL, which will be 
 useful for resolving table name conflicts.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (HIVE-967) Implement show create table

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reassigned HIVE-967:
---

Assignee: Carl Steinbach

 Implement show create table
 -

 Key: HIVE-967
 URL: https://issues.apache.org/jira/browse/HIVE-967
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Adam Kramer
Assignee: Carl Steinbach

 SHOW CREATE TABLE would be very useful in cases where you are trying to 
 figure out the partitioning and/or bucketing scheme for a table. Perhaps this 
 could be implemented by having new tables automatically SET PROPERTIES 
 (create_command='raw text of the create statement')?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Hive-trunk-h0.19 #467

2010-06-10 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/467/changes

Changes:

[athusoo] HIVE-1373. Missing connection pool plugin in Eclipse classpath.
(Vinithra via athusoo)

--
[...truncated 14090 lines...]
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_function4.q.out
 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_function4.q.out
[junit] Done query: unknown_function4.q
[junit] Begin query: unknown_table1.q
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_table1.q.out
 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_table1.q.out
[junit] Done query: unknown_table1.q
[junit] Begin query: unknown_table2.q
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Loading data to table srcbucket2
[junit] 

[jira] Commented: (HIVE-1139) GroupByOperator sometimes throws OutOfMemory error when there are too many distinct keys

2010-06-10 Thread Soundararajan Velu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12877500#action_12877500
 ] 

Soundararajan Velu commented on HIVE-1139:
--

Ning, Aravind, I got this implemented and it looks good so far, I will try 
uploading the version I have modified after a thorough test, All I did was copy 
the HashMap implementation into the HashMapWrapper (leaving the existing 
functionality intact), now HashmapWrapper works exactly like hashmap, but I did 
not get to test out the serialization issues. will do that and update you guys. 
I think this should help us in our OOM issue around GroupBy...

 GroupByOperator sometimes throws OutOfMemory error when there are too many 
 distinct keys
 

 Key: HIVE-1139
 URL: https://issues.apache.org/jira/browse/HIVE-1139
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Arvind Prabhakar

 When a partial aggregation performed on a mapper, a HashMap is created to 
 keep all distinct keys in main memory. This could leads to OOM exception when 
 there are too many distinct keys for a particular mapper. A workaround is to 
 set the map split size smaller so that each mapper takes less number of rows. 
 A better solution is to use the persistent HashMapWrapper (currently used in 
 CommonJoinOperator) to spill overflow rows to disk. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1139) GroupByOperator sometimes throws OutOfMemory error when there are too many distinct keys

2010-06-10 Thread Soundararajan Velu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12877502#action_12877502
 ] 

Soundararajan Velu commented on HIVE-1139:
--

to add, XMLEncoder/XMLDecoder works just fine and can handle our serde issues.

 GroupByOperator sometimes throws OutOfMemory error when there are too many 
 distinct keys
 

 Key: HIVE-1139
 URL: https://issues.apache.org/jira/browse/HIVE-1139
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Arvind Prabhakar

 When a partial aggregation performed on a mapper, a HashMap is created to 
 keep all distinct keys in main memory. This could leads to OOM exception when 
 there are too many distinct keys for a particular mapper. A workaround is to 
 set the map split size smaller so that each mapper takes less number of rows. 
 A better solution is to use the persistent HashMapWrapper (currently used in 
 CommonJoinOperator) to spill overflow rows to disk. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1399) Nested UDAFs cause Hive Internal Error (NullPointerException)

2010-06-10 Thread Mayank Lahiri (JIRA)
Nested UDAFs cause Hive Internal Error (NullPointerException)
-

 Key: HIVE-1399
 URL: https://issues.apache.org/jira/browse/HIVE-1399
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: Mayank Lahiri
 Fix For: 0.6.0


This query does not make real-world sense, and I'm guessing it's not even 
supported by HQL/SQL, but I'm pretty sure that it shouldn't be causing an 
internal error with a NullPointerException. normal just has one column called 
val. I'm running on trunk, svn updated 5 minutes ago, ant clean package.

SELECT percentile(val, percentile(val, 0.5)) FROM normal;

FAILED: Hive Internal Error: java.lang.NullPointerException(null)
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:153)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:587)
at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:708)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:6241)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2301)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:2860)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5002)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5524)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6055)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)



I've also recreated this error with a GenericUDAF I'm writing, and also with 
the following:

SELECT percentile(val, percentile()) FROM normal;   
SELECT avg(variance(dob_year)) FROM somedata; // this makes no sense, but 
still a NullPointerException

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-705) Hive HBase Integration (umbrella)

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-705:


Summary: Hive HBase Integration (umbrella)  (was: Let Hive can analyse 
hbase's tables)

 Hive HBase Integration (umbrella)
 -

 Key: HIVE-705
 URL: https://issues.apache.org/jira/browse/HIVE-705
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.6.0
Reporter: Samuel Guo
Assignee: John Sichi
 Fix For: 0.6.0

 Attachments: hbase-0.19.3-test.jar, hbase-0.19.3.jar, 
 hbase-0.20.3-test.jar, hbase-0.20.3.jar, HIVE-705.1.patch, HIVE-705.2.patch, 
 HIVE-705.3.patch, HIVE-705.4.patch, HIVE-705.5.patch, HIVE-705.6.patch, 
 HIVE-705.7.patch, HIVE-705_draft.patch, HIVE-705_revision806905.patch, 
 HIVE-705_revision883033.patch, zookeeper-3.2.2.jar


 Add a serde over the hbase's tables, so that hive can analyse the data stored 
 in hbase easily.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1226) support filter pushdown against non-native tables

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1226:
-

Component/s: HBase Handler

 support filter pushdown against non-native tables
 -

 Key: HIVE-1226
 URL: https://issues.apache.org/jira/browse/HIVE-1226
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: HBase Handler, Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.6.0


 For example, HBase's scan object can take filters.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-705) Hive HBase Integration (umbrella)

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-705:


Component/s: HBase Handler

 Hive HBase Integration (umbrella)
 -

 Key: HIVE-705
 URL: https://issues.apache.org/jira/browse/HIVE-705
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: HBase Handler
Affects Versions: 0.6.0
Reporter: Samuel Guo
Assignee: John Sichi
 Fix For: 0.6.0

 Attachments: hbase-0.19.3-test.jar, hbase-0.19.3.jar, 
 hbase-0.20.3-test.jar, hbase-0.20.3.jar, HIVE-705.1.patch, HIVE-705.2.patch, 
 HIVE-705.3.patch, HIVE-705.4.patch, HIVE-705.5.patch, HIVE-705.6.patch, 
 HIVE-705.7.patch, HIVE-705_draft.patch, HIVE-705_revision806905.patch, 
 HIVE-705_revision883033.patch, zookeeper-3.2.2.jar


 Add a serde over the hbase's tables, so that hive can analyse the data stored 
 in hbase easily.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1267) Make CombineHiveInputFormat work with non-native tables

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1267:
-

Component/s: HBase Handler

 Make CombineHiveInputFormat work with non-native tables
 ---

 Key: HIVE-1267
 URL: https://issues.apache.org/jira/browse/HIVE-1267
 Project: Hadoop Hive
  Issue Type: Bug
  Components: HBase Handler, Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.6.0


 As part of fixing HIVE-1257, I am making CombineHiveInputFormat punt when it 
 sees a non-native table.  I need to come up with a real fix to allow 
 CombineHiveInputFormat to deal with native and non-native tables at the same 
 time.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-758) function to load data from hive to hbase

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-758:


Component/s: UDF

 function to load data from hive to hbase
 

 Key: HIVE-758
 URL: https://issues.apache.org/jira/browse/HIVE-758
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: HBase Handler, UDF
Reporter: Raghotham Murthy
Priority: Minor
 Attachments: hive-758.1.patch, hive-758.2.patch


 supoprt a query like: SELECT hbase_put('hive_hbase_table', rowid, colfamily, 
 col, value, ts) FROM src;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1133) Refactor InputFormat and OutputFormat for Hive

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1133:
-

Component/s: HBase Handler
 Serializers/Deserializers

 Refactor InputFormat and OutputFormat for Hive
 --

 Key: HIVE-1133
 URL: https://issues.apache.org/jira/browse/HIVE-1133
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: HBase Handler, Serializers/Deserializers
Affects Versions: 0.6.0
Reporter: Zheng Shao

 Currently we ran into several problems of the FileInputFormat/OutputFormat in 
 Hive.
 The requirements are:
 R1. We want to support HBase: HIVE-806
 R2. We want to selectively include files based on file names: HIVE-951
 R3. We want to optionally choose to recurse on the directory structure: 
 HIVE-1083
 R4. We want to pass the filter condition into the storage (very useful for 
 HBase, and indexed data format)
 R5. We want to pass the column selection information into the storage 
 (already done as part of the RCFile, but we can do it better)
 We need to structure these requirements and the code structure in a good way 
 to make it extensible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1222) in metastore, do not store names of inputformat/outputformat/serde for non-native tables

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1222:
-

Component/s: HBase Handler
 Metastore

 in metastore, do not store names of inputformat/outputformat/serde for 
 non-native tables
 

 Key: HIVE-1222
 URL: https://issues.apache.org/jira/browse/HIVE-1222
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: HBase Handler, Metastore, Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.6.0


 Instead, store null and get them dynamically from the storage handler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1221) model storage handler as an attribute on StorageDescriptor

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1221:
-

Component/s: HBase Handler

 model storage handler as an attribute on StorageDescriptor
 --

 Key: HIVE-1221
 URL: https://issues.apache.org/jira/browse/HIVE-1221
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: HBase Handler, Metastore
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.6.0


 For initial work on HIVE-705, I modeled storage handler as a table property, 
 but it should really be a first-class attribute on StorageDescriptor.  We'd 
 like to combine this metastore change with others such as HIVE-1073.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1223) support partitioning for non-native tables

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1223:
-

Component/s: HBase Handler

 support partitioning for non-native tables
 --

 Key: HIVE-1223
 URL: https://issues.apache.org/jira/browse/HIVE-1223
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: HBase Handler, Metastore, Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.6.0


 The exact requirements remain to be determined here, since there are a lot of 
 possibilities for what this could mean.  Using HBase as an example, one 
 possibility would be physical partitions such as creating one HBase table per 
 partition, whereas another would be virtual partitions such as one partition 
 per timestamp (e.g. to provide snapshot semantics).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1224) refine interaction between views / non-native tables and execution hooks

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1224:
-

Component/s: HBase Handler

 refine interaction between views / non-native tables and execution hooks
 

 Key: HIVE-1224
 URL: https://issues.apache.org/jira/browse/HIVE-1224
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: HBase Handler, Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.6.0


 I need to take a look to see what information is being passed to pre/post 
 exec hooks for operations on views and non-native tables, and see if it is 
 correct and sufficient for all conceivable use cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1225) enhance storage handler interface to allow for atomic operations

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1225:
-

Component/s: HBase Handler

 enhance storage handler interface to allow for atomic operations
 

 Key: HIVE-1225
 URL: https://issues.apache.org/jira/browse/HIVE-1225
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: HBase Handler, Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.6.0


 For native tables, we support atomic operations such as INSERT by only moving 
 files from tmp to the real location once the operation is complete.  Some 
 storage handlers may be able to support something equivalent; e.g. for HBase, 
 we could purge new timestamps if the operation fails.  Even if we don't go 
 all the way to two-phase-commit, we could at least enable something that 
 handles most simple cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1397) histogram() UDAF for a numerical column

2010-06-10 Thread Mayank Lahiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Lahiri updated HIVE-1397:


Status: Patch Available  (was: Open)

I've implemented and tested the algorithm. I'm running some experiments on how 
far from optimal (in terms of MSE) we're getting with this streaming algorithm, 
but as of now, it seems to perform well when the number of data points is a few 
orders of magnitude larger than the number of bins. As an example I'm getting 
good histograms when there 100,000 data points and 20-80 histogram bins.

As I noted before, there are no approximation guarantees in terms of how close 
to optimal the histogram is.

 histogram() UDAF for a numerical column
 ---

 Key: HIVE-1397
 URL: https://issues.apache.org/jira/browse/HIVE-1397
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: Mayank Lahiri
Assignee: Mayank Lahiri
 Fix For: 0.6.0

 Attachments: HIVE-1397.1.patch


 A histogram() UDAF to generate an approximate histogram of a numerical (byte, 
 short, double, long, etc.) column. The result is returned as a map of (x,y) 
 histogram pairs, and can be plotted in Gnuplot using impulses (for example). 
 The algorithm is currently adapted from A streaming parallel decision tree 
 algorithm by Ben-Haim and Tom-Tov, JMLR 11 (2010), and uses space 
 proportional to the number of histogram bins specified. It has no 
 approximation guarantees, but seems to work well when there is a lot of data 
 and a large number (e.g. 50-100) of histogram bins specified.
 A typical call might be:
 SELECT histogram(val, 10) FROM some_table;
 where the result would be a histogram with 10 bins, returned as a Hive map 
 object.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1227) factor TableSinkOperator out of existing FileSinkOperator

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1227:
-

Component/s: HBase Handler

 factor TableSinkOperator out of existing FileSinkOperator
 -

 Key: HIVE-1227
 URL: https://issues.apache.org/jira/browse/HIVE-1227
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: HBase Handler, Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.6.0


 For non-native tables, a lot of the code in FileSinkOperator is irrelevant 
 and has to be bypassed.  It would be cleaner to factor out an 
 AbstractSinkOperator with subclasses FileSinkOperator and TableSinkOperator.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1220) accept TBLPROPERTIES on CREATE TABLE/VIEW

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1220:
-

Component/s: HBase Handler

 accept TBLPROPERTIES on CREATE TABLE/VIEW
 -

 Key: HIVE-1220
 URL: https://issues.apache.org/jira/browse/HIVE-1220
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: HBase Handler, Query Processor
Affects Versions: 0.5.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.6.0

 Attachments: HIVE-1220.1.patch


 Currently, Hive only supports ALTER TABLE t SET TBLPROPERTIES, but does not 
 allow specification of table properties during CREATE TABLE.  We should allow 
 properties to be set at the time a table or view is created.  This is useful 
 in general, and in particular we want to use this so that storage handler 
 properties (see HIVE-705) unrelated to serdes can be specified here rather 
 than in SERDEPROPERTIES.  See also HIVE-1144 regarding views.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1240) support ALTER TABLE on non-native tables

2010-06-10 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1240:
-

Component/s: HBase Handler

 support ALTER TABLE on non-native tables
 

 Key: HIVE-1240
 URL: https://issues.apache.org/jira/browse/HIVE-1240
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: HBase Handler, Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.6.0


 Currently this is prohibited, but at least some cases make sense.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1400) CombineHiveInputFormat does not set columns needed property

2010-06-10 Thread He Yongqiang (JIRA)
CombineHiveInputFormat does not set columns needed property
-

 Key: HIVE-1400
 URL: https://issues.apache.org/jira/browse/HIVE-1400
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: He Yongqiang


When i was testing a job, i found it seems CombineHiveInputFormat did not pass 
columns needed to the underlying reader. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1397) histogram() UDAF for a numerical column

2010-06-10 Thread Mayank Lahiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mayank Lahiri updated HIVE-1397:


Attachment: HIVE-1397.1.patch

 histogram() UDAF for a numerical column
 ---

 Key: HIVE-1397
 URL: https://issues.apache.org/jira/browse/HIVE-1397
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: Mayank Lahiri
Assignee: Mayank Lahiri
 Fix For: 0.6.0

 Attachments: HIVE-1397.1.patch


 A histogram() UDAF to generate an approximate histogram of a numerical (byte, 
 short, double, long, etc.) column. The result is returned as a map of (x,y) 
 histogram pairs, and can be plotted in Gnuplot using impulses (for example). 
 The algorithm is currently adapted from A streaming parallel decision tree 
 algorithm by Ben-Haim and Tom-Tov, JMLR 11 (2010), and uses space 
 proportional to the number of histogram bins specified. It has no 
 approximation guarantees, but seems to work well when there is a lot of data 
 and a large number (e.g. 50-100) of histogram bins specified.
 A typical call might be:
 SELECT histogram(val, 10) FROM some_table;
 where the result would be a histogram with 10 bins, returned as a Hive map 
 object.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1401) Web Interface can ony browse default

2010-06-10 Thread Edward Capriolo (JIRA)
Web Interface can ony browse default


 Key: HIVE-1401
 URL: https://issues.apache.org/jira/browse/HIVE-1401
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Web UI
Affects Versions: 0.5.0
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Fix For: 0.6.0




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1401) Web Interface can ony browse default

2010-06-10 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-1401:
--

Attachment: HIVE-1401-1-patch.txt

 Web Interface can ony browse default
 

 Key: HIVE-1401
 URL: https://issues.apache.org/jira/browse/HIVE-1401
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Web UI
Affects Versions: 0.5.0
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Fix For: 0.6.0

 Attachments: HIVE-1401-1-patch.txt




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Hive-Hbase integration problem, ask for help

2010-06-10 Thread Zhou Shuaifeng
Hi Guys,
 
I download the hive source from SVN server, build it and try to run the
hive-hbase integration.
 
I works well on all file-based hive tables, but on the hbase-based tables,
the 'insert' command cann't run successful. The 'select' command can run
well.
 
error info is below:
 
hive INSERT OVERWRITE TABLE hive_zsf SELECT * FROM zsf WHERE id=3;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201006081948_0021, Tracking URL =
http://linux-01:50030/jobdetails.jsp?jobid=job_201006081948_0021
Kill Command = /opt/hadoop/hdfs/bin/../bin/hadoop job
-Dmapred.job.tracker=linux-01:9001 -kill job_201006081948_0021
2010-06-09 16:05:43,898 Stage-0 map = 0%,  reduce = 0%
2010-06-09 16:06:12,131 Stage-0 map = 100%,  reduce = 100%
Ended Job = job_201006081948_0021 with errors
 
Task with the most failures(4):
-
Task ID:
  task_201006081948_0021_m_00
 
URL:
  http://linux-01:50030/taskdetails.jsp?jobid=job_201006081948_0021
http://linux-01:50030/taskdetails.jsp?jobid=job_201006081948_0021tipid=tas
k_201006081948_0021_m_00 tipid=task_201006081948_0021_m_00
-
 
FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.ExecDriver
 
 
 
 
I create a hbase-based table with hive, put some data into the hbase table
through the hbase shell, and can select data from it through hive:
 
CREATE TABLE hive_zsf1(id int, name string) ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf1:val)
TBLPROPERTIES (hbase.table.name = hive_zsf1);
 
hbase(main):001:0 scan 'hive_zsf1'
ROW  COLUMN+CELL

 1   column=cf1:val, timestamp=1276157509028,
value=zsf 
 2   column=cf1:val, timestamp=1276157539051,
value=zzf 
 3   column=cf1:val, timestamp=1276157548247,
value=zw  
 4   column=cf1:val, timestamp=1276157557115,
value=cjl 
4 row(s) in 0.0470 seconds
hbase(main):002:0

hive select * from hive_zsf1 where id=3;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201006081948_0038, Tracking URL =
http://linux-01:50030/jobdetails.jsp?jobid=job_201006081948_0038
Kill Command = /opt/hadoop/hdfs/bin/../bin/hadoop job
-Dmapred.job.tracker=linux-01:9001 -kill job_201006081948_0038
2010-06-11 10:25:42,049 Stage-1 map = 0%,  reduce = 0%
2010-06-11 10:25:45,090 Stage-1 map = 100%,  reduce = 0%
2010-06-11 10:25:48,133 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201006081948_0038
OK
3   zw
Time taken: 13.526 seconds
hive

 



-
This e-mail and its attachments contain confidential information from
HUAWEI, which 
is intended only for the person or entity whose address is listed above. Any
use of the 
information contained herein in any way (including, but not limited to,
total or partial 
disclosure, reproduction, or dissemination) by persons other than the
intended 
recipient(s) is prohibited. If you receive this e-mail in error, please
notify the sender by 
phone or email immediately and delete it!

 


[jira] Updated: (HIVE-543) provide option to run hive in local mode

2010-06-10 Thread Joydeep Sen Sarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joydeep Sen Sarma updated HIVE-543:
---

Attachment: hive-543.patch.1

a few fixes for better local mode execution:
- provide alternate log4j file configuration for capturing local mode execution 
log (and default to hive log4j if none provided). this cleans up the goop on 
the cli but allows captuing execution time logs in a separate location if 
desired
- bypass distributed cache for local mode submissions. saves on hdfs time
  - some cleanup on the set/get MapRedWork code path. it seems to have been 
messed up after the parallel execution changes
- getMRScratchDir - now returns a local scratch dir when executing in local 
mode. so we don't hit hdfs unnecessarily in local mode.
- fix to fileutils.makequalified because of the above. there was a subtle bug 
in this that was causing file paths to get messed up when using local paths for 
interemediate data
- bypassed query plan serialization/deserialization except for test mode. from 
past experience - xml serialization/deserialization is pretty expensive and 
makes no sense to subject every query to it.


 provide option to run hive in local mode
 

 Key: HIVE-543
 URL: https://issues.apache.org/jira/browse/HIVE-543
 Project: Hadoop Hive
  Issue Type: Improvement
Reporter: Joydeep Sen Sarma
Assignee: Joydeep Sen Sarma
 Attachments: hive-543.patch.1


 this is a little bit more than just mapred.job.tracker=local
 when run in this mode - multiple jobs are an issue since writing to same tmp 
 directories is an issue. the following options:
 hadoop.tmp.dir
 mapred.local.dir
 need to be randomized (perhaps based on queryid). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-543) provide option to run hive in local mode

2010-06-10 Thread Joydeep Sen Sarma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joydeep Sen Sarma updated HIVE-543:
---

Status: Patch Available  (was: Open)

- mapred.job.tracker=local continues to be the way to setup hive local mode. 
- admins can provide appropriate mapred.local.dir and mapred.system.dir 
settings for hive clients (for local mode execution). they can do this either 
via hive configuration files or via hadoop client-side-only configuration 
files. for regular cluster jobs - these are controlled by hadoop server side 
configuration files.

some of the cleanups regarding randomizing local/system directories etc. for 
concurrent queries were already in place (via hive-77).


 provide option to run hive in local mode
 

 Key: HIVE-543
 URL: https://issues.apache.org/jira/browse/HIVE-543
 Project: Hadoop Hive
  Issue Type: Improvement
Reporter: Joydeep Sen Sarma
Assignee: Joydeep Sen Sarma
 Attachments: hive-543.patch.1


 this is a little bit more than just mapred.job.tracker=local
 when run in this mode - multiple jobs are an issue since writing to same tmp 
 directories is an issue. the following options:
 hadoop.tmp.dir
 mapred.local.dir
 need to be randomized (perhaps based on queryid). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.