date:20110209

[jira] Updated: (HIVE-1962) make a libthrift.jar and libfb303.jar in dist package for backward compatibility

2011-02-09 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1962:
-

   Resolution: Fixed
Fix Version/s: 0.7.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed. Thanks Ning!

> make a libthrift.jar and libfb303.jar in dist package for backward 
> compatibility
> 
>
> Key: HIVE-1962
> URL: https://issues.apache.org/jira/browse/HIVE-1962
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Fix For: 0.7.0
>
> Attachments: HIVE-1962.2.patch, HIVE-1962.patch
>
>
> We have seen an internal user that relies on Hive's distribution library 
> libthrift.jar. After the upgrade of thrift jar to 0.5.0, the jar file is 
> renamed to thrift-0.5.0.jar and similarly for the fb303 jar. We can ask the 
> user to change their dependency to thrift-0.5.0.jar, but later when we 
> upgrade to a new version, the dependency breaks again. It's desirable to 
> create a symlink in the dist/lib directory to link libthrift.jar to 
> thrift-${thrift.version}.jar and the same for fb303. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1538) FilterOperator is applied twice with ppd on.

2011-02-09 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992886#comment-12992886
 ] 

Carl Steinbach commented on HIVE-1538:
--

Posted a review request here: https://reviews.apache.org/r/415/

Looks like the test diffs need to be regenerated, but the code changes apply 
cleanly.

@Yongqiang: Do you have time to finish this review?

> FilterOperator is applied twice with ppd on.
> 
>
> Key: HIVE-1538
> URL: https://issues.apache.org/jira/browse/HIVE-1538
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Amareshwari Sriramadasu
>Assignee: Amareshwari Sriramadasu
> Fix For: 0.7.0
>
> Attachments: patch-1538.txt
>
>
> With hive.optimize.ppd set to true, FilterOperator is applied twice. And it 
> seems second operator is always filtering zero rows.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: HIVE-1538: FilterOperator is applied twice with ppd on.

2011-02-09 Thread Carl Steinbach


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/415/
---

Review request for hive.


Summary
---

Review request for HIVE-1538


This addresses bug HIVE-1538.
https://issues.apache.org/jira/browse/HIVE-1538


Diffs
-

  trunk/contrib/src/test/results/clientpositive/dboutput.q.out 1038445 
  trunk/contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1038445 
  trunk/hbase-handler/src/test/results/hbase_pushdown.q.out 1038445 
  trunk/hbase-handler/src/test/results/hbase_queries.q.out 1038445 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 1038445 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java 1038445 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 
1038445 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 1038445 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpWalkerInfo.java 1038445 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java 
1038445 
  trunk/ql/src/test/results/clientpositive/auto_join0.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join11.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join12.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join13.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join14.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join16.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join19.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join20.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join21.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join23.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join4.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join5.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join6.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join7.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join8.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/auto_join9.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/bucket2.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/bucket3.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/bucket4.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/bucket_groupby.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/case_sensitivity.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/cast1.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/cluster.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/combine2.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/create_view.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/disable_merge_for_bucketing.q.out 
1038445 
  trunk/ql/src/test/results/clientpositive/filter_join_breaktask.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/groupby_map_ppr.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out 
1038445 
  trunk/ql/src/test/results/clientpositive/groupby_ppr.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/groupby_ppr_multi_distinct.q.out 
1038445 
  trunk/ql/src/test/results/clientpositive/implicit_cast1.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input11.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input11_limit.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input14.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input18.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input23.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input24.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input25.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input26.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input2_limit.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input31.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input39.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input42.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input6.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input9.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input_part1.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input_part5.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input_part6.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/input_part7.q.out 1038445 
  trunk/ql/src/test/results/clientpositive/i

[jira] Commented: (HIVE-1517) ability to select across a database

2011-02-09 Thread Siying Dong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992875#comment-12992875
 ] 

Siying Dong commented on HIVE-1517:
---

Looks like we have some trouble with printing token location in error message. 
new CommonToken(int, String) doesn't include location information there, so 
that the error message will always be "line 0:-1". Maybe we have to move from 
this approach to separate tokens for db and tab name.

> ability to select across a database
> ---
>
> Key: HIVE-1517
> URL: https://issues.apache.org/jira/browse/HIVE-1517
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Siying Dong
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
> HIVE-1517.3.patch
>
>
> After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
> able to select across a database for this feature to be useful.
> For eg:
> use db1
> create table foo();
> use db2
> select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Hudson: Hive-trunk-h0.20 #547

2011-02-09 Thread Apache Hudson Server

See 

Changes:

[namit] HIVE-1948 Add audit logging in the metastore
(Devaraj Das via namit)

--
[...truncated 25316 lines...]
[junit] 2011-02-09 21:10:55,934 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_21-10-53_352_5080872800286226991/-mr-1
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 

[junit] Loading data to table testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select key, value from testhivedrivertable where 
key > 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_21-10-56_816_9051942042303308962/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks is set to 0 since there's no reduce operator
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-09 21:10:59,514 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable where 
key > 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_21-10-56_816_9051942042303308962/-mr-1
[junit] OK
[junit] PREHOOK: query: select count(1) as c from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_21-10-59_709_2751416800073562522/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-09 21:11:02,451 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as c from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_21-10-59_709_2751416800073562522/-mr-1
[junit] OK
[junit] -  ---
[junit] 
[junit] Testcase: testExecute took 10.7 sec
[junit] Testcase: testNonHiveCommand took 1.027 sec
[junit] Testcase: testMetastore took 0.31 sec
[junit] Testcase: testGetClusterStatus took 0.099 sec
[junit] Testcase: testFetch took 9.362 sec
[junit] Testcase: testDynamicSerde took 6.528 sec

test-conditions:

gen-test:

create-dirs:

compile-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
:40:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

deplo

[jira] Commented: (HIVE-1981) TestHadoop20SAuthBridge failed on current trunk

2011-02-09 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992865#comment-12992865
 ] 

Carl Steinbach commented on HIVE-1981:
--

Review request: https://reviews.apache.org/r/412/


> TestHadoop20SAuthBridge failed on current trunk
> ---
>
> Key: HIVE-1981
> URL: https://issues.apache.org/jira/browse/HIVE-1981
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Ning Zhang
>Assignee: Carl Steinbach
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1981.1.patch.txt
>
>
> I'm on the latest trunk and ant package test failed on 
> TestHadoop20SAuthBridge.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1788) Add more calls to the metastore thrift interface

2011-02-09 Thread Paul Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992848#comment-12992848
 ] 

Paul Yang commented on HIVE-1788:
-

Instead of returning a list of Table objects, could we return a list of the 
matching table names? Then, the user would be responsible for getting the 
necessary table objects. Also, have you tried measuring the speed of the call 
where there are many (1000+) tables? It might be very slow, similar to how 
get_partitions() performs poorly compared to get_partition_names()

Also, with the current approach, won't the offsets not be consistent if new 
tables are created in between calls?

> Add more calls to the metastore thrift interface
> 
>
> Key: HIVE-1788
> URL: https://issues.apache.org/jira/browse/HIVE-1788
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ashish Thusoo
>Assignee: Ashish Thusoo
> Attachments: HIVE-1788.txt
>
>
> For administrative purposes the following calls to the metastore thrift 
> interface would be very useful:
> 1. Get the table metadata for all the tables owned by a particular users
> 2. Ability to iterate over this set of tables
> 3. Ability to change a particular key value property of the table

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1981) TestHadoop20SAuthBridge failed on current trunk

2011-02-09 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992827#comment-12992827
 ] 

Ning Zhang commented on HIVE-1981:
--

I will take a look.

> TestHadoop20SAuthBridge failed on current trunk
> ---
>
> Key: HIVE-1981
> URL: https://issues.apache.org/jira/browse/HIVE-1981
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Ning Zhang
>Assignee: Carl Steinbach
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1981.1.patch.txt
>
>
> I'm on the latest trunk and ant package test failed on 
> TestHadoop20SAuthBridge.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1517) ability to select across a database

2011-02-09 Thread Siying Dong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992823#comment-12992823
 ] 

Siying Dong commented on HIVE-1517:
---

https://reviews.apache.org/r/413/diff/
to better browse the patch.

> ability to select across a database
> ---
>
> Key: HIVE-1517
> URL: https://issues.apache.org/jira/browse/HIVE-1517
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Siying Dong
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
> HIVE-1517.3.patch
>
>
> After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
> able to select across a database for this feature to be useful.
> For eg:
> use db1
> create table foo();
> use db2
> select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1981) TestHadoop20SAuthBridge failed on current trunk

2011-02-09 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1981:
-

 Priority: Blocker  (was: Major)
Fix Version/s: 0.7.0

> TestHadoop20SAuthBridge failed on current trunk
> ---
>
> Key: HIVE-1981
> URL: https://issues.apache.org/jira/browse/HIVE-1981
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Ning Zhang
>Assignee: Carl Steinbach
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1981.1.patch.txt
>
>
> I'm on the latest trunk and ant package test failed on 
> TestHadoop20SAuthBridge.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1981) TestHadoop20SAuthBridge failed on current trunk

2011-02-09 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1981:
-

Status: Patch Available  (was: Open)

> TestHadoop20SAuthBridge failed on current trunk
> ---
>
> Key: HIVE-1981
> URL: https://issues.apache.org/jira/browse/HIVE-1981
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Ning Zhang
>Assignee: Carl Steinbach
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1981.1.patch.txt
>
>
> I'm on the latest trunk and ant package test failed on 
> TestHadoop20SAuthBridge.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: HIVE-1981: TestHadoop20SAuthBridge failed on current trunk

2011-02-09 Thread Carl Steinbach


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/412/
---

Review request for hive.


Summary
---

This is a fix for HIVE-1981. The patch also cleans does some general cleanup on 
the ant build files, and also adds ASF headers to the various ivy.xml files.


This addresses bug HIVE-1981.
https://issues.apache.org/jira/browse/HIVE-1981


Diffs
-

  ant/build.xml 17b3c3a 
  ant/ivy.xml 5c6299b 
  build-common.xml 6bf5d3c 
  build.xml 02132b8 
  cli/build.xml 2d16b91 
  cli/ivy.xml 86ad1ee 
  common/build.xml d9ac07e 
  common/ivy.xml 7c8cb75 
  contrib/build.xml 67948ca 
  contrib/ivy.xml 899ca70 
  hbase-handler/build.xml 88c227a 
  hbase-handler/ivy.xml 899ca70 
  hwi/build.xml 76bffa8 
  hwi/ivy.xml e6d0de5 
  ivy.xml d3ed592 
  ivy/libraries.properties 0ede62a 
  jdbc/build.xml 9d9a59e 
  jdbc/ivy.xml PRE-CREATION 
  metastore/build.xml 7f01f91 
  odbc/build.xml db9f4af 
  ql/build.xml 50c604e 
  ql/ivy.xml c25fa51 
  serde/build.xml 51ac8dd 
  service/build.xml 3c54625 
  service/ivy.xml PRE-CREATION 
  shims/build.xml 2021cfb 
  shims/ivy.xml 82b6688 

Diff: https://reviews.apache.org/r/412/diff


Testing
---


Thanks,

Carl

[jira] Updated: (HIVE-1981) TestHadoop20SAuthBridge failed on current trunk

2011-02-09 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1981:
-

Attachment: HIVE-1981.1.patch.txt

> TestHadoop20SAuthBridge failed on current trunk
> ---
>
> Key: HIVE-1981
> URL: https://issues.apache.org/jira/browse/HIVE-1981
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Ning Zhang
>Assignee: Carl Steinbach
> Attachments: HIVE-1981.1.patch.txt
>
>
> I'm on the latest trunk and ant package test failed on 
> TestHadoop20SAuthBridge.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1803) Implement bitmap indexing in Hive

2011-02-09 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992822#comment-12992822
 ] 

John Sichi commented on HIVE-1803:
--

Some review comments:

* Need to factor out all that code duplicated from compact index handler; share 
it in package org.apache.hadoop.hive.ql.index.  Use abstract classes in cases 
where behavior needs to be overridden, otherwise, just share concrete classes 
there.
* If we're going to publish the new UDF's as visible out of the box (not just 
internal to the index implementation) then they need unit tests of their own, 
as well as some documentation about the representation on which they operate 
(maybe best to wait and see how compression shakes out first).  Also, for the 
ones that turn out to be not generally applicable, then they need to be named 
more specifically.
* For dense bitmaps, I think you can probably use java.util.BitSet instead of 
rolling so much of your own (at least for ones where you have control over the 
bit array representation)
* The name attribute in the annotation for GenericUDAFCollectBitmapSet is 
incorrect.
* In HiveIndex.java, the symbol should be just BITMAP_TABLE (not 
BITMAP_SUMMARY_TABLE) since the bitmap is actually quite detailed.


> Implement bitmap indexing in Hive
> -
>
> Key: HIVE-1803
> URL: https://issues.apache.org/jira/browse/HIVE-1803
> Project: Hive
>  Issue Type: New Feature
>Reporter: Marquis Wang
>Assignee: Marquis Wang
> Attachments: HIVE-1803.1.patch, HIVE-1803.2.patch, 
> bitmap_index_1.png, bitmap_index_2.png
>
>
> Implement bitmap index handler to complement compact indexing.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1517) ability to select across a database

2011-02-09 Thread Siying Dong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-1517:
--

Attachment: HIVE-1517.3.patch

Modify based on Carl's previous patch:
1. fix concurrency issue but locking DB for every table too
2. fix CREATE TABLE AS and CREATE TABLE LIKE. Also, CREATE TABLE AS to share 
the same codes to generate default path as CREATE TABLE
3. fix TABLESAMPLE, DROP TABLE, DROP TABLE, ALTER TABLE DROP PARTITION, etc.
4. fix the same table names in different databases in the same query. 
(PartitionPruner's key's problem)
5. fix character escaping problem.
6. fix TABLE ANALYZE to mistakely allow xxx.xxx
7. fix DynamicSerDe just to get unit test pass.
8. lots of test case results changes, since in DESCRIBE EXTENDED, table names 
are printed to be default.tab instead of tab. I found the possible way not to 
print "default" to make less change to test outputs is more risky. So I didn't 
try that. But I ever implemented it on my machine and make sure those test 
output modifications are right.

Some issues:
1. I didn't figure out the necessity to modify SemanticAnalyzer.processTable() 
not to use aliasIndex in Carl's patch, so I revert them.
2. DESCRIBE a foreign table is not supported and we don't give good error 
message.
3. `db.tab` is not blocked so far since I found it's a pretty complicated issue 
and might need more thinking.
4. UnparseTranslator becomes a little bit urgly now to unescape identifiers. I 
found it's really hard to keep Carl's syntax rules and still support it in a 
clean way. Maybe as a follow-up to make DB and table two different tokens, 
instead of parsing 'xxx.xxx' to semantic analyzer.

I'm still running test suites from the beginning to the end. There still can be 
something broken.

> ability to select across a database
> ---
>
> Key: HIVE-1517
> URL: https://issues.apache.org/jira/browse/HIVE-1517
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Siying Dong
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt, 
> HIVE-1517.3.patch
>
>
> After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
> able to select across a database for this feature to be useful.
> For eg:
> use db1
> create table foo();
> use db2
> select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Hudson: Hive-trunk-h0.20 #546

2011-02-09 Thread Apache Hudson Server

See 

Changes:

[jvs] HIVE-1969. TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing
on trunk
(Ning Zhang via jvs)

--
[...truncated 25826 lines...]
[junit] 2011-02-09 16:01:01,444 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_16-00-58_854_4269874519992694384/-mr-1
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 

[junit] Loading data to table testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select key, value from testhivedrivertable where 
key > 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_16-01-02_192_5114616326688557771/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks is set to 0 since there's no reduce operator
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-09 16:01:04,854 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable where 
key > 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_16-01-02_192_5114616326688557771/-mr-1
[junit] OK
[junit] PREHOOK: query: select count(1) as c from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_16-01-05_099_8661859615395556405/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-09 16:01:07,782 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as c from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_16-01-05_099_8661859615395556405/-mr-1
[junit] OK
[junit] -  ---
[junit] 
[junit] Testcase: testExecute took 10.142 sec
[junit] Testcase: testNonHiveCommand took 0.793 sec
[junit] Testcase: testMetastore took 0.278 sec
[junit] Testcase: testGetClusterStatus took 0.137 sec
[junit] Testcase: testFetch took 9.491 sec
[junit] Testcase: testDynamicSerde took 6.356 sec

test-conditions:

gen-test:

create-dirs:

compile-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
:40:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set t

[jira] Resolved: (HIVE-1972) HiveResultset is always returning null for Array Data types in the select Query

2011-02-09 Thread Steven Wong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Wong resolved HIVE-1972.
---

Resolution: Duplicate

Should have been fixed by HIVE-1378.

> HiveResultset is always returning null for Array Data types in the select 
> Query
> ---
>
> Key: HIVE-1972
> URL: https://issues.apache.org/jira/browse/HIVE-1972
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.5.0
> Environment: Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise 
> Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
>Reporter: Chinna Rao Lalam
>Assignee: Chinna Rao Lalam
>Priority: Minor
>
> Execute the following Hive Queries 
> {noformat}
> 1) create table samplearray(a int,b int,c array)row format delimited 
> fields terminated by '@' collection items terminated by '$' stored as 
> textfile;
> 2) LOAD DATA INPATH '/user/dataloc/details3.txt' OVERWRITE INTO TABLE 
> samplearray
> 3) Now execute the select statement "select c from emp;" using HiveStatement 
> API
> 4) Now Iterate through the returned HiveResultSet, the array column is always 
> null.
> {noformat}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1965) Auto convert mapjoin should not throw exception if the top operator is union operator.

2011-02-09 Thread He Yongqiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-1965:
---

Attachment: HIVE-1965.1.patch

a quick fix. will run tests...

> Auto convert mapjoin should not throw exception if the top operator is union 
> operator.
> --
>
> Key: HIVE-1965
> URL: https://issues.apache.org/jira/browse/HIVE-1965
> Project: Hive
>  Issue Type: Bug
>Reporter: He Yongqiang
> Attachments: HIVE-1965.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1962) make a libthrift.jar and libfb303.jar in dist package for backward compatibility

2011-02-09 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992788#comment-12992788
 ] 

Ning Zhang commented on HIVE-1962:
--

Carl, any progress on this? Thanks

> make a libthrift.jar and libfb303.jar in dist package for backward 
> compatibility
> 
>
> Key: HIVE-1962
> URL: https://issues.apache.org/jira/browse/HIVE-1962
> Project: Hive
>  Issue Type: Improvement
>  Components: Build Infrastructure
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1962.2.patch, HIVE-1962.patch
>
>
> We have seen an internal user that relies on Hive's distribution library 
> libthrift.jar. After the upgrade of thrift jar to 0.5.0, the jar file is 
> renamed to thrift-0.5.0.jar and similarly for the fb303 jar. We can ask the 
> user to change their dependency to thrift-0.5.0.jar, but later when we 
> upgrade to a new version, the dependency breaks again. It's desirable to 
> create a symlink in the dist/lib directory to link libthrift.jar to 
> thrift-${thrift.version}.jar and the same for fb303. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1941) support explicit view partitioning

2011-02-09 Thread John Sichi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1941:
-

Attachment: HIVE-1941.3.patch

Running HIVE-1941.3.patch through tests now.

> support explicit view partitioning
> --
>
> Key: HIVE-1941
> URL: https://issues.apache.org/jira/browse/HIVE-1941
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: John Sichi
>Assignee: John Sichi
> Attachments: HIVE-1941.1.patch, HIVE-1941.2.patch, HIVE-1941.3.patch
>
>
> Allow creation of a view with an explicit partitioning definition, and 
> support ALTER VIEW ADD/DROP PARTITION for instantiating partitions.
> For more information, see
> http://wiki.apache.org/hadoop/Hive/PartitionedViews

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Resolved: (HIVE-1958) Not able to get the output for Map datatype using ResultSet

2011-02-09 Thread Steven Wong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Wong resolved HIVE-1958.
---

Resolution: Duplicate

Should have been fixed by HIVE-1378.

> Not able to get the output for Map datatype using ResultSet
> ---
>
> Key: HIVE-1958
> URL: https://issues.apache.org/jira/browse/HIVE-1958
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.5.0
> Environment: Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise 
> Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
>Reporter: Chinna Rao Lalam
>Assignee: Chinna Rao Lalam
>Priority: Minor
>
> Not able to retrieve the data for the Map data type using HiveResultSet API's.
> Ex:
> {noformat}
> create table maptable(details map) row format delimited map keys 
> terminated by '#';
> load data LOCAL inpath '/home/chinna/maptest.txt' overwrite into table 
> maptable;
> {noformat}
> *Input Data*
> {noformat}
> key1#100
> key2#200
> {noformat}
> Retrieved using resultset API's
> *Output*
> {noformat}
> Row of map {}
> Row of map {}
> {noformat}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Resolved: (HIVE-1957) Complete data could not be retrieved using ResultSet API's when some of the input record column values are blank.

2011-02-09 Thread Steven Wong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Wong resolved HIVE-1957.
---

Resolution: Duplicate

Should have been fixed by HIVE-1378.

> Complete data could not be retrieved using ResultSet API's when some of the 
> input record column values are blank. 
> --
>
> Key: HIVE-1957
> URL: https://issues.apache.org/jira/browse/HIVE-1957
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.5.0
> Environment: Hadoop 0.20.1, Hive0.5.0 and SUSE Linux Enterprise 
> Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
>Reporter: Chinna Rao Lalam
>Assignee: Chinna Rao Lalam
>
> Complete data could not be retrieved using ResultSet API's when some of the 
> input record column values are blank. In CLI mode, all the data is retrieved 
> as expected. But with ResultSet.next(), all the data is not retrieved.
> Ex:
> create table emp(empno String,ename String,deptno int) row format delimited 
> fields terminated by '@';
> select aemp.empno from emp aemp;
> *Input Data*
> {noformat}
> 333@chinna@20
> 444@@40
> @rao@50
> 555@lalam@40
> 666@jagan@78
> {noformat}
> *Actual output*
> {noformat}
> 333
> 444
> {noformat}
> *Expected output*
> {noformat}
> 333
> 444
> 555
> 666
> {noformat}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: HIVE-818. Create a Hive CLI that connects to hive ThriftServer

2011-02-09 Thread Ning Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/407/
---

(Updated 2011-02-09 15:11:19.773829)


Review request for hive.


Changes
---

Updated the diff according to the new patch. 


Summary
---

Copied from JIRA HIVE-818: 

This patch does the following:

add 2 options (-h, -p) in CLI to specify the hostname and port of Hive server.
change the HiveServer to output non-Hive commands (non Driver) to a temp file 
and change the fetchOne/fetchN/fetchAll functions to get results from the temp 
file.
change the fetchOne function to throw a HiveServerException (error code 0) when 
reaching the end of result rather than sending an empty string.
Caveats:

session.err from the HiveServer is still not sending back to client. So the 
progress of a Hadoop job is not shown in the client side in remote mode (I 
think there is a JIRA opened already. If not I wil file a follow-up JIRA for 
this).
now end-to-end unit test for remote mode. I manually tested HiveServer and CLI 
in remote mode (set/dfs/SQL commands) and in combination of -e/-f options. I 
will file a follow-up JIRA for creating a unit test suite for remote mode CLI.


Diffs (updated)
-

  trunk/build.xml 1069164 
  trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java 1069164 
  trunk/cli/src/java/org/apache/hadoop/hive/cli/CliSessionState.java 1069164 
  trunk/cli/src/java/org/apache/hadoop/hive/cli/OptionsProcessor.java 1069164 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1069164 
  trunk/service/if/hive_service.thrift 1069164 
  trunk/service/src/gen/thrift/gen-cpp/ThriftHive.h 1069164 
  trunk/service/src/gen/thrift/gen-cpp/ThriftHive.cpp 1069164 
  trunk/service/src/gen/thrift/gen-cpp/ThriftHive_server.skeleton.cpp 1069164 
  
trunk/service/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/service/ThriftHive.java
 1069164 
  trunk/service/src/gen/thrift/gen-php/hive_service/ThriftHive.php 1069164 
  trunk/service/src/gen/thrift/gen-py/hive_service/ThriftHive-remote 1069164 
  trunk/service/src/gen/thrift/gen-py/hive_service/ThriftHive.py 1069164 
  trunk/service/src/gen/thrift/gen-rb/thrift_hive.rb 1069164 
  trunk/service/src/java/org/apache/hadoop/hive/service/HiveServer.java 1069164 

Diff: https://reviews.apache.org/r/407/diff


Testing
---

Passed all unit tests. 

Also manually tested HiveServer and CLI remote mode by:
 1) $ hive --service hiveserver
 2) in another terminal: hive -h localhost
 3) tested the following command:
- set; -- get all parameters
- set hive.stats.autogather;  -- check default parameter value
- set hive.stats.autogather=false;  -- change parameter value
- set hive.stats.autogather;  -- check parameter value got changed
- select * from src;  -- Hive query but no Hadoop job
- select count(*) from src; -- Hive query and Hadoop job is created
- select k from src; -- negative test case where SemanticAnalyzer throw an 
exception
- show partitions srcpart;  -- Hive Query but no hadoop job
- explain select count(*) from srcpart where ds is not null; -- explain 
query


Thanks,

Ning

[jira] Updated: (HIVE-818) Create a Hive CLI that connects to hive ThriftServer

2011-02-09 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-818:


Attachment: Hive-881_2.patch

Resolved some conflicts with the current trunk. 

> Create a Hive CLI that connects to hive ThriftServer
> 
>
> Key: HIVE-818
> URL: https://issues.apache.org/jira/browse/HIVE-818
> Project: Hive
>  Issue Type: New Feature
>  Components: Clients, Server Infrastructure
>Reporter: Edward Capriolo
>Assignee: Ning Zhang
> Attachments: HIVE-818.patch, Hive-881_2.patch
>
>
> We should have an alternate CLI that works by interacting with the 
> HiveServer, in this way it will be ready when/if we deprecate the current CLI.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1517) ability to select across a database

2011-02-09 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992774#comment-12992774
 ] 

Carl Steinbach commented on HIVE-1517:
--

> `db.table` Should be considered ilegal. Is that right?

I think that's correct. See the following page for a complete discussion: 
http://dev.mysql.com/doc/refman/5.0/en/identifiers.html

It also looks like we need to tighten up the way the grammar handles 
Identifiers:

{code}
Identifier
:
(Letter | Digit) (Letter | Digit | '_')*
| '`' RegexComponent+ '`'
;

RegexComponent
: 'a'..'z' | 'A'..'Z' | '0'..'9' | '_'
| PLUS | STAR | QUESTION | MINUS | DOT
| LPAREN | RPAREN | LSQUARE | RSQUARE | LCURLY | RCURLY
| BITWISEXOR | BITWISEOR | DOLLAR
;
{code}

Defining quoted identifiers in terms of RegexComponent permits a lot of illegal 
characters.
I think we actually want something like this:


{code}
Identifier
: (Letter | Digit) (Letter | Digit | '_')*
| '`' (Letter | Digit) (Letter | Digit | '_')* '`'
;
{code}

> ability to select across a database
> ---
>
> Key: HIVE-1517
> URL: https://issues.apache.org/jira/browse/HIVE-1517
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Siying Dong
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt
>
>
> After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
> able to select across a database for this feature to be useful.
> For eg:
> use db1
> create table foo();
> use db2
> select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE

2011-02-09 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992773#comment-12992773
 ] 

John Sichi commented on HIVE-1078:
--

After HIVE-1941, it's important for CREATE OR REPLACE VIEW to preserve any 
existing view partitions.  Since we don't store any column definitions with the 
view partitions, this should be trivial.


> CREATE VIEW followup:  CREATE OR REPLACE
> 
>
> Key: HIVE-1078
> URL: https://issues.apache.org/jira/browse/HIVE-1078
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: John Sichi
>Assignee: John Sichi
>
> Currently, replacing a view requires
> DROP VIEW v;
> CREATE VIEW v AS ;
> CREATE OR REPLACE would allow these to be combined into a single operation.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1950) Block merge for RCFile

2011-02-09 Thread He Yongqiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-1950:
---

Attachment: HIVE-1950.3.patch

> Block merge for RCFile
> --
>
> Key: HIVE-1950
> URL: https://issues.apache.org/jira/browse/HIVE-1950
> Project: Hive
>  Issue Type: New Feature
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: HIVE-1950.1.patch, HIVE-1950.2.patch, HIVE-1950.3.patch
>
>
> In our env, there are a lot of small files inside one partition/table. In 
> order to reduce the namenode load, we have one dedicated housekeeping job 
> running to merge these file. Right now the merge is an 'insert overwrite' in 
> hive, and requires decompress the data and compress it. This jira is to add a 
> command in Hive to do the merge without decompress and recompress the data.
> Something like "alter table tbl_name [partition ()] merge files". In this 
> jira the new command will only support RCFile, since there need some new APIs 
> to the fileformat.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: HIVE-1950

2011-02-09 Thread Yongqiang He


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/388/
---

(Updated 2011-02-09 14:58:18.003664)


Review request for hive.


Changes
---

update based on the new diff.


Summary
---

early review


This addresses bug HIVE-1950.
https://issues.apache.org/jira/browse/HIVE-1950


Diffs (updated)
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ConditionalTask.java 1067036 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1067036 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1067036 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
1067716 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHook.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1067716 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Throttle.java 1067036 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1067036 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
1068083 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 1068083 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java 1067036 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java 
PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeInputFormat.java
 PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeOutputFormat.java
 PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java
 PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileKeyBufferWrapper.java
 PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java
 PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileValueBufferWrapper.java
 PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 1067036 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/AlterTablePartMergeFilesDesc.java
 PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
1067036 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1067036 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1067036 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
1067036 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java 1067036 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 1067036 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 1068083 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/StatsWork.java 1067036 
  trunk/ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java 1067036 
  trunk/ql/src/test/queries/clientnegative/merge_negative_1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/merge_negative_2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/alter_merge.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/alter_merge_stats.q PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/merge_negative_1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/merge_negative_2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/alter_merge.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/alter_merge_stats.q.out PRE-CREATION 
  trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 
1069093 
  trunk/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 
1069093 
  trunk/shims/src/common/java/org/apache/hadoop/hive/shims/CombineHiveKey.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/388/diff


Testing
---


Thanks,

Yongqiang

[jira] Commented: (HIVE-1517) ability to select across a database

2011-02-09 Thread Siying Dong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992761#comment-12992761
 ] 

Siying Dong commented on HIVE-1517:
---

Double check about character escaping. For db.table, we want the correct 
escaping format to be `db`.`table`. `db.table` Should be considered ilegal. Is 
that right?

> ability to select across a database
> ---
>
> Key: HIVE-1517
> URL: https://issues.apache.org/jira/browse/HIVE-1517
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Siying Dong
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1517.1.patch.txt, HIVE-1517.2.patch.txt
>
>
> After  https://issues.apache.org/jira/browse/HIVE-675, we need a way to be 
> able to select across a database for this feature to be useful.
> For eg:
> use db1
> create table foo();
> use db2
> select .. from db1.foo.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1980) Merging using mapreduce rather than map-only job failed in case of dynamic partition inserts

2011-02-09 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992753#comment-12992753
 ] 

Ning Zhang commented on HIVE-1980:
--

I remembered that this case is not supported by design and we should throw an 
error in SemanticAnalyzer. 

In the first dynamic partition insert patch we disabled merge completely 
because in the case of using HiveInputFormat (eg. in Hadoop 0.17) the partition 
columns are not passed to the reducer (part of the partition columns exist in 
the HDFS directory). So the reducer will create one file that may mix data from 
different partitions. In HIVE-1307 we enabled merge for CombineHiveInputFormat. 
However we should disable merge for the case of dynamic partition inserts using 
HiveInputFormat. 

> Merging using mapreduce rather than map-only job failed in case of dynamic 
> partition inserts
> 
>
> Key: HIVE-1980
> URL: https://issues.apache.org/jira/browse/HIVE-1980
> Project: Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Ning Zhang
>
> In dynamic partition insert and if merge is set to true and 
> hive.mergejob.maponly=false, the merge MapReduce job will fail. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1981) TestHadoop20SAuthBridge failed on current trunk

2011-02-09 Thread Carl Steinbach (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1981:
-

Component/s: Build Infrastructure
   Assignee: Carl Steinbach  (was: Devaraj Das)

> TestHadoop20SAuthBridge failed on current trunk
> ---
>
> Key: HIVE-1981
> URL: https://issues.apache.org/jira/browse/HIVE-1981
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Reporter: Ning Zhang
>Assignee: Carl Steinbach
>
> I'm on the latest trunk and ant package test failed on 
> TestHadoop20SAuthBridge.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1981) TestHadoop20SAuthBridge failed on current trunk

2011-02-09 Thread Carl Steinbach (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992747#comment-12992747
 ] 

Carl Steinbach commented on HIVE-1981:
--

This is probably caused by my patch for HIVE-1970. I'll take a look.

> TestHadoop20SAuthBridge failed on current trunk
> ---
>
> Key: HIVE-1981
> URL: https://issues.apache.org/jira/browse/HIVE-1981
> Project: Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Devaraj Das
>
> I'm on the latest trunk and ant package test failed on 
> TestHadoop20SAuthBridge.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1307) More generic and efficient merge method

2011-02-09 Thread Ning Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992741#comment-12992741
 ] 

Ning Zhang commented on HIVE-1307:
--

Sorry I missed you comment Ted. Yes this is a bug and it was fixed in current 
trunk (0.7-SNAPSHOT). 

> More generic and efficient merge method
> ---
>
> Key: HIVE-1307
> URL: https://issues.apache.org/jira/browse/HIVE-1307
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Fix For: 0.6.0
>
> Attachments: HIVE-1307.0.patch, HIVE-1307.2.patch, HIVE-1307.3.patch, 
> HIVE-1307.3_java.patch, HIVE-1307.4.patch, HIVE-1307.5.patch, 
> HIVE-1307.6.patch, HIVE-1307.7.patch, HIVE-1307.8.patch, HIVE-1307.9.patch, 
> HIVE-1307.patch, HIVE-1307_2_branch_0.6.patch, HIVE-1307_branch_0.6.patch, 
> HIVE-1307_java_only.patch
>
>
> Currently if hive.merge.mapfiles/mapredfiles=true, a new mapreduce job is 
> create to read the input files and output to one reducer for merging. This MR 
> job is created at compile time and one MR job for one partition. In the case 
> of dynamic partition case, multiple partitions could be created at execution 
> time and generating merging MR job at compile time is impossible. 
> We should generalize the merge framework to allow multiple partitions and 
> most of the time a map-only job should be sufficient if we use 
> CombineHiveInputFormat. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Assigned: (HIVE-1981) TestHadoop20SAuthBridge failed on current trunk

2011-02-09 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang reassigned HIVE-1981:


Assignee: Devaraj Das

Devaraj, can you please take a look?

> TestHadoop20SAuthBridge failed on current trunk
> ---
>
> Key: HIVE-1981
> URL: https://issues.apache.org/jira/browse/HIVE-1981
> Project: Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Devaraj Das
>
> I'm on the latest trunk and ant package test failed on 
> TestHadoop20SAuthBridge.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1948) Have audit logging in the Metastore

2011-02-09 Thread Namit Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1948:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed. Thanks Devaraj

> Have audit logging in the Metastore
> ---
>
> Key: HIVE-1948
> URL: https://issues.apache.org/jira/browse/HIVE-1948
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.7.0
>
> Attachments: audit-log-2.patch, audit-log-3.patch, audit-log.1.patch, 
> audit-log.patch
>
>
> It would be good to have audit logging in the metastore, similar to Hadoop's 
> NameNode audit logging. This would allow administrators to dig into details 
> about which user performed metadata operations (like create/drop 
> tables/partitions) and from where (IP address).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Created: (HIVE-1981) TestHadoop20SAuthBridge failed on current trunk

2011-02-09 Thread Ning Zhang (JIRA)

TestHadoop20SAuthBridge failed on current trunk
---

 Key: HIVE-1981
 URL: https://issues.apache.org/jira/browse/HIVE-1981
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang


I'm on the latest trunk and ant package test failed on TestHadoop20SAuthBridge.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1950) Block merge for RCFile

2011-02-09 Thread He Yongqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992701#comment-12992701
 ] 

He Yongqiang commented on HIVE-1950:


>>2. Move RCFile check to SemanticAnalyzer from runtime.
SemanticAnalyzer only throws SemanticException. we may should keep this 
semantic. Moving the check to SemanticAnalyzer will need it to handle a lot of 
HiveExceptions (thrown by getTable etc).

> Block merge for RCFile
> --
>
> Key: HIVE-1950
> URL: https://issues.apache.org/jira/browse/HIVE-1950
> Project: Hive
>  Issue Type: New Feature
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: HIVE-1950.1.patch, HIVE-1950.2.patch
>
>
> In our env, there are a lot of small files inside one partition/table. In 
> order to reduce the namenode load, we have one dedicated housekeeping job 
> running to merge these file. Right now the merge is an 'insert overwrite' in 
> hive, and requires decompress the data and compress it. This jira is to add a 
> command in Hive to do the merge without decompress and recompress the data.
> Something like "alter table tbl_name [partition ()] merge files". In this 
> jira the new command will only support RCFile, since there need some new APIs 
> to the fileformat.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1969) TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk

2011-02-09 Thread John Sichi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1969:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed.  Thanks Ning!


> TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk
> ---
>
> Key: HIVE-1969
> URL: https://issues.apache.org/jira/browse/HIVE-1969
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: John Sichi
>Assignee: Ning Zhang
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1969.patch
>
>
> I haven't looked into it yet but saw this at the end of the .q.out:
> +Ended Job = job_201102071402_0020 with errors
> +FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1788) Add more calls to the metastore thrift interface

2011-02-09 Thread Ashish Thusoo (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Thusoo updated HIVE-1788:


Attachment: HIVE-1788.txt

Patch attached.

This is also available at

https://reviews.apache.org/r/409/


> Add more calls to the metastore thrift interface
> 
>
> Key: HIVE-1788
> URL: https://issues.apache.org/jira/browse/HIVE-1788
> Project: Hive
>  Issue Type: New Feature
>Reporter: Ashish Thusoo
>Assignee: Ashish Thusoo
> Attachments: HIVE-1788.txt
>
>
> For administrative purposes the following calls to the metastore thrift 
> interface would be very useful:
> 1. Get the table metadata for all the tables owned by a particular users
> 2. Ability to iterate over this set of tables
> 3. Ability to change a particular key value property of the table

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1969) TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk

2011-02-09 Thread John Sichi (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992682#comment-12992682
 ] 

John Sichi commented on HIVE-1969:
--

+1.  Will commit.


> TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk
> ---
>
> Key: HIVE-1969
> URL: https://issues.apache.org/jira/browse/HIVE-1969
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: John Sichi
>Assignee: Ning Zhang
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1969.patch
>
>
> I haven't looked into it yet but saw this at the end of the .q.out:
> +Ended Job = job_201102071402_0020 with errors
> +FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: HIVE-1788: This patch adds the ability to get tables by owners from the metastore.

2011-02-09 Thread Ashish Thusoo


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/409/
---

(Updated 2011-02-09 11:39:00.975024)


Review request for hive and Paul Yang.


Summary (updated)
---

This patch adds the ability to get tables by owners from the metastore. The api 
added to the metastore is

list get_tables_by_owner(string owner, long offset, int limit)

The offset and limit is included so that the tables can be fetched in small 
batches. The tables are returned
sorted in the order of database name, table name.


This addresses bug HIVE-1788.
https://issues.apache.org/jira/browse/HIVE-1788


Diffs
-

  http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/scripts/upgrade/derby/upgrade-0.8.0.derby.sql
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/scripts/upgrade/mysql/upgrade-0.8.0.mysql.sql
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/scripts/upgrade/postgres/upgrade-0.8.0.postgres.sql
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java
 1068698 
  http://svn.apache.org/repos/asf/hive/trunk/metastore/src/model/package.jdo 
1068698 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1068698 

Diff: https://reviews.apache.org/r/409/diff


Testing
---

Metastore tests pass.
Some other unit tests seem to be broken.


Thanks,

Ashish

[jira] Created: (HIVE-1980) Merging using mapreduce rather than map-only job failed in case of dynamic partition inserts

2011-02-09 Thread Ning Zhang (JIRA)

Merging using mapreduce rather than map-only job failed in case of dynamic 
partition inserts


 Key: HIVE-1980
 URL: https://issues.apache.org/jira/browse/HIVE-1980
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang


In dynamic partition insert and if merge is set to true and 
hive.mergejob.maponly=false, the merge MapReduce job will fail. 

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1969) TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk

2011-02-09 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1969:
-

Status: Patch Available  (was: Open)

> TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk
> ---
>
> Key: HIVE-1969
> URL: https://issues.apache.org/jira/browse/HIVE-1969
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: John Sichi
>Assignee: Ning Zhang
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1969.patch
>
>
> I haven't looked into it yet but saw this at the end of the .q.out:
> +Ended Job = job_201102071402_0020 with errors
> +FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1969) TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk

2011-02-09 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1969:
-

Attachment: HIVE-1969.patch

There are 2 things that make these tests failed: 
1) the recent change of default file format from HiveInputFormat to 
CombineHiveInputFormat makes the merge job submission fail. The current mini 
HDFS implementation used in minimr doesn't support CombineFileInputFormat. 

2) even though I changed to input format to HiveInputFormat and make the merge 
using a mapreduce job rather than map-only job, it still failed because of a 
bug that exist in that code path. HIVE-1980 was filed for that.

For now we should remove merge_dynamic_partition[23].q from minimr.query.files 
since otherwise they won't be tested in regular TestCliDriver. I will add the 
equivalent test cases to minimr test in HIVE-1980.

> TestMinimrCliDriver merge_dynamic_partition2 and 3 are failing on trunk
> ---
>
> Key: HIVE-1969
> URL: https://issues.apache.org/jira/browse/HIVE-1969
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: John Sichi
>Assignee: Ning Zhang
>Priority: Blocker
> Fix For: 0.7.0
>
> Attachments: HIVE-1969.patch
>
>
> I haven't looked into it yet but saw this at the end of the .q.out:
> +Ended Job = job_201102071402_0020 with errors
> +FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.MapRedTask

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1950) Block merge for RCFile

2011-02-09 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992674#comment-12992674
 ] 

Namit Jain commented on HIVE-1950:
--

1. Can you change merge_files to concatenate ?
   alter table  concatenate;

2. Move RCFile check to SemanticAnalyzer from runtime.

3. More comments: DDLTask.java/mergeFiles
   RCFile: all the new functions etc.


> Block merge for RCFile
> --
>
> Key: HIVE-1950
> URL: https://issues.apache.org/jira/browse/HIVE-1950
> Project: Hive
>  Issue Type: New Feature
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: HIVE-1950.1.patch, HIVE-1950.2.patch
>
>
> In our env, there are a lot of small files inside one partition/table. In 
> order to reduce the namenode load, we have one dedicated housekeeping job 
> running to merge these file. Right now the merge is an 'insert overwrite' in 
> hive, and requires decompress the data and compress it. This jira is to add a 
> command in Hive to do the merge without decompress and recompress the data.
> Something like "alter table tbl_name [partition ()] merge files". In this 
> jira the new command will only support RCFile, since there need some new APIs 
> to the fileformat.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Hudson: Hive-trunk-h0.20 #545

2011-02-09 Thread Apache Hudson Server

See 

--
[...truncated 25392 lines...]
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 

[junit] Loading data to table testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select key, value from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_11-08-31_327_4767368152041876847/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks is set to 0 since there's no reduce operator
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-09 11:08:33,881 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_11-08-31_327_4767368152041876847/-mr-1
[junit] OK
[junit] PREHOOK: query: select key, value from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_11-08-34_049_8416191857889998057/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks is set to 0 since there's no reduce operator
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-09 11:08:36,591 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_11-08-34_049_8416191857889998057/-mr-1
[junit] OK
[junit] PREHOOK: query: select key, value from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_11-08-36_758_8861724302206474949/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks is set to 0 since there's no reduce operator
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-09 11:08:39,297 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011

[jira] Updated: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

2011-02-09 Thread He Yongqiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-1978:
---

Status: Patch Available  (was: Open)

> Hive SymlinkTextInputFormat does not estimate input size correctly
> --
>
> Key: HIVE-1978
> URL: https://issues.apache.org/jira/browse/HIVE-1978
> Project: Hive
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: HIVE-1978.1.patch, HIVE-1978.2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

2011-02-09 Thread He Yongqiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-1978:
---

Attachment: HIVE-1978.2.patch

fixed a typo

> Hive SymlinkTextInputFormat does not estimate input size correctly
> --
>
> Key: HIVE-1978
> URL: https://issues.apache.org/jira/browse/HIVE-1978
> Project: Hive
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: HIVE-1978.1.patch, HIVE-1978.2.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1978) Hive SymlinkTextInputFormat does not estimate input size correctly

2011-02-09 Thread He Yongqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992644#comment-12992644
 ] 

He Yongqiang commented on HIVE-1978:


namit, a .q test file can not include what this jira does. From a .q file, it 
is very difficult to know SymlinkTextInputFormat get the input size correctly.

>>getContentSummary' in all existing input formats.
There is no guarantee that the inputformat is from Hive. It is very difficult 
to change all input format.

> Hive SymlinkTextInputFormat does not estimate input size correctly
> --
>
> Key: HIVE-1978
> URL: https://issues.apache.org/jira/browse/HIVE-1978
> Project: Hive
>  Issue Type: Improvement
>Reporter: He Yongqiang
>Assignee: He Yongqiang
> Attachments: HIVE-1978.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1948) Have audit logging in the Metastore

2011-02-09 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992613#comment-12992613
 ] 

Namit Jain commented on HIVE-1948:
--

+1

> Have audit logging in the Metastore
> ---
>
> Key: HIVE-1948
> URL: https://issues.apache.org/jira/browse/HIVE-1948
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.7.0
>
> Attachments: audit-log-2.patch, audit-log-3.patch, audit-log.1.patch, 
> audit-log.patch
>
>
> It would be good to have audit logging in the metastore, similar to Hadoop's 
> NameNode audit logging. This would allow administrators to dig into details 
> about which user performed metadata operations (like create/drop 
> tables/partitions) and from where (IP address).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-09 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992612#comment-12992612
 ] 

Namit Jain commented on HIVE-1918:
--

Please file bugs for the above cases - 


The changes for import look fine.
You also need to make similar changes for export.

> Add export/import facilities to the hive system
> ---
>
> Key: HIVE-1918
> URL: https://issues.apache.org/jira/browse/HIVE-1918
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Krishna Kumar
>Assignee: Krishna Kumar
> Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
> HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf
>
>
> This is an enhancement request to add export/import features to hive.
> With this language extension, the user can export the data of the table - 
> which may be located in different hdfs locations in case of a partitioned 
> table - as well as the metadata of the table into a specified output 
> location. This output location can then be moved over to another different 
> hadoop/hive instance and imported there.  
> This should work independent of the source and target metastore dbms used; 
> for instance, between derby and mysql.
> For partitioned tables, the ability to export/import a subset of the 
> partition must be supported.
> Howl will add more features on top of this: The ability to create/use the 
> exported data even in the absence of hive, using MR or Pig. Please see 
> http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-09 Thread Krishna Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992606#comment-12992606
 ] 

Krishna Kumar commented on HIVE-1918:
-

Hmm. LoadSemanticAnalyzer (which knows the table) does not add it to the 
outputs, but the MoveTask it schedules, does. 

Similarly, CREATE-TABLE does not add the entity but the DDLTask it schedules, 
does. This may be fine only because the entity does not exist at compile time?

ADD-PARTITION adds the table as an *input* at compile time and the partition 
itself is added as an output at execution time. Should not the table be an 
output (at compile time) as well - for authorization/concurrency purposes?

Anyway, where the import operates on existing tables/partitions, I will add 
them at compile time. If the entity is being created as part of the task, then 
the task will be adding them to inputs/outputs at runtime. Is this fine?


> Add export/import facilities to the hive system
> ---
>
> Key: HIVE-1918
> URL: https://issues.apache.org/jira/browse/HIVE-1918
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Krishna Kumar
>Assignee: Krishna Kumar
> Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
> HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf
>
>
> This is an enhancement request to add export/import features to hive.
> With this language extension, the user can export the data of the table - 
> which may be located in different hdfs locations in case of a partitioned 
> table - as well as the metadata of the table into a specified output 
> location. This output location can then be moved over to another different 
> hadoop/hive instance and imported there.  
> This should work independent of the source and target metastore dbms used; 
> for instance, between derby and mysql.
> For partitioned tables, the ability to export/import a subset of the 
> partition must be supported.
> Howl will add more features on top of this: The ability to create/use the 
> exported data even in the absence of hive, using MR or Pig. Please see 
> http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-09 Thread Namit Jain (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992555#comment-12992555
 ] 

Namit Jain commented on HIVE-1918:
--

Tasks only add them when they may be available at compile time - for example, 
in case of dynamic partitions.
Semantic Analyzer is supposed to add them

> Add export/import facilities to the hive system
> ---
>
> Key: HIVE-1918
> URL: https://issues.apache.org/jira/browse/HIVE-1918
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Krishna Kumar
>Assignee: Krishna Kumar
> Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
> HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf
>
>
> This is an enhancement request to add export/import features to hive.
> With this language extension, the user can export the data of the table - 
> which may be located in different hdfs locations in case of a partitioned 
> table - as well as the metadata of the table into a specified output 
> location. This output location can then be moved over to another different 
> hadoop/hive instance and imported there.  
> This should work independent of the source and target metastore dbms used; 
> for instance, between derby and mysql.
> For partitioned tables, the ability to export/import a subset of the 
> partition must be supported.
> Howl will add more features on top of this: The ability to create/use the 
> exported data even in the absence of hive, using MR or Pig. Please see 
> http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1948) Have audit logging in the Metastore

2011-02-09 Thread Devaraj Das (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HIVE-1948:
--

Status: Patch Available  (was: Open)

> Have audit logging in the Metastore
> ---
>
> Key: HIVE-1948
> URL: https://issues.apache.org/jira/browse/HIVE-1948
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.7.0
>
> Attachments: audit-log-2.patch, audit-log-3.patch, audit-log.1.patch, 
> audit-log.patch
>
>
> It would be good to have audit logging in the metastore, similar to Hadoop's 
> NameNode audit logging. This would allow administrators to dig into details 
> about which user performed metadata operations (like create/drop 
> tables/partitions) and from where (IP address).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HIVE-1948) Have audit logging in the Metastore

2011-02-09 Thread Devaraj Das (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HIVE-1948:
--

Attachment: audit-log-3.patch

Regenerated patch

> Have audit logging in the Metastore
> ---
>
> Key: HIVE-1948
> URL: https://issues.apache.org/jira/browse/HIVE-1948
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.7.0
>
> Attachments: audit-log-2.patch, audit-log-3.patch, audit-log.1.patch, 
> audit-log.patch
>
>
> It would be good to have audit logging in the metastore, similar to Hadoop's 
> NameNode audit logging. This would allow administrators to dig into details 
> about which user performed metadata operations (like create/drop 
> tables/partitions) and from where (IP address).

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

Build failed in Hudson: Hive-trunk-h0.20 #544

2011-02-09 Thread Apache Hudson Server

See 

Changes:

[nzhang] HIVE-1971. Verbose/echo mode for the Hive CLI (Jonathan Natkins via 
Ning Zhang)

--
[...truncated 24135 lines...]
[junit] 2011-02-09 07:10:23,095 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_07-10-20_442_6861395604252812301/-mr-1
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 

[junit] Loading data to table testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select key, value from testhivedrivertable where 
key > 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_07-10-23_896_6998938275565321940/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks is set to 0 since there's no reduce operator
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-09 07:10:26,605 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable where 
key > 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_07-10-23_896_6998938275565321940/-mr-1
[junit] OK
[junit] PREHOOK: query: select count(1) as c from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_07-10-26_846_6491588939046426848/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-09 07:10:29,566 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as c from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_07-10-26_846_6491588939046426848/-mr-1
[junit] OK
[junit] -  ---
[junit] 
[junit] Testcase: testExecute took 10.847 sec
[junit] Testcase: testNonHiveCommand took 0.951 sec
[junit] Testcase: testMetastore took 0.299 sec
[junit] Testcase: testGetClusterStatus took 0.119 sec
[junit] Testcase: testFetch took 9.549 sec
[junit] Testcase: testDynamicSerde took 6.5 sec

test-conditions:

gen-test:

create-dirs:

compile-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
:40:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable

Build failed in Hudson: Hive-trunk-h0.20 #543

2011-02-09 Thread Apache Hudson Server

See 

Changes:

[namit] HIVE-1979 set hive.input.format in hbase_bulk.m
(John Sichi via namit)

--
[...truncated 24113 lines...]
[junit] 2011-02-09 03:31:43,058 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_03-31-40_448_5685608303993419247/-mr-1
[junit] OK
[junit] Hive history 
file=
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (key int, value 
string)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 

[junit] Loading data to table testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select key, value from testhivedrivertable where 
key > 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_03-31-43_859_1559995288725461204/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks is set to 0 since there's no reduce operator
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-09 03:31:46,529 null map = 100%,  reduce = 0%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select key, value from testhivedrivertable where 
key > 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_03-31-43_859_1559995288725461204/-mr-1
[junit] OK
[junit] PREHOOK: query: select count(1) as c from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_03-31-46_726_4364199030952824896/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=
[junit] Job running in-process (local Hadoop)
[junit] 2011-02-09 03:31:49,413 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as c from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-02-09_03-31-46_726_4364199030952824896/-mr-1
[junit] OK
[junit] -  ---
[junit] 
[junit] Testcase: testExecute took 10.274 sec
[junit] Testcase: testNonHiveCommand took 0.806 sec
[junit] Testcase: testMetastore took 0.283 sec
[junit] Testcase: testGetClusterStatus took 0.091 sec
[junit] Testcase: testFetch took 9.374 sec
[junit] Testcase: testDynamicSerde took 6.372 sec

test-conditions:

gen-test:

create-dirs:

compile-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
:40:
 warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

[jira] Updated: (HIVE-1971) Verbose/echo mode for the Hive CLI

2011-02-09 Thread Ning Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-1971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1971:
-

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed. Thanks Jonathan!

> Verbose/echo mode for the Hive CLI
> --
>
> Key: HIVE-1971
> URL: https://issues.apache.org/jira/browse/HIVE-1971
> Project: Hive
>  Issue Type: New Feature
>  Components: CLI
>Reporter: Jonathan Natkins
>Assignee: Jonathan Natkins
> Attachments: HIVE-1971.1.patch.txt, HIVE-1971.2.patch.txt
>
>
> It would be very beneficial to have a mode which allows a user to run a SQL 
> script, and have each command echoed to the console as it's executed.  This 
> would be useful in figuring out which SQL statement is causing failures 
> during test runs, especially when running particularly long scripts.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1918) Add export/import facilities to the hive system

2011-02-09 Thread Krishna Kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992373#comment-12992373
 ] 

Krishna Kumar commented on HIVE-1918:
-

Importing into existing tables is now supported, but the checks (to see whether 
the imported table and the target table are compatible) have been kept fairly 
simple for now. Please see ImportSemanticAnalyzer.checkTable. The schemas 
(column and partition) of the two should match exactly, except for comments. 
Since we are just moving files (rather than rewriting records), I think there 
will be issues if the metadata schema does not match (in terms of types, number 
etc) the data serialization exactly.

Re the earlier comment re outputs/inputs, got what you meant. I will add the 
table/partition to the inputs in exportsemanticanalyzer. But in the case of the 
imports, I see that the tasks themselves adds the entity operated upon to the 
inputs/outputs list. Isn't that too late for authorization/concurrency, even 
though it may work for replication. Or both the sem.analyzers and the tasks are 
expected to add them? In the case of newly created table/partition, the 
sem.analyzer does not have a handle ?

> Add export/import facilities to the hive system
> ---
>
> Key: HIVE-1918
> URL: https://issues.apache.org/jira/browse/HIVE-1918
> Project: Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Krishna Kumar
>Assignee: Krishna Kumar
> Attachments: HIVE-1918.patch.1.txt, HIVE-1918.patch.2.txt, 
> HIVE-1918.patch.3.txt, HIVE-1918.patch.txt, hive-metastore-er.pdf
>
>
> This is an enhancement request to add export/import features to hive.
> With this language extension, the user can export the data of the table - 
> which may be located in different hdfs locations in case of a partitioned 
> table - as well as the metadata of the table into a specified output 
> location. This output location can then be moved over to another different 
> hadoop/hive instance and imported there.  
> This should work independent of the source and target metastore dbms used; 
> for instance, between derby and mysql.
> For partitioned tables, the ability to export/import a subset of the 
> partition must be supported.
> Howl will add more features on top of this: The ability to create/use the 
> exported data even in the absence of hive, using MR or Pig. Please see 
> http://wiki.apache.org/pig/Howl/HowlImportExport for these details.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

59 matches

Mail list logo