date:20120611


 [ 
https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuai Ding updated HIVE-3078:
-

Attachment: hive_input_output.patch

Fix the input/output for create table, drop table, create table like, create 
table as, create view, drop view, 

 Add inputs/outputs for create table, create view and so forth
 -

 Key: HIVE-3078
 URL: https://issues.apache.org/jira/browse/HIVE-3078
 Project: Hive
  Issue Type: Bug
Reporter: Shuai Ding
Assignee: Shuai Ding
 Attachments: hive_input_output.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2694) Add FORMAT UDF


 [ 
https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2694:
-

Status: Open  (was: Patch Available)

@Zhenxiao: Looks like the negative testcase outputs need to be updated. Can you 
please do this and then resubmit? Thanks.

 Add FORMAT UDF
 --

 Key: HIVE-2694
 URL: https://issues.apache.org/jira/browse/HIVE-2694
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Carl Steinbach
Assignee: Zhenxiao Luo
 Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, 
 HIVE-2694.3.patch.txt, HIVE-2694.D1149.1.patch, HIVE-2694.D1149.2.patch, 
 HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, HIVE-2694.D2673.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-06-11 Thread Mithun Radhakrishnan (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mithun Radhakrishnan updated HIVE-3098:
---

Status: Patch Available (was: Open)

Memory leak from large number of FileSystem instances in FileSystem.CACHE.
(Must cache UGIs.)
-

Key: HIVE-3098
URL: https://issues.apache.org/jira/browse/HIVE-3098
Project: Hive
Issue Type: Bug
Components: Shims
Affects Versions: 0.9.0
Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security
turned on.
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
Attachments: HIVE-3098.patch

The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing
the Oracle backend).
The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads,
in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had
100 instances of FileSystem, whose combined retained-mem consumed the
entire heap.
It boiled down to hadoop::UserGroupInformation::equals() being implemented
such that the Subject member is compared for equality (==), and not
equivalence (.equals()). This causes equivalent UGI instances to compare as
unequal, and causes a new FileSystem instance to be created and cached.
The UGI.equals() is so implemented, incidentally, as a fix for yet another
problem (HADOOP-6670); so it is unlikely that that implementation can be
modified.
The solution for this is to check for UGI equivalence in HCatalog (i.e. in
the Hive metastore), using an cache for UGI instances in the shims.
I have a patch to fix this. I'll upload it shortly. I just ran an overnight
test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2694) Add FORMAT UDF

2012-06-11 Thread Zhenxiao Luo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292938#comment-13292938
 ] 

Zhenxiao Luo commented on HIVE-2694:


@Carl: negative testcase outputs updated in the new patch HIVE-2694.4.patch.txt.

 Add FORMAT UDF
 --

 Key: HIVE-2694
 URL: https://issues.apache.org/jira/browse/HIVE-2694
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Carl Steinbach
Assignee: Zhenxiao Luo
 Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, 
 HIVE-2694.3.patch.txt, HIVE-2694.4.patch.txt, HIVE-2694.D1149.1.patch, 
 HIVE-2694.D1149.2.patch, HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, 
 HIVE-2694.D2673.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2694) Add FORMAT UDF

2012-06-11 Thread Zhenxiao Luo (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo updated HIVE-2694:
---

Attachment: HIVE-2694.4.patch.txt

 Add FORMAT UDF
 --

 Key: HIVE-2694
 URL: https://issues.apache.org/jira/browse/HIVE-2694
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Carl Steinbach
Assignee: Zhenxiao Luo
 Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, 
 HIVE-2694.3.patch.txt, HIVE-2694.4.patch.txt, HIVE-2694.D1149.1.patch, 
 HIVE-2694.D1149.2.patch, HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, 
 HIVE-2694.D2673.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2694) Add FORMAT UDF

2012-06-11 Thread Zhenxiao Luo (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo updated HIVE-2694:
---

Status: Patch Available  (was: Open)

 Add FORMAT UDF
 --

 Key: HIVE-2694
 URL: https://issues.apache.org/jira/browse/HIVE-2694
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Carl Steinbach
Assignee: Zhenxiao Luo
 Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, 
 HIVE-2694.3.patch.txt, HIVE-2694.4.patch.txt, HIVE-2694.D1149.1.patch, 
 HIVE-2694.D1149.2.patch, HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, 
 HIVE-2694.D2673.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3106) Add option to make multi inserts more atomic

[
https://issues.apache.org/jira/browse/HIVE-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292940#comment-13292940
]

Kevin Wilfong commented on HIVE-3106:
-

Spoke with njain offline. He suggested adding a dummy task which depends on the
tasks each move task would depend on, and which has move tasks as its children.
This will reduce the number of dependency edges in the dependency graph. This
dummy task (DependencyCollectionTask) will only be added if this option is
turned on.

Add option to make multi inserts more atomic

Key: HIVE-3106
URL: https://issues.apache.org/jira/browse/HIVE-3106
Project: Hive
Issue Type: Improvement
Components: Query Processor
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Attachments: HIVE-3106.1.patch.txt

Currently, with multi-insert queries as soon the output of one of the inserts
is ready the move task associated with that insert is run, creating the
table/partition. However, if concurrency is enabled the lock on this
table/partition is not released until the entire query finishes, which can be
much later.
This causes issues if, for example, a user is waiting for an output of the
multi-insert query which is created long before the other outputs, and
checking for it's existence using the metastore's Thrift methods
(get_table/get_partition). In which case, the user will run their query
which uses the output, and it will experience a timeout trying to acquire the
lock on the table/partition.
If all the move tasks depend on the parent's of all other move tasks, the
output creation will be much closer to atomic relieving this problem.

[jira] [Updated] (HIVE-2969) Log Time To Submit metric with PerfLogger


 [ 
https://issues.apache.org/jira/browse/HIVE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-2969:


Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed

 Log Time To Submit metric with PerfLogger
 -

 Key: HIVE-2969
 URL: https://issues.apache.org/jira/browse/HIVE-2969
 Project: Hive
  Issue Type: Wish
  Components: Logging
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Attachments: HIVE-2969.D2919.1.patch


 Logging the time from when Driver.run starts to when we begin submitting jobs 
 to map reduce would be helpful in determining how much of the lag in starting 
 a query is due to Hive vs. Hadoop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-3085) make parallel tests work


 [ 
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuai Ding reassigned HIVE-3085:


Assignee: Shuai Ding  (was: Namit Jain)

 make parallel tests work
 

 Key: HIVE-3085
 URL: https://issues.apache.org/jira/browse/HIVE-3085
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Shuai Ding
 Attachments: hive.3085.1.patch


 https://cwiki.apache.org/Hive/unit-test-parallel-execution.html
 I was trying to run the tests using the instructions above.
 I was able to run them using a single machine (parallelism of 4 in ~2 hours).
 The conf. file is as follows: .hive_ptest.conf
 {
   qfile_hosts: [
 [root@MC, 4]
   ],
   other_hosts: [
   [root@MC, 1]
   ],
   master_base_path: /data/users/tmp,
   host_base_path: /data/users/hivetests,
   java_home: /usr/local/jdk-6u24-64
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3109) metastore state not cleared

Namit Jain created HIVE-3109:


 Summary: metastore state not cleared
 Key: HIVE-3109
 URL: https://issues.apache.org/jira/browse/HIVE-3109
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain


When some of the tests are in order, random bugs are encountered.

ant test -Dtestcase=TestCliDriver -Dqfile=part_inherit_tbl_props.q,stats1.q

leads to an error in stats1.q

We ran into this error as part of parallel testing (HIVE-3085).
As part of HIVE-3085, this will be fixed temporarily by clearing
hive.metastore.partition.inherit.table.properties at the end of the test.

But, in general, any property set in one .q file should not affect anything
in other tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-3109) metastore state not cleared


 [ 
https://issues.apache.org/jira/browse/HIVE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-3109:


Assignee: Ashutosh Chauhan

 metastore state not cleared
 ---

 Key: HIVE-3109
 URL: https://issues.apache.org/jira/browse/HIVE-3109
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Ashutosh Chauhan

 When some of the tests are in order, random bugs are encountered.
 ant test -Dtestcase=TestCliDriver -Dqfile=part_inherit_tbl_props.q,stats1.q
 leads to an error in stats1.q
 We ran into this error as part of parallel testing (HIVE-3085).
 As part of HIVE-3085, this will be fixed temporarily by clearing
 hive.metastore.partition.inherit.table.properties at the end of the test.
 But, in general, any property set in one .q file should not affect anything
 in other tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3106) Add option to make multi inserts more atomic


[ 
https://issues.apache.org/jira/browse/HIVE-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292961#comment-13292961
 ] 

Carl Steinbach commented on HIVE-3106:
--

@Kevin: I added some comments on phabricator.

 Add option to make multi inserts more atomic
 

 Key: HIVE-3106
 URL: https://issues.apache.org/jira/browse/HIVE-3106
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3106.1.patch.txt


 Currently, with multi-insert queries as soon the output of one of the inserts 
 is ready the move task associated with that insert is run, creating the 
 table/partition.  However, if concurrency is enabled the lock on this 
 table/partition is not released until the entire query finishes, which can be 
 much later.
 This causes issues if, for example, a user is waiting for an output of the 
 multi-insert query which is created long before the other outputs, and 
 checking for it's existence using the metastore's Thrift methods 
 (get_table/get_partition).  In which case, the user will run their query 
 which uses the output, and it will experience a timeout trying to acquire the 
 lock on the table/partition.
 If all the move tasks depend on the parent's of all other move tasks, the 
 output creation will be much closer to atomic relieving this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3109) metastore state not cleared


[ 
https://issues.apache.org/jira/browse/HIVE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292967#comment-13292967
 ] 

Namit Jain commented on HIVE-3109:
--

Assigning to you, Ashutosh.
Feel free to un-assign or pass it.
I vaguely remember you working on this parameter, but this is a more generic 
test cleanup problem.

 metastore state not cleared
 ---

 Key: HIVE-3109
 URL: https://issues.apache.org/jira/browse/HIVE-3109
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Ashutosh Chauhan

 When some of the tests are in order, random bugs are encountered.
 ant test -Dtestcase=TestCliDriver -Dqfile=part_inherit_tbl_props.q,stats1.q
 leads to an error in stats1.q
 We ran into this error as part of parallel testing (HIVE-3085).
 As part of HIVE-3085, this will be fixed temporarily by clearing
 hive.metastore.partition.inherit.table.properties at the end of the test.
 But, in general, any property set in one .q file should not affect anything
 in other tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3085) make parallel tests work


[ 
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292969#comment-13292969
 ] 

Shuai Ding commented on HIVE-3085:
--

https://reviews.facebook.net/D3585


 make parallel tests work
 

 Key: HIVE-3085
 URL: https://issues.apache.org/jira/browse/HIVE-3085
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Shuai Ding
 Attachments: hive.3085.1.patch


 https://cwiki.apache.org/Hive/unit-test-parallel-execution.html
 I was trying to run the tests using the instructions above.
 I was able to run them using a single machine (parallelism of 4 in ~2 hours).
 The conf. file is as follows: .hive_ptest.conf
 {
   qfile_hosts: [
 [root@MC, 4]
   ],
   other_hosts: [
   [root@MC, 1]
   ],
   master_base_path: /data/users/tmp,
   host_base_path: /data/users/hivetests,
   java_home: /usr/local/jdk-6u24-64
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support


 [ 
https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gang Tim Liu updated HIVE-3072:
---

Summary: Hive List Bucketing - DDL support  (was: Hive List Bucketing - DDL 
support (single column))

 Hive List Bucketing - DDL support
 -

 Key: HIVE-3072
 URL: https://issues.apache.org/jira/browse/HIVE-3072
 Project: Hive
  Issue Type: New Feature
  Components: SQL
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu

 If a hive table column has skewed keys, query performance on non-skewed key 
 is always impacted. Hive List Bucketing feature will address it:
 https://cwiki.apache.org/Hive/listbucketing.html
 This jira issue will track DDL change for the feature. It's for single skewed 
 column.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3072) Hive List Bucketing - DDL support

[
https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gang Tim Liu updated HIVE-3072:
---

Description:
If a hive table column has skewed keys, query performance on non-skewed key is
always impacted. Hive List Bucketing feature will address it:

https://cwiki.apache.org/Hive/listbucketing.html

This jira issue will track DDL change for the feature. It's for both single
skewed column and multiple columns.

was:
If a hive table column has skewed keys, query performance on non-skewed key is
always impacted. Hive List Bucketing feature will address it:

https://cwiki.apache.org/Hive/listbucketing.html

This jira issue will track DDL change for the feature. It's for single skewed
column.

Hive List Bucketing - DDL support
-

Key: HIVE-3072
URL: https://issues.apache.org/jira/browse/HIVE-3072
Project: Hive
Issue Type: New Feature
Components: SQL
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu

If a hive table column has skewed keys, query performance on non-skewed key
is always impacted. Hive List Bucketing feature will address it:
https://cwiki.apache.org/Hive/listbucketing.html
This jira issue will track DDL change for the feature. It's for both single
skewed column and multiple columns.

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support


[ 
https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292978#comment-13292978
 ] 

Gang Tim Liu commented on HIVE-3072:


making progress on DML. The following syntax started to work: 
create table T (c1 string, c2 string) list bucketed by (c1) with skew 
('x1');
create table T (c1 string, c2 string, c3 string) list bucketed by (c1, 
c2) with skew (('x1', 'x2'), ('y1', 'y2'));


 Hive List Bucketing - DDL support
 -

 Key: HIVE-3072
 URL: https://issues.apache.org/jira/browse/HIVE-3072
 Project: Hive
  Issue Type: New Feature
  Components: SQL
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu

 If a hive table column has skewed keys, query performance on non-skewed key 
 is always impacted. Hive List Bucketing feature will address it:
 https://cwiki.apache.org/Hive/listbucketing.html
 This jira issue will track DDL change for the feature. It's for both single 
 skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support

[
https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292987#comment-13292987
]

Gang Tim Liu commented on HIVE-3072:

We rethink release approach. We can deliver DDL and DML as separate patches
or a single patch. Either has pros and cons. not perfect. Separate patch
approach can make release more manageable. A single patch makes release make
more sense because with DDL but no DML you can't experience list bucketing.

We have to pick up one. We choose a single patch approach. It reduces overhead
of multiple-patch release, gives community more time to review proposal and
reserves room for us to adjust according to proposal review.

I will call proposal review again today.

Hive List Bucketing - DDL support
-

Key: HIVE-3072
URL: https://issues.apache.org/jira/browse/HIVE-3072
Project: Hive
Issue Type: New Feature
Components: SQL
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu

Re: Hive List Bucketing - Feature Review

2012-06-11 Thread Gang Liu

Dear all hive developers,

We are making good progress of implementing the list bucketing feature. It
should be available soon in weeks.

We'd like to call feature review again and please provide your comments.

Thanks

Tim

On 6/1/12 10:13 AM, Gang Liu g...@fb.com wrote:

Dear all,

Please review the proposal and provide your comments:

https://cwiki.apache.org/Hive/listbucketing.html


Thanks

Tim

Re: Hive List Bucketing - Feature Review

2012-06-11 Thread Carl Steinbach

This link may work better for some people:

https://cwiki.apache.org/confluence/display/Hive/ListBucketing

Thanks.

Carl

On Mon, Jun 11, 2012 at 12:03 PM, Gang Liu g...@fb.com wrote:

 Dear all hive developers,

 We are making good progress of implementing the list bucketing feature. It
 should be available soon in weeks.

 We'd like to call feature review again and please provide your comments.

 Thanks

 Tim

 On 6/1/12 10:13 AM, Gang Liu g...@fb.com wrote:

 Dear all,
 
 Please review the proposal and provide your comments:
 
 https://cwiki.apache.org/Hive/listbucketing.html
 
 
 Thanks
 
 Tim

Re: Hive List Bucketing - Feature Review

2012-06-11 Thread Carl Steinbach

+ hcatalog-dev

On Mon, Jun 11, 2012 at 12:09 PM, Carl Steinbach c...@cloudera.com wrote:

 This link may work better for some people:

 https://cwiki.apache.org/confluence/display/Hive/ListBucketing

 Thanks.

 Carl


 On Mon, Jun 11, 2012 at 12:03 PM, Gang Liu g...@fb.com wrote:

 Dear all hive developers,

 We are making good progress of implementing the list bucketing feature. It
 should be available soon in weeks.

 We'd like to call feature review again and please provide your comments.

 Thanks

 Tim

 On 6/1/12 10:13 AM, Gang Liu g...@fb.com wrote:

 Dear all,
 
 Please review the proposal and provide your comments:
 
 https://cwiki.apache.org/Hive/listbucketing.html
 
 
 Thanks
 
 Tim

[jira] [Created] (HIVE-3110) ant very-clean package dies if user does not have permissions to remove dir.

Namit Jain created HIVE-3110:


 Summary: ant very-clean package dies if user does not have 
permissions to remove dir.
 Key: HIVE-3110
 URL: https://issues.apache.org/jira/browse/HIVE-3110
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2694) Add FORMAT UDF


 [ 
https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2694:
-

   Resolution: Fixed
Fix Version/s: 0.10.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Zhenxiao!

 Add FORMAT UDF
 --

 Key: HIVE-2694
 URL: https://issues.apache.org/jira/browse/HIVE-2694
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Carl Steinbach
Assignee: Zhenxiao Luo
 Fix For: 0.10.0

 Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, 
 HIVE-2694.3.patch.txt, HIVE-2694.4.patch.txt, HIVE-2694.D1149.1.patch, 
 HIVE-2694.D1149.2.patch, HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, 
 HIVE-2694.D2673.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3110) ant very-clean package dies if user does not have permissions to remove dir.


 [ 
https://issues.apache.org/jira/browse/HIVE-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3110:
-

Status: Patch Available  (was: Open)

 ant very-clean package dies if user does not have permissions to remove dir.
 

 Key: HIVE-3110
 URL: https://issues.apache.org/jira/browse/HIVE-3110
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3110) ant very-clean package dies if user does not have permissions to remove dir.


[ 
https://issues.apache.org/jira/browse/HIVE-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293002#comment-13293002
 ] 

Namit Jain commented on HIVE-3110:
--

https://reviews.facebook.net/D3591

 ant very-clean package dies if user does not have permissions to remove dir.
 

 Key: HIVE-3110
 URL: https://issues.apache.org/jira/browse/HIVE-3110
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Hive List Bucketing - Feature Review

2012-06-11 Thread Gang Liu

Hi Carl, thanks Tim

On 6/11/12 12:14 PM, Carl Steinbach c...@cloudera.com wrote:

+ hcatalog-dev

On Mon, Jun 11, 2012 at 12:09 PM, Carl Steinbach c...@cloudera.com
wrote:

 This link may work better for some people:

 https://cwiki.apache.org/confluence/display/Hive/ListBucketing

 Thanks.

 Carl


 On Mon, Jun 11, 2012 at 12:03 PM, Gang Liu g...@fb.com wrote:

 Dear all hive developers,

 We are making good progress of implementing the list bucketing
feature. It
 should be available soon in weeks.

 We'd like to call feature review again and please provide your
comments.

 Thanks

 Tim

 On 6/1/12 10:13 AM, Gang Liu g...@fb.com wrote:

 Dear all,
 
 Please review the proposal and provide your comments:
 
 https://cwiki.apache.org/Hive/listbucketing.html
 
 
 Thanks
 
 Tim

[jira] [Updated] (HIVE-3110) ant very-clean package dies if user does not have permissions to remove dir.


 [ 
https://issues.apache.org/jira/browse/HIVE-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3110:
-

Status: Open  (was: Patch Available)

 ant very-clean package dies if user does not have permissions to remove dir.
 

 Key: HIVE-3110
 URL: https://issues.apache.org/jira/browse/HIVE-3110
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support


[ 
https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293008#comment-13293008
 ] 

Carl Steinbach commented on HIVE-3072:
--

If this feature requires metastore changes then I'd like to request that the 
first patch contain only changes to the metastore schema and metastore Thrift 
API. I would also prefer that the DML and DDL changes go in as a single patch 
since it a) prevents half-implemented features from showing up in releases and 
b) demonstrates that the feature actually works.

 Hive List Bucketing - DDL support
 -

 Key: HIVE-3072
 URL: https://issues.apache.org/jira/browse/HIVE-3072
 Project: Hive
  Issue Type: New Feature
  Components: SQL
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu

 If a hive table column has skewed keys, query performance on non-skewed key 
 is always impacted. Hive List Bucketing feature will address it:
 https://cwiki.apache.org/Hive/listbucketing.html
 This jira issue will track DDL change for the feature. It's for both single 
 skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3085) make parallel tests work


 [ 
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3085:
-

Component/s: Testing Infrastructure

I added some comments to the review. Thanks.

 make parallel tests work
 

 Key: HIVE-3085
 URL: https://issues.apache.org/jira/browse/HIVE-3085
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Shuai Ding
 Attachments: hive.3085.1.patch


 https://cwiki.apache.org/Hive/unit-test-parallel-execution.html
 I was trying to run the tests using the instructions above.
 I was able to run them using a single machine (parallelism of 4 in ~2 hours).
 The conf. file is as follows: .hive_ptest.conf
 {
   qfile_hosts: [
 [root@MC, 4]
   ],
   other_hosts: [
   [root@MC, 1]
   ],
   master_base_path: /data/users/tmp,
   host_base_path: /data/users/hivetests,
   java_home: /usr/local/jdk-6u24-64
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2969) Log Time To Submit metric with PerfLogger

2012-06-11 Thread Ashutosh Chauhan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-2969:
---

Affects Version/s: (was: 0.10.0)
Fix Version/s: 0.10.0

 Log Time To Submit metric with PerfLogger
 -

 Key: HIVE-2969
 URL: https://issues.apache.org/jira/browse/HIVE-2969
 Project: Hive
  Issue Type: Wish
  Components: Logging
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Fix For: 0.10.0

 Attachments: HIVE-2969.D2919.1.patch


 Logging the time from when Driver.run starts to when we begin submitting jobs 
 to map reduce would be helpful in determining how much of the lag in starting 
 a query is due to Hive vs. Hadoop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth


[ 
https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293043#comment-13293043
 ] 

Carl Steinbach commented on HIVE-3078:
--

Is this ready for review? If so can you please submit a review request? Thanks.

 Add inputs/outputs for create table, create view and so forth
 -

 Key: HIVE-3078
 URL: https://issues.apache.org/jira/browse/HIVE-3078
 Project: Hive
  Issue Type: Bug
Reporter: Shuai Ding
Assignee: Shuai Ding
 Attachments: hive_input_output.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3013) TestCliDriver cannot be debugged with eclipse since hadoop_home is set incorrectly


 [ 
https://issues.apache.org/jira/browse/HIVE-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3013:
-

   Resolution: Fixed
Fix Version/s: 0.10.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 TestCliDriver cannot be debugged with eclipse since hadoop_home is set 
 incorrectly
 --

 Key: HIVE-3013
 URL: https://issues.apache.org/jira/browse/HIVE-3013
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.9.0
Reporter: Namit Jain
Assignee: Carl Steinbach
 Fix For: 0.10.0

 Attachments: HIVE-3013.2.patch.txt, HIVE-3013.3.patch.txt, 
 hive.3013.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3085) make parallel tests work

2012-06-11 Thread Edward Capriolo (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293055#comment-13293055
 ] 

Edward Capriolo commented on HIVE-3085:
---

Do the 'parallel' tests still require a shared NFS mount? A while back someone 
told me I did not need NFS anymore because hadoop 'give me Big Datas'.  Really 
though this shared NFS mount destroys the utility of this toolkit for me.

 make parallel tests work
 

 Key: HIVE-3085
 URL: https://issues.apache.org/jira/browse/HIVE-3085
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Shuai Ding
 Attachments: hive.3085.1.patch


 https://cwiki.apache.org/Hive/unit-test-parallel-execution.html
 I was trying to run the tests using the instructions above.
 I was able to run them using a single machine (parallelism of 4 in ~2 hours).
 The conf. file is as follows: .hive_ptest.conf
 {
   qfile_hosts: [
 [root@MC, 4]
   ],
   other_hosts: [
   [root@MC, 1]
   ],
   master_base_path: /data/users/tmp,
   host_base_path: /data/users/hivetests,
   java_home: /usr/local/jdk-6u24-64
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3013) TestCliDriver cannot be debugged with eclipse since hadoop_home is set incorrectly


[ 
https://issues.apache.org/jira/browse/HIVE-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293056#comment-13293056
 ] 

Carl Steinbach commented on HIVE-3013:
--

@Ashutosh:

bq. HiveCLI.launch launches successfully, but doesnt work because of incorrect 
config.

It seems to work for me, but maybe I'm just not trying the right commands. When 
you run it how does it fail?

bq. FWIW, I ran all 887 ql tests from TestCliDriver. 36 of them failed. 
Investigating those will require separate kind of work so could be taken up in 
followup issue.

I don't suppose you still have the list of failures available?

 TestCliDriver cannot be debugged with eclipse since hadoop_home is set 
 incorrectly
 --

 Key: HIVE-3013
 URL: https://issues.apache.org/jira/browse/HIVE-3013
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.9.0
Reporter: Namit Jain
Assignee: Carl Steinbach
 Fix For: 0.10.0

 Attachments: HIVE-3013.2.patch.txt, HIVE-3013.3.patch.txt, 
 hive.3013.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3106) Add option to make multi inserts more atomic


[ 
https://issues.apache.org/jira/browse/HIVE-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293059#comment-13293059
 ] 

Kevin Wilfong commented on HIVE-3106:
-

Per Carl's comments, explicitely stated the advantages/disadvantages, removed 
atomic from the name of the configuration variable, as this is not really true, 
removed references to outputs in description of config.

Also, fixed an issue, where if a file was taking a long time to produce, there 
would still be a long time between when the tables/partitions are produced and 
when the locks on them are released. Now, when the option is set, the 
DependencyCollection task depends on the dependencies of the move tasks for 
files, but the move tasks for files do not depend on the DependencyCollection 
task, as there are no locks on these files so there would not be any advantage.

Added a new test case for this additional functionality.

 Add option to make multi inserts more atomic
 

 Key: HIVE-3106
 URL: https://issues.apache.org/jira/browse/HIVE-3106
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-3106.1.patch.txt


 Currently, with multi-insert queries as soon the output of one of the inserts 
 is ready the move task associated with that insert is run, creating the 
 table/partition.  However, if concurrency is enabled the lock on this 
 table/partition is not released until the entire query finishes, which can be 
 much later.
 This causes issues if, for example, a user is waiting for an output of the 
 multi-insert query which is created long before the other outputs, and 
 checking for it's existence using the metastore's Thrift methods 
 (get_table/get_partition).  In which case, the user will run their query 
 which uses the output, and it will experience a timeout trying to acquire the 
 lock on the table/partition.
 If all the move tasks depend on the parent's of all other move tasks, the 
 output creation will be much closer to atomic relieving this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3085) make parallel tests work


 [ 
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3085:
-

Attachment: hive.3085.2.patch

 make parallel tests work
 

 Key: HIVE-3085
 URL: https://issues.apache.org/jira/browse/HIVE-3085
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Shuai Ding
 Attachments: hive.3085.1.patch, hive.3085.2.patch


 https://cwiki.apache.org/Hive/unit-test-parallel-execution.html
 I was trying to run the tests using the instructions above.
 I was able to run them using a single machine (parallelism of 4 in ~2 hours).
 The conf. file is as follows: .hive_ptest.conf
 {
   qfile_hosts: [
 [root@MC, 4]
   ],
   other_hosts: [
   [root@MC, 1]
   ],
   master_base_path: /data/users/tmp,
   host_base_path: /data/users/hivetests,
   java_home: /usr/local/jdk-6u24-64
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3072) Hive List Bucketing - DDL support


[ 
https://issues.apache.org/jira/browse/HIVE-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293082#comment-13293082
 ] 

Gang Tim Liu commented on HIVE-3072:


Yes, we are heading to a single patch approach.

Yes, this feature requires metastore change.


 Hive List Bucketing - DDL support
 -

 Key: HIVE-3072
 URL: https://issues.apache.org/jira/browse/HIVE-3072
 Project: Hive
  Issue Type: New Feature
  Components: SQL
Reporter: Gang Tim Liu
Assignee: Gang Tim Liu

 If a hive table column has skewed keys, query performance on non-skewed key 
 is always impacted. Hive List Bucketing feature will address it:
 https://cwiki.apache.org/Hive/listbucketing.html
 This jira issue will track DDL change for the feature. It's for both single 
 skewed column and multiple columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-2693) Add DECIMAL data type


 [ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reassigned HIVE-2693:


Assignee: Josh Wills

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Reporter: Carl Steinbach
Assignee: Josh Wills

 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2693) Add DECIMAL data type


[ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293088#comment-13293088
 ] 

Carl Steinbach commented on HIVE-2693:
--

Review request from a while ago: https://reviews.facebook.net/D1221


 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Reporter: Carl Steinbach
Assignee: Josh Wills

 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3111) reduce the time for parallel unit tests for hive

Namit Jain created HIVE-3111:


 Summary: reduce the time for parallel unit tests for hive
 Key: HIVE-3111
 URL: https://issues.apache.org/jira/browse/HIVE-3111
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain


1. Run the other tests in parallel with TestCliDriver and TestNegativeCliDriver
2. Run the tests that need super-user privilege in parallel with TestCliDriver 
and TestNegativeCliDriver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3085) make parallel tests work

[
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293095#comment-13293095
]

Namit Jain commented on HIVE-3085:
--

Right now, the tests does require a shared mount if you want to run on multiple
machines.
This is good, if we dont want to compile across all the machines.

Having said that, I am also planning to use it on my machine only, and this
should still help to finish the tests in about 1.5 hours.
In that case, I was able to use local disk on my machine.

This can be further optimized. Some of them are:

1. Run the other tests in parallel with TestCliDriver and TestNegativeCliDriver
2. Run the tests that need super-user privilege in parallel with TestCliDriver
and TestNegativeCliDriver

Filed https://issues.apache.org/jira/browse/HIVE-3111 for that.

make parallel tests work

Key: HIVE-3085
URL: https://issues.apache.org/jira/browse/HIVE-3085
Project: Hive
Issue Type: Bug
Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Shuai Ding
Attachments: hive.3085.1.patch, hive.3085.2.patch

https://cwiki.apache.org/Hive/unit-test-parallel-execution.html
I was trying to run the tests using the instructions above.
I was able to run them using a single machine (parallelism of 4 in ~2 hours).
The conf. file is as follows: .hive_ptest.conf
{
qfile_hosts: [
[root@MC, 4]
],
other_hosts: [
[root@MC, 1]
],
master_base_path: /data/users/tmp,
host_base_path: /data/users/hivetests,
java_home: /usr/local/jdk-6u24-64
}

[jira] [Updated] (HIVE-3078) Add inputs/outputs for create table, create view and so forth


 [ 
https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shuai Ding updated HIVE-3078:
-

Status: Patch Available  (was: Open)

 Add inputs/outputs for create table, create view and so forth
 -

 Key: HIVE-3078
 URL: https://issues.apache.org/jira/browse/HIVE-3078
 Project: Hive
  Issue Type: Bug
Reporter: Shuai Ding
Assignee: Shuai Ding
 Attachments: hive_input_output.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3078) Add inputs/outputs for create table, create view and so forth


 [ 
https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3078:
-

Status: Open  (was: Patch Available)

Please submit a review request on Phabricator or Reviewboard. Thanks.

 Add inputs/outputs for create table, create view and so forth
 -

 Key: HIVE-3078
 URL: https://issues.apache.org/jira/browse/HIVE-3078
 Project: Hive
  Issue Type: Bug
Reporter: Shuai Ding
Assignee: Shuai Ding
 Attachments: hive_input_output.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3085) make parallel tests work


[ 
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293117#comment-13293117
 ] 

Shuai Ding commented on HIVE-3085:
--

https://reviews.facebook.net/D3585

 make parallel tests work
 

 Key: HIVE-3085
 URL: https://issues.apache.org/jira/browse/HIVE-3085
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Shuai Ding
 Attachments: hive.3085.1.patch, hive.3085.2.patch


 https://cwiki.apache.org/Hive/unit-test-parallel-execution.html
 I was trying to run the tests using the instructions above.
 I was able to run them using a single machine (parallelism of 4 in ~2 hours).
 The conf. file is as follows: .hive_ptest.conf
 {
   qfile_hosts: [
 [root@MC, 4]
   ],
   other_hosts: [
   [root@MC, 1]
   ],
   master_base_path: /data/users/tmp,
   host_base_path: /data/users/hivetests,
   java_home: /usr/local/jdk-6u24-64
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth


[ 
https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293119#comment-13293119
 ] 

Carl Steinbach commented on HIVE-3078:
--

And please exclude the test updates from the review request to make it easier 
to read.

 Add inputs/outputs for create table, create view and so forth
 -

 Key: HIVE-3078
 URL: https://issues.apache.org/jira/browse/HIVE-3078
 Project: Hive
  Issue Type: Bug
Reporter: Shuai Ding
Assignee: Shuai Ding
 Attachments: hive_input_output.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3085) make parallel tests work


[ 
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293121#comment-13293121
 ] 

Namit Jain commented on HIVE-3085:
--

@Carl, I had a question about ant very-clean package.
That takes a very long time (~20 min.) since we are downloading so many jar 
files.

Won't it be better to not populate hive*jar in ivy in our local builds ?
Then, ant clean package can run much faster.

Or, what is the downside of removing '*hive*.jar' from .ivy2 and then running 
ant clean package.
Other jars rarely change, but this saves ~10 min. in compile time. 

 make parallel tests work
 

 Key: HIVE-3085
 URL: https://issues.apache.org/jira/browse/HIVE-3085
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Shuai Ding
 Attachments: hive.3085.1.patch, hive.3085.2.patch


 https://cwiki.apache.org/Hive/unit-test-parallel-execution.html
 I was trying to run the tests using the instructions above.
 I was able to run them using a single machine (parallelism of 4 in ~2 hours).
 The conf. file is as follows: .hive_ptest.conf
 {
   qfile_hosts: [
 [root@MC, 4]
   ],
   other_hosts: [
   [root@MC, 1]
   ],
   master_base_path: /data/users/tmp,
   host_base_path: /data/users/hivetests,
   java_home: /usr/local/jdk-6u24-64
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3085) make parallel tests work


[ 
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293124#comment-13293124
 ] 

Shuai Ding commented on HIVE-3085:
--

https://reviews.facebook.net/D3585


 make parallel tests work
 

 Key: HIVE-3085
 URL: https://issues.apache.org/jira/browse/HIVE-3085
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Shuai Ding
 Attachments: hive.3085.1.patch, hive.3085.2.patch


 https://cwiki.apache.org/Hive/unit-test-parallel-execution.html
 I was trying to run the tests using the instructions above.
 I was able to run them using a single machine (parallelism of 4 in ~2 hours).
 The conf. file is as follows: .hive_ptest.conf
 {
   qfile_hosts: [
 [root@MC, 4]
   ],
   other_hosts: [
   [root@MC, 1]
   ],
   master_base_path: /data/users/tmp,
   host_base_path: /data/users/hivetests,
   java_home: /usr/local/jdk-6u24-64
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed

Namit Jain created HIVE-3112:


 Summary: clear hive.metastore.partition.inherit.table.properties 
till HIVE-3109 is fixed
 Key: HIVE-3112
 URL: https://issues.apache.org/jira/browse/HIVE-3112
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2693) Add DECIMAL data type

2012-06-11 Thread Josh Wills (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Josh Wills updated HIVE-2693:
-

Attachment: HIVE-2693.patch

The old version of this change.

 Add DECIMAL data type
 -

 Key: HIVE-2693
 URL: https://issues.apache.org/jira/browse/HIVE-2693
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, Types
Reporter: Carl Steinbach
Assignee: Josh Wills
 Attachments: HIVE-2693.patch


 Add support for the DECIMAL data type. HIVE-2272 (TIMESTAMP) provides a nice 
 template for how to do this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed


[ 
https://issues.apache.org/jira/browse/HIVE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293135#comment-13293135
 ] 

Kevin Wilfong commented on HIVE-3112:
-

+1 Will run tests

 clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is 
 fixed
 ---

 Key: HIVE-3112
 URL: https://issues.apache.org/jira/browse/HIVE-3112
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

[
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bhushan Mandhani updated HIVE-2989:
---

Attachment: HIVE-2989.9.patch.txt

Updated per comments.

Adding Table Links to Hive
--

Key: HIVE-2989
URL: https://issues.apache.org/jira/browse/HIVE-2989
Project: Hive
Issue Type: Improvement
Components: Metastore, Query Processor, Security
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Attachments: HIVE-2989.1.patch.txt, HIVE-2989.2.patch.txt,
HIVE-2989.3.patch.txt, HIVE-2989.4.patch.txt, HIVE-2989.5.patch.txt,
HIVE-2989.6.patch.txt, HIVE-2989.9.patch.txt

Original Estimate: 672h
Remaining Estimate: 672h

This will add Table Links to Hive. This will be an alternate mechanism for a
user to access tables and data in a database that is different from the one
he is associated with. This feature can be used to provide access control (if
access to databasename.tablename in queries and use database X is turned
off in conjunction).
If db X wants to access one or more partitions from table T in db Y, the user
will issue:
CREATE [STATIC] LINK TO T@Y LINKPROPERTIES ('RETENTION'='N')
New partitions added to T will automatically be added to the link as well and
become available to X. However, if the link is specified to be static, that
will not be the case. The X user will then have to explicitly import each
partition of T that he needs. The command above will not actually make any
existing partitions of T available to X. Instead, we provide the following
command to add an existing partition to a link:
ALTER LINK T@Y ADD PARTITION (ds='2012-04-27')
The user will need to execute the above for each existing partition that
needs to be imported. For future partitions, Hive will take care of this. An
imported partition can be dropped from a link using a similar command. We
just specify DROP instead of ADD. For querying the linked table, the X
user will refer to it as T@Y. Link Tables will only have read access and not
be writable. The entire Table Link alongwith all its imported partitions can
be dropped as follows:
DROP LINK TO T@Y
The above commands are purely MetaStore operations. The implementation will
rely on replicating the entire partition metadata when a partition is added
to a link. For every link that is created, we will add a new row to table
TBLS. The TBL_TYPE column will have a new kind of value LINK_TABLE (or
STATIC_LINK_TABLE if the link has been specified as static). A new column
LINK_TBL_ID will be added which will contain the id of the imported table. It
will be NULL for all other table types including the regular managed tables.
When a partition is added to a link, the new row in the table PARTITIONS will
point to the LINK_TABLE in the same database and not the master table in the
other database. We will replicate all the metadata for this partition from
the master database. The advantage of this approach is that fewer changes
will be needed in query processing and DDL for LINK_TABLEs. Also, commands
like SHOW TABLES and SHOW PARTITIONS will work as expected for
LINK_TABLEs too. Of course, even though the metadata is not shared, the
underlying data on disk is still shared. Hive still needs to know that when
dropping a partition which belongs to a LINK_TABLE, it should not drop the
underlying data from HDFS. Views and external tables cannot be imported from
one database to another.

[jira] [Updated] (HIVE-2989) Adding Table Links to Hive

[
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bhushan Mandhani updated HIVE-2989:
---

Hadoop Flags: (was: Reviewed)
Status: Patch Available (was: Open)

Adding Table Links to Hive
--

Original Estimate: 672h
Remaining Estimate: 672h

[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth


[ 
https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293137#comment-13293137
 ] 

Shuai Ding commented on HIVE-3078:
--

I have binaries in the diff and can't use arc then ..


 Add inputs/outputs for create table, create view and so forth
 -

 Key: HIVE-3078
 URL: https://issues.apache.org/jira/browse/HIVE-3078
 Project: Hive
  Issue Type: Bug
Reporter: Shuai Ding
Assignee: Shuai Ding
 Attachments: hive_input_output.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth


[ 
https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293162#comment-13293162
 ] 

Carl Steinbach commented on HIVE-3078:
--

I don't see any binaries in the patch. Not sure what you're referring to.

 Add inputs/outputs for create table, create view and so forth
 -

 Key: HIVE-3078
 URL: https://issues.apache.org/jira/browse/HIVE-3078
 Project: Hive
  Issue Type: Bug
Reporter: Shuai Ding
Assignee: Shuai Ding
 Attachments: hive_input_output.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3113) Querying of Table Links

Bhushan Mandhani created HIVE-3113:
--

 Summary: Querying of Table Links
 Key: HIVE-3113
 URL: https://issues.apache.org/jira/browse/HIVE-3113
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor


Implementation of querying of Table Links

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed


 [ 
https://issues.apache.org/jira/browse/HIVE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3112:
-

Status: Open  (was: Patch Available)

@Namit: Please add a comment to the test that points back to HIVE-3112. Thanks.

 clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is 
 fixed
 ---

 Key: HIVE-3112
 URL: https://issues.apache.org/jira/browse/HIVE-3112
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3114) Split Thrift interface for Table Link Creation

Bhushan Mandhani created HIVE-3114:
--

 Summary: Split Thrift interface for Table Link Creation
 Key: HIVE-3114
 URL: https://issues.apache.org/jira/browse/HIVE-3114
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor


Table Link creation through Thrift currently goes through the same method as 
Table creation. We want to move it out of there and into it's own method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3115) Table Links and Authorization

Bhushan Mandhani created HIVE-3115:
--

 Summary: Table Links and Authorization
 Key: HIVE-3115
 URL: https://issues.apache.org/jira/browse/HIVE-3115
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor


Incorporate Table Links into the existing authorization framework in Hive. Add 
tests to check that no breach of security permissions is possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth


[ 
https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293174#comment-13293174
 ] 

Shuai Ding commented on HIVE-3078:
--

https://reviews.facebook.net/D3603


 Add inputs/outputs for create table, create view and so forth
 -

 Key: HIVE-3078
 URL: https://issues.apache.org/jira/browse/HIVE-3078
 Project: Hive
  Issue Type: Bug
Reporter: Shuai Ding
Assignee: Shuai Ding
 Attachments: hive_input_output.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3078) Add inputs/outputs for create table, create view and so forth


[ 
https://issues.apache.org/jira/browse/HIVE-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293175#comment-13293175
 ] 

Shuai Ding commented on HIVE-3078:
--

 https://reviews.facebook.net/D3603

 Add inputs/outputs for create table, create view and so forth
 -

 Key: HIVE-3078
 URL: https://issues.apache.org/jira/browse/HIVE-3078
 Project: Hive
  Issue Type: Bug
Reporter: Shuai Ding
Assignee: Shuai Ding
 Attachments: hive_input_output.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3085) make parallel tests work

[
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293176#comment-13293176
]

Carl Steinbach commented on HIVE-3085:
--

@Namit

bq. Won't it be better to not populate hive*jar in ivy in our local builds?

I don't think we can do that and also continue to list inter-subproject
dependencies in the ivy.xml files. The basic reason why this doesn't work
correctly for Hive is that the build is still not using Ivy correctly. More
specifically, we're manually specifying the order in which subprojects are
built instead of letting Ivy determine the order through dependency analysis.

bq. Or, what is the downside of removing 'hive.jar' from .ivy2 and then running
ant clean package.

The main downside is that the user may have configured ivy.cache.dir to be
something other than ~/.ivy2, so to be sure that you're deleting the right
files you have to get the value of ${ivy.cache.dir} (which didn't seem that
straightforward the last time I looked at it:
http://ant.apache.org/ivy/history/2.0.0/use/cleancache.html).

make parallel tests work

[jira] [Created] (HIVE-3116) Make very-clean Ant target more selective

Carl Steinbach created HIVE-3116:


 Summary: Make very-clean Ant target more selective
 Key: HIVE-3116
 URL: https://issues.apache.org/jira/browse/HIVE-3116
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3116) Make very-clean Ant target more selective


[ 
https://issues.apache.org/jira/browse/HIVE-3116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293180#comment-13293180
 ] 

Carl Steinbach commented on HIVE-3116:
--

Currently the very-clean target depends on ivy:cleancache. The original 
motivation for adding very-clean was to flush Hive artifacts out of the local 
Ivy cache, but the ivy:cleancache task actually deletes everything in ~/.ivy2.

The following page indicates that the ivy:cleancache task can be configured to 
use a specific Ivy settings file, which may allow us to limit the deleted 
artifacts to Hive only: 
http://ant.apache.org/ivy/history/2.0.0/use/cleancache.html

Also relevant: http://ant.apache.org/ivy/history/2.0.0/settings/caches.html


 Make very-clean Ant target more selective
 -

 Key: HIVE-3116
 URL: https://issues.apache.org/jira/browse/HIVE-3116
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2745) Remove Hive's runtime dependency on bin/hadoop


 [ 
https://issues.apache.org/jira/browse/HIVE-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2745:
-

Summary: Remove Hive's runtime dependency on bin/hadoop  (was: Remove 
Hive's runtime/test dependency on bin/hadoop)

 Remove Hive's runtime dependency on bin/hadoop
 --

 Key: HIVE-2745
 URL: https://issues.apache.org/jira/browse/HIVE-2745
 Project: Hive
  Issue Type: Improvement
  Components: Build Infrastructure, Query Processor
Reporter: Carl Steinbach



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-3117) Determine order of subproject builds using ivy:buildlist task

Carl Steinbach created HIVE-3117:


 Summary: Determine order of subproject builds using ivy:buildlist 
task
 Key: HIVE-3117
 URL: https://issues.apache.org/jira/browse/HIVE-3117
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3117) Determine order of subproject builds using ivy:buildlist task


[ 
https://issues.apache.org/jira/browse/HIVE-3117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293190#comment-13293190
 ] 

Carl Steinbach commented on HIVE-3117:
--

We should use the ivy:buildlist task to determine the order of subproject 
builds instead of hardcoding this in the root build.xml file. Problems with the 
current approach include a) the fact that we're likely to pick up dirty Hive 
artifacts from the local Ivy cache and b) the fact that it's hard to prevent 
the subprojects from evolving circular dependencies.

References:
* http://ant.apache.org/ivy/history/latest-milestone/tutorial/multiproject.html
* http://stackoverflow.com/questions/4106143/ivy-simple-shared-repository


 Determine order of subproject builds using ivy:buildlist task
 -

 Key: HIVE-3117
 URL: https://issues.apache.org/jira/browse/HIVE-3117
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Carl Steinbach



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3085) make parallel tests work


[ 
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293191#comment-13293191
 ] 

Carl Steinbach commented on HIVE-3085:
--

@Namit: I filed two followup tickets: HIVE-3116 and HIVE-3117. Thanks.

 make parallel tests work
 

 Key: HIVE-3085
 URL: https://issues.apache.org/jira/browse/HIVE-3085
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Shuai Ding
 Attachments: hive.3085.1.patch, hive.3085.2.patch


 https://cwiki.apache.org/Hive/unit-test-parallel-execution.html
 I was trying to run the tests using the instructions above.
 I was able to run them using a single machine (parallelism of 4 in ~2 hours).
 The conf. file is as follows: .hive_ptest.conf
 {
   qfile_hosts: [
 [root@MC, 4]
   ],
   other_hosts: [
   [root@MC, 1]
   ],
   master_base_path: /data/users/tmp,
   host_base_path: /data/users/hivetests,
   java_home: /usr/local/jdk-6u24-64
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3115) Table Links and Authorization


[ 
https://issues.apache.org/jira/browse/HIVE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293195#comment-13293195
 ] 

Carl Steinbach commented on HIVE-3115:
--

@Bhushan: Is this going to be done before or after HIVE-3113?

 Table Links and Authorization
 -

 Key: HIVE-3115
 URL: https://issues.apache.org/jira/browse/HIVE-3115
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor

 Incorporate Table Links into the existing authorization framework in Hive. 
 Add tests to check that no breach of security permissions is possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3114) Split Thrift interface for Table Link Creation


[ 
https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293198#comment-13293198
 ] 

Carl Steinbach commented on HIVE-3114:
--

@Bhushan: What's the timeline for doing this? I'm concerned that the Metastore 
Thrift interface is one of Hive's de facto public APIs, and any new 
functionality that appears in a release will need to be supported going 
forward. Why not just fix this in HIVE-2989 and eliminate the possibility that 
we're going to get stuck with an interface that we already know is broken?

 Split Thrift interface for Table Link Creation
 --

 Key: HIVE-3114
 URL: https://issues.apache.org/jira/browse/HIVE-3114
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor

 Table Link creation through Thrift currently goes through the same method as 
 Table creation. We want to move it out of there and into it's own method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2989) Adding Table Links to Hive

[
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293200#comment-13293200
]

Carl Steinbach commented on HIVE-2989:
--

@Bhushan: I think HIVE-3114 (Split Thrift interface for TableLink creation)
should be done in this patch instead of splitting it out into a followup
ticket. Here's what I said in HIVE-3114:

bq. I'm concerned that the Metastore Thrift interface is one of Hive's de facto
public APIs, and any new functionality that appears in a release will need to
be supported going forward. Why not just fix this in HIVE-2989 and eliminate
the possibility that we're going to get stuck with an interface that we already
know is broken?

Adding Table Links to Hive
--

Original Estimate: 672h
Remaining Estimate: 672h

[jira] [Commented] (HIVE-2989) Adding Table Links to Hive

[
https://issues.apache.org/jira/browse/HIVE-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293201#comment-13293201
]

Carl Steinbach commented on HIVE-2989:
--

@Bhushan: I'll look over the rest of patch later tonight. Thanks.

Adding Table Links to Hive
--

Original Estimate: 672h
Remaining Estimate: 672h

[jira] [Updated] (HIVE-1362) column level statistics


 [ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1362:
-

Assignee: Shreepadma Venugopalan
  Labels:   (was: gsoc gsoc2012)

 column level statistics
 ---

 Key: HIVE-1362
 URL: https://issues.apache.org/jira/browse/HIVE-1362
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Ning Zhang
Assignee: Shreepadma Venugopalan



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-1940) Query Optimization Using Column Metadata and Histograms


 [ 
https://issues.apache.org/jira/browse/HIVE-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-1940.
--

Resolution: Duplicate

Resolving this as a duplicated of HIVE-1938 and HIVE-1362.

 Query Optimization Using Column Metadata and Histograms
 ---

 Key: HIVE-1940
 URL: https://issues.apache.org/jira/browse/HIVE-1940
 Project: Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor, Statistics
Reporter: Anja Gruenheid
 Attachments: Agruenheid_ideas11.pdf, HiveMetaStore.pdf


 The current basis for cost-based query optimization in Hive is information 
 gathered on tables and partitions. To make further improvements in query 
 optimization possible, the next step is to develop and implement 
 possibilities to gather information on columns as discussed in issue HIVE-33. 
 After that, an implementation of histograms is a possible option to use and 
 collect run-time statistics. Next to the actual implementation of these 
 features, it is also necessary to develop a consistent storage model for the 
 MetaStore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2950) Hive should store the full table schema in partition storage descriptors

2012-06-11 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293218#comment-13293218
 ] 

Ashutosh Chauhan commented on HIVE-2950:


@Travis,
Can you upload the latest patch at jira. Unfortunately, Phabricator doesn't let 
you download a patch file.

 Hive should store the full table schema in partition storage descriptors
 

 Key: HIVE-2950
 URL: https://issues.apache.org/jira/browse/HIVE-2950
 Project: Hive
  Issue Type: Bug
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: HIVE-2950.D2769.1.patch


 Hive tables have a schema, which is copied into the partition storage 
 descriptor when adding a partition. Currently only columns stored in the 
 table storage descriptor are copied - columns that are reported by the serde 
 are not copied. Instead of copying the table storage descriptor columns into 
 the partition columns, the full table schema should be copied.
 DETAILS
 This is a little long but is necessary to show 3 things: current behavior 
 when explicitly listing columns, behavior with HIVE-2941 patched in and serde 
 reported columns, and finally the behavior with this patch (full table schema 
 copied into the partition storage descriptor).
 Here's an example of what currently happens. Note the following:
 * the two manually-defined fields defined for the table are listed in the 
 table storage descriptor.
 * both fields are present in the partition storage descriptor
 This works great because users who query for a partition can look at its 
 storage descriptor and get the schema.
 {code}
 hive create external table foo_test (name string, age int) partitioned by 
 (part_dt string);
 hive describe extended foo_test;
 OK
 name  string  
 age   int 
 part_dt   string  

 Detailed Table InformationTable(tableName:foo_test, dbName:travis_test, 
 owner:travis, createTime:1334256062, lastAccessTime:0, retention:0, 
 sd:StorageDescriptor(cols:[FieldSchema(name:name, type:string, comment:null), 
 FieldSchema(name:age, type:int, comment:null), FieldSchema(name:part_dt, 
 type:string, comment:null)], 
 location:hdfs://foo.com/warehouse/travis_test.db/foo_test, 
 inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
 outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
 compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
 serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
 parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
 parameters:{}, primaryRegionName:, secondaryRegions:[]), 
 partitionKeys:[FieldSchema(name:part_dt, type:string, comment:null)], 
 parameters:{EXTERNAL=TRUE, transient_lastDdlTime=1334256062}, 
 viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE) 
 Time taken: 0.082 seconds
 hive alter table foo_test add partition (part_dt = '20120331T00Z') 
 location 'hdfs://foo.com/foo/2012/03/31/00';
 hive describe extended foo_test partition (part_dt = '20120331T00Z');
 OK
 name  string  
 age   int 
 part_dt   string  

 Detailed Partition InformationPartition(values:[20120331T00Z], 
 dbName:travis_test, tableName:foo_test, createTime:1334256131, 
 lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, 
 type:string, comment:null), FieldSchema(name:age, type:int, comment:null), 
 FieldSchema(name:part_dt, type:string, comment:null)], 
 location:hdfs://foo.com/foo/2012/03/31/00, 
 inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
 outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
 compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
 serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
 parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
 parameters:{}, primaryRegionName:, secondaryRegions:[]), 
 parameters:{transient_lastDdlTime=1334256131})  
 {code}
 CURRENT BEHAVIOR WITH HIVE-2941 PATCHED IN
 Now let's examine what happens when creating a table when the serde reports 
 the schema. Notice the following:
 * The table storage descriptor contains an empty list of columns. However, 
 the table schema is available from the serde reflecting on the serialization 
 class.
 * The partition storage descriptor does contain a single part_dt column 
 that was copied from the table partition keys. The actual data columns are 
 not present.
 {code}
 hive create external table travis_test.person_test partitioned by (part_dt 
 string) row format serde com.twitter.elephantbird.hive.serde.ThriftSerDe 
 with serdeproperties 
 (serialization.class=com.twitter.elephantbird.examples.thrift.Person) 
 stored as inputformat

[jira] [Commented] (HIVE-2950) Hive should store the full table schema in partition storage descriptors

2012-06-11 Thread Travis Crawford (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293227#comment-13293227
 ] 

Travis Crawford commented on HIVE-2950:
---

The patch is actually the same - the one attached to Jira is up-to-date.

 Hive should store the full table schema in partition storage descriptors
 

 Key: HIVE-2950
 URL: https://issues.apache.org/jira/browse/HIVE-2950
 Project: Hive
  Issue Type: Bug
Reporter: Travis Crawford
Assignee: Travis Crawford
 Attachments: HIVE-2950.D2769.1.patch


 Hive tables have a schema, which is copied into the partition storage 
 descriptor when adding a partition. Currently only columns stored in the 
 table storage descriptor are copied - columns that are reported by the serde 
 are not copied. Instead of copying the table storage descriptor columns into 
 the partition columns, the full table schema should be copied.
 DETAILS
 This is a little long but is necessary to show 3 things: current behavior 
 when explicitly listing columns, behavior with HIVE-2941 patched in and serde 
 reported columns, and finally the behavior with this patch (full table schema 
 copied into the partition storage descriptor).
 Here's an example of what currently happens. Note the following:
 * the two manually-defined fields defined for the table are listed in the 
 table storage descriptor.
 * both fields are present in the partition storage descriptor
 This works great because users who query for a partition can look at its 
 storage descriptor and get the schema.
 {code}
 hive create external table foo_test (name string, age int) partitioned by 
 (part_dt string);
 hive describe extended foo_test;
 OK
 name  string  
 age   int 
 part_dt   string  

 Detailed Table InformationTable(tableName:foo_test, dbName:travis_test, 
 owner:travis, createTime:1334256062, lastAccessTime:0, retention:0, 
 sd:StorageDescriptor(cols:[FieldSchema(name:name, type:string, comment:null), 
 FieldSchema(name:age, type:int, comment:null), FieldSchema(name:part_dt, 
 type:string, comment:null)], 
 location:hdfs://foo.com/warehouse/travis_test.db/foo_test, 
 inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
 outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
 compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
 serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
 parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
 parameters:{}, primaryRegionName:, secondaryRegions:[]), 
 partitionKeys:[FieldSchema(name:part_dt, type:string, comment:null)], 
 parameters:{EXTERNAL=TRUE, transient_lastDdlTime=1334256062}, 
 viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE) 
 Time taken: 0.082 seconds
 hive alter table foo_test add partition (part_dt = '20120331T00Z') 
 location 'hdfs://foo.com/foo/2012/03/31/00';
 hive describe extended foo_test partition (part_dt = '20120331T00Z');
 OK
 name  string  
 age   int 
 part_dt   string  

 Detailed Partition InformationPartition(values:[20120331T00Z], 
 dbName:travis_test, tableName:foo_test, createTime:1334256131, 
 lastAccessTime:0, sd:StorageDescriptor(cols:[FieldSchema(name:name, 
 type:string, comment:null), FieldSchema(name:age, type:int, comment:null), 
 FieldSchema(name:part_dt, type:string, comment:null)], 
 location:hdfs://foo.com/foo/2012/03/31/00, 
 inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
 outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
 compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
 serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
 parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
 parameters:{}, primaryRegionName:, secondaryRegions:[]), 
 parameters:{transient_lastDdlTime=1334256131})  
 {code}
 CURRENT BEHAVIOR WITH HIVE-2941 PATCHED IN
 Now let's examine what happens when creating a table when the serde reports 
 the schema. Notice the following:
 * The table storage descriptor contains an empty list of columns. However, 
 the table schema is available from the serde reflecting on the serialization 
 class.
 * The partition storage descriptor does contain a single part_dt column 
 that was copied from the table partition keys. The actual data columns are 
 not present.
 {code}
 hive create external table travis_test.person_test partitioned by (part_dt 
 string) row format serde com.twitter.elephantbird.hive.serde.ThriftSerDe 
 with serdeproperties 
 (serialization.class=com.twitter.elephantbird.examples.thrift.Person) 
 stored as inputformat 
 com.twitter.elephantbird.mapred.input.HiveMultiInputFormat outputformat

[jira] [Commented] (HIVE-3115) Table Links and Authorization


[ 
https://issues.apache.org/jira/browse/HIVE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293233#comment-13293233
 ] 

Namit Jain commented on HIVE-3115:
--

This will be done after HIVE-3113 from Facebook.
Having  said that, this is a open jira, and if someone wants to work on it, we 
would love to review it, and take it forward.

 Table Links and Authorization
 -

 Key: HIVE-3115
 URL: https://issues.apache.org/jira/browse/HIVE-3115
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor

 Incorporate Table Links into the existing authorization framework in Hive. 
 Add tests to check that no breach of security permissions is possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3115) Table Links and Authorization


[ 
https://issues.apache.org/jira/browse/HIVE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293236#comment-13293236
 ] 

Carl Steinbach commented on HIVE-3115:
--

@Namit: I think this ticket either has be done at the same time as HIVE-3113 or 
before it. Otherwise you're adding a pretty big security hole with the table 
links feature.

 Table Links and Authorization
 -

 Key: HIVE-3115
 URL: https://issues.apache.org/jira/browse/HIVE-3115
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor

 Incorporate Table Links into the existing authorization framework in Hive. 
 Add tests to check that no breach of security permissions is possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3114) Split Thrift interface for Table Link Creation


[ 
https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293237#comment-13293237
 ] 

Namit Jain commented on HIVE-3114:
--

I dont think so - there are many features (views/indexes etc.) which have been 
around for a long time without a thrift interface.

Given the fact that there is a validity check and existing thrift APIs cannot 
create invalid objects (barring bugs in the validity check),
HIVE-3114 should not be a pre-req. for HIVE-2989.



 Split Thrift interface for Table Link Creation
 --

 Key: HIVE-3114
 URL: https://issues.apache.org/jira/browse/HIVE-3114
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor

 Table Link creation through Thrift currently goes through the same method as 
 Table creation. We want to move it out of there and into it's own method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3115) Table Links and Authorization


[ 
https://issues.apache.org/jira/browse/HIVE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293239#comment-13293239
 ] 

Namit Jain commented on HIVE-3115:
--

Security has been added recently, and there are many existing features which do 
not work very well with it.
HIVE-3078 is completing the inputs/outputs list and trying to plug-in some of 
these holes.
Given that, there are many issues in security currently, it seems wrong to put 
the burden on links for security.
Links is a new feature, and lots of bells and whistles will be added over time.

 Table Links and Authorization
 -

 Key: HIVE-3115
 URL: https://issues.apache.org/jira/browse/HIVE-3115
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor

 Incorporate Table Links into the existing authorization framework in Hive. 
 Add tests to check that no breach of security permissions is possible.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3114) Split Thrift interface for Table Link Creation

[
https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293243#comment-13293243
]

Carl Steinbach commented on HIVE-3114:
--

bq. I dont think so - there are many features (views/indexes etc.) which have
been around for a long time without a thrift interface.

That's true, and it was mistake to do it that way. Instead of continuing to
compound the effects of an earlier bad decision can we please instead invest a
little extra time and actually make the situation better?

Also, this is the sort of thing that should have been described up front in the
design document. Since the List Bucketing feature requires similar changes I'd
like to see any metastore API changes that the feature requires explained in
the design doc before they appear in a patch.

Split Thrift interface for Table Link Creation
--

Key: HIVE-3114
URL: https://issues.apache.org/jira/browse/HIVE-3114
Project: Hive
Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor

Table Link creation through Thrift currently goes through the same method as
Table creation. We want to move it out of there and into it's own method.

[jira] [Commented] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed


[ 
https://issues.apache.org/jira/browse/HIVE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293246#comment-13293246
 ] 

Namit Jain commented on HIVE-3112:
--

Comments

 clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is 
 fixed
 ---

 Key: HIVE-3112
 URL: https://issues.apache.org/jira/browse/HIVE-3112
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed


 [ 
https://issues.apache.org/jira/browse/HIVE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-3112:
-

Status: Patch Available  (was: Open)

 clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is 
 fixed
 ---

 Key: HIVE-3112
 URL: https://issues.apache.org/jira/browse/HIVE-3112
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3114) Split Thrift interface for Table Link Creation


[ 
https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293250#comment-13293250
 ] 

Namit Jain commented on HIVE-3114:
--

@Carl, I agree the comment on list bucketing. Please address it in the wiki, we 
should definitely take that into account.

Agreed it was a mistake for unclean thrift API. But, we cannot penalize a 
single feature, which we need urgently, for that.
I am all for the thrift cleanup - it is just that the timing is not right from 
our side. I would be happy to help in any way if someone 
else takes the thrift API cleanup effort.

@Bhushan, in the links wiki - can you add a follow-up for the thrift interface ?
Or, just add a link to all the follow-up jiras there.

 Split Thrift interface for Table Link Creation
 --

 Key: HIVE-3114
 URL: https://issues.apache.org/jira/browse/HIVE-3114
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor

 Table Link creation through Thrift currently goes through the same method as 
 Table creation. We want to move it out of there and into it's own method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3109) metastore state not cleared


[ 
https://issues.apache.org/jira/browse/HIVE-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293255#comment-13293255
 ] 

Kevin Wilfong commented on HIVE-3109:
-

HIVE-3112 unsets the parameter 
hive.metastore.partition.inherit.table.properties at the end of
ql/src/test/queries/clientpositive/part_inherit_tbl_props.q
ql/src/test/queries/clientpositive/part_inherit_tbl_props_with_star.q

As part of this JIRA, the unsetting should be removed, and the ant test command 
in the Description should still work.

 metastore state not cleared
 ---

 Key: HIVE-3109
 URL: https://issues.apache.org/jira/browse/HIVE-3109
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Ashutosh Chauhan

 When some of the tests are in order, random bugs are encountered.
 ant test -Dtestcase=TestCliDriver -Dqfile=part_inherit_tbl_props.q,stats1.q
 leads to an error in stats1.q
 We ran into this error as part of parallel testing (HIVE-3085).
 As part of HIVE-3085, this will be fixed temporarily by clearing
 hive.metastore.partition.inherit.table.properties at the end of the test.
 But, in general, any property set in one .q file should not affect anything
 in other tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3061) hive.binary.record.max.length is a magic string

2012-06-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293259#comment-13293259
 ] 

Hudson commented on HIVE-3061:
--

Integrated in Hive-trunk-h0.21 #1479 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1479/])
HIVE-3061 hive.binary.record.max.length is a magic string
(Edward Capriolo via namit) (Revision 1348808)

 Result = SUCCESS
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1348808
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
* /hive/trunk/conf/hive-default.xml.template
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/BinaryRecordReader.java


 hive.binary.record.max.length is a magic string
 ---

 Key: HIVE-3061
 URL: https://issues.apache.org/jira/browse/HIVE-3061
 Project: Hive
  Issue Type: Task
Affects Versions: 0.8.1
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Attachments: HIVE-3061.patch.1.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2694) Add FORMAT UDF

2012-06-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293260#comment-13293260
 ] 

Hudson commented on HIVE-2694:
--

Integrated in Hive-trunk-h0.21 #1479 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1479/])
HIVE-2694. Add FORMAT UDF (Zhenxiao Luo via cws) (Revision 1348976)

 Result = SUCCESS
cws : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1348976
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFFormatNumber.java
* /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong1.q
* /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong2.q
* /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong3.q
* /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong4.q
* /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong5.q
* /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong6.q
* /hive/trunk/ql/src/test/queries/clientnegative/udf_format_number_wrong7.q
* /hive/trunk/ql/src/test/queries/clientpositive/udf_format_number.q
* /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong1.q.out
* /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong2.q.out
* /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong3.q.out
* /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong4.q.out
* /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong5.q.out
* /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong6.q.out
* /hive/trunk/ql/src/test/results/clientnegative/udf_format_number_wrong7.q.out
* /hive/trunk/ql/src/test/results/clientpositive/show_functions.q.out
* /hive/trunk/ql/src/test/results/clientpositive/udf_format_number.q.out


 Add FORMAT UDF
 --

 Key: HIVE-2694
 URL: https://issues.apache.org/jira/browse/HIVE-2694
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Carl Steinbach
Assignee: Zhenxiao Luo
 Fix For: 0.10.0

 Attachments: HIVE-2694.1.patch.txt, HIVE-2694.2.patch.txt, 
 HIVE-2694.3.patch.txt, HIVE-2694.4.patch.txt, HIVE-2694.D1149.1.patch, 
 HIVE-2694.D1149.2.patch, HIVE-2694.D1149.3.patch, HIVE-2694.D2673.1.patch, 
 HIVE-2694.D2673.1.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HIVE-3114) Split Thrift interface for Table Link Creation

[
https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293261#comment-13293261
]

Carl Steinbach edited comment on HIVE-3114 at 6/12/12 1:37 AM:
---

bq. Agreed it was a mistake for unclean thrift API. But, we cannot penalize a
single feature, which we need urgently, for that. I am all for the thrift
cleanup - it is just that the timing is not right from our side.

I'm not asking you to clean up the entire metastore Thrift API. All I'm asking
is for you to add a createTableLink() method instead of overloading the
createTable() method. It's fine if both createTable() and createTableLink() use
a common codepath behind the Thrift API, but the Thrift API needs to call these
things out as distinct operations.

bq. @Bhushan, in the links wiki - can you add a follow-up for the thrift
interface ?

When are these followups going to be addressed? If they aren't committed in
time for the 0.10.0 release are you OK with us backing out these changes?

was (Author: cwsteinbach):
bq. Agreed it was a mistake for unclean thrift API. But, we cannot penalize
a single feature, which we need urgently, for that.
I am all for the thrift cleanup - it is just that the timing is not right from
our side.

bq. @Bhushan, in the links wiki - can you add a follow-up for the thrift
interface ?

When are these followups going to be addressed? If they aren't committed in
time for the 0.10.0 release are you OK with us backing out these changes?

Split Thrift interface for Table Link Creation
--

Table Link creation through Thrift currently goes through the same method as
Table creation. We want to move it out of there and into it's own method.

[jira] [Commented] (HIVE-3115) Table Links and Authorization

[
https://issues.apache.org/jira/browse/HIVE-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293263#comment-13293263
]

Namit Jain commented on HIVE-3115:
--

If there is no-one outside the walls of Facebook who is interested in using
links, then not having authorization for links should not be a problem for
anyone outside Facebook anyway.
I am not saying having authorization for links is not a good idea - advanced
users like Facebook will also need it, but this should not be coupled to the
patch.
It can definitely be done in a follow-up. If I remember right, security was
added for most of the new features in follow-ups. Infact, having the patch will
make it easy for other
contributors, or us to quickly address security.

Everyone has their own priority of features, and they are free to work on their
own priorities. Some features which are very useful for advanced users may not
be applicable for many other
users right away, but they do get a free ride in the long run. Anyway, we
already had a long discussion on the wiki - and there is no point repeating it.

Table Links and Authorization
-

Key: HIVE-3115
URL: https://issues.apache.org/jira/browse/HIVE-3115
Project: Hive
Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor

Incorporate Table Links into the existing authorization framework in Hive.
Add tests to check that no breach of security permissions is possible.

[jira] [Commented] (HIVE-3114) Split Thrift interface for Table Link Creation


[ 
https://issues.apache.org/jira/browse/HIVE-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293264#comment-13293264
 ] 

Namit Jain commented on HIVE-3114:
--

Absolutely not, a feature may not be ready for 0.10.
That does not mean that the code for that feature will be deleted.

If 'links' is not available in 0.10, let us document it clearly - when all the 
jiras are ready, 'links' will be available in that release.

 Split Thrift interface for Table Link Creation
 --

 Key: HIVE-3114
 URL: https://issues.apache.org/jira/browse/HIVE-3114
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.10.0
Reporter: Bhushan Mandhani
Assignee: Bhushan Mandhani
Priority: Minor

 Table Link creation through Thrift currently goes through the same method as 
 Table creation. We want to move it out of there and into it's own method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed


[ 
https://issues.apache.org/jira/browse/HIVE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293284#comment-13293284
 ] 

Kevin Wilfong commented on HIVE-3112:
-

+1
Looks like Namit addressed Carl's comments.

 clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is 
 fixed
 ---

 Key: HIVE-3112
 URL: https://issues.apache.org/jira/browse/HIVE-3112
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Namit Jain



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3085) make parallel tests work


[ 
https://issues.apache.org/jira/browse/HIVE-3085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13293289#comment-13293289
 ] 

Shuai Ding commented on HIVE-3085:
--

https://reviews.facebook.net/D3585

 make parallel tests work
 

 Key: HIVE-3085
 URL: https://issues.apache.org/jira/browse/HIVE-3085
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Shuai Ding
 Attachments: hive.3085.1.patch, hive.3085.2.patch


 https://cwiki.apache.org/Hive/unit-test-parallel-execution.html
 I was trying to run the tests using the instructions above.
 I was able to run them using a single machine (parallelism of 4 in ~2 hours).
 The conf. file is as follows: .hive_ptest.conf
 {
   qfile_hosts: [
 [root@MC, 4]
   ],
   other_hosts: [
   [root@MC, 1]
   ],
   master_base_path: /data/users/tmp,
   host_base_path: /data/users/hivetests,
   java_home: /usr/local/jdk-6u24-64
 }

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3112) clear hive.metastore.partition.inherit.table.properties till HIVE-3109 is fixed