[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators

2014-02-19 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6345:
---

Status: Open  (was: Patch Available)

the current .4 patch has a merge mistake and introduces regressions. Will 
update shortly.

 Add DECIMAL support to vectorized JOIN operators
 

 Key: HIVE-6345
 URL: https://issues.apache.org/jira/browse/HIVE-6345
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: vectorization
 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, 
 HIVE-6345.4.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators

2014-02-19 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6345:
---

Attachment: HIVE-6345.5.patch

 Add DECIMAL support to vectorized JOIN operators
 

 Key: HIVE-6345
 URL: https://issues.apache.org/jira/browse/HIVE-6345
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: vectorization
 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, 
 HIVE-6345.4.patch, HIVE-6345.5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators

2014-02-19 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6345:
---

Status: Patch Available  (was: Open)

 Add DECIMAL support to vectorized JOIN operators
 

 Key: HIVE-6345
 URL: https://issues.apache.org/jira/browse/HIVE-6345
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: vectorization
 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, 
 HIVE-6345.4.patch, HIVE-6345.5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators

2014-02-19 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905108#comment-13905108
 ] 

Remus Rusanu commented on HIVE-6345:


Patch .5 fixes the regression cases

 Add DECIMAL support to vectorized JOIN operators
 

 Key: HIVE-6345
 URL: https://issues.apache.org/jira/browse/HIVE-6345
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: vectorization
 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, 
 HIVE-6345.4.patch, HIVE-6345.5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION

2014-02-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905374#comment-13905374
 ] 

Hive QA commented on HIVE-6433:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12629687/HIVE-6433.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5106 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_truncate_column_buckets
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1407/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1407/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12629687

 SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
 

 Key: HIVE-6433
 URL: https://issues.apache.org/jira/browse/HIVE-6433
 Project: Hive
  Issue Type: Sub-task
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6433.patch


 Follow up jira for HIVE-5952.
 If a user/role has admin option on a role, then user should be able to grant 
 /revoke other users to/from the role.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6456) Implement Parquet schema evolution

2014-02-19 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905454#comment-13905454
 ] 

Brock Noland commented on HIVE-6456:


[~jcoffey]

Yes let's do that since I don't know what a unit test would look like and it 
will give you some time to work on it.

BTW tests pass but JIRA was down when it tried posting the comment:

http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-1392/execution.txt

{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12629593/HIVE-6456.patch

{color:green}SUCCESS:{color} +1 5133 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1392/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1392/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

 Implement Parquet schema evolution
 --

 Key: HIVE-6456
 URL: https://issues.apache.org/jira/browse/HIVE-6456
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Trivial
 Attachments: HIVE-6456.patch


 In HIVE-5783 we removed schema evolution:
 https://github.com/Parquet/parquet-mr/pull/297/files#r9824155



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6456) Implement Parquet schema evolution

2014-02-19 Thread Justin Coffey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905450#comment-13905450
 ] 

Justin Coffey commented on HIVE-6456:
-

brock and I had the same thought offline.  Not sure what the protocol is here: 
should I open a separate ticket?

 Implement Parquet schema evolution
 --

 Key: HIVE-6456
 URL: https://issues.apache.org/jira/browse/HIVE-6456
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Trivial
 Attachments: HIVE-6456.patch


 In HIVE-5783 we removed schema evolution:
 https://github.com/Parquet/parquet-mr/pull/297/files#r9824155



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


JIRA was download last night so some precommits did not run

2014-02-19 Thread Brock Noland
EOM


[jira] [Created] (HIVE-6463) unit test for evoloving schema in parquet files

2014-02-19 Thread Justin Coffey (JIRA)
Justin Coffey created HIVE-6463:
---

 Summary: unit test for evoloving schema in parquet files
 Key: HIVE-6463
 URL: https://issues.apache.org/jira/browse/HIVE-6463
 Project: Hive
  Issue Type: Test
Reporter: Justin Coffey
Assignee: Justin Coffey


Unit test(s) for patch found in #HIVE-6456



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6456) Implement Parquet schema evolution

2014-02-19 Thread Justin Coffey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905459#comment-13905459
 ] 

Justin Coffey commented on HIVE-6456:
-

done and linked.

 Implement Parquet schema evolution
 --

 Key: HIVE-6456
 URL: https://issues.apache.org/jira/browse/HIVE-6456
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
Priority: Trivial
 Attachments: HIVE-6456.patch


 In HIVE-5783 we removed schema evolution:
 https://github.com/Parquet/parquet-mr/pull/297/files#r9824155



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HIVE-4413) Parse Exception : character '@' not supported while granting privileges to user in a Secure Cluster through hive client.

2014-02-19 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HIVE-4413.
---

Resolution: Duplicate

HIVE-3807 should resolve this (the specific need of @ in secure clusters)

 Parse Exception : character '@' not supported while granting privileges to 
 user in a Secure Cluster through hive client.
 

 Key: HIVE-4413
 URL: https://issues.apache.org/jira/browse/HIVE-4413
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.10.0
Reporter: Navin Madathil
  Labels: cli, hive

 While running through hive CLI , hive grant command  throws a parseException 
 '@' not supported. But in a secure cluster ( Kerberos ) the username is 
 appended with the realmname seperated by the character '@'.Without giving the 
 full username the permissions are not granted to the intended user.
 grant all on table tablename to user user@REALM



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6461) Run Release Audit tool, fix missing license issues

2014-02-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905472#comment-13905472
 ] 

Hive QA commented on HIVE-6461:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12629703/HIVE-6461.1.patch

{color:green}SUCCESS:{color} +1 5133 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1409/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1409/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12629703

 Run Release Audit tool, fix missing license issues
 --

 Key: HIVE-6461
 URL: https://issues.apache.org/jira/browse/HIVE-6461
 Project: Hive
  Issue Type: Bug
Reporter: Harish Butani
Assignee: Harish Butani
Priority: Trivial
 Attachments: HIVE-6461.1.patch


 run mvn apache-rat:check and add apache license in flagged files.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5275) HiveServer2 should respect hive.aux.jars.path property and add aux jars to distributed cache

2014-02-19 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905490#comment-13905490
 ] 

Brock Noland commented on HIVE-5275:


Is this true? I have not observed this myself.

 HiveServer2 should respect hive.aux.jars.path property and add aux jars to 
 distributed cache
 

 Key: HIVE-5275
 URL: https://issues.apache.org/jira/browse/HIVE-5275
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Alex Favaro

 HiveServer2 currently ignores the hive.aux.jars.path property in 
 hive-site.xml. That means that the only way to use a custom SerDe is to add 
 it to AUX_CLASSPATH on the server and manually distribute the jar to the 
 cluster nodes. Hive CLI does this automatically when hive.aux.jars.path is 
 set. It would be nice if HiverServer2 did the same.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.

2014-02-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6416:
---

Status: Open  (was: Patch Available)

 Vectorized mathematical functions for decimal type.
 ---

 Key: HIVE-6416
 URL: https://issues.apache.org/jira/browse/HIVE-6416
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch


 Vectorized mathematical functions for decimal type.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.

2014-02-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6416:
---

Attachment: HIVE-6416.3.patch

The update patch fixes the review comments and addresses the test failure.

 Vectorized mathematical functions for decimal type.
 ---

 Key: HIVE-6416
 URL: https://issues.apache.org/jira/browse/HIVE-6416
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch


 Vectorized mathematical functions for decimal type.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.

2014-02-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6416:
---

Status: Patch Available  (was: Open)

 Vectorized mathematical functions for decimal type.
 ---

 Key: HIVE-6416
 URL: https://issues.apache.org/jira/browse/HIVE-6416
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch


 Vectorized mathematical functions for decimal type.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6359) beeline -f fails on scripts with tabs in them.

2014-02-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905529#comment-13905529
 ] 

Xuefu Zhang commented on HIVE-6359:
---

[~navis] I just realiazed it's a simple change, but thanks for the review link.

+1

 beeline -f fails on scripts with tabs in them.
 --

 Key: HIVE-6359
 URL: https://issues.apache.org/jira/browse/HIVE-6359
 Project: Hive
  Issue Type: Bug
Reporter: Carter Shanklin
Assignee: Navis
Priority: Minor
 Attachments: HIVE-6359.1.patch.txt, HIVE-6359.2.patch.txt


 NO PRECOMMIT TESTS
 On a recent trunk build I used beeline -f on a script with tabs in it.
 Beeline rather unhelpfully attempts to perform tab expansion on the tabs and 
 the query fails. Here's a screendump.
 {code}
 Connecting to jdbc:hive2://mymachine:1/mydb
 Connected to: Apache Hive (version 0.13.0-SNAPSHOT)
 Driver: Hive JDBC (version 0.13.0-SNAPSHOT)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 Beeline version 0.13.0-SNAPSHOT by Apache Hive
 0: jdbc:hive2://mymachine:1/mydb select  i_brand_id as brand_id, i_brand 
 as brand,
 . . . . . . . . . . . . . . . . . . . . . . .  
 Display all 560 possibilities? (y or n) 
 . . . . . . . . . . . . . . . . . . . . . . .  ager_id=36
 . . . . . . . . . . . . . . . . . . . . . . .  
 Display all 560 possibilities? (y or n) 
 . . . . . . . . . . . . . . . . . . . . . . .  d d_moy=12
 . . . . . . . . . . . . . . . . . . . . . . .  
 Display all 560 possibilities? (y or n) 
 . . . . . . . . . . . . . . . . . . . . . . .  d d_year=2001
 . . . . . . . . . . . . . . . . . . . . . . . and ss_sold_date 
 between '2001-12-01' and '2001-12-31'
 . . . . . . . . . . . . . . . . . . . . . . .  group by i_brand, i_brand_id
 . . . . . . . . . . . . . . . . . . . . . . .  order by ext_price desc, 
 brand_id
 . . . . . . . . . . . . . . . . . . . . . . . limit 100 ;
 Error: Error while compiling statement: FAILED: ParseException line 1:65 
 missing FROM at 'd_moy' near 'd' in from source (state=42000,code=4)
 Closing: org.apache.hive.jdbc.HiveConnection
 {code}
 The same query works fine if I replace tabs with some spaces.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6375) Fix CTAS for parquet

2014-02-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905581#comment-13905581
 ] 

Hive QA commented on HIVE-6375:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12629690/HIVE-6375.patch

{color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5134 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_hadoop20
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_ctas
{noformat}

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1410/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1410/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 3 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12629690

 Fix CTAS for parquet
 

 Key: HIVE-6375
 URL: https://issues.apache.org/jira/browse/HIVE-6375
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Szehon Ho
Priority: Critical
  Labels: Parquet
 Attachments: HIVE-6375.patch


 More details here:
 https://github.com/Parquet/parquet-mr/issues/272



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5958) SQL std auth - authorize statements that work with paths

2014-02-19 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5958:


Attachment: HIVE-5958.7.patch

HIVE-5958.7.patch - fixes the locking timeout issue in 
TestPermsGrp.testCustomPerms ,TestJdbcWithMiniHS2.testURIDatabaseName  and 
TestHiveServer2.testConnection . There tests attempted to disable locking but 
were not doing it the right way, I have fixed that in the tests. When db got 
added as output in create table command in this patch, the locking had an 
object to lock and tried to get kicked off. 
Other tests had failed because I didn't generate the patch with git diff -a , 
and some file q.out got treated as binary files.


 SQL std auth - authorize statements that work with paths
 

 Key: HIVE-5958
 URL: https://issues.apache.org/jira/browse/HIVE-5958
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-5958.1.patch, HIVE-5958.2.patch, HIVE-5958.3.patch, 
 HIVE-5958.4.patch, HIVE-5958.5.patch, HIVE-5958.6.patch, HIVE-5958.7.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 Statement such as create table, alter table that specify an path uri should 
 be allowed under the new authorization scheme only if URI(Path) specified has 
 permissions including read/write and ownership of the file/dir and its 
 children.
 Also, fix issue of database not getting set as output for create-table.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried

2014-02-19 Thread Thejas M Nair (JIRA)
Thejas M Nair created HIVE-6464:
---

 Summary: Test configuration: reduce the duration for which lock 
attempts are retried
 Key: HIVE-6464
 URL: https://issues.apache.org/jira/browse/HIVE-6464
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
 Attachments: HIVE-6464.1.patch

Lock attempts are being done for 60 seconds * 100 before it gives up. Most 
tests attempt to disable locking but sometimes don't do it correctly and 
changes can cause the locking to kick in. Locking fails, (at least in the HS2 
related tests) because of problems in creating the zookeeper entries in test 
mode. When locking attempt kicks in and that fails, it can end up waiting for 
6000 seconds before failing.

As the tests are not trying to test parallel locking, there is no reason to 
wait this long in the tests. 
We should update hive-site.xml used by tests for smaller duration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Assigned] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried

2014-02-19 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair reassigned HIVE-6464:
---

Assignee: Thejas M Nair

 Test configuration: reduce the duration for which lock attempts are retried
 ---

 Key: HIVE-6464
 URL: https://issues.apache.org/jira/browse/HIVE-6464
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6464.1.patch


 Lock attempts are being done for 60 seconds * 100 before it gives up. Most 
 tests attempt to disable locking but sometimes don't do it correctly and 
 changes can cause the locking to kick in. Locking fails, (at least in the HS2 
 related tests) because of problems in creating the zookeeper entries in test 
 mode. When locking attempt kicks in and that fails, it can end up waiting for 
 6000 seconds before failing.
 As the tests are not trying to test parallel locking, there is no reason to 
 wait this long in the tests. 
 We should update hive-site.xml used by tests for smaller duration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried

2014-02-19 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6464:


Attachment: HIVE-6464.1.patch

 Test configuration: reduce the duration for which lock attempts are retried
 ---

 Key: HIVE-6464
 URL: https://issues.apache.org/jira/browse/HIVE-6464
 Project: Hive
  Issue Type: Bug
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6464.1.patch


 Lock attempts are being done for 60 seconds * 100 before it gives up. Most 
 tests attempt to disable locking but sometimes don't do it correctly and 
 changes can cause the locking to kick in. Locking fails, (at least in the HS2 
 related tests) because of problems in creating the zookeeper entries in test 
 mode. When locking attempt kicks in and that fails, it can end up waiting for 
 6000 seconds before failing.
 As the tests are not trying to test parallel locking, there is no reason to 
 wait this long in the tests. 
 We should update hive-site.xml used by tests for smaller duration.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18250: SQL std auth - allow grant/revoke roles if user has ADMIN OPTION

2014-02-19 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18250/#review34869
---



ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
https://reviews.apache.org/r/18250/#comment65253

We need to pass the roleNames argument to this function and check that user 
has admin option on these roles. For example the role in grant-role could be 
role A while current role is role B. The check is happening now on role B only.
What should we do if a user a member with admin option of role Y , because 
it belongs to role X and role X has admin option on Y?
Should we check that X is in the current role in that case? I guess so, 
that will make it consistent with rest of the current role behavior.



ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
https://reviews.apache.org/r/18250/#comment65252

ADMIN_ONLY_MSG is not the right message with this change. For the 
grant/revoke roles statements, we should change it to : ADMIN_ONLY_MSG + 
HAS_ADMIN_PRIV_MSG


- Thejas Nair


On Feb. 19, 2014, 3:31 a.m., Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18250/
 ---
 
 (Updated Feb. 19, 2014, 3:31 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6433
 https://issues.apache.org/jira/browse/HIVE-6433
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
  c1afaee 
   ql/src/test/queries/clientpositive/authorization_role_grant2.q PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_role_grant2.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18250/diff/
 
 
 Testing
 ---
 
 Added new test
 
 
 Thanks,
 
 Ashutosh Chauhan
 




[jira] [Commented] (HIVE-5926) Load Data OverWrite Into Table Throw org.apache.hadoop.hive.ql.metadata.HiveException

2014-02-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905683#comment-13905683
 ] 

Xuefu Zhang commented on HIVE-5926:
---

[~tianyi] I'm not sure if you're still working on this issue, but would you 
like moving this forward? Thanks.

 Load Data OverWrite Into Table Throw 
 org.apache.hadoop.hive.ql.metadata.HiveException
 -

 Key: HIVE-5926
 URL: https://issues.apache.org/jira/browse/HIVE-5926
 Project: Hive
  Issue Type: Bug
  Components: Database/Schema
Affects Versions: 0.12.0
 Environment: OS: Red Hat Enterprise Linux Server release 6.2
 HDFS: CDH-4.2.1
 MAPRED: CDH-4.2.1-mr1
Reporter: Yi Tian
Assignee: Yi Tian
 Attachments: HIVE-5926.patch


 step1: create table 
 step2: load data 
 load data inpath '/tianyi/usys_etl_map_total.del' overwrite into table 
 tianyi_test3
 step3: copy file back
 hadoop fs -cp /user/hive/warehouse/tianyi_test3/usys_etl_map_total.del /tianyi
 step4: load data again
 load data inpath '/tianyi/usys_etl_map_total.del' overwrite into table 
 tianyi_test3
 here we can see the error in console:
 Failed with exception Error moving: 
 hdfs://ocdccluster/tianyi/usys_etl_map_total.del into: 
 /user/hive/warehouse/tianyi_test3/usys_etl_map_total.del
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MoveTask
 we can find error detail in hive.log:
 2013-12-03 17:26:41,717 ERROR exec.Task (SessionState.java:printError(419)) - 
 Failed with exception Error moving: 
 hdfs://ocdccluster/tianyi/usys_etl_map_total.del into: 
 /user/hive/warehouse/tianyi_test3/usys_etl_map_total.del
 org.apache.hadoop.hive.ql.metadata.HiveException: Error moving: 
 hdfs://ocdccluster/tianyi/usys_etl_map_total.del into: 
 /user/hive/warehouse/tianyi_test3/usys_etl_map_total.del
   at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2323)
   at org.apache.hadoop.hive.ql.metadata.Table.replaceFiles(Table.java:639)
   at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1441)
   at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:283)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
   at 
 org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
 Caused by: java.io.IOException: Error moving: 
 hdfs://ocdccluster/tianyi/usys_etl_map_total.del into: 
 /user/hive/warehouse/tianyi_test3/usys_etl_map_total.del
   at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2317)
   ... 20 more
 2013-12-03 17:26:41,718 ERROR ql.Driver (SessionState.java:printError(419)) - 
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MoveTask



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-4501) HS2 memory leak - FileSystem objects in FileSystem.CACHE

2014-02-19 Thread Abin Shahab (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905687#comment-13905687
 ] 

Abin Shahab commented on HIVE-4501:
---

What is the progress on this issue?

 HS2 memory leak - FileSystem objects in FileSystem.CACHE
 

 Key: HIVE-4501
 URL: https://issues.apache.org/jira/browse/HIVE-4501
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.11.0
Reporter: Thejas M Nair
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-4501.1.patch, HIVE-4501.1.patch, HIVE-4501.1.patch, 
 HIVE-4501.trunk.patch


 org.apache.hadoop.fs.FileSystem objects are getting accumulated in 
 FileSystem.CACHE, with HS2 in unsecure mode.
 As a workaround, it is possible to set fs.hdfs.impl.disable.cache and 
 fs.file.impl.disable.cache to true.
 Users should not have to bother with this extra configuration. 
 As a workaround disable impersonation by setting hive.server2.enable.doAs to 
 false.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6405) Support append feature for HCatalog

2014-02-19 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6405:
---

Status: Open  (was: Patch Available)

 Support append feature for HCatalog
 ---

 Key: HIVE-6405
 URL: https://issues.apache.org/jira/browse/HIVE-6405
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6405.patch


 HCatalog currently treats all tables as immutable - i.e. all tables and 
 partitions can be written to only once, and not appended. The nuances of what 
 this means is as follows:
  * A non-partitioned table can be written to, and data in it is never updated 
 from then on unless you drop and recreate.
  * A partitioned table may support appending of a sort in a manner by 
 adding new partitions to the table, but once written, the partitions 
 themselves cannot have any new data added to them.
 Hive, on the other hand, does allow us to INSERT INTO into a table, thus 
 allowing us append semantics. There is benefit to both of these models, and 
 so, our goal is as follows:
 a) Introduce a notion of an immutable table, wherein all tables are not 
 immutable by default, and have this be a table property. If this property is 
 set for a table, and we attempt to write to a table that already has data (or 
 a partition), disallow INSERT INTO into it from hive. This property being 
 set will allow hive to mimic HCatalog's current immutable-table property. 
 (I'm going to create a separate sub-task to cover this bit, and focus on the 
 HCatalog-side on this jira)
 b) As long as that flag is not set, HCatalog should be changed to allow 
 appends into it as well, and not simply error out if data already exists in a 
 table.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6405) Support append feature for HCatalog

2014-02-19 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6405:
---

Status: Patch Available  (was: Open)

 Support append feature for HCatalog
 ---

 Key: HIVE-6405
 URL: https://issues.apache.org/jira/browse/HIVE-6405
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6405.patch


 HCatalog currently treats all tables as immutable - i.e. all tables and 
 partitions can be written to only once, and not appended. The nuances of what 
 this means is as follows:
  * A non-partitioned table can be written to, and data in it is never updated 
 from then on unless you drop and recreate.
  * A partitioned table may support appending of a sort in a manner by 
 adding new partitions to the table, but once written, the partitions 
 themselves cannot have any new data added to them.
 Hive, on the other hand, does allow us to INSERT INTO into a table, thus 
 allowing us append semantics. There is benefit to both of these models, and 
 so, our goal is as follows:
 a) Introduce a notion of an immutable table, wherein all tables are not 
 immutable by default, and have this be a table property. If this property is 
 set for a table, and we attempt to write to a table that already has data (or 
 a partition), disallow INSERT INTO into it from hive. This property being 
 set will allow hive to mimic HCatalog's current immutable-table property. 
 (I'm going to create a separate sub-task to cover this bit, and focus on the 
 HCatalog-side on this jira)
 b) As long as that flag is not set, HCatalog should be changed to allow 
 appends into it as well, and not simply error out if data already exists in a 
 table.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6405) Support append feature for HCatalog

2014-02-19 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6405:
---

Attachment: HIVE-6405.patch

Attaching patch, this depends on HIVE-6406 being patched in.

 Support append feature for HCatalog
 ---

 Key: HIVE-6405
 URL: https://issues.apache.org/jira/browse/HIVE-6405
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6405.patch


 HCatalog currently treats all tables as immutable - i.e. all tables and 
 partitions can be written to only once, and not appended. The nuances of what 
 this means is as follows:
  * A non-partitioned table can be written to, and data in it is never updated 
 from then on unless you drop and recreate.
  * A partitioned table may support appending of a sort in a manner by 
 adding new partitions to the table, but once written, the partitions 
 themselves cannot have any new data added to them.
 Hive, on the other hand, does allow us to INSERT INTO into a table, thus 
 allowing us append semantics. There is benefit to both of these models, and 
 so, our goal is as follows:
 a) Introduce a notion of an immutable table, wherein all tables are not 
 immutable by default, and have this be a table property. If this property is 
 set for a table, and we attempt to write to a table that already has data (or 
 a partition), disallow INSERT INTO into it from hive. This property being 
 set will allow hive to mimic HCatalog's current immutable-table property. 
 (I'm going to create a separate sub-task to cover this bit, and focus on the 
 HCatalog-side on this jira)
 b) As long as that flag is not set, HCatalog should be changed to allow 
 appends into it as well, and not simply error out if data already exists in a 
 table.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6405) Support append feature for HCatalog

2014-02-19 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6405:
---

Release Note: 
Introduces append feature for HCatalog writes.

Previously, if an unpartitioned table had data in it, or if a partition in a 
partitioned table had data in it, or if the partition even existed, HCat would 
fail if a user attempted to write to them. Now, that behaviour is extended so 
that the strict behaviour exists only if the table in question has a parameter 
immutable set to true (see HIVE-6406).

With this patch, we can append to existing partitions or non-partitioned tables 
that already have data in them, as long as the new data being written is 
compatible to the old data (i.e. one cannot mix fileformats when attempting an 
append)

As a further note, append is currently not compatible with dynamic 
partitioning, and a dynamic partitioning job is still unable to append to a 
table, even if it is a mutable table.
  Status: Patch Available  (was: Open)

 Support append feature for HCatalog
 ---

 Key: HIVE-6405
 URL: https://issues.apache.org/jira/browse/HIVE-6405
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6405.patch


 HCatalog currently treats all tables as immutable - i.e. all tables and 
 partitions can be written to only once, and not appended. The nuances of what 
 this means is as follows:
  * A non-partitioned table can be written to, and data in it is never updated 
 from then on unless you drop and recreate.
  * A partitioned table may support appending of a sort in a manner by 
 adding new partitions to the table, but once written, the partitions 
 themselves cannot have any new data added to them.
 Hive, on the other hand, does allow us to INSERT INTO into a table, thus 
 allowing us append semantics. There is benefit to both of these models, and 
 so, our goal is as follows:
 a) Introduce a notion of an immutable table, wherein all tables are not 
 immutable by default, and have this be a table property. If this property is 
 set for a table, and we attempt to write to a table that already has data (or 
 a partition), disallow INSERT INTO into it from hive. This property being 
 set will allow hive to mimic HCatalog's current immutable-table property. 
 (I'm going to create a separate sub-task to cover this bit, and focus on the 
 HCatalog-side on this jira)
 b) As long as that flag is not set, HCatalog should be changed to allow 
 appends into it as well, and not simply error out if data already exists in a 
 table.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators

2014-02-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905701#comment-13905701
 ] 

Hive QA commented on HIVE-6345:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12629743/HIVE-6345.5.patch

{color:green}SUCCESS:{color} +1 5147 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1411/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1411/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12629743

 Add DECIMAL support to vectorized JOIN operators
 

 Key: HIVE-6345
 URL: https://issues.apache.org/jira/browse/HIVE-6345
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: vectorization
 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, 
 HIVE-6345.4.patch, HIVE-6345.5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators

2014-02-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6345:
---

  Resolution: Fixed
Release Note: Committed to trunk. Thanks to Remus!!
  Status: Resolved  (was: Patch Available)

 Add DECIMAL support to vectorized JOIN operators
 

 Key: HIVE-6345
 URL: https://issues.apache.org/jira/browse/HIVE-6345
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: vectorization
 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, 
 HIVE-6345.4.patch, HIVE-6345.5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators

2014-02-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6345:
---

Fix Version/s: 0.13.0

 Add DECIMAL support to vectorized JOIN operators
 

 Key: HIVE-6345
 URL: https://issues.apache.org/jira/browse/HIVE-6345
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: vectorization
 Fix For: 0.13.0

 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, 
 HIVE-6345.4.patch, HIVE-6345.5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators

2014-02-19 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905750#comment-13905750
 ] 

Jitendra Nath Pandey commented on HIVE-6345:


Committed to trunk. Thanks to Remus!!

 Add DECIMAL support to vectorized JOIN operators
 

 Key: HIVE-6345
 URL: https://issues.apache.org/jira/browse/HIVE-6345
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: vectorization
 Fix For: 0.13.0

 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, 
 HIVE-6345.4.patch, HIVE-6345.5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators

2014-02-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6345:
---

Release Note:   (was: Committed to trunk. Thanks to Remus!!)

 Add DECIMAL support to vectorized JOIN operators
 

 Key: HIVE-6345
 URL: https://issues.apache.org/jira/browse/HIVE-6345
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Reporter: Remus Rusanu
Assignee: Remus Rusanu
  Labels: vectorization
 Fix For: 0.13.0

 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, 
 HIVE-6345.4.patch, HIVE-6345.5.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18254: HIVE-6375 Implement CTAS and column rename for parquet

2014-02-19 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18254/#review34879
---



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
https://reviews.apache.org/r/18254/#comment65260

This line seems to be a dupe of line 5665.


- Xuefu Zhang


On Feb. 19, 2014, 12:42 a.m., Szehon Ho wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18254/
 ---
 
 (Updated Feb. 19, 2014, 12:42 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6375
 https://issues.apache.org/jira/browse/HIVE-6375
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 There is a Hive bug in SemanticAnalyzer that chooses different names for 
 columns in the CreateTable task and the FileSink task.  
 columnInfo.getInternalName() was used in one place, and fieldSchema still 
 used columnInfo.getAlias() if it is available.  This change makes both 
 consistent, favoring columnInfo.getAlias if it is available.
 
 This is not revealed before because other file-formats like RcFile seem to 
 use column-ordinal position, and Avro file stores the schema separately 
 altogether.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd 
   ql/src/test/queries/clientpositive/parquet_ctas.q PRE-CREATION 
   ql/src/test/results/clientpositive/ctas.q.out 9668855 
   ql/src/test/results/clientpositive/parquet_ctas.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18254/diff/
 
 
 Testing
 ---
 
 Added parquet_ctas.q.  Covers cases where column name is gotten directly from 
 input table (implied alias), where name is auto-generated, where name is 
 specified as alias, and a mix of the three.
 
 
 Thanks,
 
 Szehon Ho
 




[jira] [Commented] (HIVE-6375) Fix CTAS for parquet

2014-02-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905769#comment-13905769
 ] 

Xuefu Zhang commented on HIVE-6375:
---

Patch looks good. Minor comment on RB. The test diff needs to be fixed.

 Fix CTAS for parquet
 

 Key: HIVE-6375
 URL: https://issues.apache.org/jira/browse/HIVE-6375
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Szehon Ho
Priority: Critical
  Labels: Parquet
 Attachments: HIVE-6375.patch


 More details here:
 https://github.com/Parquet/parquet-mr/issues/272



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18250: SQL std auth - allow grant/revoke roles if user has ADMIN OPTION

2014-02-19 Thread Ashutosh Chauhan


 On Feb. 19, 2014, 4:31 p.m., Thejas Nair wrote:
  ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java,
   line 278
  https://reviews.apache.org/r/18250/diff/2/?file=497456#file497456line278
 
  We need to pass the roleNames argument to this function and check that 
  user has admin option on these roles. For example the role in grant-role 
  could be role A while current role is role B. The check is happening now on 
  role B only.
  What should we do if a user a member with admin option of role Y , 
  because it belongs to role X and role X has admin option on Y?
  Should we check that X is in the current role in that case? I guess so, 
  that will make it consistent with rest of the current role behavior.

Lets say, user X has an admin option on role A. User X now wants to grant role 
A to user B. IMO, user X's current role should be A. He shouldn't be allowed to 
grant role A to user B, if his current role is C. Currently is that is whats 
implemented. It seems you are suggesting that user X should be allowed to grant 
role A to user B, even if his current role is C. To me, this seems counter 
intuitive. Not sure what does standard says here.


- Ashutosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18250/#review34869
---


On Feb. 19, 2014, 3:31 a.m., Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18250/
 ---
 
 (Updated Feb. 19, 2014, 3:31 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6433
 https://issues.apache.org/jira/browse/HIVE-6433
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
  c1afaee 
   ql/src/test/queries/clientpositive/authorization_role_grant2.q PRE-CREATION 
   ql/src/test/results/clientpositive/authorization_role_grant2.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18250/diff/
 
 
 Testing
 ---
 
 Added new test
 
 
 Thanks,
 
 Ashutosh Chauhan
 




[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.

2014-02-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6416:
---

Status: Open  (was: Patch Available)

 Vectorized mathematical functions for decimal type.
 ---

 Key: HIVE-6416
 URL: https://issues.apache.org/jira/browse/HIVE-6416
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch


 Vectorized mathematical functions for decimal type.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf

2014-02-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905808#comment-13905808
 ] 

Hive QA commented on HIVE-6459:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12629653/HIVE-6459.patch

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1415/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1415/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n '' ]]
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1415/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ svn = \s\v\n ]]
+ [[ -n '' ]]
+ [[ -d apache-svn-trunk-source ]]
+ [[ ! -d apache-svn-trunk-source/.svn ]]
+ [[ ! -d apache-svn-trunk-source ]]
+ cd apache-svn-trunk-source
+ svn revert -R .
Reverted 'shims/aggregator/pom.xml'
Reverted 'packaging/src/main/assembly/bin.xml'
Reverted 'conf/hive-default.xml.template'
Reverted 'bin/hive'
Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java'
Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java'
Reverted 'ql/pom.xml'
++ awk '{print $2}'
++ egrep -v '^X|^Performing status on external'
++ svn status --no-ignore
+ rm -rf target datanucleus.log ant/target shims/target shims/0.20/target 
shims/0.20S/target shims/0.23/target shims/aggregator/target 
shims/common/target shims/common-secure/target packaging/target 
hbase-handler/target testutils/target jdbc/target metastore/target 
itests/target itests/hcatalog-unit/target itests/test-serde/target 
itests/qtest/target itests/hive-unit/target itests/custom-serde/target 
itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target 
hcatalog/server-extensions/target hcatalog/core/target 
hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target 
hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen 
service/target contrib/target serde/target beeline/target odbc/target 
cli/target ql/dependency-reduced-pom.xml ql/target 
ql/src/java/org/apache/hadoop/hive/ql/exec/HiveAuxClasspathBuilder.java 
ql/src/java/org/apache/hadoop/hive/ql/exec/mr/JarCache.java
+ svn update
Ucommon/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java
Ucommon/src/java/org/apache/hadoop/hive/common/type/Decimal128.java
Acommon/src/java/org/apache/hive/common/util/Decimal128FastBuffer.java
A
serde/src/test/org/apache/hadoop/hive/serde2/io/TestHiveDecimalWritable.java
Userde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java
Uql/src/gen/vectorization/UDAFTemplates/VectorUDAFSum.txt
Aql/src/gen/vectorization/UDAFTemplates/VectorUDAFVarDecimal.txt
Aql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMaxDecimal.txt
Uql/src/gen/vectorization/UDAFTemplates/VectorUDAFVar.txt
Uql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMax.txt
Uql/src/gen/vectorization/UDAFTemplates/VectorUDAFAvg.txt
Uql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMaxString.txt
Aql/src/test/queries/clientpositive/vector_decimal_mapjoin.q
Aql/src/test/queries/clientpositive/vector_decimal_aggregate.q
Aql/src/test/results/clientpositive/vector_decimal_mapjoin.q.out
Aql/src/test/results/clientpositive/vector_decimal_aggregate.q.out
U
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorGroupByOperator.java
U
ql/src/test/org/apache/hadoop/hive/ql/exec/vector/util/FakeVectorRowBatchFromObjectIterables.java
Uql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java
U
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssignFactory.java
U
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExpressionDescriptor.java
U
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java
U
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java
U
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java
Uql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
Uql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java
A
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFSumDecimal.java
A

[jira] [Updated] (HIVE-6330) Metastore support for permanent UDFs

2014-02-19 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-6330:
-

Attachment: HIVE-6330.8.patch

resubmitting patch to run unit tests

 Metastore support for permanent UDFs
 

 Key: HIVE-6330
 URL: https://issues.apache.org/jira/browse/HIVE-6330
 Project: Hive
  Issue Type: Sub-task
  Components: UDF
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-6330.1.patch, HIVE-6330.2.patch, HIVE-6330.3.patch, 
 HIVE-6330.4.patch, HIVE-6330.5.patch, HIVE-6330.6.patch, HIVE-6330.7.patch, 
 HIVE-6330.8.patch


 Allow CREATE FUNCTION to add metastore entry for the created function, so 
 that it only needs to be added to Hive once.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.

2014-02-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6416:
---

Attachment: HIVE-6416.4.patch

 Vectorized mathematical functions for decimal type.
 ---

 Key: HIVE-6416
 URL: https://issues.apache.org/jira/browse/HIVE-6416
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch, 
 HIVE-6416.4.patch


 Vectorized mathematical functions for decimal type.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6416) Vectorized mathematical functions for decimal type.

2014-02-19 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905862#comment-13905862
 ] 

Jitendra Nath Pandey commented on HIVE-6416:


Patch re-based against the latest trunk.

 Vectorized mathematical functions for decimal type.
 ---

 Key: HIVE-6416
 URL: https://issues.apache.org/jira/browse/HIVE-6416
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch, 
 HIVE-6416.4.patch


 Vectorized mathematical functions for decimal type.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.

2014-02-19 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HIVE-6416:
---

Status: Patch Available  (was: Open)

 Vectorized mathematical functions for decimal type.
 ---

 Key: HIVE-6416
 URL: https://issues.apache.org/jira/browse/HIVE-6416
 Project: Hive
  Issue Type: Sub-task
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey
 Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch, 
 HIVE-6416.4.patch


 Vectorized mathematical functions for decimal type.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5994) ORC RLEv2 encodes wrongly for large negative BIGINTs (64 bits )

2014-02-19 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905871#comment-13905871
 ] 

Prasanth J commented on HIVE-5994:
--

Puneeth,

This issue can happen with large positive values as well. The reason being when 
the number of repetitions of large number is 3 and =10 SHORT_REPEAT encoding 
is used. 
https://github.com/apache/hive/blob/branch-0.12/ql/src/java/org/apache/hadoop/hive/ql/io/orc/RunLengthIntegerWriterV2.java#L35

This encoding zigzag encodes the repeating value. So in your case when 
470327563395383L is zigzag encoded, the MSB bit (64th) is set which will be 
considered as a negative value according to this bug. 

I tested your test case with trunk and it works fine. Applying the patch 
attached in this JIRA should also work.

 ORC RLEv2 encodes wrongly for large negative BIGINTs  (64 bits )
 

 Key: HIVE-5994
 URL: https://issues.apache.org/jira/browse/HIVE-5994
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Fix For: 0.13.0

 Attachments: HIVE-5994.1.patch


 For large negative BIGINTs, zigzag encoding will yield large value (64bit 
 value) with MSB set to 1. This value is interpreted as negative value in 
 SerializationUtils.findClosestNumBits(long value) function. This resulted in 
 wrong computation of total number of bits required which results in wrong 
 encoding/decoding of values.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6344) Add DECIMAL support to vectorized group by operator

2014-02-19 Thread Remus Rusanu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Remus Rusanu updated HIVE-6344:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

The patch for HIVE-6345 contains the fix for HIVE-6344 too, as much of the code 
was common.

 Add DECIMAL support to vectorized group by operator
 ---

 Key: HIVE-6344
 URL: https://issues.apache.org/jira/browse/HIVE-6344
 Project: Hive
  Issue Type: Sub-task
Reporter: Remus Rusanu
Assignee: Remus Rusanu
 Fix For: 0.13.0

 Attachments: HIVE-6344.1.patch






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization

2014-02-19 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-6455:
-

Attachment: HIVE-6455.4.patch

Reuploading patch as HIVE QA did not run yesterday.

 Scalable dynamic partitioning and bucketing optimization
 

 Key: HIVE-6455
 URL: https://issues.apache.org/jira/browse/HIVE-6455
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: optimization
 Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.2.patch, 
 HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch


 The current implementation of dynamic partition works by keeping at least one 
 record writer open per dynamic partition directory. In case of bucketing 
 there can be multispray file writers which further adds up to the number of 
 open record writers. The record writers of column oriented file format (like 
 ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or 
 compression buffers) open all the time to buffer up the rows and compress 
 them before flushing it to disk. Since these buffers are maintained per 
 column basis the amount of constant memory that will required at runtime 
 increases as the number of partitions and number of columns per partition 
 increases. This often leads to OutOfMemory (OOM) exception in mappers or 
 reducers depending on the number of open record writers. Users often tune the 
 JVM heapsize (runtime memory) to get over such OOM issues. 
 With this optimization, the dynamic partition columns and bucketing columns 
 (in case of bucketed tables) are sorted before being fed to the reducers. 
 Since the partitioning and bucketing columns are sorted, each reducers can 
 keep only one record writer open at any time thereby reducing the memory 
 pressure on the reducers. This optimization is highly scalable as the number 
 of partition and number of columns per partition increases at the cost of 
 sorting the columns.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf

2014-02-19 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905905#comment-13905905
 ] 

Remus Rusanu commented on HIVE-6459:


HIVE-6345 just got in, which adds the decimal support for vectorized 
aggregates, including AVG. Is probably going to conflict with your patch, as 
vectorized AVG must match the intermediate sum (p,s). If necessary, I will look 
at your patch  tomorrow (I'm on UTC+2) and see how it needs to consider the 
vectorized aggregate code (it should be a minor change).

 Change the precison/scale for intermediate sum result in the avg() udf 
 ---

 Key: HIVE-6459
 URL: https://issues.apache.org/jira/browse/HIVE-6459
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6459.patch


 The avg() udf, when applied to a decimal column, selects the precision/scale 
 of the intermediate sum field as (p+4, s+4), which is the same for the 
 precision/scale of the avg() result. However, the additional scale increase 
 is unnecessary, and the problem of data overflow may occur. The requested 
 change is that for the intermediate sum result,  the precsion/scale is set to 
 (p+10, s), which is consistent to sum() udf. The avg() result still keeps its 
 precision/scale.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf

2014-02-19 Thread Remus Rusanu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905907#comment-13905907
 ] 

Remus Rusanu commented on HIVE-6459:


thanks for fixing this btw.

 Change the precison/scale for intermediate sum result in the avg() udf 
 ---

 Key: HIVE-6459
 URL: https://issues.apache.org/jira/browse/HIVE-6459
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6459.patch


 The avg() udf, when applied to a decimal column, selects the precision/scale 
 of the intermediate sum field as (p+4, s+4), which is the same for the 
 precision/scale of the avg() result. However, the additional scale increase 
 is unnecessary, and the problem of data overflow may occur. The requested 
 change is that for the intermediate sum result,  the precsion/scale is set to 
 (p+10, s), which is consistent to sum() udf. The avg() result still keeps its 
 precision/scale.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf

2014-02-19 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-6459:
--

Attachment: HIVE-6459.1.patch

Patch #1 is rebased with latest trunk.

 Change the precison/scale for intermediate sum result in the avg() udf 
 ---

 Key: HIVE-6459
 URL: https://issues.apache.org/jira/browse/HIVE-6459
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.13.0
Reporter: Xuefu Zhang
Assignee: Xuefu Zhang
 Attachments: HIVE-6459.1.patch, HIVE-6459.patch


 The avg() udf, when applied to a decimal column, selects the precision/scale 
 of the intermediate sum field as (p+4, s+4), which is the same for the 
 precision/scale of the avg() result. However, the additional scale increase 
 is unnecessary, and the problem of data overflow may occur. The requested 
 change is that for the intermediate sum result,  the precsion/scale is set to 
 (p+10, s), which is consistent to sum() udf. The avg() result still keeps its 
 precision/scale.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support

2014-02-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905932#comment-13905932
 ] 

Lefty Leverenz commented on HIVE-5317:
--

Off topic:  This ticket has 100 watchers.  Is that a record?

 Implement insert, update, and delete in Hive with full ACID support
 ---

 Key: HIVE-5317
 URL: https://issues.apache.org/jira/browse/HIVE-5317
 Project: Hive
  Issue Type: New Feature
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: InsertUpdatesinHive.pdf


 Many customers want to be able to insert, update and delete rows from Hive 
 tables with full ACID support. The use cases are varied, but the form of the 
 queries that should be supported are:
 * INSERT INTO tbl SELECT …
 * INSERT INTO tbl VALUES ...
 * UPDATE tbl SET … WHERE …
 * DELETE FROM tbl WHERE …
 * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN 
 ...
 * SET TRANSACTION LEVEL …
 * BEGIN/END TRANSACTION
 Use Cases
 * Once an hour, a set of inserts and updates (up to 500k rows) for various 
 dimension tables (eg. customer, inventory, stores) needs to be processed. The 
 dimension tables have primary keys and are typically bucketed and sorted on 
 those keys.
 * Once a day a small set (up to 100k rows) of records need to be deleted for 
 regulatory compliance.
 * Once an hour a log of transactions is exported from a RDBS and the fact 
 tables need to be updated (up to 1m rows)  to reflect the new data. The 
 transactions are a combination of inserts, updates, and deletes. The table is 
 partitioned and bucketed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-860) Persistent distributed cache

2014-02-19 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905960#comment-13905960
 ] 

Brock Noland commented on HIVE-860:
---

Hmm. I will take a look.

 Persistent distributed cache
 

 Key: HIVE-860
 URL: https://issues.apache.org/jira/browse/HIVE-860
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.12.0
Reporter: Zheng Shao
Assignee: Brock Noland
 Fix For: 0.13.0

 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
 HIVE-860.patch, HIVE-860.patch, HIVE-860.patch


 DistributedCache is shared across multiple jobs, if the hdfs file name is the 
 same.
 We need to make sure Hive put the same file into the same location every time 
 and do not overwrite if the file content is the same.
 We can achieve 2 different results:
 A1. Files added with the same name, timestamp, and md5 in the same session 
 will have a single copy in distributed cache.
 A2. Filed added with the same name, timestamp, and md5 will have a single 
 copy in distributed cache.
 A2 has a bigger benefit in sharing but may raise a question on when Hive 
 should clean it up in hdfs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6422) SQL std auth - revert change for view keyword in grant statement

2014-02-19 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6422:


Status: Patch Available  (was: Open)

 SQL std auth - revert change for view keyword in grant statement
 

 Key: HIVE-6422
 URL: https://issues.apache.org/jira/browse/HIVE-6422
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6422.1.patch


 SQL standard does not support view keyword in grant statement. HIVE-6181 
 which was added as part of sql standard changes, needs to be reverted.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6422) SQL std auth - revert change for view keyword in grant statement

2014-02-19 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6422:


Attachment: HIVE-6422.1.patch

 SQL std auth - revert change for view keyword in grant statement
 

 Key: HIVE-6422
 URL: https://issues.apache.org/jira/browse/HIVE-6422
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6422.1.patch


 SQL standard does not support view keyword in grant statement. HIVE-6181 
 which was added as part of sql standard changes, needs to be reverted.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-860) Persistent distributed cache

2014-02-19 Thread Dmitriy V. Ryaboy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905992#comment-13905992
 ] 

Dmitriy V. Ryaboy commented on HIVE-860:


[~brocknoland] please note my and Aniket's comments in PIG-2672 -- the real 
solution for this is YARN-1492 (which doesn't help existing hadoop 
installations, granted.. but you should plan on using that for future hadoop 
versions, as it will be more core, have better sharing across multiple users 
and tools, etc).

 Persistent distributed cache
 

 Key: HIVE-860
 URL: https://issues.apache.org/jira/browse/HIVE-860
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.12.0
Reporter: Zheng Shao
Assignee: Brock Noland
 Fix For: 0.13.0

 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
 HIVE-860.patch, HIVE-860.patch, HIVE-860.patch


 DistributedCache is shared across multiple jobs, if the hdfs file name is the 
 same.
 We need to make sure Hive put the same file into the same location every time 
 and do not overwrite if the file content is the same.
 We can achieve 2 different results:
 A1. Files added with the same name, timestamp, and md5 in the same session 
 will have a single copy in distributed cache.
 A2. Filed added with the same name, timestamp, and md5 will have a single 
 copy in distributed cache.
 A2 has a bigger benefit in sharing but may raise a question on when Hive 
 should clean it up in hdfs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf

2014-02-19 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906031#comment-13906031
 ] 

Ravi Prakash commented on HIVE-6037:


I'm new to this so apologies in advance if I didn't do something right. 
Reverting the commit has not helped. I still see the same error as in  
https://issues.apache.org/jira/browse/HIVE-6037?focusedCommentId=13904349page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13904349
 .  I checked out the commit before this 
(5893677435f165bee81d1c5be4300321f9bf47fb) and it built fine.

 Synchronize HiveConf with hive-default.xml.template and support show conf
 -

 Key: HIVE-6037
 URL: https://issues.apache.org/jira/browse/HIVE-6037
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Reporter: Navis
Assignee: Navis
Priority: Minor
 Fix For: 0.13.0

 Attachments: CHIVE-6037.3.patch.txt, HIVE-6037.1.patch.txt, 
 HIVE-6037.10.patch.txt, HIVE-6037.11.patch.txt, HIVE-6037.12.patch.txt, 
 HIVE-6037.2.patch.txt, HIVE-6037.4.patch.txt, HIVE-6037.5.patch.txt, 
 HIVE-6037.6.patch.txt, HIVE-6037.7.patch.txt, HIVE-6037.8.patch.txt, 
 HIVE-6037.9.patch.txt, HIVE-6037.patch


 see HIVE-5879



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-860) Persistent distributed cache

2014-02-19 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906032#comment-13906032
 ] 

Brock Noland commented on HIVE-860:
---

Thank you much for bring that up! ! Yes I noted that earlier as well and should 
have mentioned that here.

 Persistent distributed cache
 

 Key: HIVE-860
 URL: https://issues.apache.org/jira/browse/HIVE-860
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.12.0
Reporter: Zheng Shao
Assignee: Brock Noland
 Fix For: 0.13.0

 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
 HIVE-860.patch, HIVE-860.patch, HIVE-860.patch


 DistributedCache is shared across multiple jobs, if the hdfs file name is the 
 same.
 We need to make sure Hive put the same file into the same location every time 
 and do not overwrite if the file content is the same.
 We can achieve 2 different results:
 A1. Files added with the same name, timestamp, and md5 in the same session 
 will have a single copy in distributed cache.
 A2. Filed added with the same name, timestamp, and md5 will have a single 
 copy in distributed cache.
 A2 has a bigger benefit in sharing but may raise a question on when Hive 
 should clean it up in hdfs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-860) Persistent distributed cache

2014-02-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-860:
--

Attachment: HIVE-860.patch

This should fix it. The problem is that when ExecDriver is run as main() the 
instance variable conf is null and has to be ignored as opposed to job.

 Persistent distributed cache
 

 Key: HIVE-860
 URL: https://issues.apache.org/jira/browse/HIVE-860
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.12.0
Reporter: Zheng Shao
Assignee: Brock Noland
 Fix For: 0.13.0

 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, 
 HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch


 DistributedCache is shared across multiple jobs, if the hdfs file name is the 
 same.
 We need to make sure Hive put the same file into the same location every time 
 and do not overwrite if the file content is the same.
 We can achieve 2 different results:
 A1. Files added with the same name, timestamp, and md5 in the same session 
 will have a single copy in distributed cache.
 A2. Filed added with the same name, timestamp, and md5 will have a single 
 copy in distributed cache.
 A2 has a bigger benefit in sharing but may raise a question on when Hive 
 should clean it up in hdfs.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18200: HIVE-860 - Persistent distributed cache

2014-02-19 Thread Brock Noland

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18200/
---

(Updated Feb. 19, 2014, 8:35 p.m.)


Review request for hive.


Changes
---

Latest update.


Bugs: HIVE-860
https://issues.apache.org/jira/browse/HIVE-860


Repository: hive-git


Description
---

Caches auxiliary jars and remote runtime jars in /user/$user/.hiveJars by their 
sha1 hash. This results in:

1) faster queries
2) less distributed cache churn
3) a smaller/cleaner hive-exec jar


Diffs (updated)
-

  bin/hive 3bd949f 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
  conf/hive-default.xml.template 0d08aa2 
  packaging/src/main/assembly/bin.xml a97ef7d 
  ql/pom.xml 53d0b9e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HiveAuxClasspathBuilder.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 288da8e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/JarCache.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 326654f 
  shims/aggregator/pom.xml 7aa8c4c 

Diff: https://reviews.apache.org/r/18200/diff/


Testing
---

Tested manually on a cluster.


Thanks,

Brock Noland



[jira] [Commented] (HIVE-6406) Introduce immutable-table table property and if set, disallow insert-into

2014-02-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906047#comment-13906047
 ] 

Ashutosh Chauhan commented on HIVE-6406:


+1 
Since this a new protection mode, in addition to existing ones (like NO_DROP, 
OFFLINE) it make sense to have this new mode supported via syntax like earlier. 
Thats only a syntactic sugar, which could be done in a follow-up.

 Introduce immutable-table table property and if set, disallow insert-into
 -

 Key: HIVE-6406
 URL: https://issues.apache.org/jira/browse/HIVE-6406
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6406.2.patch, HIVE-6406.3.patch, HIVE-6406.patch


 As part of HIVE-6405's attempt to make HCatalog and Hive behave in similar 
 ways with regards to immutable tables, this is a companion task to introduce 
 the notion of an immutable table, wherein all tables are not immutable by 
 default, and have this be a table property. If this property is set for a 
 table, and we attempt to write to a table that already has data (or a 
 partition), disallow INSERT INTO into it from hive(if destination directory 
 is non-empty). This property being set will allow hive to mimic HCatalog's 
 current immutable-table property.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6422) SQL std auth - revert change for view keyword in grant statement

2014-02-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906050#comment-13906050
 ] 

Ashutosh Chauhan commented on HIVE-6422:


It will be good to retain the test case here.

 SQL std auth - revert change for view keyword in grant statement
 

 Key: HIVE-6422
 URL: https://issues.apache.org/jira/browse/HIVE-6422
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6422.1.patch


 SQL standard does not support view keyword in grant statement. HIVE-6181 
 which was added as part of sql standard changes, needs to be reverted.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6405) Support append feature for HCatalog

2014-02-19 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906058#comment-13906058
 ] 

Sushanth Sowmyan commented on HIVE-6405:


After a brief discussion with Ashutosh on the nature of HIVE-6406, I'm going to 
create another task to add ql grammar to support modification of the 
immutability property in a manner similar to existing grammar for 
NO_DROP/OFFLINE, so that this can be treated as another kind of data 
protection, and so that users will not have to deal with explicitly modifying 
TBLPROPERTIES.

 Support append feature for HCatalog
 ---

 Key: HIVE-6405
 URL: https://issues.apache.org/jira/browse/HIVE-6405
 Project: Hive
  Issue Type: Bug
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6405.patch


 HCatalog currently treats all tables as immutable - i.e. all tables and 
 partitions can be written to only once, and not appended. The nuances of what 
 this means is as follows:
  * A non-partitioned table can be written to, and data in it is never updated 
 from then on unless you drop and recreate.
  * A partitioned table may support appending of a sort in a manner by 
 adding new partitions to the table, but once written, the partitions 
 themselves cannot have any new data added to them.
 Hive, on the other hand, does allow us to INSERT INTO into a table, thus 
 allowing us append semantics. There is benefit to both of these models, and 
 so, our goal is as follows:
 a) Introduce a notion of an immutable table, wherein all tables are not 
 immutable by default, and have this be a table property. If this property is 
 set for a table, and we attempt to write to a table that already has data (or 
 a partition), disallow INSERT INTO into it from hive. This property being 
 set will allow hive to mimic HCatalog's current immutable-table property. 
 (I'm going to create a separate sub-task to cover this bit, and focus on the 
 HCatalog-side on this jira)
 b) As long as that flag is not set, HCatalog should be changed to allow 
 appends into it as well, and not simply error out if data already exists in a 
 table.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6465) Introduce ql grammar for immutability property

2014-02-19 Thread Sushanth Sowmyan (JIRA)
Sushanth Sowmyan created HIVE-6465:
--

 Summary: Introduce ql grammar for immutability property
 Key: HIVE-6465
 URL: https://issues.apache.org/jira/browse/HIVE-6465
 Project: Hive
  Issue Type: Sub-task
Reporter: Sushanth Sowmyan






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6465) Introduce ql grammar for immutability property

2014-02-19 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-6465:
---

Description: 
HIVE-6406 introduces a notion of an immutable table in hive. In essence, it is 
a data protection feature similar to current hive protections like OFFLINE and 
NO_DROP. Thus, rather than having its interface being people mucking around 
TBLPROPERTIES, we should have ql grammar for it.


 Introduce ql grammar for immutability property
 --

 Key: HIVE-6465
 URL: https://issues.apache.org/jira/browse/HIVE-6465
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan

 HIVE-6406 introduces a notion of an immutable table in hive. In essence, it 
 is a data protection feature similar to current hive protections like OFFLINE 
 and NO_DROP. Thus, rather than having its interface being people mucking 
 around TBLPROPERTIES, we should have ql grammar for it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support

2014-02-19 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906059#comment-13906059
 ] 

Alan Gates commented on HIVE-5317:
--

MAPREDUCE-279, at 109, currently out scores us.  There may be others, but it 
would be cool to have more watchers than Yarn. :)

 Implement insert, update, and delete in Hive with full ACID support
 ---

 Key: HIVE-5317
 URL: https://issues.apache.org/jira/browse/HIVE-5317
 Project: Hive
  Issue Type: New Feature
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Attachments: InsertUpdatesinHive.pdf


 Many customers want to be able to insert, update and delete rows from Hive 
 tables with full ACID support. The use cases are varied, but the form of the 
 queries that should be supported are:
 * INSERT INTO tbl SELECT …
 * INSERT INTO tbl VALUES ...
 * UPDATE tbl SET … WHERE …
 * DELETE FROM tbl WHERE …
 * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN 
 ...
 * SET TRANSACTION LEVEL …
 * BEGIN/END TRANSACTION
 Use Cases
 * Once an hour, a set of inserts and updates (up to 500k rows) for various 
 dimension tables (eg. customer, inventory, stores) needs to be processed. The 
 dimension tables have primary keys and are typically bucketed and sorted on 
 those keys.
 * Once a day a small set (up to 100k rows) of records need to be deleted for 
 regulatory compliance.
 * Once an hour a log of transactions is exported from a RDBS and the fact 
 tables need to be updated (up to 1m rows)  to reflect the new data. The 
 transactions are a combination of inserts, updates, and deletes. The table is 
 partitioned and bucketed.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6382) PATCHED_BLOB encoding in ORC will corrupt data in some cases

2014-02-19 Thread Prasanth J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth J updated HIVE-6382:
-

Attachment: HIVE-6382.3.patch

Added configuration to orc readers to support skipping of corrupted data.

 PATCHED_BLOB encoding in ORC will corrupt data in some cases
 

 Key: HIVE-6382
 URL: https://issues.apache.org/jira/browse/HIVE-6382
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Attachments: HIVE-6382.1.patch, HIVE-6382.2.patch, HIVE-6382.3.patch


 In PATCHED_BLOB encoding (added in HIVE-4123), gapVsPatchList is an array of 
 long that stores gap (g) between the values that are patched and the patch 
 value (p). The maximum distance of gap can be 511 that require 8 bits to 
 encode. And patch values can take more than 56 bits. When patch values take 
 more than 56 bits, p + g will become  64 bits which cannot be packed to a 
 long. This will result in data corruption under the case where patch values 
 are  56 bits. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6465) Introduce ql grammar for immutability property

2014-02-19 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906063#comment-13906063
 ] 

Sushanth Sowmyan commented on HIVE-6465:


Current grammar for data protections are as follows:

{noformat}
ALTER TABLE table_name [PARTITION partition_spec] ENABLE|DISABLE NO_DROP;
ALTER TABLE table_name [PARTITION partition_spec] ENABLE|DISABLE OFFLINE;
{noformat}

Proposed new grammar for immutability:

{noformat}
ALTER TABLE table_name ENABLE|DISABLE IMMUTABILITY;
{noformat}

 Introduce ql grammar for immutability property
 --

 Key: HIVE-6465
 URL: https://issues.apache.org/jira/browse/HIVE-6465
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan

 HIVE-6406 introduces a notion of an immutable table in hive. In essence, it 
 is a data protection feature similar to current hive protections like OFFLINE 
 and NO_DROP. Thus, rather than having its interface being people mucking 
 around TBLPROPERTIES, we should have ql grammar for it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6406) Introduce immutable-table table property and if set, disallow insert-into

2014-02-19 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906071#comment-13906071
 ] 

Sushanth Sowmyan commented on HIVE-6406:


Thanks, Ashutosh. I've created HIVE-6465 for that.

 Introduce immutable-table table property and if set, disallow insert-into
 -

 Key: HIVE-6406
 URL: https://issues.apache.org/jira/browse/HIVE-6406
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-6406.2.patch, HIVE-6406.3.patch, HIVE-6406.patch


 As part of HIVE-6405's attempt to make HCatalog and Hive behave in similar 
 ways with regards to immutable tables, this is a companion task to introduce 
 the notion of an immutable table, wherein all tables are not immutable by 
 default, and have this be a table property. If this property is set for a 
 table, and we attempt to write to a table that already has data (or a 
 partition), disallow INSERT INTO into it from hive(if destination directory 
 is non-empty). This property being set will allow hive to mimic HCatalog's 
 current immutable-table property.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5232) Make JDBC use the new HiveServer2 async execution API by default

2014-02-19 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-5232:
---

Description: 
[HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
for async execution in HS2. There are some proposed improvements in followup 
JIRAs:
# [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217]
# [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229]
# [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230]
# [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441]

There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which 
assumes that execute to be asynchronous by default.
 
Once they are in, we can think of using the async API as the default for JDBC. 
This can enable the server to report back error sooner to the client. It can 
also be useful in cases where a statement.cancel in a different thread and the 
original thread will now be able to detect that.

  was:
[HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
for async execution in HS2. There are some proposed improvements in followup 
JIRAs:
# [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217]
# [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229]
# [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230]
# [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441]

There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which 
assumes that execute to be asynchronous by default.
 
Once they are in, we can think of using the async API as the default for JDBC. 
This is likely going to provide performance benefits as a long running query 
need not keep the underlying TCP connection open for the entire duration.


 Make JDBC use the new HiveServer2 async execution API by default
 

 Key: HIVE-5232
 URL: https://issues.apache.org/jira/browse/HIVE-5232
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-5232.1.patch, HIVE-5232.2.patch


 [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
 for async execution in HS2. There are some proposed improvements in followup 
 JIRAs:
 # [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217]
 # [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229]
 # [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230]
 # [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441]
 There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] 
 which assumes that execute to be asynchronous by default.
  
 Once they are in, we can think of using the async API as the default for 
 JDBC. This can enable the server to report back error sooner to the client. 
 It can also be useful in cases where a statement.cancel in a different thread 
 and the original thread will now be able to detect that.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6466) Add support for pluggable authentication modules (PAM) in HiveServer2

2014-02-19 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-6466:
--

 Summary: Add support for pluggable authentication modules (PAM) in 
HiveServer2
 Key: HIVE-6466
 URL: https://issues.apache.org/jira/browse/HIVE-6466
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0


More on PAM in these articles:
http://www.tuxradar.com/content/how-pam-works
https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html
 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5232) Make JDBC use the new HiveServer2 async execution API by default

2014-02-19 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-5232:
---

Description: 
[HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
for async execution in HS2. There are some proposed improvements in followup 
JIRAs:
# [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217]
# [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229]
# [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230]
# [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441]

There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which 
assumes that execute to be asynchronous by default.
 
Once they are in, we can think of using the async API as the default for JDBC. 
This can enable the server to report back error sooner to the client. It can 
also be useful in cases where a statement.cancel is done in a different thread 
- the original thread will now be able to detect the cancel, as opposed to the 
use of the blocking execute calls, in which statement.cancel will be a no-op. 

  was:
[HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
for async execution in HS2. There are some proposed improvements in followup 
JIRAs:
# [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217]
# [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229]
# [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230]
# [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441]

There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which 
assumes that execute to be asynchronous by default.
 
Once they are in, we can think of using the async API as the default for JDBC. 
This can enable the server to report back error sooner to the client. It can 
also be useful in cases where a statement.cancel in a different thread and the 
original thread will now be able to detect that.


 Make JDBC use the new HiveServer2 async execution API by default
 

 Key: HIVE-5232
 URL: https://issues.apache.org/jira/browse/HIVE-5232
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-5232.1.patch, HIVE-5232.2.patch


 [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
 for async execution in HS2. There are some proposed improvements in followup 
 JIRAs:
 # [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217]
 # [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229]
 # [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230]
 # [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441]
 There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] 
 which assumes that execute to be asynchronous by default.
  
 Once they are in, we can think of using the async API as the default for 
 JDBC. This can enable the server to report back error sooner to the client. 
 It can also be useful in cases where a statement.cancel is done in a 
 different thread - the original thread will now be able to detect the cancel, 
 as opposed to the use of the blocking execute calls, in which 
 statement.cancel will be a no-op. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6422) SQL std auth - revert change for view keyword in grant statement

2014-02-19 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906112#comment-13906112
 ] 

Ashutosh Chauhan commented on HIVE-6422:


+1

 SQL std auth - revert change for view keyword in grant statement
 

 Key: HIVE-6422
 URL: https://issues.apache.org/jira/browse/HIVE-6422
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6422.1.patch, HIVE-6422.2.patch


 SQL standard does not support view keyword in grant statement. HIVE-6181 
 which was added as part of sql standard changes, needs to be reverted.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6422) SQL std auth - revert change for view keyword in grant statement

2014-02-19 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-6422:


Attachment: HIVE-6422.2.patch

Good point about test coverage. I have enhanced the authorization_view_sqlstd.q 
test to add more coverage and also remove the view keyword used there. It now 
also tests the grant/revoke statements on views with and without the table 
keyword.


 SQL std auth - revert change for view keyword in grant statement
 

 Key: HIVE-6422
 URL: https://issues.apache.org/jira/browse/HIVE-6422
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-6422.1.patch, HIVE-6422.2.patch


 SQL standard does not support view keyword in grant statement. HIVE-6181 
 which was added as part of sql standard changes, needs to be reverted.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6465) Introduce ql grammar for immutability property

2014-02-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906126#comment-13906126
 ] 

Lefty Leverenz commented on HIVE-6465:
--

What about CREATE TABLE?  Would it also have ENABLE|DISABLE IMMUTABLILTY or 
keep on using TBLPROPERTIES(immutable=true) as shown in HIVE-6406, or 
neither?  (According to the wiki NO_DROP and OFFLINE only exist as ALTER TABLE 
options, but maybe the doc is incomplete.)

* [DDL:  Create Table 
|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable]
* [DDL:  Alter Table/Partition Protections 
|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionProtections]

 Introduce ql grammar for immutability property
 --

 Key: HIVE-6465
 URL: https://issues.apache.org/jira/browse/HIVE-6465
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan

 HIVE-6406 introduces a notion of an immutable table in hive. In essence, it 
 is a data protection feature similar to current hive protections like OFFLINE 
 and NO_DROP. Thus, rather than having its interface being people mucking 
 around TBLPROPERTIES, we should have ql grammar for it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5958) SQL std auth - authorize statements that work with paths

2014-02-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906127#comment-13906127
 ] 

Hive QA commented on HIVE-5958:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12629792/HIVE-5958.7.patch

{color:green}SUCCESS:{color} +1 5143 tests passed

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1416/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1416/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12629792

 SQL std auth - authorize statements that work with paths
 

 Key: HIVE-5958
 URL: https://issues.apache.org/jira/browse/HIVE-5958
 Project: Hive
  Issue Type: Sub-task
  Components: Authorization
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Attachments: HIVE-5958.1.patch, HIVE-5958.2.patch, HIVE-5958.3.patch, 
 HIVE-5958.4.patch, HIVE-5958.5.patch, HIVE-5958.6.patch, HIVE-5958.7.patch

   Original Estimate: 72h
  Remaining Estimate: 72h

 Statement such as create table, alter table that specify an path uri should 
 be allowed under the new authorization scheme only if URI(Path) specified has 
 permissions including read/write and ownership of the file/dir and its 
 children.
 Also, fix issue of database not getting set as output for create-table.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6405) Support append feature for HCatalog

2014-02-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906134#comment-13906134
 ] 

Hive QA commented on HIVE-6405:
---



{color:red}Overall{color}: -1 no tests executed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12629811/HIVE-6405.patch

Test results: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1418/testReport
Console output: 
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1418/console

Messages:
{noformat}
 This message was trimmed, see log for full details 
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-hbase-handler ---
[INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ 
hive-hbase-handler ---
[debug] execute contextualize
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/main/resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-hbase-handler 
---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ 
hive-hbase-handler ---
[INFO] Compiling 18 source files to 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/classes
[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-resources-plugin:2.5:testResources (default-testResources) @ 
hive-hbase-handler ---
[debug] execute contextualize
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/test/resources
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-hbase-handler 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/tmp/conf
 [copy] Copying 5 files to 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/tmp/conf
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-hbase-handler ---
[INFO] Compiling 4 source files to 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/test-classes
[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-hbase-handler 
---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-hbase-handler ---
[INFO] Building jar: 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/hive-hbase-handler-0.13.0-SNAPSHOT.jar
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ 
hive-hbase-handler ---
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/hive-hbase-handler-0.13.0-SNAPSHOT.jar
 to 
/data/hive-ptest/working/maven/org/apache/hive/hive-hbase-handler/0.13.0-SNAPSHOT/hive-hbase-handler-0.13.0-SNAPSHOT.jar
[INFO] Installing 
/data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/pom.xml to 
/data/hive-ptest/working/maven/org/apache/hive/hive-hbase-handler/0.13.0-SNAPSHOT/hive-hbase-handler-0.13.0-SNAPSHOT.pom
[INFO] 
[INFO] 
[INFO] Building Hive HCatalog 0.13.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-hcatalog ---
[INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/hcatalog 
(includes = [datanucleus.log, derby.log], excludes = [])
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-hcatalog ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-hcatalog ---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/target/tmp
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/target/warehouse
[mkdir] Created dir: 
/data/hive-ptest/working/apache-svn-trunk-source/hcatalog/target/tmp/conf
 [copy] 

Re: Review Request 15435: Add long polling to asynchronous execution in HiveServer2

2014-02-19 Thread Thejas Nair

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15435/#review34933
---



service/src/java/org/apache/hive/service/cli/operation/ExecuteStatementOperation.java
https://reviews.apache.org/r/15435/#comment65313

it would be better to call the constructor with more arguments from this 
one.




service/src/java/org/apache/hive/service/cli/operation/Operation.java
https://reviews.apache.org/r/15435/#comment65311

I think we should make runAsync field final, as it is not supposed to 
change.



service/src/java/org/apache/hive/service/cli/operation/Operation.java
https://reviews.apache.org/r/15435/#comment65312

it think it is cleaner to have this constructor call (instead of the other 
way around)
this(parentSession, opType, false);
That way initialization will be in one constructor, and it will be clear 
what all variables get initalized to.



- Thejas Nair


On Feb. 18, 2014, 1:16 p.m., Vaibhav Gumashta wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15435/
 ---
 
 (Updated Feb. 18, 2014, 1:16 p.m.)
 
 
 Review request for hive, Carl Steinbach and Thejas Nair.
 
 
 Bugs: HIVE-5217
 https://issues.apache.org/jira/browse/HIVE-5217
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Add long polling to asynchronous execution in HiveServer2
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
   service/src/java/org/apache/hive/service/cli/CLIService.java 56b357a 
   
 service/src/java/org/apache/hive/service/cli/operation/ExecuteStatementOperation.java
  e973f83 
   service/src/java/org/apache/hive/service/cli/operation/Operation.java 
 58a28b6 
   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
 03a37c8 
   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java 8ec8d43 
 
 Diff: https://reviews.apache.org/r/15435/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Vaibhav Gumashta
 




Re: Review Request 18250: SQL std auth - allow grant/revoke roles if user has ADMIN OPTION

2014-02-19 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18250/
---

(Updated Feb. 19, 2014, 9:51 p.m.)


Review request for hive.


Changes
---

Incorporated Thejas feedback. Also, added new -ve test.


Bugs: HIVE-6433
https://issues.apache.org/jira/browse/HIVE-6433


Repository: hive-git


Description
---

SQL std auth - allow grant/revoke roles if user has ADMIN OPTION


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
 c1afaee 
  ql/src/test/queries/clientnegative/authorization_role_grant.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_role_grant2.q PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_role_grant.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_set_role_neg2.q.out eec684d 
  ql/src/test/results/clientpositive/authorization_role_grant2.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/18250/diff/


Testing
---

Added new test


Thanks,

Ashutosh Chauhan



Re: Review Request 18250: SQL std auth - allow grant/revoke roles if user has ADMIN OPTION

2014-02-19 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18250/
---

(Updated Feb. 19, 2014, 9:51 p.m.)


Review request for hive.


Bugs: HIVE-6433
https://issues.apache.org/jira/browse/HIVE-6433


Repository: hive-git


Description
---

SQL std auth - allow grant/revoke roles if user has ADMIN OPTION


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java
 c1afaee 
  ql/src/test/queries/clientnegative/authorization_role_grant.q PRE-CREATION 
  ql/src/test/queries/clientpositive/authorization_role_grant2.q PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_role_grant.q.out 
PRE-CREATION 
  ql/src/test/results/clientnegative/authorization_set_role_neg2.q.out eec684d 
  ql/src/test/results/clientpositive/authorization_role_grant2.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/18250/diff/


Testing
---

Added new test


Thanks,

Ashutosh Chauhan



[jira] [Updated] (HIVE-2365) SQL support for bulk load into HBase

2014-02-19 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-2365:
---

Attachment: HIVE-2365.2.patch.txt

This patch separates HFile generation from the new completebulkload task. Also 
adds a couple more test cases.

 SQL support for bulk load into HBase
 

 Key: HIVE-2365
 URL: https://issues.apache.org/jira/browse/HIVE-2365
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: John Sichi
Assignee: Nick Dimiduk
 Fix For: 0.13.0

 Attachments: HIVE-2365.2.patch.txt, HIVE-2365.WIP.00.patch, 
 HIVE-2365.WIP.01.patch, HIVE-2365.WIP.01.patch


 Support the as simple as this SQL for bulk load from Hive into HBase.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-2365) SQL support for bulk load into HBase

2014-02-19 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-2365:
---

Status: Open  (was: Patch Available)

 SQL support for bulk load into HBase
 

 Key: HIVE-2365
 URL: https://issues.apache.org/jira/browse/HIVE-2365
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: John Sichi
Assignee: Nick Dimiduk
 Fix For: 0.13.0

 Attachments: HIVE-2365.2.patch.txt, HIVE-2365.WIP.00.patch, 
 HIVE-2365.WIP.01.patch, HIVE-2365.WIP.01.patch


 Support the as simple as this SQL for bulk load from Hive into HBase.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION

2014-02-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6433:
---

Status: Patch Available  (was: Open)

 SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
 

 Key: HIVE-6433
 URL: https://issues.apache.org/jira/browse/HIVE-6433
 Project: Hive
  Issue Type: Sub-task
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6433.1.patch, HIVE-6433.patch


 Follow up jira for HIVE-5952.
 If a user/role has admin option on a role, then user should be able to grant 
 /revoke other users to/from the role.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-2365) SQL support for bulk load into HBase

2014-02-19 Thread Nick Dimiduk (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HIVE-2365:
---

Status: Patch Available  (was: Open)

 SQL support for bulk load into HBase
 

 Key: HIVE-2365
 URL: https://issues.apache.org/jira/browse/HIVE-2365
 Project: Hive
  Issue Type: Improvement
  Components: HBase Handler
Reporter: John Sichi
Assignee: Nick Dimiduk
 Fix For: 0.13.0

 Attachments: HIVE-2365.2.patch.txt, HIVE-2365.WIP.00.patch, 
 HIVE-2365.WIP.01.patch, HIVE-2365.WIP.01.patch


 Support the as simple as this SQL for bulk load from Hive into HBase.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION

2014-02-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6433:
---

Status: Open  (was: Patch Available)

 SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
 

 Key: HIVE-6433
 URL: https://issues.apache.org/jira/browse/HIVE-6433
 Project: Hive
  Issue Type: Sub-task
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6433.1.patch, HIVE-6433.patch


 Follow up jira for HIVE-5952.
 If a user/role has admin option on a role, then user should be able to grant 
 /revoke other users to/from the role.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION

2014-02-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6433:
---

Attachment: HIVE-6433.1.patch

Incorporated Thejas feedback.

 SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
 

 Key: HIVE-6433
 URL: https://issues.apache.org/jira/browse/HIVE-6433
 Project: Hive
  Issue Type: Sub-task
Reporter: Thejas M Nair
Assignee: Ashutosh Chauhan
 Attachments: HIVE-6433.1.patch, HIVE-6433.patch


 Follow up jira for HIVE-5952.
 If a user/role has admin option on a role, then user should be able to grant 
 /revoke other users to/from the role.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-5232) Make JDBC use the new HiveServer2 async execution API by default

2014-02-19 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5232:


Description: 
[HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
for async execution in HS2. There are some proposed improvements in followup 
JIRAs:
HIVE-5217
HIVE-5229
HIVE-5230
HIVE-5441

There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which 
assumes that execute to be asynchronous by default.
 
Once they are in, we can think of using the async API as the default for JDBC. 
This can enable the server to report back error sooner to the client. It can 
also be useful in cases where a statement.cancel is done in a different thread 
- the original thread will now be able to detect the cancel, as opposed to the 
use of the blocking execute calls, in which statement.cancel will be a no-op. 

  was:
[HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
for async execution in HS2. There are some proposed improvements in followup 
JIRAs:
# [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217]
# [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229]
# [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230]
# [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441]

There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which 
assumes that execute to be asynchronous by default.
 
Once they are in, we can think of using the async API as the default for JDBC. 
This can enable the server to report back error sooner to the client. It can 
also be useful in cases where a statement.cancel is done in a different thread 
- the original thread will now be able to detect the cancel, as opposed to the 
use of the blocking execute calls, in which statement.cancel will be a no-op. 


 Make JDBC use the new HiveServer2 async execution API by default
 

 Key: HIVE-5232
 URL: https://issues.apache.org/jira/browse/HIVE-5232
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-5232.1.patch, HIVE-5232.2.patch


 [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
 for async execution in HS2. There are some proposed improvements in followup 
 JIRAs:
 HIVE-5217
 HIVE-5229
 HIVE-5230
 HIVE-5441
 There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] 
 which assumes that execute to be asynchronous by default.
  
 Once they are in, we can think of using the async API as the default for 
 JDBC. This can enable the server to report back error sooner to the client. 
 It can also be useful in cases where a statement.cancel is done in a 
 different thread - the original thread will now be able to detect the cancel, 
 as opposed to the use of the blocking execute calls, in which 
 statement.cancel will be a no-op. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18254: HIVE-6375 Implement CTAS and column rename for parquet

2014-02-19 Thread Szehon Ho

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18254/
---

(Updated Feb. 19, 2014, 9:56 p.m.)


Review request for hive.


Changes
---

Incorporated review feedback.

Updated more test cases results of explain CTAS.

It seems that the test table srcbucket, as a bucketed (multi-file) table, will 
give random results from select query, so first insert to a staging table using 
sort by.


Bugs: HIVE-6375
https://issues.apache.org/jira/browse/HIVE-6375


Repository: hive-git


Description
---

There is a Hive bug in SemanticAnalyzer that chooses different names for 
columns in the CreateTable task and the FileSink task.  
columnInfo.getInternalName() was used in one place, and fieldSchema still used 
columnInfo.getAlias() if it is available.  This change makes both consistent, 
favoring columnInfo.getAlias if it is available.

This is not revealed before because other file-formats like RcFile seem to use 
column-ordinal position, and Avro file stores the schema separately altogether.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd 
  ql/src/test/queries/clientpositive/parquet_ctas.q PRE-CREATION 
  ql/src/test/results/clientpositive/ctas.q.out 9668855 
  ql/src/test/results/clientpositive/ctas_hadoop20.q.out 0ec0af5 
  ql/src/test/results/clientpositive/merge3.q.out 3df75b7 
  ql/src/test/results/clientpositive/parquet_ctas.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/18254/diff/


Testing
---

Added parquet_ctas.q.  Covers cases where column name is gotten directly from 
input table (implied alias), where name is auto-generated, where name is 
specified as alias, and a mix of the three.


Thanks,

Szehon Ho



[jira] [Updated] (HIVE-5232) Make JDBC use the new HiveServer2 async execution API by default

2014-02-19 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-5232:


Description: 
HIVE-4617 provides support for async execution in HS2. There are some proposed 
improvements in followup JIRAs:
HIVE-5217
HIVE-5229
HIVE-5230
HIVE-5441

There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which 
assumes that execute to be asynchronous by default.
 
Once they are in, we can think of using the async API as the default for JDBC. 
This can enable the server to report back error sooner to the client. It can 
also be useful in cases where a statement.cancel is done in a different thread 
- the original thread will now be able to detect the cancel, as opposed to the 
use of the blocking execute calls, in which statement.cancel will be a no-op. 

  was:
[HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support 
for async execution in HS2. There are some proposed improvements in followup 
JIRAs:
HIVE-5217
HIVE-5229
HIVE-5230
HIVE-5441

There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which 
assumes that execute to be asynchronous by default.
 
Once they are in, we can think of using the async API as the default for JDBC. 
This can enable the server to report back error sooner to the client. It can 
also be useful in cases where a statement.cancel is done in a different thread 
- the original thread will now be able to detect the cancel, as opposed to the 
use of the blocking execute calls, in which statement.cancel will be a no-op. 


 Make JDBC use the new HiveServer2 async execution API by default
 

 Key: HIVE-5232
 URL: https://issues.apache.org/jira/browse/HIVE-5232
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-5232.1.patch, HIVE-5232.2.patch


 HIVE-4617 provides support for async execution in HS2. There are some 
 proposed improvements in followup JIRAs:
 HIVE-5217
 HIVE-5229
 HIVE-5230
 HIVE-5441
 There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] 
 which assumes that execute to be asynchronous by default.
  
 Once they are in, we can think of using the async API as the default for 
 JDBC. This can enable the server to report back error sooner to the client. 
 It can also be useful in cases where a statement.cancel is done in a 
 different thread - the original thread will now be able to detect the cancel, 
 as opposed to the use of the blocking execute calls, in which 
 statement.cancel will be a no-op. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6465) Introduce ql grammar for immutability property

2014-02-19 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906159#comment-13906159
 ] 

Sushanth Sowmyan commented on HIVE-6465:


The Alter Table syntax acts as syntactic sugar for the background parameter, so 
the TBLPROPERTIES would also continue working. This is consistent with how 
OFFLINE/NO_DROP work as well, which are simply Properties at a Partition/Table 
level - PROTECT_MODE=OFFLINE/NO_DROP/NO_DROP_CASCADE/READ_ONLY would 
achieve the same thing for them.


 Introduce ql grammar for immutability property
 --

 Key: HIVE-6465
 URL: https://issues.apache.org/jira/browse/HIVE-6465
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan

 HIVE-6406 introduces a notion of an immutable table in hive. In essence, it 
 is a data protection feature similar to current hive protections like OFFLINE 
 and NO_DROP. Thus, rather than having its interface being people mucking 
 around TBLPROPERTIES, we should have ql grammar for it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6465) Introduce ql grammar for immutability property

2014-02-19 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906163#comment-13906163
 ] 

Sushanth Sowmyan commented on HIVE-6465:


This, btw, might also indicate that maybe we should prefer a reuse of 
ProtectMode, and go with PROTECT_MODE=IMMUTABLE or WRITE_ONLY for this 
scenario.

 Introduce ql grammar for immutability property
 --

 Key: HIVE-6465
 URL: https://issues.apache.org/jira/browse/HIVE-6465
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog, Metastore, Query Processor, Thrift API
Reporter: Sushanth Sowmyan

 HIVE-6406 introduces a notion of an immutable table in hive. In essence, it 
 is a data protection feature similar to current hive protections like OFFLINE 
 and NO_DROP. Thus, rather than having its interface being people mucking 
 around TBLPROPERTIES, we should have ql grammar for it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6375) Fix CTAS for parquet

2014-02-19 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho updated HIVE-6375:


Attachment: HIVE-6375.2.patch

Thanks Xuefu for review.  Incorporated feedback and fixed test output.  Seems 
select from srcbucket has some randomness to the result, as it is a bucketed 
table.

 Fix CTAS for parquet
 

 Key: HIVE-6375
 URL: https://issues.apache.org/jira/browse/HIVE-6375
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Szehon Ho
Priority: Critical
  Labels: Parquet
 Attachments: HIVE-6375.2.patch, HIVE-6375.patch


 More details here:
 https://github.com/Parquet/parquet-mr/issues/272



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6356) Dependency injection in hbase storage handler is broken

2014-02-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-6356:
---

Status: Patch Available  (was: Open)

Marking this as Patch Available. Lets get this in. Code changes are rather 
small to bump up hbase version.

 Dependency injection in hbase storage handler is broken
 ---

 Key: HIVE-6356
 URL: https://issues.apache.org/jira/browse/HIVE-6356
 Project: Hive
  Issue Type: Bug
  Components: HBase Handler
Reporter: Navis
Priority: Minor
 Fix For: 0.13.0

 Attachments: HIVE-6356.1.patch.txt, HIVE-6356.2.patch.txt, 
 HIVE-6356.3.patch.txt, HIVE-6356.addendum.00.patch


 Dependent jars for hbase is not added to tmpjars, which is caused by the 
 change of method signature(TableMapReduceUtil.addDependencyJars).



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (HIVE-6326) Split generation in ORC may generate wrong split boundaries because of unaccounted padded bytes

2014-02-19 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-6326:
-

Fix Version/s: 0.13.0

 Split generation in ORC may generate wrong split boundaries because of 
 unaccounted padded bytes
 ---

 Key: HIVE-6326
 URL: https://issues.apache.org/jira/browse/HIVE-6326
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Prasanth J
Assignee: Prasanth J
  Labels: orcfile
 Fix For: 0.13.0

 Attachments: HIVE-6326.1.patch, HIVE-6326.2.patch, HIVE-6326.3.patch, 
 HIVE-6326.4.patch


 HIVE-5091 added padding to ORC files to avoid ORC stripes straddling HDFS 
 blocks. The length of this padded bytes are not stored in stripe information. 
 OrcInputFormat.getSplits() uses stripeInformation.getLength() for split 
 computation. stripeInformation.getLength() is sum of index length, data 
 length and stripe footer length. It does not account for the length of padded 
 bytes which may result in wrong split boundary.
 The fix for this is to use the offset of next stripe as the length of current 
 stripe which includes the padded bytes as well.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Re: Review Request 18254: HIVE-6375 Implement CTAS and column rename for parquet

2014-02-19 Thread Mohammad Islam

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18254/#review34937
---

Ship it!


- Mohammad Islam


On Feb. 19, 2014, 9:56 p.m., Szehon Ho wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/18254/
 ---
 
 (Updated Feb. 19, 2014, 9:56 p.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6375
 https://issues.apache.org/jira/browse/HIVE-6375
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 There is a Hive bug in SemanticAnalyzer that chooses different names for 
 columns in the CreateTable task and the FileSink task.  
 columnInfo.getInternalName() was used in one place, and fieldSchema still 
 used columnInfo.getAlias() if it is available.  This change makes both 
 consistent, favoring columnInfo.getAlias if it is available.
 
 This is not revealed before because other file-formats like RcFile seem to 
 use column-ordinal position, and Avro file stores the schema separately 
 altogether.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd 
   ql/src/test/queries/clientpositive/parquet_ctas.q PRE-CREATION 
   ql/src/test/results/clientpositive/ctas.q.out 9668855 
   ql/src/test/results/clientpositive/ctas_hadoop20.q.out 0ec0af5 
   ql/src/test/results/clientpositive/merge3.q.out 3df75b7 
   ql/src/test/results/clientpositive/parquet_ctas.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/18254/diff/
 
 
 Testing
 ---
 
 Added parquet_ctas.q.  Covers cases where column name is gotten directly from 
 input table (implied alias), where name is auto-generated, where name is 
 specified as alias, and a mix of the three.
 
 
 Thanks,
 
 Szehon Ho
 




[jira] [Commented] (HIVE-6375) Fix CTAS for parquet

2014-02-19 Thread Mohammad Kamrul Islam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906215#comment-13906215
 ] 

Mohammad Kamrul Islam commented on HIVE-6375:
-

+1 
reviewed the patch.

CTAS for aver doesn't work for the same reason (HIVE-5803).
Hopefully, the patch will help avro as well.

 Fix CTAS for parquet
 

 Key: HIVE-6375
 URL: https://issues.apache.org/jira/browse/HIVE-6375
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Szehon Ho
Priority: Critical
  Labels: Parquet
 Attachments: HIVE-6375.2.patch, HIVE-6375.patch


 More details here:
 https://github.com/Parquet/parquet-mr/issues/272



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5803) Support CTAS from a non-avro table to an avro table

2014-02-19 Thread Mohammad Kamrul Islam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906217#comment-13906217
 ] 

Mohammad Kamrul Islam commented on HIVE-5803:
-

The linked Jira might solve this problem as well.

 Support CTAS from a non-avro table to an avro table
 ---

 Key: HIVE-5803
 URL: https://issues.apache.org/jira/browse/HIVE-5803
 Project: Hive
  Issue Type: Task
Reporter: Mohammad Kamrul Islam
Assignee: Carl Steinbach

 Hive currently does not work with HQL like :
 CREATE TABLE AVRO-BASE-TABLE as SELECT * from NON_AVRO_TABLE;
 Actual it works successfully. But when I run SELECT * from 
 AVRO-BASED-TABLE .. it fails.
 This JIRA depends on HIVE-3159 that translates TypeInfo to Avro schema.
 Findings so far: CTAS uses internal column names (in place of using the 
 column names provided in select) when crating the AVRO data file. In other 
 words, avro data file has column names in this form  of: _col0, _col1 where 
 as table column names are different.
 I tested with the following test cases and it failed:
 - verify 1) can create table using create table as select from non-avro table 
 2) LOAD avro data into new table and read data from the new table
 CREATE TABLE simple_kv_txt (key STRING, value STRING) STORED AS TEXTFILE;
 DESCRIBE simple_kv_txt;
 LOAD DATA LOCAL INPATH '../data/files/kv1.txt' INTO TABLE simple_kv_txt;
 SELECT * FROM simple_kv_txt ORDER BY KEY;
 CREATE TABLE copy_doctors ROW FORMAT SERDE 
 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' as SELECT key 
 as key, value as value FROM simple_kv_txt;
 DESCRIBE copy_doctors;
 SELECT * FROM copy_doctors;
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-6375) Fix CTAS for parquet

2014-02-19 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906225#comment-13906225
 ] 

Szehon Ho commented on HIVE-6375:
-

Yea, looks like a similar issue.

 Fix CTAS for parquet
 

 Key: HIVE-6375
 URL: https://issues.apache.org/jira/browse/HIVE-6375
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Brock Noland
Assignee: Szehon Ho
Priority: Critical
  Labels: Parquet
 Attachments: HIVE-6375.2.patch, HIVE-6375.patch


 More details here:
 https://github.com/Parquet/parquet-mr/issues/272



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


Review Request 18291: Add support for pluggable authentication modules (PAM) in HiveServer2

2014-02-19 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18291/
---

Review request for hive and Thejas Nair.


Bugs: HIVE-6466
https://issues.apache.org/jira/browse/HIVE-6466


Repository: hive-git


Description
---

Refer the jira: https://issues.apache.org/jira/browse/HIVE-6466


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 
  pom.xml 9aef665 
  service/pom.xml b1002e2 
  
service/src/java/org/apache/hive/service/auth/AuthenticationProviderFactory.java
 b92fd83 
  
service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/18291/diff/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Updated] (HIVE-6466) Add support for pluggable authentication modules (PAM) in HiveServer2

2014-02-19 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-6466:
---

Attachment: HIVE-6466.1.patch

Rb link: https://reviews.apache.org/r/18291

 Add support for pluggable authentication modules (PAM) in HiveServer2
 -

 Key: HIVE-6466
 URL: https://issues.apache.org/jira/browse/HIVE-6466
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2
Affects Versions: 0.13.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta
 Fix For: 0.13.0

 Attachments: HIVE-6466.1.patch


 More on PAM in these articles:
 http://www.tuxradar.com/content/how-pam-works
 https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html
  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HIVE-5636) Introduce getPartitionColumns() functionality from HCatInputFormat

2014-02-19 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906251#comment-13906251
 ] 

Sushanth Sowmyan commented on HIVE-5636:


Note : None of the reported failures are due to this patch. With Daniel's +1, 
I'm going to go ahead and commit.

 Introduce getPartitionColumns() functionality from HCatInputFormat
 --

 Key: HIVE-5636
 URL: https://issues.apache.org/jira/browse/HIVE-5636
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.12.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-5636.2.patch, HIVE-5636.patch


 As of HCat 0.5, we made the class InputJobInfo private for hcatalog use only, 
 and we made it so that setInput would not modify the InputJobInfo being 
 passed in.
 However, if a user of HCatInputFormat wants to get what Partitioning columns 
 or Data columns exist for the job, they are not able to do so directly from 
 HCatInputFormat and are forced to use InputJobInfo, which currently does not 
 work. Thus, we need to expose this functionality.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HIVE-6467) Metastore DBS.OWNER_TYPE value got spaces at the end

2014-02-19 Thread Jason Dere (JIRA)
Jason Dere created HIVE-6467:


 Summary: Metastore DBS.OWNER_TYPE  value got spaces at the end
 Key: HIVE-6467
 URL: https://issues.apache.org/jira/browse/HIVE-6467
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Jason Dere


Trying to tinker with the metastore upgrade scripts and did the following steps 
on a brand new Derby DB:

From derby:
{noformat}
run 'hive-schema-0.12.0.derby.sql';
run 'upgrade-0.12.0-to-0.13.0.derby.sql';
{noformat}

From Hive:
{noformat}
show tables;
{noformat}

I then hit the following error below.  It appears that in the metastore DBS 
table, the row with defaultdb was created with the value ROLE  , with 
spaces at the end, where it was expecting ROLE.

{noformat}
2014-02-19 14:49:19,824 ERROR metastore.RetryingHMSHandler 
(RetryingHMSHandler.java:invoke(143)) - java.lang.IllegalArgumentException: No 
enum const class org.apache.hadoop.hive.metastore.api.PrincipalType.ROLE  
at java.lang.Enum.valueOf(Enum.java:196)
at 
org.apache.hadoop.hive.metastore.api.PrincipalType.valueOf(PrincipalType.java:14)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
at com.sun.proxy.$Proxy7.getDatabase(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(HiveMetaStore.java:753)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
at com.sun.proxy.$Proxy8.get_database(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:895)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89)
at com.sun.proxy.$Proxy9.getDatabase(Unknown Source)
at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1150)
at 
org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1139)
at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2372)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1566)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1339)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1010)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1000)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424)
at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:687)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


  1   2   >