[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-6345: --- Status: Open (was: Patch Available) the current .4 patch has a merge mistake and introduces regressions. Will update shortly. Add DECIMAL support to vectorized JOIN operators Key: HIVE-6345 URL: https://issues.apache.org/jira/browse/HIVE-6345 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: vectorization Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, HIVE-6345.4.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-6345: --- Attachment: HIVE-6345.5.patch Add DECIMAL support to vectorized JOIN operators Key: HIVE-6345 URL: https://issues.apache.org/jira/browse/HIVE-6345 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: vectorization Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, HIVE-6345.4.patch, HIVE-6345.5.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-6345: --- Status: Patch Available (was: Open) Add DECIMAL support to vectorized JOIN operators Key: HIVE-6345 URL: https://issues.apache.org/jira/browse/HIVE-6345 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: vectorization Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, HIVE-6345.4.patch, HIVE-6345.5.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905108#comment-13905108 ] Remus Rusanu commented on HIVE-6345: Patch .5 fixes the regression cases Add DECIMAL support to vectorized JOIN operators Key: HIVE-6345 URL: https://issues.apache.org/jira/browse/HIVE-6345 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: vectorization Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, HIVE-6345.4.patch, HIVE-6345.5.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
[ https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905374#comment-13905374 ] Hive QA commented on HIVE-6433: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629687/HIVE-6433.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 5106 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_truncate_column_buckets {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1407/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1407/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12629687 SQL std auth - allow grant/revoke roles if user has ADMIN OPTION Key: HIVE-6433 URL: https://issues.apache.org/jira/browse/HIVE-6433 Project: Hive Issue Type: Sub-task Reporter: Thejas M Nair Assignee: Ashutosh Chauhan Attachments: HIVE-6433.patch Follow up jira for HIVE-5952. If a user/role has admin option on a role, then user should be able to grant /revoke other users to/from the role. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6456) Implement Parquet schema evolution
[ https://issues.apache.org/jira/browse/HIVE-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905454#comment-13905454 ] Brock Noland commented on HIVE-6456: [~jcoffey] Yes let's do that since I don't know what a unit test would look like and it will give you some time to work on it. BTW tests pass but JIRA was down when it tried posting the comment: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-1392/execution.txt {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629593/HIVE-6456.patch {color:green}SUCCESS:{color} +1 5133 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1392/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1392/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} Implement Parquet schema evolution -- Key: HIVE-6456 URL: https://issues.apache.org/jira/browse/HIVE-6456 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Brock Noland Priority: Trivial Attachments: HIVE-6456.patch In HIVE-5783 we removed schema evolution: https://github.com/Parquet/parquet-mr/pull/297/files#r9824155 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6456) Implement Parquet schema evolution
[ https://issues.apache.org/jira/browse/HIVE-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905450#comment-13905450 ] Justin Coffey commented on HIVE-6456: - brock and I had the same thought offline. Not sure what the protocol is here: should I open a separate ticket? Implement Parquet schema evolution -- Key: HIVE-6456 URL: https://issues.apache.org/jira/browse/HIVE-6456 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Brock Noland Priority: Trivial Attachments: HIVE-6456.patch In HIVE-5783 we removed schema evolution: https://github.com/Parquet/parquet-mr/pull/297/files#r9824155 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
JIRA was download last night so some precommits did not run
EOM
[jira] [Created] (HIVE-6463) unit test for evoloving schema in parquet files
Justin Coffey created HIVE-6463: --- Summary: unit test for evoloving schema in parquet files Key: HIVE-6463 URL: https://issues.apache.org/jira/browse/HIVE-6463 Project: Hive Issue Type: Test Reporter: Justin Coffey Assignee: Justin Coffey Unit test(s) for patch found in #HIVE-6456 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6456) Implement Parquet schema evolution
[ https://issues.apache.org/jira/browse/HIVE-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905459#comment-13905459 ] Justin Coffey commented on HIVE-6456: - done and linked. Implement Parquet schema evolution -- Key: HIVE-6456 URL: https://issues.apache.org/jira/browse/HIVE-6456 Project: Hive Issue Type: Improvement Reporter: Brock Noland Assignee: Brock Noland Priority: Trivial Attachments: HIVE-6456.patch In HIVE-5783 we removed schema evolution: https://github.com/Parquet/parquet-mr/pull/297/files#r9824155 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HIVE-4413) Parse Exception : character '@' not supported while granting privileges to user in a Secure Cluster through hive client.
[ https://issues.apache.org/jira/browse/HIVE-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HIVE-4413. --- Resolution: Duplicate HIVE-3807 should resolve this (the specific need of @ in secure clusters) Parse Exception : character '@' not supported while granting privileges to user in a Secure Cluster through hive client. Key: HIVE-4413 URL: https://issues.apache.org/jira/browse/HIVE-4413 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0 Reporter: Navin Madathil Labels: cli, hive While running through hive CLI , hive grant command throws a parseException '@' not supported. But in a secure cluster ( Kerberos ) the username is appended with the realmname seperated by the character '@'.Without giving the full username the permissions are not granted to the intended user. grant all on table tablename to user user@REALM -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6461) Run Release Audit tool, fix missing license issues
[ https://issues.apache.org/jira/browse/HIVE-6461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905472#comment-13905472 ] Hive QA commented on HIVE-6461: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629703/HIVE-6461.1.patch {color:green}SUCCESS:{color} +1 5133 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1409/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1409/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12629703 Run Release Audit tool, fix missing license issues -- Key: HIVE-6461 URL: https://issues.apache.org/jira/browse/HIVE-6461 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Priority: Trivial Attachments: HIVE-6461.1.patch run mvn apache-rat:check and add apache license in flagged files. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5275) HiveServer2 should respect hive.aux.jars.path property and add aux jars to distributed cache
[ https://issues.apache.org/jira/browse/HIVE-5275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905490#comment-13905490 ] Brock Noland commented on HIVE-5275: Is this true? I have not observed this myself. HiveServer2 should respect hive.aux.jars.path property and add aux jars to distributed cache Key: HIVE-5275 URL: https://issues.apache.org/jira/browse/HIVE-5275 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Alex Favaro HiveServer2 currently ignores the hive.aux.jars.path property in hive-site.xml. That means that the only way to use a custom SerDe is to add it to AUX_CLASSPATH on the server and manually distribute the jar to the cluster nodes. Hive CLI does this automatically when hive.aux.jars.path is set. It would be nice if HiverServer2 did the same. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.
[ https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6416: --- Status: Open (was: Patch Available) Vectorized mathematical functions for decimal type. --- Key: HIVE-6416 URL: https://issues.apache.org/jira/browse/HIVE-6416 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch Vectorized mathematical functions for decimal type. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.
[ https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6416: --- Attachment: HIVE-6416.3.patch The update patch fixes the review comments and addresses the test failure. Vectorized mathematical functions for decimal type. --- Key: HIVE-6416 URL: https://issues.apache.org/jira/browse/HIVE-6416 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch Vectorized mathematical functions for decimal type. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.
[ https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6416: --- Status: Patch Available (was: Open) Vectorized mathematical functions for decimal type. --- Key: HIVE-6416 URL: https://issues.apache.org/jira/browse/HIVE-6416 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch Vectorized mathematical functions for decimal type. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6359) beeline -f fails on scripts with tabs in them.
[ https://issues.apache.org/jira/browse/HIVE-6359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905529#comment-13905529 ] Xuefu Zhang commented on HIVE-6359: --- [~navis] I just realiazed it's a simple change, but thanks for the review link. +1 beeline -f fails on scripts with tabs in them. -- Key: HIVE-6359 URL: https://issues.apache.org/jira/browse/HIVE-6359 Project: Hive Issue Type: Bug Reporter: Carter Shanklin Assignee: Navis Priority: Minor Attachments: HIVE-6359.1.patch.txt, HIVE-6359.2.patch.txt NO PRECOMMIT TESTS On a recent trunk build I used beeline -f on a script with tabs in it. Beeline rather unhelpfully attempts to perform tab expansion on the tabs and the query fails. Here's a screendump. {code} Connecting to jdbc:hive2://mymachine:1/mydb Connected to: Apache Hive (version 0.13.0-SNAPSHOT) Driver: Hive JDBC (version 0.13.0-SNAPSHOT) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.13.0-SNAPSHOT by Apache Hive 0: jdbc:hive2://mymachine:1/mydb select i_brand_id as brand_id, i_brand as brand, . . . . . . . . . . . . . . . . . . . . . . . Display all 560 possibilities? (y or n) . . . . . . . . . . . . . . . . . . . . . . . ager_id=36 . . . . . . . . . . . . . . . . . . . . . . . Display all 560 possibilities? (y or n) . . . . . . . . . . . . . . . . . . . . . . . d d_moy=12 . . . . . . . . . . . . . . . . . . . . . . . Display all 560 possibilities? (y or n) . . . . . . . . . . . . . . . . . . . . . . . d d_year=2001 . . . . . . . . . . . . . . . . . . . . . . . and ss_sold_date between '2001-12-01' and '2001-12-31' . . . . . . . . . . . . . . . . . . . . . . . group by i_brand, i_brand_id . . . . . . . . . . . . . . . . . . . . . . . order by ext_price desc, brand_id . . . . . . . . . . . . . . . . . . . . . . . limit 100 ; Error: Error while compiling statement: FAILED: ParseException line 1:65 missing FROM at 'd_moy' near 'd' in from source (state=42000,code=4) Closing: org.apache.hive.jdbc.HiveConnection {code} The same query works fine if I replace tabs with some spaces. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6375) Fix CTAS for parquet
[ https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905581#comment-13905581 ] Hive QA commented on HIVE-6375: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629690/HIVE-6375.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5134 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ctas_hadoop20 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_ctas {noformat} Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1410/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1410/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12629690 Fix CTAS for parquet Key: HIVE-6375 URL: https://issues.apache.org/jira/browse/HIVE-6375 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Brock Noland Assignee: Szehon Ho Priority: Critical Labels: Parquet Attachments: HIVE-6375.patch More details here: https://github.com/Parquet/parquet-mr/issues/272 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5958) SQL std auth - authorize statements that work with paths
[ https://issues.apache.org/jira/browse/HIVE-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5958: Attachment: HIVE-5958.7.patch HIVE-5958.7.patch - fixes the locking timeout issue in TestPermsGrp.testCustomPerms ,TestJdbcWithMiniHS2.testURIDatabaseName and TestHiveServer2.testConnection . There tests attempted to disable locking but were not doing it the right way, I have fixed that in the tests. When db got added as output in create table command in this patch, the locking had an object to lock and tried to get kicked off. Other tests had failed because I didn't generate the patch with git diff -a , and some file q.out got treated as binary files. SQL std auth - authorize statements that work with paths Key: HIVE-5958 URL: https://issues.apache.org/jira/browse/HIVE-5958 Project: Hive Issue Type: Sub-task Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-5958.1.patch, HIVE-5958.2.patch, HIVE-5958.3.patch, HIVE-5958.4.patch, HIVE-5958.5.patch, HIVE-5958.6.patch, HIVE-5958.7.patch Original Estimate: 72h Remaining Estimate: 72h Statement such as create table, alter table that specify an path uri should be allowed under the new authorization scheme only if URI(Path) specified has permissions including read/write and ownership of the file/dir and its children. Also, fix issue of database not getting set as output for create-table. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried
Thejas M Nair created HIVE-6464: --- Summary: Test configuration: reduce the duration for which lock attempts are retried Key: HIVE-6464 URL: https://issues.apache.org/jira/browse/HIVE-6464 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Attachments: HIVE-6464.1.patch Lock attempts are being done for 60 seconds * 100 before it gives up. Most tests attempt to disable locking but sometimes don't do it correctly and changes can cause the locking to kick in. Locking fails, (at least in the HS2 related tests) because of problems in creating the zookeeper entries in test mode. When locking attempt kicks in and that fails, it can end up waiting for 6000 seconds before failing. As the tests are not trying to test parallel locking, there is no reason to wait this long in the tests. We should update hive-site.xml used by tests for smaller duration. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried
[ https://issues.apache.org/jira/browse/HIVE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair reassigned HIVE-6464: --- Assignee: Thejas M Nair Test configuration: reduce the duration for which lock attempts are retried --- Key: HIVE-6464 URL: https://issues.apache.org/jira/browse/HIVE-6464 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6464.1.patch Lock attempts are being done for 60 seconds * 100 before it gives up. Most tests attempt to disable locking but sometimes don't do it correctly and changes can cause the locking to kick in. Locking fails, (at least in the HS2 related tests) because of problems in creating the zookeeper entries in test mode. When locking attempt kicks in and that fails, it can end up waiting for 6000 seconds before failing. As the tests are not trying to test parallel locking, there is no reason to wait this long in the tests. We should update hive-site.xml used by tests for smaller duration. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6464) Test configuration: reduce the duration for which lock attempts are retried
[ https://issues.apache.org/jira/browse/HIVE-6464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6464: Attachment: HIVE-6464.1.patch Test configuration: reduce the duration for which lock attempts are retried --- Key: HIVE-6464 URL: https://issues.apache.org/jira/browse/HIVE-6464 Project: Hive Issue Type: Bug Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6464.1.patch Lock attempts are being done for 60 seconds * 100 before it gives up. Most tests attempt to disable locking but sometimes don't do it correctly and changes can cause the locking to kick in. Locking fails, (at least in the HS2 related tests) because of problems in creating the zookeeper entries in test mode. When locking attempt kicks in and that fails, it can end up waiting for 6000 seconds before failing. As the tests are not trying to test parallel locking, there is no reason to wait this long in the tests. We should update hive-site.xml used by tests for smaller duration. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18250: SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18250/#review34869 --- ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java https://reviews.apache.org/r/18250/#comment65253 We need to pass the roleNames argument to this function and check that user has admin option on these roles. For example the role in grant-role could be role A while current role is role B. The check is happening now on role B only. What should we do if a user a member with admin option of role Y , because it belongs to role X and role X has admin option on Y? Should we check that X is in the current role in that case? I guess so, that will make it consistent with rest of the current role behavior. ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java https://reviews.apache.org/r/18250/#comment65252 ADMIN_ONLY_MSG is not the right message with this change. For the grant/revoke roles statements, we should change it to : ADMIN_ONLY_MSG + HAS_ADMIN_PRIV_MSG - Thejas Nair On Feb. 19, 2014, 3:31 a.m., Ashutosh Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18250/ --- (Updated Feb. 19, 2014, 3:31 a.m.) Review request for hive. Bugs: HIVE-6433 https://issues.apache.org/jira/browse/HIVE-6433 Repository: hive-git Description --- SQL std auth - allow grant/revoke roles if user has ADMIN OPTION Diffs - ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java c1afaee ql/src/test/queries/clientpositive/authorization_role_grant2.q PRE-CREATION ql/src/test/results/clientpositive/authorization_role_grant2.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18250/diff/ Testing --- Added new test Thanks, Ashutosh Chauhan
[jira] [Commented] (HIVE-5926) Load Data OverWrite Into Table Throw org.apache.hadoop.hive.ql.metadata.HiveException
[ https://issues.apache.org/jira/browse/HIVE-5926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905683#comment-13905683 ] Xuefu Zhang commented on HIVE-5926: --- [~tianyi] I'm not sure if you're still working on this issue, but would you like moving this forward? Thanks. Load Data OverWrite Into Table Throw org.apache.hadoop.hive.ql.metadata.HiveException - Key: HIVE-5926 URL: https://issues.apache.org/jira/browse/HIVE-5926 Project: Hive Issue Type: Bug Components: Database/Schema Affects Versions: 0.12.0 Environment: OS: Red Hat Enterprise Linux Server release 6.2 HDFS: CDH-4.2.1 MAPRED: CDH-4.2.1-mr1 Reporter: Yi Tian Assignee: Yi Tian Attachments: HIVE-5926.patch step1: create table step2: load data load data inpath '/tianyi/usys_etl_map_total.del' overwrite into table tianyi_test3 step3: copy file back hadoop fs -cp /user/hive/warehouse/tianyi_test3/usys_etl_map_total.del /tianyi step4: load data again load data inpath '/tianyi/usys_etl_map_total.del' overwrite into table tianyi_test3 here we can see the error in console: Failed with exception Error moving: hdfs://ocdccluster/tianyi/usys_etl_map_total.del into: /user/hive/warehouse/tianyi_test3/usys_etl_map_total.del FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask we can find error detail in hive.log: 2013-12-03 17:26:41,717 ERROR exec.Task (SessionState.java:printError(419)) - Failed with exception Error moving: hdfs://ocdccluster/tianyi/usys_etl_map_total.del into: /user/hive/warehouse/tianyi_test3/usys_etl_map_total.del org.apache.hadoop.hive.ql.metadata.HiveException: Error moving: hdfs://ocdccluster/tianyi/usys_etl_map_total.del into: /user/hive/warehouse/tianyi_test3/usys_etl_map_total.del at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2323) at org.apache.hadoop.hive.ql.metadata.Table.replaceFiles(Table.java:639) at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1441) at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:283) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:675) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: java.io.IOException: Error moving: hdfs://ocdccluster/tianyi/usys_etl_map_total.del into: /user/hive/warehouse/tianyi_test3/usys_etl_map_total.del at org.apache.hadoop.hive.ql.metadata.Hive.replaceFiles(Hive.java:2317) ... 20 more 2013-12-03 17:26:41,718 ERROR ql.Driver (SessionState.java:printError(419)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-4501) HS2 memory leak - FileSystem objects in FileSystem.CACHE
[ https://issues.apache.org/jira/browse/HIVE-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905687#comment-13905687 ] Abin Shahab commented on HIVE-4501: --- What is the progress on this issue? HS2 memory leak - FileSystem objects in FileSystem.CACHE Key: HIVE-4501 URL: https://issues.apache.org/jira/browse/HIVE-4501 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-4501.1.patch, HIVE-4501.1.patch, HIVE-4501.1.patch, HIVE-4501.trunk.patch org.apache.hadoop.fs.FileSystem objects are getting accumulated in FileSystem.CACHE, with HS2 in unsecure mode. As a workaround, it is possible to set fs.hdfs.impl.disable.cache and fs.file.impl.disable.cache to true. Users should not have to bother with this extra configuration. As a workaround disable impersonation by setting hive.server2.enable.doAs to false. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6405) Support append feature for HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6405: --- Status: Open (was: Patch Available) Support append feature for HCatalog --- Key: HIVE-6405 URL: https://issues.apache.org/jira/browse/HIVE-6405 Project: Hive Issue Type: Bug Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6405.patch HCatalog currently treats all tables as immutable - i.e. all tables and partitions can be written to only once, and not appended. The nuances of what this means is as follows: * A non-partitioned table can be written to, and data in it is never updated from then on unless you drop and recreate. * A partitioned table may support appending of a sort in a manner by adding new partitions to the table, but once written, the partitions themselves cannot have any new data added to them. Hive, on the other hand, does allow us to INSERT INTO into a table, thus allowing us append semantics. There is benefit to both of these models, and so, our goal is as follows: a) Introduce a notion of an immutable table, wherein all tables are not immutable by default, and have this be a table property. If this property is set for a table, and we attempt to write to a table that already has data (or a partition), disallow INSERT INTO into it from hive. This property being set will allow hive to mimic HCatalog's current immutable-table property. (I'm going to create a separate sub-task to cover this bit, and focus on the HCatalog-side on this jira) b) As long as that flag is not set, HCatalog should be changed to allow appends into it as well, and not simply error out if data already exists in a table. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6405) Support append feature for HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6405: --- Status: Patch Available (was: Open) Support append feature for HCatalog --- Key: HIVE-6405 URL: https://issues.apache.org/jira/browse/HIVE-6405 Project: Hive Issue Type: Bug Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6405.patch HCatalog currently treats all tables as immutable - i.e. all tables and partitions can be written to only once, and not appended. The nuances of what this means is as follows: * A non-partitioned table can be written to, and data in it is never updated from then on unless you drop and recreate. * A partitioned table may support appending of a sort in a manner by adding new partitions to the table, but once written, the partitions themselves cannot have any new data added to them. Hive, on the other hand, does allow us to INSERT INTO into a table, thus allowing us append semantics. There is benefit to both of these models, and so, our goal is as follows: a) Introduce a notion of an immutable table, wherein all tables are not immutable by default, and have this be a table property. If this property is set for a table, and we attempt to write to a table that already has data (or a partition), disallow INSERT INTO into it from hive. This property being set will allow hive to mimic HCatalog's current immutable-table property. (I'm going to create a separate sub-task to cover this bit, and focus on the HCatalog-side on this jira) b) As long as that flag is not set, HCatalog should be changed to allow appends into it as well, and not simply error out if data already exists in a table. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6405) Support append feature for HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6405: --- Attachment: HIVE-6405.patch Attaching patch, this depends on HIVE-6406 being patched in. Support append feature for HCatalog --- Key: HIVE-6405 URL: https://issues.apache.org/jira/browse/HIVE-6405 Project: Hive Issue Type: Bug Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6405.patch HCatalog currently treats all tables as immutable - i.e. all tables and partitions can be written to only once, and not appended. The nuances of what this means is as follows: * A non-partitioned table can be written to, and data in it is never updated from then on unless you drop and recreate. * A partitioned table may support appending of a sort in a manner by adding new partitions to the table, but once written, the partitions themselves cannot have any new data added to them. Hive, on the other hand, does allow us to INSERT INTO into a table, thus allowing us append semantics. There is benefit to both of these models, and so, our goal is as follows: a) Introduce a notion of an immutable table, wherein all tables are not immutable by default, and have this be a table property. If this property is set for a table, and we attempt to write to a table that already has data (or a partition), disallow INSERT INTO into it from hive. This property being set will allow hive to mimic HCatalog's current immutable-table property. (I'm going to create a separate sub-task to cover this bit, and focus on the HCatalog-side on this jira) b) As long as that flag is not set, HCatalog should be changed to allow appends into it as well, and not simply error out if data already exists in a table. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6405) Support append feature for HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6405: --- Release Note: Introduces append feature for HCatalog writes. Previously, if an unpartitioned table had data in it, or if a partition in a partitioned table had data in it, or if the partition even existed, HCat would fail if a user attempted to write to them. Now, that behaviour is extended so that the strict behaviour exists only if the table in question has a parameter immutable set to true (see HIVE-6406). With this patch, we can append to existing partitions or non-partitioned tables that already have data in them, as long as the new data being written is compatible to the old data (i.e. one cannot mix fileformats when attempting an append) As a further note, append is currently not compatible with dynamic partitioning, and a dynamic partitioning job is still unable to append to a table, even if it is a mutable table. Status: Patch Available (was: Open) Support append feature for HCatalog --- Key: HIVE-6405 URL: https://issues.apache.org/jira/browse/HIVE-6405 Project: Hive Issue Type: Bug Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6405.patch HCatalog currently treats all tables as immutable - i.e. all tables and partitions can be written to only once, and not appended. The nuances of what this means is as follows: * A non-partitioned table can be written to, and data in it is never updated from then on unless you drop and recreate. * A partitioned table may support appending of a sort in a manner by adding new partitions to the table, but once written, the partitions themselves cannot have any new data added to them. Hive, on the other hand, does allow us to INSERT INTO into a table, thus allowing us append semantics. There is benefit to both of these models, and so, our goal is as follows: a) Introduce a notion of an immutable table, wherein all tables are not immutable by default, and have this be a table property. If this property is set for a table, and we attempt to write to a table that already has data (or a partition), disallow INSERT INTO into it from hive. This property being set will allow hive to mimic HCatalog's current immutable-table property. (I'm going to create a separate sub-task to cover this bit, and focus on the HCatalog-side on this jira) b) As long as that flag is not set, HCatalog should be changed to allow appends into it as well, and not simply error out if data already exists in a table. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905701#comment-13905701 ] Hive QA commented on HIVE-6345: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629743/HIVE-6345.5.patch {color:green}SUCCESS:{color} +1 5147 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1411/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1411/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12629743 Add DECIMAL support to vectorized JOIN operators Key: HIVE-6345 URL: https://issues.apache.org/jira/browse/HIVE-6345 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: vectorization Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, HIVE-6345.4.patch, HIVE-6345.5.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6345: --- Resolution: Fixed Release Note: Committed to trunk. Thanks to Remus!! Status: Resolved (was: Patch Available) Add DECIMAL support to vectorized JOIN operators Key: HIVE-6345 URL: https://issues.apache.org/jira/browse/HIVE-6345 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: vectorization Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, HIVE-6345.4.patch, HIVE-6345.5.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6345: --- Fix Version/s: 0.13.0 Add DECIMAL support to vectorized JOIN operators Key: HIVE-6345 URL: https://issues.apache.org/jira/browse/HIVE-6345 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: vectorization Fix For: 0.13.0 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, HIVE-6345.4.patch, HIVE-6345.5.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905750#comment-13905750 ] Jitendra Nath Pandey commented on HIVE-6345: Committed to trunk. Thanks to Remus!! Add DECIMAL support to vectorized JOIN operators Key: HIVE-6345 URL: https://issues.apache.org/jira/browse/HIVE-6345 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: vectorization Fix For: 0.13.0 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, HIVE-6345.4.patch, HIVE-6345.5.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6345) Add DECIMAL support to vectorized JOIN operators
[ https://issues.apache.org/jira/browse/HIVE-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6345: --- Release Note: (was: Committed to trunk. Thanks to Remus!!) Add DECIMAL support to vectorized JOIN operators Key: HIVE-6345 URL: https://issues.apache.org/jira/browse/HIVE-6345 Project: Hive Issue Type: Sub-task Components: Query Processor Reporter: Remus Rusanu Assignee: Remus Rusanu Labels: vectorization Fix For: 0.13.0 Attachments: HIVE-6345.2.patch, HIVE-6345.3.patch, HIVE-6345.3.patch, HIVE-6345.4.patch, HIVE-6345.5.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18254: HIVE-6375 Implement CTAS and column rename for parquet
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18254/#review34879 --- ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java https://reviews.apache.org/r/18254/#comment65260 This line seems to be a dupe of line 5665. - Xuefu Zhang On Feb. 19, 2014, 12:42 a.m., Szehon Ho wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18254/ --- (Updated Feb. 19, 2014, 12:42 a.m.) Review request for hive. Bugs: HIVE-6375 https://issues.apache.org/jira/browse/HIVE-6375 Repository: hive-git Description --- There is a Hive bug in SemanticAnalyzer that chooses different names for columns in the CreateTable task and the FileSink task. columnInfo.getInternalName() was used in one place, and fieldSchema still used columnInfo.getAlias() if it is available. This change makes both consistent, favoring columnInfo.getAlias if it is available. This is not revealed before because other file-formats like RcFile seem to use column-ordinal position, and Avro file stores the schema separately altogether. Diffs - ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd ql/src/test/queries/clientpositive/parquet_ctas.q PRE-CREATION ql/src/test/results/clientpositive/ctas.q.out 9668855 ql/src/test/results/clientpositive/parquet_ctas.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18254/diff/ Testing --- Added parquet_ctas.q. Covers cases where column name is gotten directly from input table (implied alias), where name is auto-generated, where name is specified as alias, and a mix of the three. Thanks, Szehon Ho
[jira] [Commented] (HIVE-6375) Fix CTAS for parquet
[ https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905769#comment-13905769 ] Xuefu Zhang commented on HIVE-6375: --- Patch looks good. Minor comment on RB. The test diff needs to be fixed. Fix CTAS for parquet Key: HIVE-6375 URL: https://issues.apache.org/jira/browse/HIVE-6375 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Brock Noland Assignee: Szehon Ho Priority: Critical Labels: Parquet Attachments: HIVE-6375.patch More details here: https://github.com/Parquet/parquet-mr/issues/272 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18250: SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
On Feb. 19, 2014, 4:31 p.m., Thejas Nair wrote: ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java, line 278 https://reviews.apache.org/r/18250/diff/2/?file=497456#file497456line278 We need to pass the roleNames argument to this function and check that user has admin option on these roles. For example the role in grant-role could be role A while current role is role B. The check is happening now on role B only. What should we do if a user a member with admin option of role Y , because it belongs to role X and role X has admin option on Y? Should we check that X is in the current role in that case? I guess so, that will make it consistent with rest of the current role behavior. Lets say, user X has an admin option on role A. User X now wants to grant role A to user B. IMO, user X's current role should be A. He shouldn't be allowed to grant role A to user B, if his current role is C. Currently is that is whats implemented. It seems you are suggesting that user X should be allowed to grant role A to user B, even if his current role is C. To me, this seems counter intuitive. Not sure what does standard says here. - Ashutosh --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18250/#review34869 --- On Feb. 19, 2014, 3:31 a.m., Ashutosh Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18250/ --- (Updated Feb. 19, 2014, 3:31 a.m.) Review request for hive. Bugs: HIVE-6433 https://issues.apache.org/jira/browse/HIVE-6433 Repository: hive-git Description --- SQL std auth - allow grant/revoke roles if user has ADMIN OPTION Diffs - ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java c1afaee ql/src/test/queries/clientpositive/authorization_role_grant2.q PRE-CREATION ql/src/test/results/clientpositive/authorization_role_grant2.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18250/diff/ Testing --- Added new test Thanks, Ashutosh Chauhan
[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.
[ https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6416: --- Status: Open (was: Patch Available) Vectorized mathematical functions for decimal type. --- Key: HIVE-6416 URL: https://issues.apache.org/jira/browse/HIVE-6416 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch Vectorized mathematical functions for decimal type. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf
[ https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905808#comment-13905808 ] Hive QA commented on HIVE-6459: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629653/HIVE-6459.patch Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1415/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1415/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-1415/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'shims/aggregator/pom.xml' Reverted 'packaging/src/main/assembly/bin.xml' Reverted 'conf/hive-default.xml.template' Reverted 'bin/hive' Reverted 'common/src/java/org/apache/hadoop/hive/conf/HiveConf.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java' Reverted 'ql/pom.xml' ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf target datanucleus.log ant/target shims/target shims/0.20/target shims/0.20S/target shims/0.23/target shims/aggregator/target shims/common/target shims/common-secure/target packaging/target hbase-handler/target testutils/target jdbc/target metastore/target itests/target itests/hcatalog-unit/target itests/test-serde/target itests/qtest/target itests/hive-unit/target itests/custom-serde/target itests/util/target hcatalog/target hcatalog/storage-handlers/hbase/target hcatalog/server-extensions/target hcatalog/core/target hcatalog/webhcat/svr/target hcatalog/webhcat/java-client/target hcatalog/hcatalog-pig-adapter/target hwi/target common/target common/src/gen service/target contrib/target serde/target beeline/target odbc/target cli/target ql/dependency-reduced-pom.xml ql/target ql/src/java/org/apache/hadoop/hive/ql/exec/HiveAuxClasspathBuilder.java ql/src/java/org/apache/hadoop/hive/ql/exec/mr/JarCache.java + svn update Ucommon/src/java/org/apache/hadoop/hive/common/type/UnsignedInt128.java Ucommon/src/java/org/apache/hadoop/hive/common/type/Decimal128.java Acommon/src/java/org/apache/hive/common/util/Decimal128FastBuffer.java A serde/src/test/org/apache/hadoop/hive/serde2/io/TestHiveDecimalWritable.java Userde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java Uql/src/gen/vectorization/UDAFTemplates/VectorUDAFSum.txt Aql/src/gen/vectorization/UDAFTemplates/VectorUDAFVarDecimal.txt Aql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMaxDecimal.txt Uql/src/gen/vectorization/UDAFTemplates/VectorUDAFVar.txt Uql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMax.txt Uql/src/gen/vectorization/UDAFTemplates/VectorUDAFAvg.txt Uql/src/gen/vectorization/UDAFTemplates/VectorUDAFMinMaxString.txt Aql/src/test/queries/clientpositive/vector_decimal_mapjoin.q Aql/src/test/queries/clientpositive/vector_decimal_aggregate.q Aql/src/test/results/clientpositive/vector_decimal_mapjoin.q.out Aql/src/test/results/clientpositive/vector_decimal_aggregate.q.out U ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorGroupByOperator.java U ql/src/test/org/apache/hadoop/hive/ql/exec/vector/util/FakeVectorRowBatchFromObjectIterables.java Uql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java U ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssignFactory.java U ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExpressionDescriptor.java U ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java U ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java U ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java Uql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java Uql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapper.java A ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorUDAFSumDecimal.java A
[jira] [Updated] (HIVE-6330) Metastore support for permanent UDFs
[ https://issues.apache.org/jira/browse/HIVE-6330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-6330: - Attachment: HIVE-6330.8.patch resubmitting patch to run unit tests Metastore support for permanent UDFs Key: HIVE-6330 URL: https://issues.apache.org/jira/browse/HIVE-6330 Project: Hive Issue Type: Sub-task Components: UDF Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-6330.1.patch, HIVE-6330.2.patch, HIVE-6330.3.patch, HIVE-6330.4.patch, HIVE-6330.5.patch, HIVE-6330.6.patch, HIVE-6330.7.patch, HIVE-6330.8.patch Allow CREATE FUNCTION to add metastore entry for the created function, so that it only needs to be added to Hive once. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.
[ https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6416: --- Attachment: HIVE-6416.4.patch Vectorized mathematical functions for decimal type. --- Key: HIVE-6416 URL: https://issues.apache.org/jira/browse/HIVE-6416 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch, HIVE-6416.4.patch Vectorized mathematical functions for decimal type. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6416) Vectorized mathematical functions for decimal type.
[ https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905862#comment-13905862 ] Jitendra Nath Pandey commented on HIVE-6416: Patch re-based against the latest trunk. Vectorized mathematical functions for decimal type. --- Key: HIVE-6416 URL: https://issues.apache.org/jira/browse/HIVE-6416 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch, HIVE-6416.4.patch Vectorized mathematical functions for decimal type. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6416) Vectorized mathematical functions for decimal type.
[ https://issues.apache.org/jira/browse/HIVE-6416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HIVE-6416: --- Status: Patch Available (was: Open) Vectorized mathematical functions for decimal type. --- Key: HIVE-6416 URL: https://issues.apache.org/jira/browse/HIVE-6416 Project: Hive Issue Type: Sub-task Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HIVE-6416.1.patch, HIVE-6416.2.patch, HIVE-6416.3.patch, HIVE-6416.4.patch Vectorized mathematical functions for decimal type. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5994) ORC RLEv2 encodes wrongly for large negative BIGINTs (64 bits )
[ https://issues.apache.org/jira/browse/HIVE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905871#comment-13905871 ] Prasanth J commented on HIVE-5994: -- Puneeth, This issue can happen with large positive values as well. The reason being when the number of repetitions of large number is 3 and =10 SHORT_REPEAT encoding is used. https://github.com/apache/hive/blob/branch-0.12/ql/src/java/org/apache/hadoop/hive/ql/io/orc/RunLengthIntegerWriterV2.java#L35 This encoding zigzag encodes the repeating value. So in your case when 470327563395383L is zigzag encoded, the MSB bit (64th) is set which will be considered as a negative value according to this bug. I tested your test case with trunk and it works fine. Applying the patch attached in this JIRA should also work. ORC RLEv2 encodes wrongly for large negative BIGINTs (64 bits ) Key: HIVE-5994 URL: https://issues.apache.org/jira/browse/HIVE-5994 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Fix For: 0.13.0 Attachments: HIVE-5994.1.patch For large negative BIGINTs, zigzag encoding will yield large value (64bit value) with MSB set to 1. This value is interpreted as negative value in SerializationUtils.findClosestNumBits(long value) function. This resulted in wrong computation of total number of bits required which results in wrong encoding/decoding of values. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6344) Add DECIMAL support to vectorized group by operator
[ https://issues.apache.org/jira/browse/HIVE-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Remus Rusanu updated HIVE-6344: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) The patch for HIVE-6345 contains the fix for HIVE-6344 too, as much of the code was common. Add DECIMAL support to vectorized group by operator --- Key: HIVE-6344 URL: https://issues.apache.org/jira/browse/HIVE-6344 Project: Hive Issue Type: Sub-task Reporter: Remus Rusanu Assignee: Remus Rusanu Fix For: 0.13.0 Attachments: HIVE-6344.1.patch -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6455) Scalable dynamic partitioning and bucketing optimization
[ https://issues.apache.org/jira/browse/HIVE-6455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-6455: - Attachment: HIVE-6455.4.patch Reuploading patch as HIVE QA did not run yesterday. Scalable dynamic partitioning and bucketing optimization Key: HIVE-6455 URL: https://issues.apache.org/jira/browse/HIVE-6455 Project: Hive Issue Type: New Feature Components: Query Processor Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: optimization Attachments: HIVE-6455.1.patch, HIVE-6455.1.patch, HIVE-6455.2.patch, HIVE-6455.3.patch, HIVE-6455.4.patch, HIVE-6455.4.patch The current implementation of dynamic partition works by keeping at least one record writer open per dynamic partition directory. In case of bucketing there can be multispray file writers which further adds up to the number of open record writers. The record writers of column oriented file format (like ORC, RCFile etc.) keeps some sort of in-memory buffers (value buffer or compression buffers) open all the time to buffer up the rows and compress them before flushing it to disk. Since these buffers are maintained per column basis the amount of constant memory that will required at runtime increases as the number of partitions and number of columns per partition increases. This often leads to OutOfMemory (OOM) exception in mappers or reducers depending on the number of open record writers. Users often tune the JVM heapsize (runtime memory) to get over such OOM issues. With this optimization, the dynamic partition columns and bucketing columns (in case of bucketed tables) are sorted before being fed to the reducers. Since the partitioning and bucketing columns are sorted, each reducers can keep only one record writer open at any time thereby reducing the memory pressure on the reducers. This optimization is highly scalable as the number of partition and number of columns per partition increases at the cost of sorting the columns. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf
[ https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905905#comment-13905905 ] Remus Rusanu commented on HIVE-6459: HIVE-6345 just got in, which adds the decimal support for vectorized aggregates, including AVG. Is probably going to conflict with your patch, as vectorized AVG must match the intermediate sum (p,s). If necessary, I will look at your patch tomorrow (I'm on UTC+2) and see how it needs to consider the vectorized aggregate code (it should be a minor change). Change the precison/scale for intermediate sum result in the avg() udf --- Key: HIVE-6459 URL: https://issues.apache.org/jira/browse/HIVE-6459 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-6459.patch The avg() udf, when applied to a decimal column, selects the precision/scale of the intermediate sum field as (p+4, s+4), which is the same for the precision/scale of the avg() result. However, the additional scale increase is unnecessary, and the problem of data overflow may occur. The requested change is that for the intermediate sum result, the precsion/scale is set to (p+10, s), which is consistent to sum() udf. The avg() result still keeps its precision/scale. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf
[ https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905907#comment-13905907 ] Remus Rusanu commented on HIVE-6459: thanks for fixing this btw. Change the precison/scale for intermediate sum result in the avg() udf --- Key: HIVE-6459 URL: https://issues.apache.org/jira/browse/HIVE-6459 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-6459.patch The avg() udf, when applied to a decimal column, selects the precision/scale of the intermediate sum field as (p+4, s+4), which is the same for the precision/scale of the avg() result. However, the additional scale increase is unnecessary, and the problem of data overflow may occur. The requested change is that for the intermediate sum result, the precsion/scale is set to (p+10, s), which is consistent to sum() udf. The avg() result still keeps its precision/scale. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6459) Change the precison/scale for intermediate sum result in the avg() udf
[ https://issues.apache.org/jira/browse/HIVE-6459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-6459: -- Attachment: HIVE-6459.1.patch Patch #1 is rebased with latest trunk. Change the precison/scale for intermediate sum result in the avg() udf --- Key: HIVE-6459 URL: https://issues.apache.org/jira/browse/HIVE-6459 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.13.0 Reporter: Xuefu Zhang Assignee: Xuefu Zhang Attachments: HIVE-6459.1.patch, HIVE-6459.patch The avg() udf, when applied to a decimal column, selects the precision/scale of the intermediate sum field as (p+4, s+4), which is the same for the precision/scale of the avg() result. However, the additional scale increase is unnecessary, and the problem of data overflow may occur. The requested change is that for the intermediate sum result, the precsion/scale is set to (p+10, s), which is consistent to sum() udf. The avg() result still keeps its precision/scale. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support
[ https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905932#comment-13905932 ] Lefty Leverenz commented on HIVE-5317: -- Off topic: This ticket has 100 watchers. Is that a record? Implement insert, update, and delete in Hive with full ACID support --- Key: HIVE-5317 URL: https://issues.apache.org/jira/browse/HIVE-5317 Project: Hive Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: InsertUpdatesinHive.pdf Many customers want to be able to insert, update and delete rows from Hive tables with full ACID support. The use cases are varied, but the form of the queries that should be supported are: * INSERT INTO tbl SELECT … * INSERT INTO tbl VALUES ... * UPDATE tbl SET … WHERE … * DELETE FROM tbl WHERE … * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... * SET TRANSACTION LEVEL … * BEGIN/END TRANSACTION Use Cases * Once an hour, a set of inserts and updates (up to 500k rows) for various dimension tables (eg. customer, inventory, stores) needs to be processed. The dimension tables have primary keys and are typically bucketed and sorted on those keys. * Once a day a small set (up to 100k rows) of records need to be deleted for regulatory compliance. * Once an hour a log of transactions is exported from a RDBS and the fact tables need to be updated (up to 1m rows) to reflect the new data. The transactions are a combination of inserts, updates, and deletes. The table is partitioned and bucketed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-860) Persistent distributed cache
[ https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905960#comment-13905960 ] Brock Noland commented on HIVE-860: --- Hmm. I will take a look. Persistent distributed cache Key: HIVE-860 URL: https://issues.apache.org/jira/browse/HIVE-860 Project: Hive Issue Type: Improvement Affects Versions: 0.12.0 Reporter: Zheng Shao Assignee: Brock Noland Fix For: 0.13.0 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch DistributedCache is shared across multiple jobs, if the hdfs file name is the same. We need to make sure Hive put the same file into the same location every time and do not overwrite if the file content is the same. We can achieve 2 different results: A1. Files added with the same name, timestamp, and md5 in the same session will have a single copy in distributed cache. A2. Filed added with the same name, timestamp, and md5 will have a single copy in distributed cache. A2 has a bigger benefit in sharing but may raise a question on when Hive should clean it up in hdfs. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6422) SQL std auth - revert change for view keyword in grant statement
[ https://issues.apache.org/jira/browse/HIVE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6422: Status: Patch Available (was: Open) SQL std auth - revert change for view keyword in grant statement Key: HIVE-6422 URL: https://issues.apache.org/jira/browse/HIVE-6422 Project: Hive Issue Type: Sub-task Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6422.1.patch SQL standard does not support view keyword in grant statement. HIVE-6181 which was added as part of sql standard changes, needs to be reverted. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6422) SQL std auth - revert change for view keyword in grant statement
[ https://issues.apache.org/jira/browse/HIVE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6422: Attachment: HIVE-6422.1.patch SQL std auth - revert change for view keyword in grant statement Key: HIVE-6422 URL: https://issues.apache.org/jira/browse/HIVE-6422 Project: Hive Issue Type: Sub-task Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6422.1.patch SQL standard does not support view keyword in grant statement. HIVE-6181 which was added as part of sql standard changes, needs to be reverted. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-860) Persistent distributed cache
[ https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13905992#comment-13905992 ] Dmitriy V. Ryaboy commented on HIVE-860: [~brocknoland] please note my and Aniket's comments in PIG-2672 -- the real solution for this is YARN-1492 (which doesn't help existing hadoop installations, granted.. but you should plan on using that for future hadoop versions, as it will be more core, have better sharing across multiple users and tools, etc). Persistent distributed cache Key: HIVE-860 URL: https://issues.apache.org/jira/browse/HIVE-860 Project: Hive Issue Type: Improvement Affects Versions: 0.12.0 Reporter: Zheng Shao Assignee: Brock Noland Fix For: 0.13.0 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch DistributedCache is shared across multiple jobs, if the hdfs file name is the same. We need to make sure Hive put the same file into the same location every time and do not overwrite if the file content is the same. We can achieve 2 different results: A1. Files added with the same name, timestamp, and md5 in the same session will have a single copy in distributed cache. A2. Filed added with the same name, timestamp, and md5 will have a single copy in distributed cache. A2 has a bigger benefit in sharing but may raise a question on when Hive should clean it up in hdfs. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6037) Synchronize HiveConf with hive-default.xml.template and support show conf
[ https://issues.apache.org/jira/browse/HIVE-6037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906031#comment-13906031 ] Ravi Prakash commented on HIVE-6037: I'm new to this so apologies in advance if I didn't do something right. Reverting the commit has not helped. I still see the same error as in https://issues.apache.org/jira/browse/HIVE-6037?focusedCommentId=13904349page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13904349 . I checked out the commit before this (5893677435f165bee81d1c5be4300321f9bf47fb) and it built fine. Synchronize HiveConf with hive-default.xml.template and support show conf - Key: HIVE-6037 URL: https://issues.apache.org/jira/browse/HIVE-6037 Project: Hive Issue Type: Improvement Components: Configuration Reporter: Navis Assignee: Navis Priority: Minor Fix For: 0.13.0 Attachments: CHIVE-6037.3.patch.txt, HIVE-6037.1.patch.txt, HIVE-6037.10.patch.txt, HIVE-6037.11.patch.txt, HIVE-6037.12.patch.txt, HIVE-6037.2.patch.txt, HIVE-6037.4.patch.txt, HIVE-6037.5.patch.txt, HIVE-6037.6.patch.txt, HIVE-6037.7.patch.txt, HIVE-6037.8.patch.txt, HIVE-6037.9.patch.txt, HIVE-6037.patch see HIVE-5879 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-860) Persistent distributed cache
[ https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906032#comment-13906032 ] Brock Noland commented on HIVE-860: --- Thank you much for bring that up! ! Yes I noted that earlier as well and should have mentioned that here. Persistent distributed cache Key: HIVE-860 URL: https://issues.apache.org/jira/browse/HIVE-860 Project: Hive Issue Type: Improvement Affects Versions: 0.12.0 Reporter: Zheng Shao Assignee: Brock Noland Fix For: 0.13.0 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch DistributedCache is shared across multiple jobs, if the hdfs file name is the same. We need to make sure Hive put the same file into the same location every time and do not overwrite if the file content is the same. We can achieve 2 different results: A1. Files added with the same name, timestamp, and md5 in the same session will have a single copy in distributed cache. A2. Filed added with the same name, timestamp, and md5 will have a single copy in distributed cache. A2 has a bigger benefit in sharing but may raise a question on when Hive should clean it up in hdfs. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-860) Persistent distributed cache
[ https://issues.apache.org/jira/browse/HIVE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-860: -- Attachment: HIVE-860.patch This should fix it. The problem is that when ExecDriver is run as main() the instance variable conf is null and has to be ignored as opposed to job. Persistent distributed cache Key: HIVE-860 URL: https://issues.apache.org/jira/browse/HIVE-860 Project: Hive Issue Type: Improvement Affects Versions: 0.12.0 Reporter: Zheng Shao Assignee: Brock Noland Fix For: 0.13.0 Attachments: HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch, HIVE-860.patch DistributedCache is shared across multiple jobs, if the hdfs file name is the same. We need to make sure Hive put the same file into the same location every time and do not overwrite if the file content is the same. We can achieve 2 different results: A1. Files added with the same name, timestamp, and md5 in the same session will have a single copy in distributed cache. A2. Filed added with the same name, timestamp, and md5 will have a single copy in distributed cache. A2 has a bigger benefit in sharing but may raise a question on when Hive should clean it up in hdfs. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18200: HIVE-860 - Persistent distributed cache
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18200/ --- (Updated Feb. 19, 2014, 8:35 p.m.) Review request for hive. Changes --- Latest update. Bugs: HIVE-860 https://issues.apache.org/jira/browse/HIVE-860 Repository: hive-git Description --- Caches auxiliary jars and remote runtime jars in /user/$user/.hiveJars by their sha1 hash. This results in: 1) faster queries 2) less distributed cache churn 3) a smaller/cleaner hive-exec jar Diffs (updated) - bin/hive 3bd949f common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 conf/hive-default.xml.template 0d08aa2 packaging/src/main/assembly/bin.xml a97ef7d ql/pom.xml 53d0b9e ql/src/java/org/apache/hadoop/hive/ql/exec/HiveAuxClasspathBuilder.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 288da8e ql/src/java/org/apache/hadoop/hive/ql/exec/mr/JarCache.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java 326654f shims/aggregator/pom.xml 7aa8c4c Diff: https://reviews.apache.org/r/18200/diff/ Testing --- Tested manually on a cluster. Thanks, Brock Noland
[jira] [Commented] (HIVE-6406) Introduce immutable-table table property and if set, disallow insert-into
[ https://issues.apache.org/jira/browse/HIVE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906047#comment-13906047 ] Ashutosh Chauhan commented on HIVE-6406: +1 Since this a new protection mode, in addition to existing ones (like NO_DROP, OFFLINE) it make sense to have this new mode supported via syntax like earlier. Thats only a syntactic sugar, which could be done in a follow-up. Introduce immutable-table table property and if set, disallow insert-into - Key: HIVE-6406 URL: https://issues.apache.org/jira/browse/HIVE-6406 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6406.2.patch, HIVE-6406.3.patch, HIVE-6406.patch As part of HIVE-6405's attempt to make HCatalog and Hive behave in similar ways with regards to immutable tables, this is a companion task to introduce the notion of an immutable table, wherein all tables are not immutable by default, and have this be a table property. If this property is set for a table, and we attempt to write to a table that already has data (or a partition), disallow INSERT INTO into it from hive(if destination directory is non-empty). This property being set will allow hive to mimic HCatalog's current immutable-table property. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6422) SQL std auth - revert change for view keyword in grant statement
[ https://issues.apache.org/jira/browse/HIVE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906050#comment-13906050 ] Ashutosh Chauhan commented on HIVE-6422: It will be good to retain the test case here. SQL std auth - revert change for view keyword in grant statement Key: HIVE-6422 URL: https://issues.apache.org/jira/browse/HIVE-6422 Project: Hive Issue Type: Sub-task Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6422.1.patch SQL standard does not support view keyword in grant statement. HIVE-6181 which was added as part of sql standard changes, needs to be reverted. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6405) Support append feature for HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906058#comment-13906058 ] Sushanth Sowmyan commented on HIVE-6405: After a brief discussion with Ashutosh on the nature of HIVE-6406, I'm going to create another task to add ql grammar to support modification of the immutability property in a manner similar to existing grammar for NO_DROP/OFFLINE, so that this can be treated as another kind of data protection, and so that users will not have to deal with explicitly modifying TBLPROPERTIES. Support append feature for HCatalog --- Key: HIVE-6405 URL: https://issues.apache.org/jira/browse/HIVE-6405 Project: Hive Issue Type: Bug Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6405.patch HCatalog currently treats all tables as immutable - i.e. all tables and partitions can be written to only once, and not appended. The nuances of what this means is as follows: * A non-partitioned table can be written to, and data in it is never updated from then on unless you drop and recreate. * A partitioned table may support appending of a sort in a manner by adding new partitions to the table, but once written, the partitions themselves cannot have any new data added to them. Hive, on the other hand, does allow us to INSERT INTO into a table, thus allowing us append semantics. There is benefit to both of these models, and so, our goal is as follows: a) Introduce a notion of an immutable table, wherein all tables are not immutable by default, and have this be a table property. If this property is set for a table, and we attempt to write to a table that already has data (or a partition), disallow INSERT INTO into it from hive. This property being set will allow hive to mimic HCatalog's current immutable-table property. (I'm going to create a separate sub-task to cover this bit, and focus on the HCatalog-side on this jira) b) As long as that flag is not set, HCatalog should be changed to allow appends into it as well, and not simply error out if data already exists in a table. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6465) Introduce ql grammar for immutability property
Sushanth Sowmyan created HIVE-6465: -- Summary: Introduce ql grammar for immutability property Key: HIVE-6465 URL: https://issues.apache.org/jira/browse/HIVE-6465 Project: Hive Issue Type: Sub-task Reporter: Sushanth Sowmyan -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6465) Introduce ql grammar for immutability property
[ https://issues.apache.org/jira/browse/HIVE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-6465: --- Description: HIVE-6406 introduces a notion of an immutable table in hive. In essence, it is a data protection feature similar to current hive protections like OFFLINE and NO_DROP. Thus, rather than having its interface being people mucking around TBLPROPERTIES, we should have ql grammar for it. Introduce ql grammar for immutability property -- Key: HIVE-6465 URL: https://issues.apache.org/jira/browse/HIVE-6465 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan HIVE-6406 introduces a notion of an immutable table in hive. In essence, it is a data protection feature similar to current hive protections like OFFLINE and NO_DROP. Thus, rather than having its interface being people mucking around TBLPROPERTIES, we should have ql grammar for it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5317) Implement insert, update, and delete in Hive with full ACID support
[ https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906059#comment-13906059 ] Alan Gates commented on HIVE-5317: -- MAPREDUCE-279, at 109, currently out scores us. There may be others, but it would be cool to have more watchers than Yarn. :) Implement insert, update, and delete in Hive with full ACID support --- Key: HIVE-5317 URL: https://issues.apache.org/jira/browse/HIVE-5317 Project: Hive Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley Attachments: InsertUpdatesinHive.pdf Many customers want to be able to insert, update and delete rows from Hive tables with full ACID support. The use cases are varied, but the form of the queries that should be supported are: * INSERT INTO tbl SELECT … * INSERT INTO tbl VALUES ... * UPDATE tbl SET … WHERE … * DELETE FROM tbl WHERE … * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN ... * SET TRANSACTION LEVEL … * BEGIN/END TRANSACTION Use Cases * Once an hour, a set of inserts and updates (up to 500k rows) for various dimension tables (eg. customer, inventory, stores) needs to be processed. The dimension tables have primary keys and are typically bucketed and sorted on those keys. * Once a day a small set (up to 100k rows) of records need to be deleted for regulatory compliance. * Once an hour a log of transactions is exported from a RDBS and the fact tables need to be updated (up to 1m rows) to reflect the new data. The transactions are a combination of inserts, updates, and deletes. The table is partitioned and bucketed. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6382) PATCHED_BLOB encoding in ORC will corrupt data in some cases
[ https://issues.apache.org/jira/browse/HIVE-6382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-6382: - Attachment: HIVE-6382.3.patch Added configuration to orc readers to support skipping of corrupted data. PATCHED_BLOB encoding in ORC will corrupt data in some cases Key: HIVE-6382 URL: https://issues.apache.org/jira/browse/HIVE-6382 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6382.1.patch, HIVE-6382.2.patch, HIVE-6382.3.patch In PATCHED_BLOB encoding (added in HIVE-4123), gapVsPatchList is an array of long that stores gap (g) between the values that are patched and the patch value (p). The maximum distance of gap can be 511 that require 8 bits to encode. And patch values can take more than 56 bits. When patch values take more than 56 bits, p + g will become 64 bits which cannot be packed to a long. This will result in data corruption under the case where patch values are 56 bits. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6465) Introduce ql grammar for immutability property
[ https://issues.apache.org/jira/browse/HIVE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906063#comment-13906063 ] Sushanth Sowmyan commented on HIVE-6465: Current grammar for data protections are as follows: {noformat} ALTER TABLE table_name [PARTITION partition_spec] ENABLE|DISABLE NO_DROP; ALTER TABLE table_name [PARTITION partition_spec] ENABLE|DISABLE OFFLINE; {noformat} Proposed new grammar for immutability: {noformat} ALTER TABLE table_name ENABLE|DISABLE IMMUTABILITY; {noformat} Introduce ql grammar for immutability property -- Key: HIVE-6465 URL: https://issues.apache.org/jira/browse/HIVE-6465 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan HIVE-6406 introduces a notion of an immutable table in hive. In essence, it is a data protection feature similar to current hive protections like OFFLINE and NO_DROP. Thus, rather than having its interface being people mucking around TBLPROPERTIES, we should have ql grammar for it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6406) Introduce immutable-table table property and if set, disallow insert-into
[ https://issues.apache.org/jira/browse/HIVE-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906071#comment-13906071 ] Sushanth Sowmyan commented on HIVE-6406: Thanks, Ashutosh. I've created HIVE-6465 for that. Introduce immutable-table table property and if set, disallow insert-into - Key: HIVE-6406 URL: https://issues.apache.org/jira/browse/HIVE-6406 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-6406.2.patch, HIVE-6406.3.patch, HIVE-6406.patch As part of HIVE-6405's attempt to make HCatalog and Hive behave in similar ways with regards to immutable tables, this is a companion task to introduce the notion of an immutable table, wherein all tables are not immutable by default, and have this be a table property. If this property is set for a table, and we attempt to write to a table that already has data (or a partition), disallow INSERT INTO into it from hive(if destination directory is non-empty). This property being set will allow hive to mimic HCatalog's current immutable-table property. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5232) Make JDBC use the new HiveServer2 async execution API by default
[ https://issues.apache.org/jira/browse/HIVE-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-5232: --- Description: [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: # [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217] # [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229] # [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230] # [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441] There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel in a different thread and the original thread will now be able to detect that. was: [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: # [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217] # [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229] # [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230] # [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441] There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This is likely going to provide performance benefits as a long running query need not keep the underlying TCP connection open for the entire duration. Make JDBC use the new HiveServer2 async execution API by default Key: HIVE-5232 URL: https://issues.apache.org/jira/browse/HIVE-5232 Project: Hive Issue Type: Sub-task Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5232.1.patch, HIVE-5232.2.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: # [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217] # [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229] # [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230] # [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441] There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel in a different thread and the original thread will now be able to detect that. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6466) Add support for pluggable authentication modules (PAM) in HiveServer2
Vaibhav Gumashta created HIVE-6466: -- Summary: Add support for pluggable authentication modules (PAM) in HiveServer2 Key: HIVE-6466 URL: https://issues.apache.org/jira/browse/HIVE-6466 Project: Hive Issue Type: New Feature Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 More on PAM in these articles: http://www.tuxradar.com/content/how-pam-works https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5232) Make JDBC use the new HiveServer2 async execution API by default
[ https://issues.apache.org/jira/browse/HIVE-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-5232: --- Description: [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: # [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217] # [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229] # [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230] # [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441] There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel is done in a different thread - the original thread will now be able to detect the cancel, as opposed to the use of the blocking execute calls, in which statement.cancel will be a no-op. was: [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: # [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217] # [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229] # [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230] # [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441] There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel in a different thread and the original thread will now be able to detect that. Make JDBC use the new HiveServer2 async execution API by default Key: HIVE-5232 URL: https://issues.apache.org/jira/browse/HIVE-5232 Project: Hive Issue Type: Sub-task Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5232.1.patch, HIVE-5232.2.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: # [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217] # [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229] # [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230] # [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441] There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel is done in a different thread - the original thread will now be able to detect the cancel, as opposed to the use of the blocking execute calls, in which statement.cancel will be a no-op. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6422) SQL std auth - revert change for view keyword in grant statement
[ https://issues.apache.org/jira/browse/HIVE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906112#comment-13906112 ] Ashutosh Chauhan commented on HIVE-6422: +1 SQL std auth - revert change for view keyword in grant statement Key: HIVE-6422 URL: https://issues.apache.org/jira/browse/HIVE-6422 Project: Hive Issue Type: Sub-task Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6422.1.patch, HIVE-6422.2.patch SQL standard does not support view keyword in grant statement. HIVE-6181 which was added as part of sql standard changes, needs to be reverted. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6422) SQL std auth - revert change for view keyword in grant statement
[ https://issues.apache.org/jira/browse/HIVE-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-6422: Attachment: HIVE-6422.2.patch Good point about test coverage. I have enhanced the authorization_view_sqlstd.q test to add more coverage and also remove the view keyword used there. It now also tests the grant/revoke statements on views with and without the table keyword. SQL std auth - revert change for view keyword in grant statement Key: HIVE-6422 URL: https://issues.apache.org/jira/browse/HIVE-6422 Project: Hive Issue Type: Sub-task Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-6422.1.patch, HIVE-6422.2.patch SQL standard does not support view keyword in grant statement. HIVE-6181 which was added as part of sql standard changes, needs to be reverted. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6465) Introduce ql grammar for immutability property
[ https://issues.apache.org/jira/browse/HIVE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906126#comment-13906126 ] Lefty Leverenz commented on HIVE-6465: -- What about CREATE TABLE? Would it also have ENABLE|DISABLE IMMUTABLILTY or keep on using TBLPROPERTIES(immutable=true) as shown in HIVE-6406, or neither? (According to the wiki NO_DROP and OFFLINE only exist as ALTER TABLE options, but maybe the doc is incomplete.) * [DDL: Create Table |https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTable] * [DDL: Alter Table/Partition Protections |https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-AlterTable/PartitionProtections] Introduce ql grammar for immutability property -- Key: HIVE-6465 URL: https://issues.apache.org/jira/browse/HIVE-6465 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan HIVE-6406 introduces a notion of an immutable table in hive. In essence, it is a data protection feature similar to current hive protections like OFFLINE and NO_DROP. Thus, rather than having its interface being people mucking around TBLPROPERTIES, we should have ql grammar for it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5958) SQL std auth - authorize statements that work with paths
[ https://issues.apache.org/jira/browse/HIVE-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906127#comment-13906127 ] Hive QA commented on HIVE-5958: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629792/HIVE-5958.7.patch {color:green}SUCCESS:{color} +1 5143 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1416/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1416/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12629792 SQL std auth - authorize statements that work with paths Key: HIVE-5958 URL: https://issues.apache.org/jira/browse/HIVE-5958 Project: Hive Issue Type: Sub-task Components: Authorization Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-5958.1.patch, HIVE-5958.2.patch, HIVE-5958.3.patch, HIVE-5958.4.patch, HIVE-5958.5.patch, HIVE-5958.6.patch, HIVE-5958.7.patch Original Estimate: 72h Remaining Estimate: 72h Statement such as create table, alter table that specify an path uri should be allowed under the new authorization scheme only if URI(Path) specified has permissions including read/write and ownership of the file/dir and its children. Also, fix issue of database not getting set as output for create-table. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6405) Support append feature for HCatalog
[ https://issues.apache.org/jira/browse/HIVE-6405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906134#comment-13906134 ] Hive QA commented on HIVE-6405: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12629811/HIVE-6405.patch Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1418/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1418/console Messages: {noformat} This message was trimmed, see log for full details [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-hbase-handler --- [INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ hive-hbase-handler --- [debug] execute contextualize [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/main/resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-hbase-handler --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-hbase-handler --- [INFO] Compiling 18 source files to /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/classes [WARNING] Note: Some input files use or override a deprecated API. [WARNING] Note: Recompile with -Xlint:deprecation for details. [INFO] [INFO] --- maven-resources-plugin:2.5:testResources (default-testResources) @ hive-hbase-handler --- [debug] execute contextualize [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/src/test/resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-hbase-handler --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/tmp/conf [copy] Copying 5 files to /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hive-hbase-handler --- [INFO] Compiling 4 source files to /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/test-classes [WARNING] Note: Some input files use or override a deprecated API. [WARNING] Note: Recompile with -Xlint:deprecation for details. [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-hbase-handler --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-hbase-handler --- [INFO] Building jar: /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/hive-hbase-handler-0.13.0-SNAPSHOT.jar [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-hbase-handler --- [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/target/hive-hbase-handler-0.13.0-SNAPSHOT.jar to /data/hive-ptest/working/maven/org/apache/hive/hive-hbase-handler/0.13.0-SNAPSHOT/hive-hbase-handler-0.13.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-svn-trunk-source/hbase-handler/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-hbase-handler/0.13.0-SNAPSHOT/hive-hbase-handler-0.13.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive HCatalog 0.13.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-hcatalog --- [INFO] Deleting /data/hive-ptest/working/apache-svn-trunk-source/hcatalog (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-antrun-plugin:1.7:run (define-classpath) @ hive-hcatalog --- [INFO] Executing tasks main: [INFO] Executed tasks [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-hcatalog --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hcatalog/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hcatalog/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-svn-trunk-source/hcatalog/target/tmp/conf [copy]
Re: Review Request 15435: Add long polling to asynchronous execution in HiveServer2
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15435/#review34933 --- service/src/java/org/apache/hive/service/cli/operation/ExecuteStatementOperation.java https://reviews.apache.org/r/15435/#comment65313 it would be better to call the constructor with more arguments from this one. service/src/java/org/apache/hive/service/cli/operation/Operation.java https://reviews.apache.org/r/15435/#comment65311 I think we should make runAsync field final, as it is not supposed to change. service/src/java/org/apache/hive/service/cli/operation/Operation.java https://reviews.apache.org/r/15435/#comment65312 it think it is cleaner to have this constructor call (instead of the other way around) this(parentSession, opType, false); That way initialization will be in one constructor, and it will be clear what all variables get initalized to. - Thejas Nair On Feb. 18, 2014, 1:16 p.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15435/ --- (Updated Feb. 18, 2014, 1:16 p.m.) Review request for hive, Carl Steinbach and Thejas Nair. Bugs: HIVE-5217 https://issues.apache.org/jira/browse/HIVE-5217 Repository: hive-git Description --- Add long polling to asynchronous execution in HiveServer2 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 service/src/java/org/apache/hive/service/cli/CLIService.java 56b357a service/src/java/org/apache/hive/service/cli/operation/ExecuteStatementOperation.java e973f83 service/src/java/org/apache/hive/service/cli/operation/Operation.java 58a28b6 service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 03a37c8 service/src/test/org/apache/hive/service/cli/CLIServiceTest.java 8ec8d43 Diff: https://reviews.apache.org/r/15435/diff/ Testing --- Thanks, Vaibhav Gumashta
Re: Review Request 18250: SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18250/ --- (Updated Feb. 19, 2014, 9:51 p.m.) Review request for hive. Changes --- Incorporated Thejas feedback. Also, added new -ve test. Bugs: HIVE-6433 https://issues.apache.org/jira/browse/HIVE-6433 Repository: hive-git Description --- SQL std auth - allow grant/revoke roles if user has ADMIN OPTION Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java c1afaee ql/src/test/queries/clientnegative/authorization_role_grant.q PRE-CREATION ql/src/test/queries/clientpositive/authorization_role_grant2.q PRE-CREATION ql/src/test/results/clientnegative/authorization_role_grant.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_set_role_neg2.q.out eec684d ql/src/test/results/clientpositive/authorization_role_grant2.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18250/diff/ Testing --- Added new test Thanks, Ashutosh Chauhan
Re: Review Request 18250: SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18250/ --- (Updated Feb. 19, 2014, 9:51 p.m.) Review request for hive. Bugs: HIVE-6433 https://issues.apache.org/jira/browse/HIVE-6433 Repository: hive-git Description --- SQL std auth - allow grant/revoke roles if user has ADMIN OPTION Diffs - ql/src/java/org/apache/hadoop/hive/ql/security/authorization/plugin/sqlstd/SQLStdHiveAccessController.java c1afaee ql/src/test/queries/clientnegative/authorization_role_grant.q PRE-CREATION ql/src/test/queries/clientpositive/authorization_role_grant2.q PRE-CREATION ql/src/test/results/clientnegative/authorization_role_grant.q.out PRE-CREATION ql/src/test/results/clientnegative/authorization_set_role_neg2.q.out eec684d ql/src/test/results/clientpositive/authorization_role_grant2.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18250/diff/ Testing --- Added new test Thanks, Ashutosh Chauhan
[jira] [Updated] (HIVE-2365) SQL support for bulk load into HBase
[ https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-2365: --- Attachment: HIVE-2365.2.patch.txt This patch separates HFile generation from the new completebulkload task. Also adds a couple more test cases. SQL support for bulk load into HBase Key: HIVE-2365 URL: https://issues.apache.org/jira/browse/HIVE-2365 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: John Sichi Assignee: Nick Dimiduk Fix For: 0.13.0 Attachments: HIVE-2365.2.patch.txt, HIVE-2365.WIP.00.patch, HIVE-2365.WIP.01.patch, HIVE-2365.WIP.01.patch Support the as simple as this SQL for bulk load from Hive into HBase. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-2365) SQL support for bulk load into HBase
[ https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-2365: --- Status: Open (was: Patch Available) SQL support for bulk load into HBase Key: HIVE-2365 URL: https://issues.apache.org/jira/browse/HIVE-2365 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: John Sichi Assignee: Nick Dimiduk Fix For: 0.13.0 Attachments: HIVE-2365.2.patch.txt, HIVE-2365.WIP.00.patch, HIVE-2365.WIP.01.patch, HIVE-2365.WIP.01.patch Support the as simple as this SQL for bulk load from Hive into HBase. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
[ https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6433: --- Status: Patch Available (was: Open) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION Key: HIVE-6433 URL: https://issues.apache.org/jira/browse/HIVE-6433 Project: Hive Issue Type: Sub-task Reporter: Thejas M Nair Assignee: Ashutosh Chauhan Attachments: HIVE-6433.1.patch, HIVE-6433.patch Follow up jira for HIVE-5952. If a user/role has admin option on a role, then user should be able to grant /revoke other users to/from the role. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-2365) SQL support for bulk load into HBase
[ https://issues.apache.org/jira/browse/HIVE-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HIVE-2365: --- Status: Patch Available (was: Open) SQL support for bulk load into HBase Key: HIVE-2365 URL: https://issues.apache.org/jira/browse/HIVE-2365 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: John Sichi Assignee: Nick Dimiduk Fix For: 0.13.0 Attachments: HIVE-2365.2.patch.txt, HIVE-2365.WIP.00.patch, HIVE-2365.WIP.01.patch, HIVE-2365.WIP.01.patch Support the as simple as this SQL for bulk load from Hive into HBase. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
[ https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6433: --- Status: Open (was: Patch Available) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION Key: HIVE-6433 URL: https://issues.apache.org/jira/browse/HIVE-6433 Project: Hive Issue Type: Sub-task Reporter: Thejas M Nair Assignee: Ashutosh Chauhan Attachments: HIVE-6433.1.patch, HIVE-6433.patch Follow up jira for HIVE-5952. If a user/role has admin option on a role, then user should be able to grant /revoke other users to/from the role. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6433) SQL std auth - allow grant/revoke roles if user has ADMIN OPTION
[ https://issues.apache.org/jira/browse/HIVE-6433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6433: --- Attachment: HIVE-6433.1.patch Incorporated Thejas feedback. SQL std auth - allow grant/revoke roles if user has ADMIN OPTION Key: HIVE-6433 URL: https://issues.apache.org/jira/browse/HIVE-6433 Project: Hive Issue Type: Sub-task Reporter: Thejas M Nair Assignee: Ashutosh Chauhan Attachments: HIVE-6433.1.patch, HIVE-6433.patch Follow up jira for HIVE-5952. If a user/role has admin option on a role, then user should be able to grant /revoke other users to/from the role. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-5232) Make JDBC use the new HiveServer2 async execution API by default
[ https://issues.apache.org/jira/browse/HIVE-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5232: Description: [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: HIVE-5217 HIVE-5229 HIVE-5230 HIVE-5441 There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel is done in a different thread - the original thread will now be able to detect the cancel, as opposed to the use of the blocking execute calls, in which statement.cancel will be a no-op. was: [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: # [HIVE-5217|https://issues.apache.org/jira/browse/HIVE-5217] # [HIVE-5229|https://issues.apache.org/jira/browse/HIVE-5229] # [HIVE-5230|https://issues.apache.org/jira/browse/HIVE-5230] # [HIVE-5441|https://issues.apache.org/jira/browse/HIVE-5441] There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel is done in a different thread - the original thread will now be able to detect the cancel, as opposed to the use of the blocking execute calls, in which statement.cancel will be a no-op. Make JDBC use the new HiveServer2 async execution API by default Key: HIVE-5232 URL: https://issues.apache.org/jira/browse/HIVE-5232 Project: Hive Issue Type: Sub-task Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5232.1.patch, HIVE-5232.2.patch [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: HIVE-5217 HIVE-5229 HIVE-5230 HIVE-5441 There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel is done in a different thread - the original thread will now be able to detect the cancel, as opposed to the use of the blocking execute calls, in which statement.cancel will be a no-op. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18254: HIVE-6375 Implement CTAS and column rename for parquet
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18254/ --- (Updated Feb. 19, 2014, 9:56 p.m.) Review request for hive. Changes --- Incorporated review feedback. Updated more test cases results of explain CTAS. It seems that the test table srcbucket, as a bucketed (multi-file) table, will give random results from select query, so first insert to a staging table using sort by. Bugs: HIVE-6375 https://issues.apache.org/jira/browse/HIVE-6375 Repository: hive-git Description --- There is a Hive bug in SemanticAnalyzer that chooses different names for columns in the CreateTable task and the FileSink task. columnInfo.getInternalName() was used in one place, and fieldSchema still used columnInfo.getAlias() if it is available. This change makes both consistent, favoring columnInfo.getAlias if it is available. This is not revealed before because other file-formats like RcFile seem to use column-ordinal position, and Avro file stores the schema separately altogether. Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd ql/src/test/queries/clientpositive/parquet_ctas.q PRE-CREATION ql/src/test/results/clientpositive/ctas.q.out 9668855 ql/src/test/results/clientpositive/ctas_hadoop20.q.out 0ec0af5 ql/src/test/results/clientpositive/merge3.q.out 3df75b7 ql/src/test/results/clientpositive/parquet_ctas.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18254/diff/ Testing --- Added parquet_ctas.q. Covers cases where column name is gotten directly from input table (implied alias), where name is auto-generated, where name is specified as alias, and a mix of the three. Thanks, Szehon Ho
[jira] [Updated] (HIVE-5232) Make JDBC use the new HiveServer2 async execution API by default
[ https://issues.apache.org/jira/browse/HIVE-5232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-5232: Description: HIVE-4617 provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: HIVE-5217 HIVE-5229 HIVE-5230 HIVE-5441 There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel is done in a different thread - the original thread will now be able to detect the cancel, as opposed to the use of the blocking execute calls, in which statement.cancel will be a no-op. was: [HIVE-4617|https://issues.apache.org/jira/browse/HIVE-4617] provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: HIVE-5217 HIVE-5229 HIVE-5230 HIVE-5441 There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel is done in a different thread - the original thread will now be able to detect the cancel, as opposed to the use of the blocking execute calls, in which statement.cancel will be a no-op. Make JDBC use the new HiveServer2 async execution API by default Key: HIVE-5232 URL: https://issues.apache.org/jira/browse/HIVE-5232 Project: Hive Issue Type: Sub-task Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-5232.1.patch, HIVE-5232.2.patch HIVE-4617 provides support for async execution in HS2. There are some proposed improvements in followup JIRAs: HIVE-5217 HIVE-5229 HIVE-5230 HIVE-5441 There is also [HIVE-5060|https://issues.apache.org/jira/browse/HIVE-5060] which assumes that execute to be asynchronous by default. Once they are in, we can think of using the async API as the default for JDBC. This can enable the server to report back error sooner to the client. It can also be useful in cases where a statement.cancel is done in a different thread - the original thread will now be able to detect the cancel, as opposed to the use of the blocking execute calls, in which statement.cancel will be a no-op. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6465) Introduce ql grammar for immutability property
[ https://issues.apache.org/jira/browse/HIVE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906159#comment-13906159 ] Sushanth Sowmyan commented on HIVE-6465: The Alter Table syntax acts as syntactic sugar for the background parameter, so the TBLPROPERTIES would also continue working. This is consistent with how OFFLINE/NO_DROP work as well, which are simply Properties at a Partition/Table level - PROTECT_MODE=OFFLINE/NO_DROP/NO_DROP_CASCADE/READ_ONLY would achieve the same thing for them. Introduce ql grammar for immutability property -- Key: HIVE-6465 URL: https://issues.apache.org/jira/browse/HIVE-6465 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan HIVE-6406 introduces a notion of an immutable table in hive. In essence, it is a data protection feature similar to current hive protections like OFFLINE and NO_DROP. Thus, rather than having its interface being people mucking around TBLPROPERTIES, we should have ql grammar for it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6465) Introduce ql grammar for immutability property
[ https://issues.apache.org/jira/browse/HIVE-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906163#comment-13906163 ] Sushanth Sowmyan commented on HIVE-6465: This, btw, might also indicate that maybe we should prefer a reuse of ProtectMode, and go with PROTECT_MODE=IMMUTABLE or WRITE_ONLY for this scenario. Introduce ql grammar for immutability property -- Key: HIVE-6465 URL: https://issues.apache.org/jira/browse/HIVE-6465 Project: Hive Issue Type: Sub-task Components: HCatalog, Metastore, Query Processor, Thrift API Reporter: Sushanth Sowmyan HIVE-6406 introduces a notion of an immutable table in hive. In essence, it is a data protection feature similar to current hive protections like OFFLINE and NO_DROP. Thus, rather than having its interface being people mucking around TBLPROPERTIES, we should have ql grammar for it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6375) Fix CTAS for parquet
[ https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-6375: Attachment: HIVE-6375.2.patch Thanks Xuefu for review. Incorporated feedback and fixed test output. Seems select from srcbucket has some randomness to the result, as it is a bucketed table. Fix CTAS for parquet Key: HIVE-6375 URL: https://issues.apache.org/jira/browse/HIVE-6375 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Brock Noland Assignee: Szehon Ho Priority: Critical Labels: Parquet Attachments: HIVE-6375.2.patch, HIVE-6375.patch More details here: https://github.com/Parquet/parquet-mr/issues/272 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6356) Dependency injection in hbase storage handler is broken
[ https://issues.apache.org/jira/browse/HIVE-6356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-6356: --- Status: Patch Available (was: Open) Marking this as Patch Available. Lets get this in. Code changes are rather small to bump up hbase version. Dependency injection in hbase storage handler is broken --- Key: HIVE-6356 URL: https://issues.apache.org/jira/browse/HIVE-6356 Project: Hive Issue Type: Bug Components: HBase Handler Reporter: Navis Priority: Minor Fix For: 0.13.0 Attachments: HIVE-6356.1.patch.txt, HIVE-6356.2.patch.txt, HIVE-6356.3.patch.txt, HIVE-6356.addendum.00.patch Dependent jars for hbase is not added to tmpjars, which is caused by the change of method signature(TableMapReduceUtil.addDependencyJars). -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (HIVE-6326) Split generation in ORC may generate wrong split boundaries because of unaccounted padded bytes
[ https://issues.apache.org/jira/browse/HIVE-6326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-6326: - Fix Version/s: 0.13.0 Split generation in ORC may generate wrong split boundaries because of unaccounted padded bytes --- Key: HIVE-6326 URL: https://issues.apache.org/jira/browse/HIVE-6326 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Fix For: 0.13.0 Attachments: HIVE-6326.1.patch, HIVE-6326.2.patch, HIVE-6326.3.patch, HIVE-6326.4.patch HIVE-5091 added padding to ORC files to avoid ORC stripes straddling HDFS blocks. The length of this padded bytes are not stored in stripe information. OrcInputFormat.getSplits() uses stripeInformation.getLength() for split computation. stripeInformation.getLength() is sum of index length, data length and stripe footer length. It does not account for the length of padded bytes which may result in wrong split boundary. The fix for this is to use the offset of next stripe as the length of current stripe which includes the padded bytes as well. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Review Request 18254: HIVE-6375 Implement CTAS and column rename for parquet
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18254/#review34937 --- Ship it! - Mohammad Islam On Feb. 19, 2014, 9:56 p.m., Szehon Ho wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18254/ --- (Updated Feb. 19, 2014, 9:56 p.m.) Review request for hive. Bugs: HIVE-6375 https://issues.apache.org/jira/browse/HIVE-6375 Repository: hive-git Description --- There is a Hive bug in SemanticAnalyzer that chooses different names for columns in the CreateTable task and the FileSink task. columnInfo.getInternalName() was used in one place, and fieldSchema still used columnInfo.getAlias() if it is available. This change makes both consistent, favoring columnInfo.getAlias if it is available. This is not revealed before because other file-formats like RcFile seem to use column-ordinal position, and Avro file stores the schema separately altogether. Diffs - ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 77388dd ql/src/test/queries/clientpositive/parquet_ctas.q PRE-CREATION ql/src/test/results/clientpositive/ctas.q.out 9668855 ql/src/test/results/clientpositive/ctas_hadoop20.q.out 0ec0af5 ql/src/test/results/clientpositive/merge3.q.out 3df75b7 ql/src/test/results/clientpositive/parquet_ctas.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18254/diff/ Testing --- Added parquet_ctas.q. Covers cases where column name is gotten directly from input table (implied alias), where name is auto-generated, where name is specified as alias, and a mix of the three. Thanks, Szehon Ho
[jira] [Commented] (HIVE-6375) Fix CTAS for parquet
[ https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906215#comment-13906215 ] Mohammad Kamrul Islam commented on HIVE-6375: - +1 reviewed the patch. CTAS for aver doesn't work for the same reason (HIVE-5803). Hopefully, the patch will help avro as well. Fix CTAS for parquet Key: HIVE-6375 URL: https://issues.apache.org/jira/browse/HIVE-6375 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Brock Noland Assignee: Szehon Ho Priority: Critical Labels: Parquet Attachments: HIVE-6375.2.patch, HIVE-6375.patch More details here: https://github.com/Parquet/parquet-mr/issues/272 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5803) Support CTAS from a non-avro table to an avro table
[ https://issues.apache.org/jira/browse/HIVE-5803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906217#comment-13906217 ] Mohammad Kamrul Islam commented on HIVE-5803: - The linked Jira might solve this problem as well. Support CTAS from a non-avro table to an avro table --- Key: HIVE-5803 URL: https://issues.apache.org/jira/browse/HIVE-5803 Project: Hive Issue Type: Task Reporter: Mohammad Kamrul Islam Assignee: Carl Steinbach Hive currently does not work with HQL like : CREATE TABLE AVRO-BASE-TABLE as SELECT * from NON_AVRO_TABLE; Actual it works successfully. But when I run SELECT * from AVRO-BASED-TABLE .. it fails. This JIRA depends on HIVE-3159 that translates TypeInfo to Avro schema. Findings so far: CTAS uses internal column names (in place of using the column names provided in select) when crating the AVRO data file. In other words, avro data file has column names in this form of: _col0, _col1 where as table column names are different. I tested with the following test cases and it failed: - verify 1) can create table using create table as select from non-avro table 2) LOAD avro data into new table and read data from the new table CREATE TABLE simple_kv_txt (key STRING, value STRING) STORED AS TEXTFILE; DESCRIBE simple_kv_txt; LOAD DATA LOCAL INPATH '../data/files/kv1.txt' INTO TABLE simple_kv_txt; SELECT * FROM simple_kv_txt ORDER BY KEY; CREATE TABLE copy_doctors ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' as SELECT key as key, value as value FROM simple_kv_txt; DESCRIBE copy_doctors; SELECT * FROM copy_doctors; -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-6375) Fix CTAS for parquet
[ https://issues.apache.org/jira/browse/HIVE-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906225#comment-13906225 ] Szehon Ho commented on HIVE-6375: - Yea, looks like a similar issue. Fix CTAS for parquet Key: HIVE-6375 URL: https://issues.apache.org/jira/browse/HIVE-6375 Project: Hive Issue Type: Bug Affects Versions: 0.13.0 Reporter: Brock Noland Assignee: Szehon Ho Priority: Critical Labels: Parquet Attachments: HIVE-6375.2.patch, HIVE-6375.patch More details here: https://github.com/Parquet/parquet-mr/issues/272 -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Review Request 18291: Add support for pluggable authentication modules (PAM) in HiveServer2
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18291/ --- Review request for hive and Thejas Nair. Bugs: HIVE-6466 https://issues.apache.org/jira/browse/HIVE-6466 Repository: hive-git Description --- Refer the jira: https://issues.apache.org/jira/browse/HIVE-6466 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a182cd7 pom.xml 9aef665 service/pom.xml b1002e2 service/src/java/org/apache/hive/service/auth/AuthenticationProviderFactory.java b92fd83 service/src/java/org/apache/hive/service/auth/PamAuthenticationProviderImpl.java PRE-CREATION Diff: https://reviews.apache.org/r/18291/diff/ Testing --- Thanks, Vaibhav Gumashta
[jira] [Updated] (HIVE-6466) Add support for pluggable authentication modules (PAM) in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-6466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-6466: --- Attachment: HIVE-6466.1.patch Rb link: https://reviews.apache.org/r/18291 Add support for pluggable authentication modules (PAM) in HiveServer2 - Key: HIVE-6466 URL: https://issues.apache.org/jira/browse/HIVE-6466 Project: Hive Issue Type: New Feature Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Fix For: 0.13.0 Attachments: HIVE-6466.1.patch More on PAM in these articles: http://www.tuxradar.com/content/how-pam-works https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Managing_Smart_Cards/Pluggable_Authentication_Modules.html -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HIVE-5636) Introduce getPartitionColumns() functionality from HCatInputFormat
[ https://issues.apache.org/jira/browse/HIVE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13906251#comment-13906251 ] Sushanth Sowmyan commented on HIVE-5636: Note : None of the reported failures are due to this patch. With Daniel's +1, I'm going to go ahead and commit. Introduce getPartitionColumns() functionality from HCatInputFormat -- Key: HIVE-5636 URL: https://issues.apache.org/jira/browse/HIVE-5636 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.12.0 Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-5636.2.patch, HIVE-5636.patch As of HCat 0.5, we made the class InputJobInfo private for hcatalog use only, and we made it so that setInput would not modify the InputJobInfo being passed in. However, if a user of HCatInputFormat wants to get what Partitioning columns or Data columns exist for the job, they are not able to do so directly from HCatInputFormat and are forced to use InputJobInfo, which currently does not work. Thus, we need to expose this functionality. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HIVE-6467) Metastore DBS.OWNER_TYPE value got spaces at the end
Jason Dere created HIVE-6467: Summary: Metastore DBS.OWNER_TYPE value got spaces at the end Key: HIVE-6467 URL: https://issues.apache.org/jira/browse/HIVE-6467 Project: Hive Issue Type: Bug Components: Metastore Reporter: Jason Dere Trying to tinker with the metastore upgrade scripts and did the following steps on a brand new Derby DB: From derby: {noformat} run 'hive-schema-0.12.0.derby.sql'; run 'upgrade-0.12.0-to-0.13.0.derby.sql'; {noformat} From Hive: {noformat} show tables; {noformat} I then hit the following error below. It appears that in the metastore DBS table, the row with defaultdb was created with the value ROLE , with spaces at the end, where it was expecting ROLE. {noformat} 2014-02-19 14:49:19,824 ERROR metastore.RetryingHMSHandler (RetryingHMSHandler.java:invoke(143)) - java.lang.IllegalArgumentException: No enum const class org.apache.hadoop.hive.metastore.api.PrincipalType.ROLE at java.lang.Enum.valueOf(Enum.java:196) at org.apache.hadoop.hive.metastore.api.PrincipalType.valueOf(PrincipalType.java:14) at org.apache.hadoop.hive.metastore.ObjectStore.getDatabase(ObjectStore.java:521) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108) at com.sun.proxy.$Proxy7.getDatabase(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_database(HiveMetaStore.java:753) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy8.get_database(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:895) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy9.getDatabase(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1150) at org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1139) at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2372) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:354) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1566) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1339) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1010) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1000) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:687) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)