[jira] [Created] (HIVE-10637) Cleanup TestPassProperties changes introduced due to HIVE-8696
Thiruvel Thirumoolan created HIVE-10637: --- Summary: Cleanup TestPassProperties changes introduced due to HIVE-8696 Key: HIVE-10637 URL: https://issues.apache.org/jira/browse/HIVE-10637 Project: Hive Issue Type: Test Components: HCatalog, Tests Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Priority: Trivial Follow up JIRA to cleanup the test case as per recommendations from Sushanth. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14325511#comment-14325511 ] Thiruvel Thirumoolan commented on HIVE-9582: The test failure is unrelated to this patch. Review request @ https://reviews.apache.org/r/31152/ > HCatalog should use IMetaStoreClient interface > -- > > Key: HIVE-9582 > URL: https://issues.apache.org/jira/browse/HIVE-9582 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore >Affects Versions: 0.14.0, 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: hcatalog, metastore, rolling_upgrade > Fix For: 0.14.1 > > Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, > HIVE-9583.1.patch > > > Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. > Hence during a failure, the client retries and possibly succeeds. But > HCatalog has long been using HiveMetaStoreClient directly and hence failures > are costly, especially if they are during the commit stage of a job. Its also > not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9582: --- Attachment: HIVE-9582.3.patch Attaching rebased patch for precommit tests to run. > HCatalog should use IMetaStoreClient interface > -- > > Key: HIVE-9582 > URL: https://issues.apache.org/jira/browse/HIVE-9582 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore >Affects Versions: 0.14.0, 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: hcatalog, metastore, rolling_upgrade > Fix For: 0.14.1 > > Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9582.3.patch, > HIVE-9583.1.patch > > > Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. > Hence during a failure, the client retries and possibly succeeds. But > HCatalog has long been using HiveMetaStoreClient directly and hence failures > are costly, especially if they are during the commit stage of a job. Its also > not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9508) MetaStore client socket connection should have a lifetime
[ https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14325344#comment-14325344 ] Thiruvel Thirumoolan commented on HIVE-9508: Review request @ https://reviews.apache.org/r/31146/ > MetaStore client socket connection should have a lifetime > - > > Key: HIVE-9508 > URL: https://issues.apache.org/jira/browse/HIVE-9508 > Project: Hive > Issue Type: Sub-task > Components: CLI, Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: metastore, rolling_upgrade > Fix For: 1.2.0 > > Attachments: HIVE-9508.1.patch, HIVE-9508.2.patch, HIVE-9508.3.patch > > > Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore > server until the connection is closed or there is a problem. I would like to > introduce the concept of a MetaStore client socket life time. The MS client > will reconnect if the socket lifetime is reached. This will help during > rolling upgrade of Metastore. > When there are multiple Metastore servers behind a VIP (load balancer), it is > easy to take one server out of rotation and wait for 10+ mins for all > existing connections will die down (if the lifetime is 5mins say) and the > server can be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime
[ https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9508: --- Attachment: HIVE-9508.3.patch Uploading patch with sane defaults. > MetaStore client socket connection should have a lifetime > - > > Key: HIVE-9508 > URL: https://issues.apache.org/jira/browse/HIVE-9508 > Project: Hive > Issue Type: Sub-task > Components: CLI, Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: metastore, rolling_upgrade > Fix For: 1.2.0 > > Attachments: HIVE-9508.1.patch, HIVE-9508.2.patch, HIVE-9508.3.patch > > > Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore > server until the connection is closed or there is a problem. I would like to > introduce the concept of a MetaStore client socket life time. The MS client > will reconnect if the socket lifetime is reached. This will help during > rolling upgrade of Metastore. > When there are multiple Metastore servers behind a VIP (load balancer), it is > easy to take one server out of rotation and wait for 10+ mins for all > existing connections will die down (if the lifetime is 5mins say) and the > server can be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9582: --- Attachment: HIVE-9582.2.patch Updated patch. > HCatalog should use IMetaStoreClient interface > -- > > Key: HIVE-9582 > URL: https://issues.apache.org/jira/browse/HIVE-9582 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore >Affects Versions: 0.14.0, 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: hcatalog, metastore, rolling_upgrade > Fix For: 0.14.1 > > Attachments: HIVE-9582.1.patch, HIVE-9582.2.patch, HIVE-9583.1.patch > > > Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. > Hence during a failure, the client retries and possibly succeeds. But > HCatalog has long been using HiveMetaStoreClient directly and hence failures > are costly, especially if they are during the commit stage of a job. Its also > not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9582: --- Attachment: HIVE-9582.1.patch Uploading a patch that applies cleanly on trunk and with the right file name for tests to run. > HCatalog should use IMetaStoreClient interface > -- > > Key: HIVE-9582 > URL: https://issues.apache.org/jira/browse/HIVE-9582 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore >Affects Versions: 0.14.0, 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: hcatalog, metastore, rolling_upgrade > Fix For: 0.14.1 > > Attachments: HIVE-9582.1.patch, HIVE-9583.1.patch > > > Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. > Hence during a failure, the client retries and possibly succeeds. But > HCatalog has long been using HiveMetaStoreClient directly and hence failures > are costly, especially if they are during the commit stage of a job. Its also > not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9582: --- Attachment: HIVE-9583.1.patch Uploading a WIP patch > HCatalog should use IMetaStoreClient interface > -- > > Key: HIVE-9582 > URL: https://issues.apache.org/jira/browse/HIVE-9582 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore >Affects Versions: 0.14.0, 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: hcatalog, metastore, rolling_upgrade > Fix For: 0.14.1 > > Attachments: HIVE-9583.1.patch > > > Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. > Hence during a failure, the client retries and possibly succeeds. But > HCatalog has long been using HiveMetaStoreClient directly and hence failures > are costly, especially if they are during the commit stage of a job. Its also > not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9582: --- Status: Patch Available (was: Open) > HCatalog should use IMetaStoreClient interface > -- > > Key: HIVE-9582 > URL: https://issues.apache.org/jira/browse/HIVE-9582 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore >Affects Versions: 0.13.1, 0.14.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: hcatalog, metastore, rolling_upgrade > Fix For: 0.14.1 > > Attachments: HIVE-9583.1.patch > > > Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. > Hence during a failure, the client retries and possibly succeeds. But > HCatalog has long been using HiveMetaStoreClient directly and hence failures > are costly, especially if they are during the commit stage of a job. Its also > not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime
[ https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9508: --- Issue Type: Sub-task (was: Improvement) Parent: HIVE-9583 > MetaStore client socket connection should have a lifetime > - > > Key: HIVE-9508 > URL: https://issues.apache.org/jira/browse/HIVE-9508 > Project: Hive > Issue Type: Sub-task > Components: CLI, Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: metastore, rolling_upgrade > Fix For: 1.2.0 > > Attachments: HIVE-9508.1.patch, HIVE-9508.2.patch > > > Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore > server until the connection is closed or there is a problem. I would like to > introduce the concept of a MetaStore client socket life time. The MS client > will reconnect if the socket lifetime is reached. This will help during > rolling upgrade of Metastore. > When there are multiple Metastore servers behind a VIP (load balancer), it is > easy to take one server out of rotation and wait for 10+ mins for all > existing connections will die down (if the lifetime is 5mins say) and the > server can be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9582) HCatalog should use IMetaStoreClient interface
[ https://issues.apache.org/jira/browse/HIVE-9582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9582: --- Issue Type: Sub-task (was: Improvement) Parent: HIVE-9583 > HCatalog should use IMetaStoreClient interface > -- > > Key: HIVE-9582 > URL: https://issues.apache.org/jira/browse/HIVE-9582 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore >Affects Versions: 0.14.0, 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: hcatalog, metastore, rolling_upgrade > Fix For: 0.14.1 > > > Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. > Hence during a failure, the client retries and possibly succeeds. But > HCatalog has long been using HiveMetaStoreClient directly and hence failures > are costly, especially if they are during the commit stage of a job. Its also > not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
[ https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-8696: --- Issue Type: Sub-task (was: Bug) Parent: HIVE-9583 > HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient. > - > > Key: HIVE-8696 > URL: https://issues.apache.org/jira/browse/HIVE-8696 > Project: Hive > Issue Type: Sub-task > Components: HCatalog, Metastore >Affects Versions: 0.12.0, 0.13.1 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 1.2.0 > > Attachments: HIVE-8696.1.patch > > > The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the > HCatClient API that log in through keytabs will fail without retry, when > their TGTs expire. > The fix is inbound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9583) Rolling upgrade of Hive MetaStore Server
Thiruvel Thirumoolan created HIVE-9583: -- Summary: Rolling upgrade of Hive MetaStore Server Key: HIVE-9583 URL: https://issues.apache.org/jira/browse/HIVE-9583 Project: Hive Issue Type: Improvement Components: HCatalog, Metastore Affects Versions: 0.14.0 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 1.2.0 This is an umbrella JIRA to track all rolling upgrade JIRAs w.r.t MetaStore server. This will be helpful for users deploying Metastore server and connecting to it with HCatalog or Hive CLI interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9582) HCatalog should use IMetaStoreClient interface
Thiruvel Thirumoolan created HIVE-9582: -- Summary: HCatalog should use IMetaStoreClient interface Key: HIVE-9582 URL: https://issues.apache.org/jira/browse/HIVE-9582 Project: Hive Issue Type: Improvement Components: HCatalog, Metastore Affects Versions: 0.13.1, 0.14.0 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.14.1 Hive uses IMetaStoreClient and it makes using RetryingMetaStoreClient easy. Hence during a failure, the client retries and possibly succeeds. But HCatalog has long been using HiveMetaStoreClient directly and hence failures are costly, especially if they are during the commit stage of a job. Its also not possible to do rolling upgrade of MetaStore Server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8696) HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient.
[ https://issues.apache.org/jira/browse/HIVE-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-8696: --- Fix Version/s: 1.2.0 > HCatClientHMSImpl doesn't use a Retrying-HiveMetastoreClient. > - > > Key: HIVE-8696 > URL: https://issues.apache.org/jira/browse/HIVE-8696 > Project: Hive > Issue Type: Bug > Components: HCatalog, Metastore >Affects Versions: 0.12.0, 0.13.1 >Reporter: Mithun Radhakrishnan >Assignee: Mithun Radhakrishnan > Fix For: 1.2.0 > > Attachments: HIVE-8696.1.patch > > > The HCatClientHMSImpl doesn't use a RetryingHiveMetastoreClient. Users of the > HCatClient API that log in through keytabs will fail without retry, when > their TGTs expire. > The fix is inbound. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime
[ https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9508: --- Attachment: HIVE-9508.2.patch Uploading another version of patch with the functionality enabled and a minor bug fix. > MetaStore client socket connection should have a lifetime > - > > Key: HIVE-9508 > URL: https://issues.apache.org/jira/browse/HIVE-9508 > Project: Hive > Issue Type: Improvement > Components: CLI, Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: metastore, rolling_upgrade > Fix For: 1.2.0 > > Attachments: HIVE-9508.1.patch, HIVE-9508.2.patch > > > Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore > server until the connection is closed or there is a problem. I would like to > introduce the concept of a MetaStore client socket life time. The MS client > will reconnect if the socket lifetime is reached. This will help during > rolling upgrade of Metastore. > When there are multiple Metastore servers behind a VIP (load balancer), it is > easy to take one server out of rotation and wait for 10+ mins for all > existing connections will die down (if the lifetime is 5mins say) and the > server can be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime
[ https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9508: --- Status: Patch Available (was: Open) > MetaStore client socket connection should have a lifetime > - > > Key: HIVE-9508 > URL: https://issues.apache.org/jira/browse/HIVE-9508 > Project: Hive > Issue Type: Improvement > Components: CLI, Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: metastore, rolling_upgrade > Fix For: 1.2.0 > > Attachments: HIVE-9508.1.patch > > > Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore > server until the connection is closed or there is a problem. I would like to > introduce the concept of a MetaStore client socket life time. The MS client > will reconnect if the socket lifetime is reached. This will help during > rolling upgrade of Metastore. > When there are multiple Metastore servers behind a VIP (load balancer), it is > easy to take one server out of rotation and wait for 10+ mins for all > existing connections will die down (if the lifetime is 5mins say) and the > server can be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime
[ https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9508: --- Attachment: HIVE-9508.1.patch Attaching basic patch. The connection lifetime is disabled by default so existing users should not be affected. > MetaStore client socket connection should have a lifetime > - > > Key: HIVE-9508 > URL: https://issues.apache.org/jira/browse/HIVE-9508 > Project: Hive > Issue Type: Improvement > Components: CLI, Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: metastore, rolling_upgrade > Fix For: 1.2.0 > > Attachments: HIVE-9508.1.patch > > > Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore > server until the connection is closed or there is a problem. I would like to > introduce the concept of a MetaStore client socket life time. The MS client > will reconnect if the socket lifetime is reached. This will help during > rolling upgrade of Metastore. > When there are multiple Metastore servers behind a VIP (load balancer), it is > easy to take one server out of rotation and wait for 10+ mins for all > existing connections will die down (if the lifetime is 5mins say) and the > server can be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9508) MetaStore client socket connection should have a lifetime
[ https://issues.apache.org/jira/browse/HIVE-9508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-9508: --- Fix Version/s: (was: 0.15.0) 1.2.0 > MetaStore client socket connection should have a lifetime > - > > Key: HIVE-9508 > URL: https://issues.apache.org/jira/browse/HIVE-9508 > Project: Hive > Issue Type: Improvement > Components: CLI, Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: metastore, rolling_upgrade > Fix For: 1.2.0 > > > Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore > server until the connection is closed or there is a problem. I would like to > introduce the concept of a MetaStore client socket life time. The MS client > will reconnect if the socket lifetime is reached. This will help during > rolling upgrade of Metastore. > When there are multiple Metastore servers behind a VIP (load balancer), it is > easy to take one server out of rotation and wait for 10+ mins for all > existing connections will die down (if the lifetime is 5mins say) and the > server can be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-9508) MetaStore client socket connection should have a lifetime
Thiruvel Thirumoolan created HIVE-9508: -- Summary: MetaStore client socket connection should have a lifetime Key: HIVE-9508 URL: https://issues.apache.org/jira/browse/HIVE-9508 Project: Hive Issue Type: Improvement Components: CLI, Metastore Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.15.0 Currently HiveMetaStoreClient (or SessionHMSC) is connected to one Metastore server until the connection is closed or there is a problem. I would like to introduce the concept of a MetaStore client socket life time. The MS client will reconnect if the socket lifetime is reached. This will help during rolling upgrade of Metastore. When there are multiple Metastore servers behind a VIP (load balancer), it is easy to take one server out of rotation and wait for 10+ mins for all existing connections will die down (if the lifetime is 5mins say) and the server can be updated. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6090) Audit logs for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-6090: --- Fix Version/s: 0.15.0 Labels: audit hiveserver (was: ) Status: Patch Available (was: Open) > Audit logs for HiveServer2 > -- > > Key: HIVE-6090 > URL: https://issues.apache.org/jira/browse/HIVE-6090 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, HiveServer2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: hiveserver, audit > Fix For: 0.15.0 > > Attachments: HIVE-6090.1.WIP.patch, HIVE-6090.1.patch, HIVE-6090.patch > > > HiveMetastore has audit logs and would like to audit all queries or requests > to HiveServer2 also. This will help in understanding how the APIs were used, > queries submitted, users etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6090) Audit logs for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-6090: --- Attachment: HIVE-6090.1.patch Uploading patch for unit tests to run. TestJdbcDriver2 passed with the changes. > Audit logs for HiveServer2 > -- > > Key: HIVE-6090 > URL: https://issues.apache.org/jira/browse/HIVE-6090 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, HiveServer2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-6090.1.WIP.patch, HIVE-6090.1.patch, HIVE-6090.patch > > > HiveMetastore has audit logs and would like to audit all queries or requests > to HiveServer2 also. This will help in understanding how the APIs were used, > queries submitted, users etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8417) round(decimal, negative) errors out/wrong results with reduce side vectorization
Thiruvel Thirumoolan created HIVE-8417: -- Summary: round(decimal, negative) errors out/wrong results with reduce side vectorization Key: HIVE-8417 URL: https://issues.apache.org/jira/browse/HIVE-8417 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 0.14.0 Reporter: Thiruvel Thirumoolan Assignee: Jitendra Nath Pandey Priority: Critical With reduce-side vectorization enabled, round UDF taking a decimal value and a negative argument fails. It passes when there is no reducer or when vectorization is turned off. Simulated with: create table decimal_tbl (dec decimal(10,0)); Data: just one record, "101" Query: select dec, round(dec, -1) from decimal_tbl order by dec; This query fails with text and rcfile with IndexOutOfBoundsException in Decimal128.toFormalString(), but returns "101 101" with orc. When order by is removed, it returns "101 100" with orc and rc. When "order by dec" is replaced with "order by round(dec, -1) it fails with the same exception with orc too. Following is the exception thrown: Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) [Error getting row data with exception java.lang.IndexOutOfBoundsException: start 0, end 3, s.length() 2 at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:476) at java.lang.StringBuilder.append(StringBuilder.java:191) at org.apache.hadoop.hive.common.type.Decimal128.toFormalString(Decimal128.java:1858) at org.apache.hadoop.hive.common.type.Decimal128.toBigDecimal(Decimal128.java:1733) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$1.writeValue(VectorExpressionWriterFactory.java:469) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterDecimal.writeValue(VectorExpressionWriterFactory.java:310) at org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch.toString(VectorizedRowBatch.java:159) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:371) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:250) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:168) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:164) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:176) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:168) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:168) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:163) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) ] at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:376) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:250) ... 16 more Caused by: java.lang.IndexOutOfBoundsException: start 0, end 3, s.length() 2 at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:476) at java.lang.StringBuilder.append(StringBuilder.java:191) at org.apache.hadoop.hive.common.type.Decimal128.toFormalString(Decimal128.java:1858) at org.apache.hadoop.hive.common.type.Decimal128.toBigDecimal(Decimal128.java:1733) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$1.writeValue(VectorExpressionWriterFactory.java:469) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterDecimal.writeValue(VectorExpressionWriterFactory.java:310) at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpressionWriterFactory$VectorExpressionWriterSetter.writeValue(VectorExpressionWriterFactory.java:1153) at o
[jira] [Commented] (HIVE-8371) HCatStorer should fail by default when publishing to an existing partition
[ https://issues.apache.org/jira/browse/HIVE-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162873#comment-14162873 ] Thiruvel Thirumoolan commented on HIVE-8371: Having a warehouse level property which basically sets default value of immutable property will only help restore HCatalog behavior. Isn't that going to flip the behavior of Hive? I am very concerned that the default HCatStorer behavior has changed after it has been out for a very long time. > HCatStorer should fail by default when publishing to an existing partition > -- > > Key: HIVE-8371 > URL: https://issues.apache.org/jira/browse/HIVE-8371 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 0.13.0, 0.14.0, 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: hcatalog, partition > > In Hive-12 and before (on in previous HCatalog releases) HCatStorer would > fail if the partition already exists (whether before launching the job or > during commit depending on the partitioning). HIVE-6406 changed that behavior > and by default does an append. This causes data quality issues since an rerun > (or duplicate run) won't fail (when it used to) and will just append to the > partition. > A preferable approach would be to leave HCatStorer behavior as is (fail > during a duplicate publish) and support append through an option. Overwrite > also can be implemented in a similar fashion. Eg: > store A into 'db.table' using > org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append'); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8371) HCatStorer should fail by default when publishing to an existing partition
[ https://issues.apache.org/jira/browse/HIVE-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161227#comment-14161227 ] Thiruvel Thirumoolan commented on HIVE-8371: [~sushanth] Lemme know what do you think about this. > HCatStorer should fail by default when publishing to an existing partition > -- > > Key: HIVE-8371 > URL: https://issues.apache.org/jira/browse/HIVE-8371 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 0.13.0, 0.14.0, 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: hcatalog, partition > > In Hive-12 and before (on in previous HCatalog releases) HCatStorer would > fail if the partition already exists (whether before launching the job or > during commit depending on the partitioning). HIVE-6406 changed that behavior > and by default does an append. This causes data quality issues since an rerun > (or duplicate run) won't fail (when it used to) and will just append to the > partition. > A preferable approach would be to leave HCatStorer behavior as is (fail > during a duplicate publish) and support append through an option. Overwrite > also can be implemented in a similar fashion. Eg: > store A into 'db.table' using > org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append'); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-8371) HCatStorer should fail by default when publishing to an existing partition
[ https://issues.apache.org/jira/browse/HIVE-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HIVE-8371: -- Assignee: Thiruvel Thirumoolan > HCatStorer should fail by default when publishing to an existing partition > -- > > Key: HIVE-8371 > URL: https://issues.apache.org/jira/browse/HIVE-8371 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 0.13.0, 0.14.0, 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Labels: hcatalog, partition > > In Hive-12 and before (on in previous HCatalog releases) HCatStorer would > fail if the partition already exists (whether before launching the job or > during commit depending on the partitioning). HIVE-6406 changed that behavior > and by default does an append. This causes data quality issues since an rerun > (or duplicate run) won't fail (when it used to) and will just append to the > partition. > A preferable approach would be to leave HCatStorer behavior as is (fail > during a duplicate publish) and support append through an option. Overwrite > also can be implemented in a similar fashion. Eg: > store A into 'db.table' using > org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append'); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8371) HCatStorer should fail by default when publishing to an existing partition
Thiruvel Thirumoolan created HIVE-8371: -- Summary: HCatStorer should fail by default when publishing to an existing partition Key: HIVE-8371 URL: https://issues.apache.org/jira/browse/HIVE-8371 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.1, 0.13.0, 0.14.0 Reporter: Thiruvel Thirumoolan In Hive-12 and before (on in previous HCatalog releases) HCatStorer would fail if the partition already exists (whether before launching the job or during commit depending on the partitioning). HIVE-6406 changed that behavior and by default does an append. This causes data quality issues since an rerun (or duplicate run) won't fail (when it used to) and will just append to the partition. A preferable approach would be to leave HCatStorer behavior as is (fail during a duplicate publish) and support append through an option. Overwrite also can be implemented in a similar fashion. Eg: store A into 'db.table' using org.apache.hive.hcatalog.pig.HCatStorer('partspec', '', ' -append'); -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan resolved HIVE-8264. Resolution: Duplicate > Math UDFs in Reducer-with-vectorization fail with > ArrayIndexOutOfBoundsException > > > Key: HIVE-8264 > URL: https://issues.apache.org/jira/browse/HIVE-8264 > Project: Hive > Issue Type: Bug > Components: Tez, UDF, Vectorization >Affects Versions: 0.14.0 > Environment: Hive trunk - as of today > Tez - 0.5.0 > Hadoop - 2.5 >Reporter: Thiruvel Thirumoolan > Labels: mathfunction, tez, vectorization > > Following queries are representative of the exceptions we are seeing with > trunk. These queries pass if vectorization is disabled (or if limit is > removed, which means no reducer). > select name, log2(0) from (select name from mytable limit 1) t; > select name, rand() from (select name from mytable limit 1) t; > .. similar patterns with other Math UDFs'. > Exception: > ], TaskAttempt 3 failed, info=[Error: Failure while running > task:java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > Caused by: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154) > ... 14 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242) > ... 16 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating > null > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) > at > org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347) > ... 17 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125) > ... 22 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148981#comment-14148981 ] Thiruvel Thirumoolan commented on HIVE-8264: Thanks [~mmccline], it appears to fix the problem. After applying the patch in HIVE-8171, I don't see any exceptions, I see the query running fine. > Math UDFs in Reducer-with-vectorization fail with > ArrayIndexOutOfBoundsException > > > Key: HIVE-8264 > URL: https://issues.apache.org/jira/browse/HIVE-8264 > Project: Hive > Issue Type: Bug > Components: Tez, UDF, Vectorization >Affects Versions: 0.14.0 > Environment: Hive trunk - as of today > Tez - 0.5.0 > Hadoop - 2.5 >Reporter: Thiruvel Thirumoolan > Labels: mathfunction, tez, vectorization > > Following queries are representative of the exceptions we are seeing with > trunk. These queries pass if vectorization is disabled (or if limit is > removed, which means no reducer). > select name, log2(0) from (select name from mytable limit 1) t; > select name, rand() from (select name from mytable limit 1) t; > .. similar patterns with other Math UDFs'. > Exception: > ], TaskAttempt 3 failed, info=[Error: Failure while running > task:java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > Caused by: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154) > ... 14 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242) > ... 16 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating > null > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) > at > org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347) > ... 17 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125) > ... 22 more
[jira] [Commented] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14148537#comment-14148537 ] Thiruvel Thirumoolan commented on HIVE-8264: [~mmccline] Have you seen this problem before or will it be addressed by any of the open jiras you are looking into? > Math UDFs in Reducer-with-vectorization fail with > ArrayIndexOutOfBoundsException > > > Key: HIVE-8264 > URL: https://issues.apache.org/jira/browse/HIVE-8264 > Project: Hive > Issue Type: Bug > Components: Tez, UDF, Vectorization >Affects Versions: 0.14.0 > Environment: Hive trunk - as of today > Tez - 0.5.0 > Hadoop - 2.5 >Reporter: Thiruvel Thirumoolan > Labels: mathfunction, tez, vectorization > > Following queries are representative of the exceptions we are seeing with > trunk. These queries pass if vectorization is disabled (or if limit is > removed, which means no reducer). > select name, log2(0) from (select name from mytable limit 1) t; > select name, rand() from (select name from mytable limit 1) t; > .. similar patterns with other Math UDFs'. > Exception: > ], TaskAttempt 3 failed, info=[Error: Failure while running > task:java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > Caused by: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154) > ... 14 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242) > ... 16 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating > null > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) > at > org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347) > ... 17 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125) > ... 22 more -- This message was sent by A
[jira] [Updated] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-8264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-8264: --- Tags: vectorization,math,udf,tez Assignee: (was: Jitendra Nath Pandey) Labels: mathfunction tez vectorization (was: ) > Math UDFs in Reducer-with-vectorization fail with > ArrayIndexOutOfBoundsException > > > Key: HIVE-8264 > URL: https://issues.apache.org/jira/browse/HIVE-8264 > Project: Hive > Issue Type: Bug > Components: Tez, UDF, Vectorization >Affects Versions: 0.14.0 > Environment: Hive trunk - as of today > Tez - 0.5.0 > Hadoop - 2.5 >Reporter: Thiruvel Thirumoolan > Labels: mathfunction, tez, vectorization > > Following queries are representative of the exceptions we are seeing with > trunk. These queries pass if vectorization is disabled (or if limit is > removed, which means no reducer). > select name, log2(0) from (select name from mytable limit 1) t; > select name, rand() from (select name from mytable limit 1) t; > .. similar patterns with other Math UDFs'. > Exception: > ], TaskAttempt 3 failed, info=[Error: Failure while running > task:java.lang.RuntimeException: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142) > at > org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172) > at > org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:722) > Caused by: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167) > at > org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154) > ... 14 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing vector batch (tag=0) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242) > ... 16 more > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating > null > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) > at > org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) > at > org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347) > ... 17 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102) > at > org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150) > at > org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125) > ... 22 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8264) Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException
Thiruvel Thirumoolan created HIVE-8264: -- Summary: Math UDFs in Reducer-with-vectorization fail with ArrayIndexOutOfBoundsException Key: HIVE-8264 URL: https://issues.apache.org/jira/browse/HIVE-8264 Project: Hive Issue Type: Bug Components: Tez, UDF, Vectorization Affects Versions: 0.14.0 Environment: Hive trunk - as of today Tez - 0.5.0 Hadoop - 2.5 Reporter: Thiruvel Thirumoolan Assignee: Jitendra Nath Pandey Following queries are representative of the exceptions we are seeing with trunk. These queries pass if vectorization is disabled (or if limit is removed, which means no reducer). select name, log2(0) from (select name from mytable limit 1) t; select name, rand() from (select name from mytable limit 1) t; .. similar patterns with other Math UDFs'. Exception: ], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:177) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:142) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) Caused by: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:254) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:167) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:154) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:360) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:242) ... 16 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating null at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:127) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) at org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator.processOp(VectorLimitOperator.java:47) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:801) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:139) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.processVectors(ReduceRecordSource.java:347) ... 17 more Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 at org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:102) at org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:150) at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:125) ... 22 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6090) Audit logs for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-6090: --- Attachment: HIVE-6090.1.WIP.patch Uploading a WIP progress patch that should apply cleanly. Will test against a live cluster (kerberos) and submit for precommit tests. > Audit logs for HiveServer2 > -- > > Key: HIVE-6090 > URL: https://issues.apache.org/jira/browse/HIVE-6090 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, HiveServer2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-6090.1.WIP.patch, HIVE-6090.patch > > > HiveMetastore has audit logs and would like to audit all queries or requests > to HiveServer2 also. This will help in understanding how the APIs were used, > queries submitted, users etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6090) Audit logs for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14132902#comment-14132902 ] Thiruvel Thirumoolan commented on HIVE-6090: Thanks [~farisa], will rebase and upload. > Audit logs for HiveServer2 > -- > > Key: HIVE-6090 > URL: https://issues.apache.org/jira/browse/HIVE-6090 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, HiveServer2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-6090.patch > > > HiveMetastore has audit logs and would like to audit all queries or requests > to HiveServer2 also. This will help in understanding how the APIs were used, > queries submitted, users etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7604) Add Metastore API to fetch one or more partition names
[ https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-7604: --- Attachment: Design_HIVE_7604.1.txt Thanks [~ashutoshc], uploading revised document with additional information for return values. Lemme know if its unclear. > Add Metastore API to fetch one or more partition names > -- > > Key: HIVE-7604 > URL: https://issues.apache.org/jira/browse/HIVE-7604 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.14.0 > > Attachments: Design_HIVE_7604.1.txt, Design_HIVE_7604.txt > > > We need a new API in Metastore to address the following use cases. Both use > cases arise from having tables with hundreds of thousands or in some cases > millions of partitions. > 1. It should be quick and easy to obtain distinct values of a partition. Eg: > Obtain all dates for which partitions are available. This can be used by > tools/frameworks programmatically to understand gaps in partitions before > reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to > obtain this information which is unfriendly and heavy weight. And for tables > which have large number of partitions, it takes a long time to run the > queries and it also requires large heap space. > 2. Typically users would like to know the list of partitions available and > would run queries that would only involve partition keys (select distinct > partkey1 from table) Or to obtain the latest date partition from a dimension > table to join against another fact table (select * from fact_table join > select max(dt) from dimension_table). Those queries (metadata only queries) > can be pushed to metastore and need not be run even locally in Hive. If the > queries can be converted into database based queries, the clients can be > light weight and need not fetch all partition names. The results can be > obtained much faster with less resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7604) Add Metastore API to fetch one or more partition names
[ https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111437#comment-14111437 ] Thiruvel Thirumoolan commented on HIVE-7604: [~ashutoshc] Do you have any comments on the API? > Add Metastore API to fetch one or more partition names > -- > > Key: HIVE-7604 > URL: https://issues.apache.org/jira/browse/HIVE-7604 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.14.0 > > Attachments: Design_HIVE_7604.txt > > > We need a new API in Metastore to address the following use cases. Both use > cases arise from having tables with hundreds of thousands or in some cases > millions of partitions. > 1. It should be quick and easy to obtain distinct values of a partition. Eg: > Obtain all dates for which partitions are available. This can be used by > tools/frameworks programmatically to understand gaps in partitions before > reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to > obtain this information which is unfriendly and heavy weight. And for tables > which have large number of partitions, it takes a long time to run the > queries and it also requires large heap space. > 2. Typically users would like to know the list of partitions available and > would run queries that would only involve partition keys (select distinct > partkey1 from table) Or to obtain the latest date partition from a dimension > table to join against another fact table (select * from fact_table join > select max(dt) from dimension_table). Those queries (metadata only queries) > can be pushed to metastore and need not be run even locally in Hive. If the > queries can be converted into database based queries, the clients can be > light weight and need not fetch all partition names. The results can be > obtained much faster with less resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6093) table creation should fail when user does not have permissions on db
[ https://issues.apache.org/jira/browse/HIVE-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102368#comment-14102368 ] Thiruvel Thirumoolan commented on HIVE-6093: Thanks [~thejas] > table creation should fail when user does not have permissions on db > > > Key: HIVE-6093 > URL: https://issues.apache.org/jira/browse/HIVE-6093 > Project: Hive > Issue Type: Bug > Components: Authorization, HCatalog, Metastore >Affects Versions: 0.12.0, 0.13.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Labels: authorization, metastore, security > Fix For: 0.14.0 > > Attachments: HIVE-6093-1.patch, HIVE-6093.1.patch, HIVE-6093.1.patch, > HIVE-6093.patch > > > Its possible to create a table under a database where the user does not have > write permission. It can be done by specifying a LOCATION where the user has > write access (say /tmp/foo). This should be restricted. > HdfsAuthorizationProvider (which typically runs on client) checks the > database directory during table creation. But > StorageBasedAuthorizationProvider does not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6093) table creation should fail when user does not have permissions on db
[ https://issues.apache.org/jira/browse/HIVE-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-6093: --- Fix Version/s: 0.14.0 Labels: authorization metastore security (was: ) Affects Version/s: 0.12.0 0.13.0 Release Note: One cannot create table (whether or not they provide a LOCATION) if they do not have WRITE permission on the database directory. Status: Patch Available (was: Open) > table creation should fail when user does not have permissions on db > > > Key: HIVE-6093 > URL: https://issues.apache.org/jira/browse/HIVE-6093 > Project: Hive > Issue Type: Bug > Components: Authorization, HCatalog, Metastore >Affects Versions: 0.13.0, 0.12.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Labels: security, authorization, metastore > Fix For: 0.14.0 > > Attachments: HIVE-6093-1.patch, HIVE-6093.patch > > > Its possible to create a table under a database where the user does not have > write permission. It can be done by specifying a LOCATION where the user has > write access (say /tmp/foo). This should be restricted. > HdfsAuthorizationProvider (which typically runs on client) checks the > database directory during table creation. But > StorageBasedAuthorizationProvider does not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6093) table creation should fail when user does not have permissions on db
[ https://issues.apache.org/jira/browse/HIVE-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097401#comment-14097401 ] Thiruvel Thirumoolan commented on HIVE-6093: Review request @ https://reviews.apache.org/r/24705/ > table creation should fail when user does not have permissions on db > > > Key: HIVE-6093 > URL: https://issues.apache.org/jira/browse/HIVE-6093 > Project: Hive > Issue Type: Bug > Components: Authorization, HCatalog, Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Attachments: HIVE-6093-1.patch, HIVE-6093.patch > > > Its possible to create a table under a database where the user does not have > write permission. It can be done by specifying a LOCATION where the user has > write access (say /tmp/foo). This should be restricted. > HdfsAuthorizationProvider (which typically runs on client) checks the > database directory during table creation. But > StorageBasedAuthorizationProvider does not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-6089) Add metrics to HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096508#comment-14096508 ] Thiruvel Thirumoolan commented on HIVE-6089: Hi [~jaideepdhok]. Sorry about the delay. I was hoping to get back on this. We have been using metrics internally and would like to update this patch with what we have learnt. > Add metrics to HiveServer2 > -- > > Key: HIVE-6089 > URL: https://issues.apache.org/jira/browse/HIVE-6089 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, HiveServer2 >Affects Versions: 0.12.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.14.0 > > Attachments: HIVE-6089_prototype.patch > > > Would like to collect metrics about HiveServer's usage, like active > connections, total requests etc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7604) Add Metastore API to fetch one or more partition names
[ https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096002#comment-14096002 ] Thiruvel Thirumoolan commented on HIVE-7604: Thanks [~sershe]. I will reuse as much as possible. > Add Metastore API to fetch one or more partition names > -- > > Key: HIVE-7604 > URL: https://issues.apache.org/jira/browse/HIVE-7604 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.14.0 > > Attachments: Design_HIVE_7604.txt > > > We need a new API in Metastore to address the following use cases. Both use > cases arise from having tables with hundreds of thousands or in some cases > millions of partitions. > 1. It should be quick and easy to obtain distinct values of a partition. Eg: > Obtain all dates for which partitions are available. This can be used by > tools/frameworks programmatically to understand gaps in partitions before > reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to > obtain this information which is unfriendly and heavy weight. And for tables > which have large number of partitions, it takes a long time to run the > queries and it also requires large heap space. > 2. Typically users would like to know the list of partitions available and > would run queries that would only involve partition keys (select distinct > partkey1 from table) Or to obtain the latest date partition from a dimension > table to join against another fact table (select * from fact_table join > select max(dt) from dimension_table). Those queries (metadata only queries) > can be pushed to metastore and need not be run even locally in Hive. If the > queries can be converted into database based queries, the clients can be > light weight and need not fetch all partition names. The results can be > obtained much faster with less resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7604) Add Metastore API to fetch one or more partition names
[ https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-7604: --- Attachment: Design_HIVE_7604.txt Attaching file that describes the API and rationality behind them. I have an alpha implementation which obtains distinct values of partition keys. To start with, this is only ORM and its approach is very similar to ExpressionTree.java (using substring and indexOf string functions). Tested this with a table containing about a million partitions, partitioned by 6 keys and using Oracle as backend. It takes 2-4 seconds to obtain unique values of a partition. Hope this provides a rough idea of latency for large tables. > Add Metastore API to fetch one or more partition names > -- > > Key: HIVE-7604 > URL: https://issues.apache.org/jira/browse/HIVE-7604 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.14.0 > > Attachments: Design_HIVE_7604.txt > > > We need a new API in Metastore to address the following use cases. Both use > cases arise from having tables with hundreds of thousands or in some cases > millions of partitions. > 1. It should be quick and easy to obtain distinct values of a partition. Eg: > Obtain all dates for which partitions are available. This can be used by > tools/frameworks programmatically to understand gaps in partitions before > reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to > obtain this information which is unfriendly and heavy weight. And for tables > which have large number of partitions, it takes a long time to run the > queries and it also requires large heap space. > 2. Typically users would like to know the list of partitions available and > would run queries that would only involve partition keys (select distinct > partkey1 from table) Or to obtain the latest date partition from a dimension > table to join against another fact table (select * from fact_table join > select max(dt) from dimension_table). Those queries (metadata only queries) > can be pushed to metastore and need not be run even locally in Hive. If the > queries can be converted into database based queries, the clients can be > light weight and need not fetch all partition names. The results can be > obtained much faster with less resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6093) table creation should fail when user does not have permissions on db
[ https://issues.apache.org/jira/browse/HIVE-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-6093: --- Attachment: HIVE-6093-1.patch Updated patch for trunk and also added unit tests. Unit tests TestMetastoreAuthorizationProvider and TestStorageBasedMetastoreAuthorizationProvider passed. Running complete suite. > table creation should fail when user does not have permissions on db > > > Key: HIVE-6093 > URL: https://issues.apache.org/jira/browse/HIVE-6093 > Project: Hive > Issue Type: Bug > Components: Authorization, HCatalog, Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Attachments: HIVE-6093-1.patch, HIVE-6093.patch > > > Its possible to create a table under a database where the user does not have > write permission. It can be done by specifying a LOCATION where the user has > write access (say /tmp/foo). This should be restricted. > HdfsAuthorizationProvider (which typically runs on client) checks the > database directory during table creation. But > StorageBasedAuthorizationProvider does not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7604) Add Metastore API to fetch one or more partition names
[ https://issues.apache.org/jira/browse/HIVE-7604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085064#comment-14085064 ] Thiruvel Thirumoolan commented on HIVE-7604: [~ashutoshc] Thanks. I will post an API signature today. > Add Metastore API to fetch one or more partition names > -- > > Key: HIVE-7604 > URL: https://issues.apache.org/jira/browse/HIVE-7604 > Project: Hive > Issue Type: New Feature > Components: Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.14.0 > > > We need a new API in Metastore to address the following use cases. Both use > cases arise from having tables with hundreds of thousands or in some cases > millions of partitions. > 1. It should be quick and easy to obtain distinct values of a partition. Eg: > Obtain all dates for which partitions are available. This can be used by > tools/frameworks programmatically to understand gaps in partitions before > reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to > obtain this information which is unfriendly and heavy weight. And for tables > which have large number of partitions, it takes a long time to run the > queries and it also requires large heap space. > 2. Typically users would like to know the list of partitions available and > would run queries that would only involve partition keys (select distinct > partkey1 from table) Or to obtain the latest date partition from a dimension > table to join against another fact table (select * from fact_table join > select max(dt) from dimension_table). Those queries (metadata only queries) > can be pushed to metastore and need not be run even locally in Hive. If the > queries can be converted into database based queries, the clients can be > light weight and need not fetch all partition names. The results can be > obtained much faster with less resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HIVE-7604) Add Metastore API to fetch one or more partition names
Thiruvel Thirumoolan created HIVE-7604: -- Summary: Add Metastore API to fetch one or more partition names Key: HIVE-7604 URL: https://issues.apache.org/jira/browse/HIVE-7604 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.14.0 We need a new API in Metastore to address the following use cases. Both use cases arise from having tables with hundreds of thousands or in some cases millions of partitions. 1. It should be quick and easy to obtain distinct values of a partition. Eg: Obtain all dates for which partitions are available. This can be used by tools/frameworks programmatically to understand gaps in partitions before reprocessing them. Currently one has to run Hive queries (JDBC or CLI) to obtain this information which is unfriendly and heavy weight. And for tables which have large number of partitions, it takes a long time to run the queries and it also requires large heap space. 2. Typically users would like to know the list of partitions available and would run queries that would only involve partition keys (select distinct partkey1 from table) Or to obtain the latest date partition from a dimension table to join against another fact table (select * from fact_table join select max(dt) from dimension_table). Those queries (metadata only queries) can be pushed to metastore and need not be run even locally in Hive. If the queries can be converted into database based queries, the clients can be light weight and need not fetch all partition names. The results can be obtained much faster with less resources. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7558) HCatLoader reuses credentials across jobs
[ https://issues.apache.org/jira/browse/HIVE-7558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081476#comment-14081476 ] Thiruvel Thirumoolan commented on HIVE-7558: Thanks [~daijy]. Review link @ https://reviews.apache.org/r/24163/ > HCatLoader reuses credentials across jobs > - > > Key: HIVE-7558 > URL: https://issues.apache.org/jira/browse/HIVE-7558 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.14.0 > > Attachments: HIVE-7558.patch > > > HCatLoader reuses credentials of stage1 in stage2 for some of the pig > queries. This causes stage-2 to fail, if stage-2 runs for more than 10 mins. > Pig queries which loads data using HCatLoader, filters only by partition > columns and does an order by will run into this problem. Exceptions will be > very similar to the following: > 2014-07-22 17:28:49,337 [main] ERROR org.apache.pig.tools.grunt.GruntParser - > ERROR 2997: Unable to recreate exception from backed error: > AttemptID: Info:RemoteTrace: > org.apache.hadoop.security.token.SecretManager$InvalidToken: token > (HDFS_DELEGATION_TOKEN token for ) can't be found in cache > at org.apache.hadoop.ipc.Client.call(Client.java:1095) > at > org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:195) > at $Proxy7.getFileInfo(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:102) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:67) > at $Proxy7.getFileInfo(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1305) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:734) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51) > at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284) > at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1300) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:281) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > at LocalTrace: > org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: > token (HDFS_DELEGATION_TOKEN token for ) can't be found in > cache > at > org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217) > at > org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:823) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:497) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:224) > at > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46) > at > org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolServic
[jira] [Updated] (HIVE-7558) HCatLoader reuses credentials across jobs
[ https://issues.apache.org/jira/browse/HIVE-7558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-7558: --- Attachment: HIVE-7558.patch Attaching patch. Do not copy job's credentials in HCatLoader's objects. > HCatLoader reuses credentials across jobs > - > > Key: HIVE-7558 > URL: https://issues.apache.org/jira/browse/HIVE-7558 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.14.0 > > Attachments: HIVE-7558.patch > > > HCatLoader reuses credentials of stage1 in stage2 for some of the pig > queries. This causes stage-2 to fail, if stage-2 runs for more than 10 mins. > Pig queries which loads data using HCatLoader, filters only by partition > columns and does an order by will run into this problem. Exceptions will be > very similar to the following: > 2014-07-22 17:28:49,337 [main] ERROR org.apache.pig.tools.grunt.GruntParser - > ERROR 2997: Unable to recreate exception from backed error: > AttemptID: Info:RemoteTrace: > org.apache.hadoop.security.token.SecretManager$InvalidToken: token > (HDFS_DELEGATION_TOKEN token for ) can't be found in cache > at org.apache.hadoop.ipc.Client.call(Client.java:1095) > at > org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:195) > at $Proxy7.getFileInfo(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:102) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:67) > at $Proxy7.getFileInfo(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1305) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:734) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51) > at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284) > at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1300) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:281) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > at LocalTrace: > org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: > token (HDFS_DELEGATION_TOKEN token for ) can't be found in > cache > at > org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217) > at > org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:823) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:497) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:224) > at > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46) > at > org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMet
[jira] [Assigned] (HIVE-7558) HCatLoader reuses credentials across jobs
[ https://issues.apache.org/jira/browse/HIVE-7558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HIVE-7558: -- Assignee: Thiruvel Thirumoolan > HCatLoader reuses credentials across jobs > - > > Key: HIVE-7558 > URL: https://issues.apache.org/jira/browse/HIVE-7558 > Project: Hive > Issue Type: Bug > Components: HCatalog >Affects Versions: 0.13.1 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.14.0 > > > HCatLoader reuses credentials of stage1 in stage2 for some of the pig > queries. This causes stage-2 to fail, if stage-2 runs for more than 10 mins. > Pig queries which loads data using HCatLoader, filters only by partition > columns and does an order by will run into this problem. Exceptions will be > very similar to the following: > 2014-07-22 17:28:49,337 [main] ERROR org.apache.pig.tools.grunt.GruntParser - > ERROR 2997: Unable to recreate exception from backed error: > AttemptID: Info:RemoteTrace: > org.apache.hadoop.security.token.SecretManager$InvalidToken: token > (HDFS_DELEGATION_TOKEN token for ) can't be found in cache > at org.apache.hadoop.ipc.Client.call(Client.java:1095) > at > org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:195) > at $Proxy7.getFileInfo(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:102) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:67) > at $Proxy7.getFileInfo(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1305) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:734) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51) > at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284) > at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1300) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:281) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > at LocalTrace: > org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: > token (HDFS_DELEGATION_TOKEN token for ) can't be found in > cache > at > org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217) > at > org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:823) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:497) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:224) > at > org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46) > at > org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57) > at > org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Serve
[jira] [Created] (HIVE-7558) HCatLoader reuses credentials across jobs
Thiruvel Thirumoolan created HIVE-7558: -- Summary: HCatLoader reuses credentials across jobs Key: HIVE-7558 URL: https://issues.apache.org/jira/browse/HIVE-7558 Project: Hive Issue Type: Bug Components: HCatalog Affects Versions: 0.13.1 Reporter: Thiruvel Thirumoolan Fix For: 0.14.0 HCatLoader reuses credentials of stage1 in stage2 for some of the pig queries. This causes stage-2 to fail, if stage-2 runs for more than 10 mins. Pig queries which loads data using HCatLoader, filters only by partition columns and does an order by will run into this problem. Exceptions will be very similar to the following: 2014-07-22 17:28:49,337 [main] ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2997: Unable to recreate exception from backed error: AttemptID: Info:RemoteTrace: org.apache.hadoop.security.token.SecretManager$InvalidToken: token (HDFS_DELEGATION_TOKEN token for ) can't be found in cache at org.apache.hadoop.ipc.Client.call(Client.java:1095) at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:195) at $Proxy7.getFileInfo(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:102) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:67) at $Proxy7.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1305) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:734) at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:176) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:51) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:284) at org.apache.hadoop.yarn.util.FSDownload$1.run(FSDownload.java:282) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1300) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:281) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:51) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) at LocalTrace: org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl: token (HDFS_DELEGATION_TOKEN token for ) can't be found in cache at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.convertFromProtoFormat(LocalResourceStatusPBImpl.java:217) at org.apache.hadoop.yarn.server.nodemanager.api.protocolrecords.impl.pb.LocalResourceStatusPBImpl.getException(LocalResourceStatusPBImpl.java:147) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerRunner.update(ResourceLocalizationService.java:823) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService$LocalizerTracker.processHeartbeat(ResourceLocalizationService.java:497) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.heartbeat(ResourceLocalizationService.java:224) at org.apache.hadoop.yarn.server.nodemanager.api.impl.pb.service.LocalizationProtocolPBServiceImpl.heartbeat(LocalizationProtocolPBServiceImpl.java:46) at org.apache.hadoop.yarn.proto.LocalizationProtocol$LocalizationProtocolService$2.callBlockingMethod(LocalizationProtocol.java:57) at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:353) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1476) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1472) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.
[jira] [Commented] (HIVE-6089) Add metrics to HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861841#comment-13861841 ] Thiruvel Thirumoolan commented on HIVE-6089: [~jaideepdhok] Thanks for the feedback. As this is the first metrics patch, I will add everything that's straightforward. Will add others in a followup JIRA. > Add metrics to HiveServer2 > -- > > Key: HIVE-6089 > URL: https://issues.apache.org/jira/browse/HIVE-6089 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, HiveServer2 >Affects Versions: 0.12.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.13.0 > > Attachments: HIVE-6089_prototype.patch > > > Would like to collect metrics about HiveServer's usage, like active > connections, total requests etc. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (HIVE-6093) table creation should fail when user does not have permissions on db
[ https://issues.apache.org/jira/browse/HIVE-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HIVE-6093: -- Assignee: Thiruvel Thirumoolan > table creation should fail when user does not have permissions on db > > > Key: HIVE-6093 > URL: https://issues.apache.org/jira/browse/HIVE-6093 > Project: Hive > Issue Type: Bug > Components: Authorization, HCatalog, Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > > Its possible to create a table under a database where the user does not have > write permission. It can be done by specifying a LOCATION where the user has > write access (say /tmp/foo). This should be restricted. > HdfsAuthorizationProvider (which typically runs on client) checks the > database directory during table creation. But > StorageBasedAuthorizationProvider does not. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HIVE-6093) table creation should fail when user does not have permissions on db
[ https://issues.apache.org/jira/browse/HIVE-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-6093: --- Attachment: HIVE-6093.patch Attaching a first approach patch. This checks the permission of database directory even if a location was specified during table creation. > table creation should fail when user does not have permissions on db > > > Key: HIVE-6093 > URL: https://issues.apache.org/jira/browse/HIVE-6093 > Project: Hive > Issue Type: Bug > Components: Authorization, HCatalog, Metastore >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Attachments: HIVE-6093.patch > > > Its possible to create a table under a database where the user does not have > write permission. It can be done by specifying a LOCATION where the user has > write access (say /tmp/foo). This should be restricted. > HdfsAuthorizationProvider (which typically runs on client) checks the > database directory during table creation. But > StorageBasedAuthorizationProvider does not. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (HIVE-6093) table creation should fail when user does not have permissions on db
Thiruvel Thirumoolan created HIVE-6093: -- Summary: table creation should fail when user does not have permissions on db Key: HIVE-6093 URL: https://issues.apache.org/jira/browse/HIVE-6093 Project: Hive Issue Type: Bug Components: Authorization, HCatalog, Metastore Reporter: Thiruvel Thirumoolan Priority: Minor Its possible to create a table under a database where the user does not have write permission. It can be done by specifying a LOCATION where the user has write access (say /tmp/foo). This should be restricted. HdfsAuthorizationProvider (which typically runs on client) checks the database directory during table creation. But StorageBasedAuthorizationProvider does not. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HIVE-6091) Empty pipeout files are created for connection create/close
[ https://issues.apache.org/jira/browse/HIVE-6091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-6091: --- Attachment: HIVE-6091.patch Starting off with a simple approach, to delete the file on connection close. > Empty pipeout files are created for connection create/close > --- > > Key: HIVE-6091 > URL: https://issues.apache.org/jira/browse/HIVE-6091 > Project: Hive > Issue Type: Bug >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Attachments: HIVE-6091.patch > > > Pipeout files are created when a connection is established and removed only > when data was produced. Instead we should create them only when data has to > be fetched or remove them whether data is fetched or not. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (HIVE-6091) Empty pipeout files are created for connection create/close
Thiruvel Thirumoolan created HIVE-6091: -- Summary: Empty pipeout files are created for connection create/close Key: HIVE-6091 URL: https://issues.apache.org/jira/browse/HIVE-6091 Project: Hive Issue Type: Bug Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Priority: Minor Pipeout files are created when a connection is established and removed only when data was produced. Instead we should create them only when data has to be fetched or remove them whether data is fetched or not. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HIVE-6090) Audit logs for HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-6090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-6090: --- Attachment: HIVE-6090.patch Uploading a preliminary patch based on Hive12, to start with. The implementation is pretty similar to HMS. Should refactor based on trunk with the right methods. > Audit logs for HiveServer2 > -- > > Key: HIVE-6090 > URL: https://issues.apache.org/jira/browse/HIVE-6090 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, HiveServer2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-6090.patch > > > HiveMetastore has audit logs and would like to audit all queries or requests > to HiveServer2 also. This will help in understanding how the APIs were used, > queries submitted, users etc. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (HIVE-6090) Audit logs for HiveServer2
Thiruvel Thirumoolan created HIVE-6090: -- Summary: Audit logs for HiveServer2 Key: HIVE-6090 URL: https://issues.apache.org/jira/browse/HIVE-6090 Project: Hive Issue Type: Improvement Components: Diagnosability, HiveServer2 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan HiveMetastore has audit logs and would like to audit all queries or requests to HiveServer2 also. This will help in understanding how the APIs were used, queries submitted, users etc. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Updated] (HIVE-6089) Add metrics to HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-6089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-6089: --- Attachment: HIVE-6089_prototype.patch Here is a preliminary patch based on branch-12. More work has to be done to port this to trunk, this is just a start. > Add metrics to HiveServer2 > -- > > Key: HIVE-6089 > URL: https://issues.apache.org/jira/browse/HIVE-6089 > Project: Hive > Issue Type: Improvement > Components: Diagnosability, HiveServer2 >Affects Versions: 0.12.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.13.0 > > Attachments: HIVE-6089_prototype.patch > > > Would like to collect metrics about HiveServer's usage, like active > connections, total requests etc. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Created] (HIVE-6089) Add metrics to HiveServer2
Thiruvel Thirumoolan created HIVE-6089: -- Summary: Add metrics to HiveServer2 Key: HIVE-6089 URL: https://issues.apache.org/jira/browse/HIVE-6089 Project: Hive Issue Type: Improvement Components: Diagnosability, HiveServer2 Affects Versions: 0.12.0 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.13.0 Would like to collect metrics about HiveServer's usage, like active connections, total requests etc. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query
[ https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802192#comment-13802192 ] Thiruvel Thirumoolan commented on HIVE-5268: Thanks Brock and Carl for the comments. I posted the initial patch as sort of an approach I had for branch-10, it was only a first dig at this problem. The intention is to separate the physical disconnect and session timeout as Carl mentioned. > HiveServer2 accumulates orphaned OperationHandle objects when a client fails > while executing query > -- > > Key: HIVE-5268 > URL: https://issues.apache.org/jira/browse/HIVE-5268 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Vaibhav Gumashta >Assignee: Thiruvel Thirumoolan > Fix For: 0.13.0 > > Attachments: HIVE-5268_prototype.patch > > > When queries are executed against the HiveServer2 an OperationHandle object > is stored in the OperationManager.handleToOperation HashMap. Currently its > the duty of the JDBC client to explicitly close to cleanup the entry in the > map. But if the client fails to close the statement then the OperationHandle > object is never cleaned up and gets accumulated in the server. > This can potentially cause OOM on the server over time. This also can be used > as a loophole by a malicious client to bring down the Hive server. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query
[ https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13801331#comment-13801331 ] Thiruvel Thirumoolan commented on HIVE-5268: [~vgumashta] Here it is https://reviews.apache.org/r/14809/ Let me dig in and come up with a design. > HiveServer2 accumulates orphaned OperationHandle objects when a client fails > while executing query > -- > > Key: HIVE-5268 > URL: https://issues.apache.org/jira/browse/HIVE-5268 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Vaibhav Gumashta >Assignee: Thiruvel Thirumoolan > Fix For: 0.13.0 > > Attachments: HIVE-5268_prototype.patch > > > When queries are executed against the HiveServer2 an OperationHandle object > is stored in the OperationManager.handleToOperation HashMap. Currently its > the duty of the JDBC client to explicitly close to cleanup the entry in the > map. But if the client fails to close the statement then the OperationHandle > object is never cleaned up and gets accumulated in the server. > This can potentially cause OOM on the server over time. This also can be used > as a loophole by a malicious client to bring down the Hive server. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query
[ https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HIVE-5268: -- Assignee: Thiruvel Thirumoolan (was: Vaibhav Gumashta) > HiveServer2 accumulates orphaned OperationHandle objects when a client fails > while executing query > -- > > Key: HIVE-5268 > URL: https://issues.apache.org/jira/browse/HIVE-5268 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Vaibhav Gumashta >Assignee: Thiruvel Thirumoolan > Fix For: 0.13.0 > > Attachments: HIVE-5268_prototype.patch > > > When queries are executed against the HiveServer2 an OperationHandle object > is stored in the OperationManager.handleToOperation HashMap. Currently its > the duty of the JDBC client to explicitly close to cleanup the entry in the > map. But if the client fails to close the statement then the OperationHandle > object is never cleaned up and gets accumulated in the server. > This can potentially cause OOM on the server over time. This also can be used > as a loophole by a malicious client to bring down the Hive server. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query
[ https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-5268: --- Attachment: HIVE-5268_prototype.patch Attaching a preliminary patch for branch 12. As mentioned before, this patch is aggressive (was a start) in cleaning up resources on server side. As soon as a client disconnects the resources are cleaned up on HS2 (if a query is running during disconnection, the resources are cleaned up at the end of the query). This approach was designed for Hive10 and I am working on porting it to trunk and a patch will be available for Hive12 too. The newer approach will handle disconnects during async query execution and also have timeouts after which handles/sessions will be cleaned up instead of the existing aggressive approach. Vaibhav, can I assign this to myself if you arent working on this? Thanks! > HiveServer2 accumulates orphaned OperationHandle objects when a client fails > while executing query > -- > > Key: HIVE-5268 > URL: https://issues.apache.org/jira/browse/HIVE-5268 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Fix For: 0.13.0 > > Attachments: HIVE-5268_prototype.patch > > > When queries are executed against the HiveServer2 an OperationHandle object > is stored in the OperationManager.handleToOperation HashMap. Currently its > the duty of the JDBC client to explicitly close to cleanup the entry in the > map. But if the client fails to close the statement then the OperationHandle > object is never cleaned up and gets accumulated in the server. > This can potentially cause OOM on the server over time. This also can be used > as a loophole by a malicious client to bring down the Hive server. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5486) HiveServer2 should create base scratch directories at startup
[ https://issues.apache.org/jira/browse/HIVE-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13796260#comment-13796260 ] Thiruvel Thirumoolan commented on HIVE-5486: [~prasadm] Can we set the permission to 1777, with the sticky bit? What do you think? > HiveServer2 should create base scratch directories at startup > - > > Key: HIVE-5486 > URL: https://issues.apache.org/jira/browse/HIVE-5486 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 0.11.0, 0.12.0 >Reporter: Prasad Mujumdar >Assignee: Prasad Mujumdar > Attachments: HIVE-5486.2.patch, HIVE-5486.3.patch > > > With impersonation enabled, the same base directory is used by all > sessions/queries. For a new deployment, this directory gets created on first > invocation by the user running that session. This would cause directory > permission conflict for other users. > HiveServer2 should create the base scratch dirs if it doesn't exist. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query
[ https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13794612#comment-13794612 ] Thiruvel Thirumoolan commented on HIVE-5268: Sorry, my bad. Let me upload what I have. > HiveServer2 accumulates orphaned OperationHandle objects when a client fails > while executing query > -- > > Key: HIVE-5268 > URL: https://issues.apache.org/jira/browse/HIVE-5268 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Fix For: 0.13.0 > > > When queries are executed against the HiveServer2 an OperationHandle object > is stored in the OperationManager.handleToOperation HashMap. Currently its > the duty of the JDBC client to explicitly close to cleanup the entry in the > map. But if the client fails to close the statement then the OperationHandle > object is never cleaned up and gets accumulated in the server. > This can potentially cause OOM on the server over time. This also can be used > as a loophole by a malicious client to bring down the Hive server. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-5268) HiveServer2 accumulates orphaned OperationHandle objects when a client fails while executing query
[ https://issues.apache.org/jira/browse/HIVE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778106#comment-13778106 ] Thiruvel Thirumoolan commented on HIVE-5268: Thanks for raising this Vaibhav. We have a similar patch which cleans up session related info when network issues cause client disconnection or clients fail to close sessions. The patch is available for Hive-11 and am porting that to Hive12 and trunk. Unfortunately I didnt create the JIRA earlier. The patch cleanups aggressively as soon as the client disconnects. Based on Carl's feedback from a hive meetup, we would like to have a session timeout after which all idle/disconnected sessions are cleaned. I was working towards that. Have you started working on this? If not, can I start by uploading the aggressive patch I have and then go forward with the improvements? > HiveServer2 accumulates orphaned OperationHandle objects when a client fails > while executing query > -- > > Key: HIVE-5268 > URL: https://issues.apache.org/jira/browse/HIVE-5268 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Fix For: 0.13.0 > > > When queries are executed against the HiveServer2 an OperationHandle object > is stored in the OperationManager.handleToOperation HashMap. Currently its > the duty of the JDBC client to explicitly close to cleanup the entry in the > map. But if the client fails to close the statement then the OperationHandle > object is never cleaned up and gets accumulated in the server. > This can potentially cause OOM on the server over time. This also can be used > as a loophole by a malicious client to bring down the Hive server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-5214) Dynamic partitions/insert overwrite don't inherit groupname from table's directory
[ https://issues.apache.org/jira/browse/HIVE-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-5214: --- Description: When dynamic partitions are created or insert overwrite without partitions, the files/partition-dirs don't inherit the group name. The query (say, insert overwrite table select *) uses the scratch directory for creating the temporary data. The temporary data's perm/group is inherited from scratch directory. Finally, the MoveTask does a rename of the temporary dir/files to be the target directory and an explicit group/perm change does not happen. HIVE-3756 fixed it for Load data, dynamic partitions/inserts have to be handled. was: When dynamic partitions are created, the files/partitions don't inherit the group name. The query (say, insert overwrite table select *) uses the scratch directory for creating the temporary data. The temporary data's perm/group is inherited from scratch directory. Finally, the MoveTask does a rename of the temporary dir to be the target partition directory and an explicit group/perm change does not happen. HIVE-3756 fixed it for Load data, dynamic partitions has to be handled. Summary: Dynamic partitions/insert overwrite don't inherit groupname from table's directory (was: Dynamic partitions don't inherit groupname from table's directory) > Dynamic partitions/insert overwrite don't inherit groupname from table's > directory > -- > > Key: HIVE-5214 > URL: https://issues.apache.org/jira/browse/HIVE-5214 > Project: Hive > Issue Type: Bug > Components: Authorization, Security >Affects Versions: 0.12.0 >Reporter: Thiruvel Thirumoolan > > When dynamic partitions are created or insert overwrite without partitions, > the files/partition-dirs don't inherit the group name. > The query (say, insert overwrite table select *) uses the scratch directory > for creating the temporary data. The temporary data's perm/group is inherited > from scratch directory. Finally, the MoveTask does a rename of the temporary > dir/files to be the target directory and an explicit group/perm change does > not happen. > HIVE-3756 fixed it for Load data, dynamic partitions/inserts have to be > handled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5214) Dynamic partitions don't inherit groupname from table's directory
Thiruvel Thirumoolan created HIVE-5214: -- Summary: Dynamic partitions don't inherit groupname from table's directory Key: HIVE-5214 URL: https://issues.apache.org/jira/browse/HIVE-5214 Project: Hive Issue Type: Bug Components: Authorization, Security Affects Versions: 0.12.0 Reporter: Thiruvel Thirumoolan When dynamic partitions are created, the files/partitions don't inherit the group name. The query (say, insert overwrite table select *) uses the scratch directory for creating the temporary data. The temporary data's perm/group is inherited from scratch directory. Finally, the MoveTask does a rename of the temporary dir to be the target partition directory and an explicit group/perm change does not happen. HIVE-3756 fixed it for Load data, dynamic partitions has to be handled. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3591) set hive.security.authorization.enabled can be executed by any user
[ https://issues.apache.org/jira/browse/HIVE-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13746980#comment-13746980 ] Thiruvel Thirumoolan commented on HIVE-3591: [~lmccay] The first approach to authorization was client side. [~sushanth] has also enabled this on the server side (HCatalog/Metastore) through HIVE-3705. We enable these features on our HCatalog deployments. Even if the user unsets these properties, server side changes still take effect and the user can't drop tables etc. We have tested this for HDFS based authorization. The properties we used on the HCatalog server are: hive.security.metastore.authorization.manager org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider hive.security.metastore.authenticator.manager org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator hive.metastore.pre.event.listeners org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener > set hive.security.authorization.enabled can be executed by any user > --- > > Key: HIVE-3591 > URL: https://issues.apache.org/jira/browse/HIVE-3591 > Project: Hive > Issue Type: Bug > Components: Authorization, CLI, Clients, JDBC >Affects Versions: 0.7.1 > Environment: RHEL 5.6 > CDH U3 >Reporter: Dev Gupta > Labels: Authorization, Security > > The property hive.security.authorization.enabled can be set to true or false, > by any user on the CLI, thus circumventing any previously set grants and > authorizations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4513) disable hivehistory logs by default
[ https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737720#comment-13737720 ] Thiruvel Thirumoolan commented on HIVE-4513: > Chris Drome had some comments on the patch. These fall in the vicinity but > should be addressed as a separate JIRA. Created HIVE-5071 to address thread safety issues. > disable hivehistory logs by default > --- > > Key: HIVE-4513 > URL: https://issues.apache.org/jira/browse/HIVE-4513 > Project: Hive > Issue Type: Bug > Components: Configuration, Logging >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch, > HIVE-4513.4.patch, HIVE-4513.5.patch, HIVE-4513.6.patch > > > HiveHistory log files (hive_job_log_hive_*.txt files) store information about > hive query such as query string, plan , counters and MR job progress > information. > There is no mechanism to delete these files and as a result they get > accumulated over time, using up lot of disk space. > I don't think this is used by most people, so I think it would better to turn > this off by default. Jobtracker logs already capture most of this > information, though it is not as structured as history logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-5071) Address thread safety issues with HiveHistoryUtil
Thiruvel Thirumoolan created HIVE-5071: -- Summary: Address thread safety issues with HiveHistoryUtil Key: HIVE-5071 URL: https://issues.apache.org/jira/browse/HIVE-5071 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Thiruvel Thirumoolan Priority: Minor Fix For: 0.12.0 HiveHistoryUtil.parseLine() is not thread safe, it could be used by multiple clients of HWA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4513) disable hivehistory logs by default
[ https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737711#comment-13737711 ] Thiruvel Thirumoolan commented on HIVE-4513: Thanks [~thejas]. +1. I guess we can close the duplicates HIVE-1708 and HIVE-3779. We back-ported this to Hive10 and it works as expected. [~cdrome] had some comments on the patch. These fall in the vicinity but should be addressed as a separate JIRA. 1. HiveHistoryViewer.java: Its good as "private void init()" 2. HiveHistoryUtil.java: parseLine() method is not thread-safe. It uses parseBuffer which could be modified by multiple threads. Currently only HWA uses it. > disable hivehistory logs by default > --- > > Key: HIVE-4513 > URL: https://issues.apache.org/jira/browse/HIVE-4513 > Project: Hive > Issue Type: Bug > Components: Configuration, Logging >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch, > HIVE-4513.4.patch, HIVE-4513.5.patch, HIVE-4513.6.patch > > > HiveHistory log files (hive_job_log_hive_*.txt files) store information about > hive query such as query string, plan , counters and MR job progress > information. > There is no mechanism to delete these files and as a result they get > accumulated over time, using up lot of disk space. > I don't think this is used by most people, so I think it would better to turn > this off by default. Jobtracker logs already capture most of this > information, though it is not as structured as history logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4513) disable hivehistory logs by default
[ https://issues.apache.org/jira/browse/HIVE-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733021#comment-13733021 ] Thiruvel Thirumoolan commented on HIVE-4513: Hi Thejas, are you working on this patch? > disable hivehistory logs by default > --- > > Key: HIVE-4513 > URL: https://issues.apache.org/jira/browse/HIVE-4513 > Project: Hive > Issue Type: Bug > Components: Configuration, Logging >Reporter: Thejas M Nair >Assignee: Thejas M Nair > Attachments: HIVE-4513.1.patch, HIVE-4513.2.patch, HIVE-4513.3.patch, > HIVE-4513.4.patch > > > HiveHistory log files (hive_job_log_hive_*.txt files) store information about > hive query such as query string, plan , counters and MR job progress > information. > There is no mechanism to delete these files and as a result they get > accumulated over time, using up lot of disk space. > I don't think this is used by most people, so I think it would better to turn > this off by default. Jobtracker logs already capture most of this > information, though it is not as structured as history logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4835) Methods in Metrics class could avoid throwing IOException
[ https://issues.apache.org/jira/browse/HIVE-4835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13703965#comment-13703965 ] Thiruvel Thirumoolan commented on HIVE-4835: I think the method should return at-least return a boolean on whether an increment succeeded, so a corresponding decrement can be avoided if the increment failed. An incr might succeed but a decr fail, but that's again best effort to be consistent. > Methods in Metrics class could avoid throwing IOException > - > > Key: HIVE-4835 > URL: https://issues.apache.org/jira/browse/HIVE-4835 > Project: Hive > Issue Type: Bug >Affects Versions: 0.11.0 >Reporter: Arup Malakar >Priority: Minor > > I see that most of the methods in the Metrics class throws exception: > {code:java} > public void resetMetrics() throws IOException { > public void open() throws IOException { > public void close() throws IOException { > public void reopen() throws IOException { > public static void init() throws Exception { > public static Long incrementCounter(String name) throws IOException{ > public static Long incrementCounter(String name, long increment) throws > IOException{ > public static void set(String name, Object value) throws IOException{ > public static Object get(String name) throws IOException{ > public static void initializeScope(String name) throws IOException { > public static MetricsScope startScope(String name) throws IOException{ > public static MetricsScope getScope(String name) throws IOException { > public static void endScope(String name) throws IOException{ > {code} > I believe Metrics should be best effort and the Metrics system should just > log error messages in case it is unable to capture the Metrics. Throwing > exception makes the caller code unnecessarily lengthy. Also the caller would > never want to stop execution because of failure to capture metrics, so it > ends up just logging the exception. > The kind of code we see is like: > {code:java} > // Snippet from HiveMetaStore.java > try { > Metrics.startScope(function); > } catch (IOException e) { > LOG.debug("Exception when starting metrics scope" > + e.getClass().getName() + " " + e.getMessage()); > MetaStoreUtils.printStackTrace(e); > } > {code} > which could have been: > {code:java} > Metrics.startScope(function); > {code} > Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4291) Test HiveServer2 crash based on max thrift threads
[ https://issues.apache.org/jira/browse/HIVE-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686259#comment-13686259 ] Thiruvel Thirumoolan commented on HIVE-4291: I wrote the test case based on the patches in THRIFT-692. I haven't had a chance to modify the tests based on THRIFT-1869. > Test HiveServer2 crash based on max thrift threads > -- > > Key: HIVE-4291 > URL: https://issues.apache.org/jira/browse/HIVE-4291 > Project: Hive > Issue Type: Test > Components: HiveServer2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: TestHS2ThreadAllocation.java > > > This test case ensures HS2 does not shutdown/crash when the thrift threads > have been depleted. This is due to an issue fixed in THRIFT-1869. This test > should pass post HIVE-4224. This test case ensures, the crash doesnt happen > due to any changes in Thrift behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4547) A complex create view statement fails with new Antlr 3.4
[ https://issues.apache.org/jira/browse/HIVE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673739#comment-13673739 ] Thiruvel Thirumoolan commented on HIVE-4547: Sure, will take a look. > A complex create view statement fails with new Antlr 3.4 > > > Key: HIVE-4547 > URL: https://issues.apache.org/jira/browse/HIVE-4547 > Project: Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.10.0 >Reporter: Prasad Mujumdar >Assignee: Prasad Mujumdar > Fix For: 0.12.0 > > Attachments: HIVE-4547-1.patch, HIVE-4547-repro.tar > > > A complex create view statement with CAST in join condition fails with > IllegalArgumentException error. This is exposed by the Antlr 3.4 upgrade > (HIVE-2439). The same statement works fine with Hive 0.9 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4467) HiveConnection does not handle failures correctly
[ https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659904#comment-13659904 ] Thiruvel Thirumoolan commented on HIVE-4467: [~cwsteinbach] Does the updated patch look good? > HiveConnection does not handle failures correctly > - > > Key: HIVE-4467 > URL: https://issues.apache.org/jira/browse/HIVE-4467 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 0.11.0, 0.12.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-4467_1.patch, HIVE-4467.patch > > > HiveConnection uses Utils.verifySuccess* routines to check if there is any > error from the server side. This is not handled well. In > Utils.verifySuccess() when withInfo is 'false', the condition evaluates to > 'false' and no SQLexception is thrown even though there could be a problem on > the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4467) HiveConnection does not handle failures correctly
[ https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-4467: --- Status: Patch Available (was: Open) > HiveConnection does not handle failures correctly > - > > Key: HIVE-4467 > URL: https://issues.apache.org/jira/browse/HIVE-4467 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 0.11.0, 0.12.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-4467_1.patch, HIVE-4467.patch > > > HiveConnection uses Utils.verifySuccess* routines to check if there is any > error from the server side. This is not handled well. In > Utils.verifySuccess() when withInfo is 'false', the condition evaluates to > 'false' and no SQLexception is thrown even though there could be a problem on > the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4467) HiveConnection does not handle failures correctly
[ https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-4467: --- Attachment: HIVE-4467_1.patch Updated patch on phabricator and https://reviews.facebook.net/D10629 and also uploaded here (HIVE-4467_1.patch). > HiveConnection does not handle failures correctly > - > > Key: HIVE-4467 > URL: https://issues.apache.org/jira/browse/HIVE-4467 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 0.11.0, 0.12.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-4467_1.patch, HIVE-4467.patch > > > HiveConnection uses Utils.verifySuccess* routines to check if there is any > error from the server side. This is not handled well. In > Utils.verifySuccess() when withInfo is 'false', the condition evaluates to > 'false' and no SQLexception is thrown even though there could be a problem on > the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4467) HiveConnection does not handle failures correctly
[ https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13646996#comment-13646996 ] Thiruvel Thirumoolan commented on HIVE-4467: Uploaded patch to phabricator: https://reviews.facebook.net/D10629 > HiveConnection does not handle failures correctly > - > > Key: HIVE-4467 > URL: https://issues.apache.org/jira/browse/HIVE-4467 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 0.11.0, 0.12.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.11.0, 0.12.0 > > Attachments: HIVE-4467.patch > > > HiveConnection uses Utils.verifySuccess* routines to check if there is any > error from the server side. This is not handled well. In > Utils.verifySuccess() when withInfo is 'false', the condition evaluates to > 'false' and no SQLexception is thrown even though there could be a problem on > the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4467) HiveConnection does not handle failures correctly
[ https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-4467: --- Status: Patch Available (was: Open) > HiveConnection does not handle failures correctly > - > > Key: HIVE-4467 > URL: https://issues.apache.org/jira/browse/HIVE-4467 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 0.11.0, 0.12.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.11.0, 0.12.0 > > Attachments: HIVE-4467.patch > > > HiveConnection uses Utils.verifySuccess* routines to check if there is any > error from the server side. This is not handled well. In > Utils.verifySuccess() when withInfo is 'false', the condition evaluates to > 'false' and no SQLexception is thrown even though there could be a problem on > the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4467) HiveConnection does not handle failures correctly
[ https://issues.apache.org/jira/browse/HIVE-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-4467: --- Attachment: HIVE-4467.patch Attaching patch, I have made the functions straightforward and not preserved the boolean in the UtilsverifySuccess() methods. I am unsure where TStatusCode.SUCCESS_WITH_INFO_STATUS is set in the HS2 code and couldn't find any occurrences. Is there any reason/intention for checking for that status? > HiveConnection does not handle failures correctly > - > > Key: HIVE-4467 > URL: https://issues.apache.org/jira/browse/HIVE-4467 > Project: Hive > Issue Type: Bug > Components: HiveServer2, JDBC >Affects Versions: 0.11.0, 0.12.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.11.0, 0.12.0 > > Attachments: HIVE-4467.patch > > > HiveConnection uses Utils.verifySuccess* routines to check if there is any > error from the server side. This is not handled well. In > Utils.verifySuccess() when withInfo is 'false', the condition evaluates to > 'false' and no SQLexception is thrown even though there could be a problem on > the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4467) HiveConnection does not handle failures correctly
Thiruvel Thirumoolan created HIVE-4467: -- Summary: HiveConnection does not handle failures correctly Key: HIVE-4467 URL: https://issues.apache.org/jira/browse/HIVE-4467 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.11.0, 0.12.0 HiveConnection uses Utils.verifySuccess* routines to check if there is any error from the server side. This is not handled well. In Utils.verifySuccess() when withInfo is 'false', the condition evaluates to 'false' and no SQLexception is thrown even though there could be a problem on the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3620) Drop table using hive CLI throws error when the total number of partition in the table is around 50K.
[ https://issues.apache.org/jira/browse/HIVE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-3620: --- Attachment: Hive-3620_HeapDump.jpg > Drop table using hive CLI throws error when the total number of partition in > the table is around 50K. > - > > Key: HIVE-3620 > URL: https://issues.apache.org/jira/browse/HIVE-3620 > Project: Hive > Issue Type: Bug >Reporter: Arup Malakar > Attachments: Hive-3620_HeapDump.jpg > > > hive> drop table load_test_table_2_0; > > FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: > java.net.SocketTimeoutException: Read timedout > > > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > The DB used is Oracle and hive had only one table: > select COUNT(*) from PARTITIONS; > 54839 > I can try and play around with the parameter > hive.metastore.client.socket.timeout if that is what is being used. But it is > 200 seconds as of now, and 200 seconds for a drop table calls seems high > already. > Thanks, > Arup -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3620) Drop table using hive CLI throws error when the total number of partition in the table is around 50K.
[ https://issues.apache.org/jira/browse/HIVE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13635676#comment-13635676 ] Thiruvel Thirumoolan commented on HIVE-3620: [~sho.shimauchi] Did you have any special parameters for datanucleus to get this working? I tried disabling datanucleus cache and also set connection pools, but that does not seem to help. Will also post a snapshot of memory dump I have. BTW, I tried dropping a table with 45k partitions with the batch size configured to 100 and 1000. > Drop table using hive CLI throws error when the total number of partition in > the table is around 50K. > - > > Key: HIVE-3620 > URL: https://issues.apache.org/jira/browse/HIVE-3620 > Project: Hive > Issue Type: Bug >Reporter: Arup Malakar > > hive> drop table load_test_table_2_0; > > FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: > java.net.SocketTimeoutException: Read timedout > > > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > The DB used is Oracle and hive had only one table: > select COUNT(*) from PARTITIONS; > 54839 > I can try and play around with the parameter > hive.metastore.client.socket.timeout if that is what is being used. But it is > 200 seconds as of now, and 200 seconds for a drop table calls seems high > already. > Thanks, > Arup -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3620) Drop table using hive CLI throws error when the total number of partition in the table is around 50K.
[ https://issues.apache.org/jira/browse/HIVE-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627283#comment-13627283 ] Thiruvel Thirumoolan commented on HIVE-3620: I have had this problem in the past (in my case 0.2 million partitions, was stress testing dynamic partitions). Metastore crashes badly, may be mine was a remote situation. The workaround I did was to drop one hierarchy of partition. In my case, there were many partition keys and I used to drop the topmost one instead of dropping the table. May be its worthwhile to visit HIVE-3214 and see if there is anything we could do at datanucleus end. > Drop table using hive CLI throws error when the total number of partition in > the table is around 50K. > - > > Key: HIVE-3620 > URL: https://issues.apache.org/jira/browse/HIVE-3620 > Project: Hive > Issue Type: Bug >Reporter: Arup Malakar > > hive> drop table load_test_table_2_0; > > FAILED: Error in metadata: org.apache.thrift.transport.TTransportException: > java.net.SocketTimeoutException: Read timedout > > > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.DDLTask > The DB used is Oracle and hive had only one table: > select COUNT(*) from PARTITIONS; > 54839 > I can try and play around with the parameter > hive.metastore.client.socket.timeout if that is what is being used. But it is > 200 seconds as of now, and 200 seconds for a drop table calls seems high > already. > Thanks, > Arup -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4291) Test HiveServer2 crash based on max thrift threads
[ https://issues.apache.org/jira/browse/HIVE-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13623999#comment-13623999 ] Thiruvel Thirumoolan commented on HIVE-4291: Thanks Brock, I will also reduce the time delays so the entire test runs in less than 20 seconds. > Test HiveServer2 crash based on max thrift threads > -- > > Key: HIVE-4291 > URL: https://issues.apache.org/jira/browse/HIVE-4291 > Project: Hive > Issue Type: Test > Components: HiveServer2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: TestHS2ThreadAllocation.java > > > This test case ensures HS2 does not shutdown/crash when the thrift threads > have been depleted. This is due to an issue fixed in THRIFT-1869. This test > should pass post HIVE-4224. This test case ensures, the crash doesnt happen > due to any changes in Thrift behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4291) Test HiveServer2 crash based on max thrift threads
[ https://issues.apache.org/jira/browse/HIVE-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-4291: --- Attachment: TestHS2ThreadAllocation.java A WIP patch, will clean it up and post it on review board. I tested this with a custom built Thrift 0.9.0 library with THRIFT-692 changes, will retest with THRIFT-1.0 and update. > Test HiveServer2 crash based on max thrift threads > -- > > Key: HIVE-4291 > URL: https://issues.apache.org/jira/browse/HIVE-4291 > Project: Hive > Issue Type: Test > Components: HiveServer2 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: TestHS2ThreadAllocation.java > > > This test case ensures HS2 does not shutdown/crash when the thrift threads > have been depleted. This is due to an issue fixed in THRIFT-1869. This test > should pass post HIVE-4224. This test case ensures, the crash doesnt happen > due to any changes in Thrift behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4291) Test HiveServer2 crash based on max thrift threads
Thiruvel Thirumoolan created HIVE-4291: -- Summary: Test HiveServer2 crash based on max thrift threads Key: HIVE-4291 URL: https://issues.apache.org/jira/browse/HIVE-4291 Project: Hive Issue Type: Test Components: HiveServer2 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan This test case ensures HS2 does not shutdown/crash when the thrift threads have been depleted. This is due to an issue fixed in THRIFT-1869. This test should pass post HIVE-4224. This test case ensures, the crash doesnt happen due to any changes in Thrift behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4049) local_mapred_error_cache.q with hadoop 23.x fails with additional warning messages
[ https://issues.apache.org/jira/browse/HIVE-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan resolved HIVE-4049. Resolution: Duplicate HIVE-3428 has already fixed this. > local_mapred_error_cache.q with hadoop 23.x fails with additional warning > messages > -- > > Key: HIVE-4049 > URL: https://issues.apache.org/jira/browse/HIVE-4049 > Project: Hive > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Thiruvel Thirumoolan > Fix For: 0.10.1 > > > When run on branch10 with 23.x, the test fails. An additional warning message > leads to failure. The test should be independent of these things. > Diff output: > [junit] 16d15 > [junit] < WARNING: org.apache.hadoop.metrics.jvm.EventCounter is > deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the > log4j.properties files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4228) Bump up hadoop2 version in trunk
[ https://issues.apache.org/jira/browse/HIVE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-4228: --- Status: Patch Available (was: Open) > Bump up hadoop2 version in trunk > > > Key: HIVE-4228 > URL: https://issues.apache.org/jira/browse/HIVE-4228 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure >Affects Versions: 0.11.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.11.0 > > Attachments: HIVE-4228.patch > > > Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. > Have raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix > any new failures due to this bump. [I am guessing this should also help > HCatalog]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4228) Bump up hadoop2 version in trunk
[ https://issues.apache.org/jira/browse/HIVE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614548#comment-13614548 ] Thiruvel Thirumoolan commented on HIVE-4228: Patch on Phabricator - https://reviews.facebook.net/D9723 > Bump up hadoop2 version in trunk > > > Key: HIVE-4228 > URL: https://issues.apache.org/jira/browse/HIVE-4228 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure >Affects Versions: 0.11.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.11.0 > > Attachments: HIVE-4228.patch > > > Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. > Have raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix > any new failures due to this bump. [I am guessing this should also help > HCatalog]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4228) Bump up hadoop2 version in trunk
[ https://issues.apache.org/jira/browse/HIVE-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-4228: --- Attachment: HIVE-4228.patch > Bump up hadoop2 version in trunk > > > Key: HIVE-4228 > URL: https://issues.apache.org/jira/browse/HIVE-4228 > Project: Hive > Issue Type: Improvement > Components: Build Infrastructure >Affects Versions: 0.11.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.11.0 > > Attachments: HIVE-4228.patch > > > Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. > Have raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix > any new failures due to this bump. [I am guessing this should also help > HCatalog]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4228) Bump up hadoop2 version in trunk
Thiruvel Thirumoolan created HIVE-4228: -- Summary: Bump up hadoop2 version in trunk Key: HIVE-4228 URL: https://issues.apache.org/jira/browse/HIVE-4228 Project: Hive Issue Type: Improvement Components: Build Infrastructure Affects Versions: 0.11.0 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.11.0 Hive builds with hadoop 2.0.0-alpha now. Bumping up to hadoop-2.0.3-alpha. Have raised JIRAs with hive10-hadoop23.6 unit tests. Most of them should fix any new failures due to this bump. [I am guessing this should also help HCatalog]. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4049) local_mapred_error_cache.q with hadoop 23.x fails with additional warning messages
Thiruvel Thirumoolan created HIVE-4049: -- Summary: local_mapred_error_cache.q with hadoop 23.x fails with additional warning messages Key: HIVE-4049 URL: https://issues.apache.org/jira/browse/HIVE-4049 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thiruvel Thirumoolan Fix For: 0.10.1 When run on branch10 with 23.x, the test fails. An additional warning message leads to failure. The test should be independent of these things. Diff output: [junit] 16d15 [junit] < WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4047) skewjoin.q unit test inconsistently fails with Hadoop 0.23.x on branch 10
Thiruvel Thirumoolan created HIVE-4047: -- Summary: skewjoin.q unit test inconsistently fails with Hadoop 0.23.x on branch 10 Key: HIVE-4047 URL: https://issues.apache.org/jira/browse/HIVE-4047 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.10.1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3911) udaf_percentile_approx.q fails with Hadoop 0.23.5 when map-side aggr is disabled.
[ https://issues.apache.org/jira/browse/HIVE-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-3911: --- Fix Version/s: 0.10.1 Assignee: Thiruvel Thirumoolan > udaf_percentile_approx.q fails with Hadoop 0.23.5 when map-side aggr is > disabled. > - > > Key: HIVE-3911 > URL: https://issues.apache.org/jira/browse/HIVE-3911 > Project: Hive > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.11.0, 0.10.1 > > Attachments: HIVE-3911_branch10.patch, HIVE-3911.patch > > > I am running Hive10 unit tests against Hadoop 0.23.5 and > udaf_percentile_approx.q fails with a different value when map-side aggr is > disabled and only when 3rd argument to this UDAF is 100. Matches expected > output when map-side aggr is enabled for the same arguments. > This test passes when hadoop.version is 1.1.1 and fails when its 0.23.x or > 2.0.0-alpha or 2.0.2-alpha. > [junit] 20c20 > [junit] < 254.083331 > [junit] --- > [junit] > 252.77 > [junit] 47c47 > [junit] < 254.083331 > [junit] --- > [junit] > 252.77 > [junit] 74c74 > [junit] < > [23.358,254.083331,477.0625,489.54667] > [junit] --- > [junit] > [24.07,252.77,476.9,487.82] > [junit] 101c101 > [junit] < > [23.358,254.083331,477.0625,489.54667] > [junit] --- > [junit] > [24.07,252.77,476.9,487.82] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3911) udaf_percentile_approx.q fails with Hadoop 0.23.5 when map-side aggr is disabled.
[ https://issues.apache.org/jira/browse/HIVE-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-3911: --- Attachment: HIVE-3911_branch10.patch Attaching HIVE-3911_branch10.patch. This should make it consistent. I have just removed the queries that cause changes and fails this test. > udaf_percentile_approx.q fails with Hadoop 0.23.5 when map-side aggr is > disabled. > - > > Key: HIVE-3911 > URL: https://issues.apache.org/jira/browse/HIVE-3911 > Project: Hive > Issue Type: Bug >Affects Versions: 0.10.0 >Reporter: Thiruvel Thirumoolan > Fix For: 0.11.0 > > Attachments: HIVE-3911_branch10.patch, HIVE-3911.patch > > > I am running Hive10 unit tests against Hadoop 0.23.5 and > udaf_percentile_approx.q fails with a different value when map-side aggr is > disabled and only when 3rd argument to this UDAF is 100. Matches expected > output when map-side aggr is enabled for the same arguments. > This test passes when hadoop.version is 1.1.1 and fails when its 0.23.x or > 2.0.0-alpha or 2.0.2-alpha. > [junit] 20c20 > [junit] < 254.083331 > [junit] --- > [junit] > 252.77 > [junit] 47c47 > [junit] < 254.083331 > [junit] --- > [junit] > 252.77 > [junit] 74c74 > [junit] < > [23.358,254.083331,477.0625,489.54667] > [junit] --- > [junit] > [24.07,252.77,476.9,487.82] > [junit] 101c101 > [junit] < > [23.358,254.083331,477.0625,489.54667] > [junit] --- > [junit] > [24.07,252.77,476.9,487.82] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4012) Unit test failures with Hadoop 23 due to HADOOP-8551
[ https://issues.apache.org/jira/browse/HIVE-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HIVE-4012: --- Assignee: Thiruvel Thirumoolan > Unit test failures with Hadoop 23 due to HADOOP-8551 > > > Key: HIVE-4012 > URL: https://issues.apache.org/jira/browse/HIVE-4012 > Project: Hive > Issue Type: Bug >Affects Versions: 0.10.0, 0.11.0 >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 0.11.0, 0.10.1 > > Attachments: HIVE-4012_branch10.patch > > > With HADOOP-8551 (>=23.3 or >=2.0.2-alpha), its not possible to do a dfs > -mkdir of foo/bar when foo does not exist. One has to use '-p' option (not > available in Hadoop 20.x). A bunch of our test cases rely on this feature and > this was to make it interoperable with Windows too (HIVE-3204). However, all > these unit tests fail when using Hadoop >=23.3 or >=2.0.2-alpha. Its also not > possible to use the '-p' option in the tests as thats not supported in Hadoop > 20.x. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4013) Misc test failures on Hive10 with Hadoop 0.23.x
[ https://issues.apache.org/jira/browse/HIVE-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan reassigned HIVE-4013: -- Assignee: Thiruvel Thirumoolan > Misc test failures on Hive10 with Hadoop 0.23.x > --- > > Key: HIVE-4013 > URL: https://issues.apache.org/jira/browse/HIVE-4013 > Project: Hive > Issue Type: Bug >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Attachments: HIVE-4013_branch10.patch > > > Following fail with latest builds of Hadoop23 (tested with 0.23.5 and a build > of 0.23.6 also). Its more like making the tests deterministic, adding order > by to all the queries. > list_bucket_query_oneskew_3.q > list_bucket_query_multiskew_2.q > list_bucket_query_multiskew_3.q > list_bucket_query_multiskew_1.q > parenthesis_star_by.q -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira