[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15381074#comment-15381074 ] Lefty Leverenz commented on HIVE-13884: --- Thanks Sergio. I moved *hive.metastore.limit.partition.request* into the MetaStore section and put internal links on the crossreferences between it and *hive.limit.query.max.table.partition*. Here's the link to the new configuration parameter: * [ConfigurationProperties -- MetaStore -- hive.metastore.limit.partition.request | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.limit.partition.request] Removing the TODOC2.2 label. > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Fix For: 2.2.0 > > Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, > HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, > HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, > HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371054#comment-15371054 ] Sergio Peña commented on HIVE-13884: Thanks [~leftylev]. I added the the doc to the Wiki. > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, > HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, > HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, > HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15370264#comment-15370264 ] Lefty Leverenz commented on HIVE-13884: --- Okay, found it on the dev@hive list. You have edit privileges now. > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, > HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, > HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, > HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369968#comment-15369968 ] Sergio Peña commented on HIVE-13884: [~leftylev] Yes, thanks. I sent an email to user@ 2 days ago asking for wiki access, but I just found out that email was not sent. Just resent again. > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, > HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, > HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, > HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369306#comment-15369306 ] Lefty Leverenz commented on HIVE-13884: --- [~spena], did you see this reply? > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, > HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, > HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, > HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365723#comment-15365723 ] Lefty Leverenz commented on HIVE-13884: --- You need a Confluence account: * [About This Wiki -- How to get permission to edit | https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki#AboutThisWiki-Howtogetpermissiontoedit] > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, > HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, > HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, > HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364415#comment-15364415 ] Sergio Peña commented on HIVE-13884: [~leftylev] How do I get an account for the wiki so that I can edit it? I tried my apache creds, but they did not work. Do I have to request one? > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, > HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, > HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, > HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362156#comment-15362156 ] Lefty Leverenz commented on HIVE-13884: --- Thanks [~spena]. *hive.metastore.limit.partition.request* belongs in the MetaStore section of Configuration Properties, and a deprecation notice should be added for *hive.limit.query.max.table.partition*: * [Configuration Properties -- MetaStore | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-MetaStore] * [Configuration Properties -- hive.limit.query.max.table.partition | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.limit.query.max.table.partition] > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, > HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, > HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, > HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360244#comment-15360244 ] Sergio Peña commented on HIVE-13884: [~leftylev] Here's the description of the new parameter: *hive.metastore.limit.partition.request* This limits the number of partitions that can be requested from the metastore for a given table. A query will not be executed if it attempts to fetch more partitions per table than the limit configured. This parameter is preferred over hive.limit.query.max.table.partition (deprecated). The default value is -1 (unlimited). > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, > HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, > HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, > HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359837#comment-15359837 ] Hive QA commented on HIVE-13884: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12815550/HIVE-13884.10.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10292 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/340/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/340/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-340/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12815550 - PreCommit-HIVE-MASTER-Build > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, > HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, > HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, > HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356533#comment-15356533 ] Mohit Sabharwal commented on HIVE-13884: Thanks, +1 (non-binding) > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Attachments: HIVE-13884.1.patch, HIVE-13884.2.patch, > HIVE-13884.3.patch, HIVE-13884.4.patch, HIVE-13884.5.patch, > HIVE-13884.6.patch, HIVE-13884.7.patch, HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions
[ https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356282#comment-15356282 ] Hive QA commented on HIVE-13884: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12814194/HIVE-13884.9.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10291 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testListPartitionsWihtLimitEnabled org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testListPartitionsWihtLimitEnabled org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testListPartitionsWihtLimitEnabled org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testListPartitionsWihtLimitEnabled {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/318/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/318/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-318/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12814194 - PreCommit-HIVE-MASTER-Build > Disallow queries in HMS fetching more than a configured number of partitions > > > Key: HIVE-13884 > URL: https://issues.apache.org/jira/browse/HIVE-13884 > Project: Hive > Issue Type: Improvement >Reporter: Mohit Sabharwal >Assignee: Sergio Peña > Attachments: HIVE-13884.1.patch, HIVE-13884.2.patch, > HIVE-13884.3.patch, HIVE-13884.4.patch, HIVE-13884.5.patch, > HIVE-13884.6.patch, HIVE-13884.7.patch, HIVE-13884.8.patch, HIVE-13884.9.patch > > > Currently the PartitionPruner requests either all partitions or partitions > based on filter expression. In either scenarios, if the number of partitions > accessed is large there can be significant memory pressure at the HMS server > end. > We already have a config {{hive.limit.query.max.table.partition}} that > enforces limits on number of partitions that may be scanned per operator. But > this check happens after the PartitionPruner has already fetched all > partitions. > We should add an option at PartitionPruner level to disallow queries that > attempt to access number of partitions beyond a configurable limit. > Note that {{hive.mapred.mode=strict}} disallow queries without a partition > filter in PartitionPruner, but this check accepts any query with a pruning > condition, even if partitions fetched are large. In multi-tenant > environments, admins could use more control w.r.t. number of partitions > allowed based on HMS memory capacity. > One option is to have PartitionPruner first fetch the partition names > (instead of partition specs) and throw an exception if number of partitions > exceeds the configured value. Otherwise, fetch the partition specs. > Looks like the existing {{listPartitionNames}} call could be used if extended > to take partition filter expressions like {{getPartitionsByExpr}} call does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)