[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-16 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15381074#comment-15381074
 ] 

Lefty Leverenz commented on HIVE-13884:
---

Thanks Sergio.  I moved *hive.metastore.limit.partition.request* into the 
MetaStore section and put internal links on the crossreferences between it and 
*hive.limit.query.max.table.partition*. 

Here's the link to the new configuration parameter:

* [ConfigurationProperties -- MetaStore -- 
hive.metastore.limit.partition.request | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.limit.partition.request]

Removing the TODOC2.2 label.

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371054#comment-15371054
 ] 

Sergio Peña commented on HIVE-13884:


Thanks [~leftylev]. I added the the doc to the Wiki.

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-11 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15370264#comment-15370264
 ] 

Lefty Leverenz commented on HIVE-13884:
---

Okay, found it on the dev@hive list.  You have edit privileges now.

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-10 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369968#comment-15369968
 ] 

Sergio Peña commented on HIVE-13884:


[~leftylev] Yes, thanks.
I sent an email to user@ 2 days ago asking for wiki access, but I just found 
out that email was not sent. 
Just resent again.

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-09 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369306#comment-15369306
 ] 

Lefty Leverenz commented on HIVE-13884:
---

[~spena], did you see this reply?

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-07 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365723#comment-15365723
 ] 

Lefty Leverenz commented on HIVE-13884:
---

You need a Confluence account:

* [About This Wiki -- How to get permission to edit | 
https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki#AboutThisWiki-Howtogetpermissiontoedit]

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364415#comment-15364415
 ] 

Sergio Peña commented on HIVE-13884:


[~leftylev] How do I get an account for the wiki so that I can edit it? I tried 
my apache creds, but they did not work.
Do I have to request one?

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-05 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362156#comment-15362156
 ] 

Lefty Leverenz commented on HIVE-13884:
---

Thanks [~spena].  

*hive.metastore.limit.partition.request* belongs in the MetaStore section of 
Configuration Properties, and a deprecation notice should be added for 
*hive.limit.query.max.table.partition*:

* [Configuration Properties -- MetaStore | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-MetaStore]
* [Configuration Properties -- hive.limit.query.max.table.partition | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.limit.query.max.table.partition]

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15360244#comment-15360244
 ] 

Sergio Peña commented on HIVE-13884:


[~leftylev] Here's the description of the new parameter:

*hive.metastore.limit.partition.request*
This limits the number of partitions that can be requested from the metastore 
for a given table. A query will not be executed if it attempts to fetch more 
partitions per table than the limit configured. This parameter is preferred 
over hive.limit.query.max.table.partition (deprecated). The default value is -1 
(unlimited). 

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-07-01 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359837#comment-15359837
 ] 

Hive QA commented on HIVE-13884:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12815550/HIVE-13884.10.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10292 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/340/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/340/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-340/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12815550 - PreCommit-HIVE-MASTER-Build

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Attachments: HIVE-13884.1.patch, HIVE-13884.10.patch, 
> HIVE-13884.2.patch, HIVE-13884.3.patch, HIVE-13884.4.patch, 
> HIVE-13884.5.patch, HIVE-13884.6.patch, HIVE-13884.7.patch, 
> HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-06-29 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356533#comment-15356533
 ] 

Mohit Sabharwal commented on HIVE-13884:


Thanks, +1 (non-binding)

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Attachments: HIVE-13884.1.patch, HIVE-13884.2.patch, 
> HIVE-13884.3.patch, HIVE-13884.4.patch, HIVE-13884.5.patch, 
> HIVE-13884.6.patch, HIVE-13884.7.patch, HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13884) Disallow queries in HMS fetching more than a configured number of partitions

2016-06-29 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356282#comment-15356282
 ] 

Hive QA commented on HIVE-13884:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12814194/HIVE-13884.9.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10291 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testListPartitionsWihtLimitEnabled
org.apache.hadoop.hive.metastore.TestSetUGIOnBothClientServer.testListPartitionsWihtLimitEnabled
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyClient.testListPartitionsWihtLimitEnabled
org.apache.hadoop.hive.metastore.TestSetUGIOnOnlyServer.testListPartitionsWihtLimitEnabled
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/318/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/318/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-318/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12814194 - PreCommit-HIVE-MASTER-Build

> Disallow queries in HMS fetching more than a configured number of partitions
> 
>
> Key: HIVE-13884
> URL: https://issues.apache.org/jira/browse/HIVE-13884
> Project: Hive
>  Issue Type: Improvement
>Reporter: Mohit Sabharwal
>Assignee: Sergio Peña
> Attachments: HIVE-13884.1.patch, HIVE-13884.2.patch, 
> HIVE-13884.3.patch, HIVE-13884.4.patch, HIVE-13884.5.patch, 
> HIVE-13884.6.patch, HIVE-13884.7.patch, HIVE-13884.8.patch, HIVE-13884.9.patch
>
>
> Currently the PartitionPruner requests either all partitions or partitions 
> based on filter expression. In either scenarios, if the number of partitions 
> accessed is large there can be significant memory pressure at the HMS server 
> end.
> We already have a config {{hive.limit.query.max.table.partition}} that 
> enforces limits on number of partitions that may be scanned per operator. But 
> this check happens after the PartitionPruner has already fetched all 
> partitions.
> We should add an option at PartitionPruner level to disallow queries that 
> attempt to access number of partitions beyond a configurable limit.
> Note that {{hive.mapred.mode=strict}} disallow queries without a partition 
> filter in PartitionPruner, but this check accepts any query with a pruning 
> condition, even if partitions fetched are large. In multi-tenant 
> environments, admins could use more control w.r.t. number of partitions 
> allowed based on HMS memory capacity.
> One option is to have PartitionPruner first fetch the partition names 
> (instead of partition specs) and throw an exception if number of partitions 
> exceeds the configured value. Otherwise, fetch the partition specs.
> Looks like the existing {{listPartitionNames}} call could be used if extended 
> to take partition filter expressions like {{getPartitionsByExpr}} call does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)