I've been thinking about these scenarios:

1.  Hash distributed tables with fixed number of buckets.
If the tables were built using the defaults, buckets = 6 * number of
nodes.  So you basically have 6 vsegs per host.  Multiply that by 16GB and
you only can use 96GB of the 256GB of RAM per node.

2.  A user has random tables but doesn't understand they can increase the
number of vsegs.
This will be common for users that come from Greenplum.  They again can
only set statement member to 16GB so they are stuck with a max of 96GB of
RAM usage.

3.  User increases vsegs and statement memory.
Possibly run out of memory if too aggressive with settings.

- I think we should be able to specify statement memory higher than 16GB.
Maybe the limit should be something much higher such as 1TB.

- The optimizer should limit the number of vsegs based on statement memory
setting to prevent OOM.  You could do the opposite too.  (limit memory and
use the vseg setting provided)  Greenplum can limit the amount of memory
but we have two dials to adjust with vsegs and memory.


Jon Roberts

On Sun, Jan 22, 2017 at 5:20 PM, Yi Jin <y...@pivotal.io> wrote:

> Hi Jon,
>
> That guc setting limit means for one virtual segment, the maximum
> consumable memory is 16GB, for one segment/node, there maybe multiple vsegs
> allocated to run queries, so if a node has 256gb expected to be consumed by
> HAWQ, it will have at most 16 vsegs running concurrently.
>
> hawq_rm_stmt_vseg_memory is for setting statement level vseg memory
> consumption, it is required to specify hawq_rm_stmt_nvseg as well, only
> when hawq_rm_stmt_nvseg is greater than 0, hawq_rm_stmt_vseg_memory is
> activated regardless the original target resource queue vseg resource quota
> definition. For example, you can set hawq_rm_stmt_vseg_memory as 16gb,
> hawq_rm_stmt_nvseg
> as 256, if you have a cluster having 256gb * 16 nodes and your target
> resource queue can use 100% cluster resource, you will have 16 vsegs
> running per node to consume all memory resource for this query.
>
> Best,
> Yi
>
> On Sat, Jan 21, 2017 at 3:40 PM, Lei Chang <lei_ch...@apache.org> wrote:
>
> > hawq_rm_stmt_vseg_memory and hawq_rm_stmt_nvseg need to be used together
> to
> > set the specific number of segments and the vseg memory. And
> > hawq_rm_stmt_nvseg should be less than hawq_rm_nvseg_perquery_perseg_
> > limit.
> >
> > set hawq_rm_stmt_vseg_memory = '2GB';set hawq_rm_stmt_nvseg = 6;
> >
> > looks 16GB is somewhat small for big dedicated machines: if 16GB is per
> > virtual segment memory, if 8 segment is used, it only use 128GB.
> >
> > Cheers
> > Lei
> >
> >
> > On Fri, Jan 20, 2017 at 9:11 PM, Jon Roberts <jrobe...@pivotal.io>
> wrote:
> >
> > > Why is there a limit of 16GB for hawq_rm_stmt_vseg_memory?  A cluster
> > with
> > > 256GB per node and dedicated for HAWQ may certainly want to utilize
> more
> > > memory per segment.  Is there something I'm missing regarding statement
> > > memory?
> > >
> > > Secondly, does the number of vsegs for a query get influenced by the
> > > statement memory or does it just look at the plan and
> > > hawq_rm_nvseg_perquery_perseg_limit?
> > >
> > >
> > > Jon Roberts
> > >
> >
>

Reply via email to