Re: [DISCUSSION] PIP 24: Simplify memory settings

2018-10-11 Thread Sijie Guo
On Wed, Oct 10, 2018 at 4:13 PM Matteo Merli  wrote:

> On Wed, Oct 10, 2018 at 4:03 PM Sijie Guo  wrote:
>
> > the proposal looks good to me.
> >
> > just out of curiosity, why PoolingPolicy only has `UnpooledHeap` and
> > `PooledDirect`?
> >
>
> I think these are the 2 that make sense (from my perspective):
>
>  1. Pooling heap memory still suffers from interactions with GC and will
> still involve copying the buffers multiple times when reading/writing on
> network.
>  Sure, it would reduce the allocation rate, but if one wants to pool
> memory it's preferable to use direct memory.
>
>  2. Using unpooled direct memory for each buffer would be very slow so not
> to be useful in practice
>


Make sense to me +1 then

>
>
> --
> Matteo Merli
> 
>


Re: [DISCUSSION] PIP 24: Simplify memory settings

2018-10-10 Thread Matteo Merli
On Wed, Oct 10, 2018 at 4:03 PM Sijie Guo  wrote:

> the proposal looks good to me.
>
> just out of curiosity, why PoolingPolicy only has `UnpooledHeap` and
> `PooledDirect`?
>

I think these are the 2 that make sense (from my perspective):

 1. Pooling heap memory still suffers from interactions with GC and will
still involve copying the buffers multiple times when reading/writing on
network.
 Sure, it would reduce the allocation rate, but if one wants to pool
memory it's preferable to use direct memory.

 2. Using unpooled direct memory for each buffer would be very slow so not
to be useful in practice


-- 
Matteo Merli



Re: [DISCUSSION] PIP 24: Simplify memory settings

2018-10-10 Thread Sijie Guo
the proposal looks good to me.

just out of curiosity, why PoolingPolicy only has `UnpooledHeap` and
`PooledDirect`?

- Sijie

On Wed, Oct 10, 2018 at 12:51 PM Matteo Merli  wrote:

> (CCed dev@bookkeeper.apache.org since this proposal is referring to both
> projects and I thought it might be good to have a single discussion thread)
>
> Wiki page is available at:
> https://github.com/apache/pulsar/wiki/PIP-24%3A-Simplify-memory-settings
>
> Pasting here for convenience:
>
> 
>
>
> ## Motivation
>
> Configuring the correct JVM memory settings and cache sizes for a Pulsar
> cluster should be
> simplified.
>
> There are currently many knobs in Netty or JVM flags for different
> components and while
> with a good setup the systems is very stable, it's easy to setup
> non-optimal configurations
> which might result in OutOfMemory errors under load.
>
> Ideally, there should be very minimal configuration required to bring up a
> Pulsar cluster
> that can work under a wide set of traffic loads. In any case, we should
> prefer to automatically
> fallback to slower alternatives, when possible, instead of throwing OOM
> exceptions.
>
> ## Goals
>
>  1. Default setting should allow Pulsar to use the all the memory as
> configured on the JVM,
> irrespective of Direct vs Heap memory
>  2. Automatically set the size of caches based on the amount of memory
> available to the JVM
>  3. Allow to disable pooling completely for environments where memory is
> scarce
>  4. Allow to configure different policies to have different fallback
> options when the memory
> quotas are reached
>
>
> ## Changes
>
> ### Netty Allocator Wrapper
>
> Create an allocator wrapper that can be configured with different
> behaviors. This will be
> using the regular `PooledByteBufAllocator` but will have a configuration
> object to decide
> what to do in particular moments. It will also serve as a way to group and
> simplify all
> the Netty allocator options which are currently spread across multiple
> system properties,
> for which the documentation is not easily searchable.
>
> The wrapper will be configured and instantianted through a builder class:
>
> ```java
> public interface ByteBufAllocatorBuilder {
>
> /**
>  * Creates a new {@link ByteBufAllocatorBuilder}.
>  */
> public static ByteBufAllocatorBuilder create() {
> return new ByteBufAllocatorBuilderImpl();
> }
>
> /**
>  * Finalize the configured {@link ByteBufAllocator}
>  */
> ByteBufAllocator build();
>
> /**
>  * Specify a custom allocator where the allocation requests should be
> forwarded to.
>  *
>  * 
>  * Default is to used {@link PooledByteBufAllocator#DEFAULT} when
> pooling is required or
>  * {@link UnpooledByteBufAllocator} otherwise.
>  */
> ByteBufAllocatorBuilder allocator(ByteBufAllocator allocator);
>
> /**
>  * Define the memory pooling policy
>  *
>  * 
>  * Default is {@link PoolingPolicy#PooledDirect}
>  */
> ByteBufAllocatorBuilder poolingPolicy(PoolingPolicy policy);
>
> /**
>  * Controls the amount of concurrency for the memory pool.
>  *
>  * 
>  * Default is to have a number of allocator arenas equals to 2 * CPUS.
>  * 
>  * Decreasing this number will reduce the amount of memory overhead, at
> the expense of increased allocation
>  * contention.
>  */
> ByteBufAllocatorBuilder poolingConcurrency(int poolingConcurrency);
>
> /**
>  * Define the OutOfMemory handling policy
>  *
>  * 
>  * Default is {@link OomPolicy#FallbackToHeap}
>  */
> ByteBufAllocatorBuilder oomPolicy(OomPolicy policy);
>
> /**
>  * Enable the leak detection for
>  *
>  * 
>  * Default is {@link LeakDetectionPolicy#Disabled}
>  */
> ByteBufAllocatorBuilder leakDetectionPolicy(LeakDetectionPolicy
> leakDetectionPolicy);
> }
> ```
>
> The policies are here defined:
>
> ```java
> /**
>  * Define a policy for allocating buffers
>  */
> public enum PoolingPolicy {
>
> /**
>  * Allocate memory from JVM heap without any pooling.
>  *
>  * This option has the least overhead in terms of memory usage since
> the memory will be automatically reclaimed by
>  * the JVM GC but might impose a performance penalty at high
> throughput.
>  */
> UnpooledHeap,
>
> /**
>  * Use Direct memory for all buffers and pool the memory.
>  *
>  * Direct memory will avoid the overhead of JVM GC and most memory
> copies when reading and writing to socket
>  * channel.
>  *
>  * Pooling will add memory space overhead due to the fact that there
> will be fragmentation in the allocator and that
>  * threads will keep a portion of memory as thread-local to avoid
> contention when possible.
>  */
> PooledDirect
> }
>
> /**
>  * Represents the action to take when it's not possible to allocate 

[DISCUSSION] PIP 24: Simplify memory settings

2018-10-10 Thread Matteo Merli
(CCed dev@bookkeeper.apache.org since this proposal is referring to both
projects and I thought it might be good to have a single discussion thread)

Wiki page is available at:
https://github.com/apache/pulsar/wiki/PIP-24%3A-Simplify-memory-settings

Pasting here for convenience:




## Motivation

Configuring the correct JVM memory settings and cache sizes for a Pulsar
cluster should be
simplified.

There are currently many knobs in Netty or JVM flags for different
components and while
with a good setup the systems is very stable, it's easy to setup
non-optimal configurations
which might result in OutOfMemory errors under load.

Ideally, there should be very minimal configuration required to bring up a
Pulsar cluster
that can work under a wide set of traffic loads. In any case, we should
prefer to automatically
fallback to slower alternatives, when possible, instead of throwing OOM
exceptions.

## Goals

 1. Default setting should allow Pulsar to use the all the memory as
configured on the JVM,
irrespective of Direct vs Heap memory
 2. Automatically set the size of caches based on the amount of memory
available to the JVM
 3. Allow to disable pooling completely for environments where memory is
scarce
 4. Allow to configure different policies to have different fallback
options when the memory
quotas are reached


## Changes

### Netty Allocator Wrapper

Create an allocator wrapper that can be configured with different
behaviors. This will be
using the regular `PooledByteBufAllocator` but will have a configuration
object to decide
what to do in particular moments. It will also serve as a way to group and
simplify all
the Netty allocator options which are currently spread across multiple
system properties,
for which the documentation is not easily searchable.

The wrapper will be configured and instantianted through a builder class:

```java
public interface ByteBufAllocatorBuilder {

/**
 * Creates a new {@link ByteBufAllocatorBuilder}.
 */
public static ByteBufAllocatorBuilder create() {
return new ByteBufAllocatorBuilderImpl();
}

/**
 * Finalize the configured {@link ByteBufAllocator}
 */
ByteBufAllocator build();

/**
 * Specify a custom allocator where the allocation requests should be
forwarded to.
 *
 * 
 * Default is to used {@link PooledByteBufAllocator#DEFAULT} when
pooling is required or
 * {@link UnpooledByteBufAllocator} otherwise.
 */
ByteBufAllocatorBuilder allocator(ByteBufAllocator allocator);

/**
 * Define the memory pooling policy
 *
 * 
 * Default is {@link PoolingPolicy#PooledDirect}
 */
ByteBufAllocatorBuilder poolingPolicy(PoolingPolicy policy);

/**
 * Controls the amount of concurrency for the memory pool.
 *
 * 
 * Default is to have a number of allocator arenas equals to 2 * CPUS.
 * 
 * Decreasing this number will reduce the amount of memory overhead, at
the expense of increased allocation
 * contention.
 */
ByteBufAllocatorBuilder poolingConcurrency(int poolingConcurrency);

/**
 * Define the OutOfMemory handling policy
 *
 * 
 * Default is {@link OomPolicy#FallbackToHeap}
 */
ByteBufAllocatorBuilder oomPolicy(OomPolicy policy);

/**
 * Enable the leak detection for
 *
 * 
 * Default is {@link LeakDetectionPolicy#Disabled}
 */
ByteBufAllocatorBuilder leakDetectionPolicy(LeakDetectionPolicy
leakDetectionPolicy);
}
```

The policies are here defined:

```java
/**
 * Define a policy for allocating buffers
 */
public enum PoolingPolicy {

/**
 * Allocate memory from JVM heap without any pooling.
 *
 * This option has the least overhead in terms of memory usage since
the memory will be automatically reclaimed by
 * the JVM GC but might impose a performance penalty at high throughput.
 */
UnpooledHeap,

/**
 * Use Direct memory for all buffers and pool the memory.
 *
 * Direct memory will avoid the overhead of JVM GC and most memory
copies when reading and writing to socket
 * channel.
 *
 * Pooling will add memory space overhead due to the fact that there
will be fragmentation in the allocator and that
 * threads will keep a portion of memory as thread-local to avoid
contention when possible.
 */
PooledDirect
}

/**
 * Represents the action to take when it's not possible to allocate memory.
 */
public enum OomPolicy {

/**
 * Throw regular OOM exception without taking addition actions
 */
ThrowException,

/**
 * If it's not possible to allocate a buffer from direct memory,
fallback to allocate an unpooled buffer from JVM
 * heap.
 *
 * This will help absorb memory allocation spikes because the heap
allocations will naturally slow down the process
 * and will result if full GC cleanup if the Heap itself is full.