[GitHub] flink pull request: [FLINK-3120] [runtime] Manually configure Nett...

2016-02-07 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/1593


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3120] [runtime] Manually configure Nett...

2016-02-05 Thread uce
GitHub user uce opened a pull request:

https://github.com/apache/flink/pull/1593

[FLINK-3120] [runtime] Manually configure Netty's ByteBufAllocator

tl;dr Change default Netty configuration to be relative to number of slots, 
i.e. configure one memory arena (in PooledByteBufAllocator) per slot and use 
one event loop thread per slot. Behaviour can still be manually overwritten. 
With this change, we can expect 16 MB of direct memory allocated per task slot 
by Netty.

Problem: We were using Netty's default PooledByteBufAllocator instance, 
which is subject to changing behaviour between Netty versions (happened between 
versions 4.0.27.Final and 4.0.28.Final resulting in increased memory 
consumption) and whose default memory consumption depends on the number of 
available cores in the system. This can be problematic for example in YARN 
setups where users run one slot per task manager on machines with many cores, 
resulting in a relatively high number of allocated memory.

Solution: We instantiate a PooledByteBufAllocator instance manually and 
wrap it as a NettyBufferPool. Our instance configures one arena per task slot 
as default. It's desirable to have the number of arenas match the number of 
event loop threads to minimize lock contention (Netty's default tried to ensure 
this as well), hence the number of threads is changed as well to match the 
number of slots as default. Both number of threads and arenas can still be 
manually configured.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/uce/flink 3120-buffers

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/1593.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1593


commit 613ed9cce07d36e7b229e444dad3996db1bdb8c6
Author: Ufuk Celebi 
Date:   2016-02-03T15:05:37Z

[FLINK-3120] [runtime] Manually configure Netty's ByteBufAllocator

tl;dr Change default Netty configuration to be relative to number of slots,
i.e. configure one memory arena (in PooledByteBufAllocator) per slot and 
use one
event loop thread per slot. Behaviour can still be manually overwritten. 
With
this change, we can expect 16 MB of direct memory allocated per task slot by
Netty.

Problem: We were using Netty's default PooledByteBufAllocator instance, 
which
is subject to changing behaviour between Netty versions (happened between
versions 4.0.27.Final and 4.0.28.Final resulting in increased memory
consumption) and whose default memory consumption depends on the number of
available cores in the system. This can be problematic for example in YARN
setups where users run one slot per task manager on machines with many 
cores,
resulting in a relatively high number of allocated memory.

Solution: We instantiate a PooledByteBufAllocator instance manually and wrap
it as a NettyBufferPool. Our instance configures one arena per task slot as
default. It's desirable to have the number of arenas match the number of 
event
loop threads to minimize lock contention (Netty's default tried to ensure 
this
as well), hence the number of threads is changed as well to match the number
of slots as default. Both number of threads and arenas can still be manually
configured.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3120] [runtime] Manually configure Nett...

2016-02-05 Thread uce
Github user uce commented on the pull request:

https://github.com/apache/flink/pull/1593#issuecomment-180305086
  
(I forgot to add that in a small set of experiments with both YARN and 
Standalone setups on 4 nodes I did not notice a difference in performance. For 
YARN, I tested both allocating many small containers and few large ones.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] flink pull request: [FLINK-3120] [runtime] Manually configure Nett...

2016-02-05 Thread StephanEwen
Github user StephanEwen commented on the pull request:

https://github.com/apache/flink/pull/1593#issuecomment-180364957
  
Looks very good, +1 to merge this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---