[
https://issues.apache.org/jira/browse/FLINK-33?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephan Ewen resolved FLINK-33.
-------------------------------
Resolution: Fixed
Fix Version/s: (was: pre-apache)
0.6-incubating
This is fixed in b4b633eab9a70e14d2e0dd5252f4b092a3689093
> [GitHub] Rework instance configuration.
> ---------------------------------------
>
> Key: FLINK-33
> URL: https://issues.apache.org/jira/browse/FLINK-33
> Project: Flink
> Issue Type: Bug
> Reporter: GitHub Import
> Labels: github-import
> Fix For: 0.6-incubating
>
>
> Right now, Nephele still uses the EC2-inspired instance configuration model.
> The Pact compiler connects to obtain information about these instances, such
> as how many are available, and how much memory they have. This is error prone
> to configure and also a bit buggy, it frequently leads to wrong memory
> bookkeeping if different instance types are configured.
> Do we need support for heterogeneous setups where different nodes have
> different capabilities and should be assigned a different amount of work? If
> we defer this to later, we can greatly simplify the logic and configuration:
> 1) No configuration for the instance type. The internal instance manager has
> a default profile which is okay for all cluster instances.
> 2) An explicit value of how many slots for parallel operators we have on each
> node (such as 8 on an eight core machine). There should be a default value in
> the config which could be overridden via query-specific parameters.
> 3) An explicit config entry that defines how much memory should be used for
> networking and how much for query processing. The query processing memory
> amount is used to initialize the MemoryManager and is also used by the
> pact-compiler to parameterize the memory available to the operators. That way
> we can also get rid of the communication between the compiler and the job
> manager on plan compilation. Eventually it would be good to run the compiler
> as a child process of the job-manager anyways.
> In the long run we want to make query processing memory and network memory
> one value (overall system memory, the rest is the UDF Java heap memory) which
> is shared for materialization in the network stack and the runtime operators.
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/33
> Created by: [StephanEwen|https://github.com/StephanEwen]
> Labels:
> Created at: Wed Jun 12 02:58:27 CEST 2013
> State: open
--
This message was sent by Atlassian JIRA
(v6.2#6252)