Github user uce commented on a diff in the pull request:
https://github.com/apache/incubator-flink/pull/246#discussion_r21106379
--- Diff: docs/config.md ---
@@ -266,3 +272,79 @@ So if `yarn.am.rpc.port` is configured to `10245` and
the session's application
- `yarn.am.rpc.port`: The port that is being opened by the Application
Master (AM) to
let the YARN client connect for an RPC serice. (DEFAULT: Port 10245)
+
+
+## Background
+
+### Configuring the Network Buffers
+
+Network buffers are a critical resource for the communication layers. They
are
+used to buffer records before transmission over a network, and to buffer
+incoming data before dissecting it into records and handing them to the
+application. A sufficient number of network buffers are critical to
achieve a
+good throughput.
+
+In general, configure the task manager to have so many buffers that each
logical
+network connection on you expect to be open at the same time has a
dedicated
+buffer. A logical network connection exists for each point-to-point
exchange of
+data over the network, which typically happens at repartitioning- or
+broadcasting steps. In those, each parallel task inside the TaskManager
has to
+be able to talk to all other parallel tasks. Hence, the required number of
+buffers on a task manager is *total-degree-of-parallelism* (number of
targets)
+\* *intra-node-parallelism* (number of sources in one task manager) \* *n*.
+Here, *n* is a constant that defines how many repartitioning-/broadcasting
steps
+you expect to be active at the same time.
+
+Since the *intra-node-parallelism* is typically the number of cores, and
more
+than 4 repartitioning or broadcasting channels are rarely active in
parallel, it
+frequently boils down to *\#cores\^2\^* \* *\#machines* \* 4. To support
for
+example a cluster of 20 8-core machines, you should use roughly 5000
network
+buffers for optimal throughput.
+
+Each network buffer is by default 64 KiBytes large. In the above example,
the
+system would allocate roughly 300 MiBytes for network buffers.
+
+The number and size of network buffers can be configured with the following
+parameters:
+
+- `taskmanager.network.numberOfBuffers`, and
+- `taskmanager.network.bufferSizeInBytes`.
+
+### Configuring Temporary I/O Directories
+
+Although Flink aims to process as much data in main memory as possible,
+it is not uncommon that more data needs to be processed than memory is
+available. Flink's runtime is designed to write temporary data to disk
+to handle these situations.
+
+The `taskmanager.tmp.dirs` parameter specifies a list of directories into
which
+Flink writes temporary files. The paths of the directories need to be
+separated by ':' (colon character). Flink will concurrently write (or
+read) one temporary file to (from) each configured directory. This way,
+temporary I/O can be evenly distributed over multiple independent I/O
devices
+such as hard disks to improve performance. To leverage fast I/O devices
(e.g.,
+SSD, RAID, NAS), it is possible to specify a directory multiple times.
+
+If the `taskmanager.tmp.dirs` parameter is not explicitly specified,
+Flink writes temporary data to the temporary directory of the operating
+system, such as */tmp* in Linux systems.
+
+
+### Configuring TaskManager processing slots
+
+A processing slot allows Flink to execute a distributed DataSet
transformation, such as a
+data source or a map-transformation.
+
+Each Flink TaskManager provides processing slots in the cluster. The
number of slots
+is typically proportional to the number of available CPU cores __of each__
TaskManager.
+As a general recommendation, the number of available CPU cores is a good
default for
+`taskmanager.numberOfTaskSlots`.
+
+When starting a Flink application, users can supply the default number of
slots to use for that job.
+The command line value therefore is called `-p` (for parallelism). In
addition, it is possible
+to [set the number of slots in the programming
APIs](programming_guide.html#parallel-execution) for
+the whole application and individual operators.
+
+Flink is currently scheduling an application to slots by "filling" them
up.
+If the cluster has 20 machines with 2 slots each (40 slots in total) but
the application is running
+with a parallelism of 20, only 10 machines are processing data.
--- End diff --
are processing => will process
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---