Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

Till Rohrmann Thu, 29 Aug 2019 04:35:38 -0700

What I forgot to add is that we could tackle specifying the configuration
fully in an incremental way and that the full specification should be the
desired end state.


On Thu, Aug 29, 2019 at 1:33 PM Till Rohrmann <trohrm...@apache.org> wrote:

> I think our goal should be that the configuration is fully specified when
> the process is started. By considering the internal calculation step to be
> rather validate existing values and calculate missing ones, these two
> proposal shouldn't even conflict (given determinism).
>
> Since we don't want to change an existing flink-conf.yaml, specifying the
> full configuration would require to pass in the options differently.
>
> One way could be the ENV variables approach. The reason why I'm trying to
> exclude this feature from the FLIP is that I believe it needs a bit more
> discussion. Just some questions which come to my mind: What would be the
> exact format (FLINK_KEY_NAME)? Would we support a dot separator which is
> supported by some systems (FLINK.KEY.NAME)? If we accept the dot
> separator what would be the order of precedence if there are two ENV
> variables defined (FLINK_KEY_NAME and FLINK.KEY.NAME)? What is the
> precedence of env variable vs. dynamic configuration value specified via -D?
>
> Another approach could be to pass in the dynamic configuration values via
> `-Dkey=value` to the Flink process. For that we don't have to change
> anything because the functionality already exists.
>
> Cheers,
> Till
>
> On Thu, Aug 29, 2019 at 12:50 PM Stephan Ewen <se...@apache.org> wrote:
>
>> I see. Under the assumption of strict determinism that should work.
>>
>> The original proposal had this point "don't compute inside the TM, compute
>> outside and supply a full config", because that sounded more intuitive.
>>
>> On Thu, Aug 29, 2019 at 12:15 PM Till Rohrmann <trohrm...@apache.org>
>> wrote:
>>
>> > My understanding was that before starting the Flink process we call a
>> > utility which calculates these values. I assume that this utility will
>> do
>> > the calculation based on a set of configured values (process memory,
>> flink
>> > memory, network memory etc.). Assuming that these values don't differ
>> from
>> > the values with which the JVM is started, it should be possible to
>> > recompute them in the Flink process in order to set the values.
>> >
>> >
>> >
>> > On Thu, Aug 29, 2019 at 11:29 AM Stephan Ewen <se...@apache.org> wrote:
>> >
>> > > When computing the values in the JVM process after it started, how
>> would
>> > > you deal with values like Max Direct Memory, Metaspace size. native
>> > memory
>> > > reservation (reduce heap size), etc? All the values that are
>> parameters
>> > to
>> > > the JVM process and that need to be supplied at process startup?
>> > >
>> > > On Wed, Aug 28, 2019 at 4:46 PM Till Rohrmann <trohrm...@apache.org>
>> > > wrote:
>> > >
>> > > > Thanks for the clarification. I have some more comments:
>> > > >
>> > > > - I would actually split the logic to compute the process memory
>> > > > requirements and storing the values into two things. E.g. one could
>> > name
>> > > > the former TaskExecutorProcessUtility and  the latter
>> > > > TaskExecutorProcessMemory. But we can discuss this on the PR since
>> it's
>> > > > just a naming detail.
>> > > >
>> > > > - Generally, I'm not opposed to making configuration values
>> overridable
>> > > by
>> > > > ENV variables. I think this is a very good idea and makes the
>> > > > configurability of Flink processes easier. However, I think that
>> adding
>> > > > this functionality should not be part of this FLIP because it would
>> > > simply
>> > > > widen the scope unnecessarily.
>> > > >
>> > > > The reasons why I believe it is unnecessary are the following: For
>> Yarn
>> > > we
>> > > > already create write a flink-conf.yaml which could be populated with
>> > the
>> > > > memory settings. For the other processes it should not make a
>> > difference
>> > > > whether the loaded Configuration is populated with the memory
>> settings
>> > > from
>> > > > ENV variables or by using TaskExecutorProcessUtility to compute the
>> > > missing
>> > > > values from the loaded configuration. If the latter would not be
>> > possible
>> > > > (wrong or missing configuration values), then we should not have
>> been
>> > > able
>> > > > to actually start the process in the first place.
>> > > >
>> > > > - Concerning the memory reservation: I agree with you that we need
>> the
>> > > > memory reservation functionality to make streaming jobs work with
>> > > "managed"
>> > > > memory. However, w/o this functionality the whole Flip would already
>> > > bring
>> > > > a good amount of improvements to our users when running batch jobs.
>> > > > Moreover, by keeping the scope smaller we can complete the FLIP
>> faster.
>> > > > Hence, I would propose to address the memory reservation
>> functionality
>> > > as a
>> > > > follow up FLIP (which Yu is working on if I'm not mistaken).
>> > > >
>> > > > Cheers,
>> > > > Till
>> > > >
>> > > > On Wed, Aug 28, 2019 at 11:43 AM Yang Wang <danrtsey...@gmail.com>
>> > > wrote:
>> > > >
>> > > > > Just add my 2 cents.
>> > > > >
>> > > > > Using environment variables to override the configuration for
>> > different
>> > > > > taskmanagers is better.
>> > > > > We do not need to generate dedicated flink-conf.yaml for all
>> > > > taskmanagers.
>> > > > > A common flink-conf.yam and different environment variables are
>> > enough.
>> > > > > By reducing the distributed cached files, it could make launching
>> a
>> > > > > taskmanager faster.
>> > > > >
>> > > > > Stephan gives a good suggestion that we could move the logic into
>> > > > > "GlobalConfiguration.loadConfig()" method.
>> > > > > Maybe the client could also benefit from this. Different users do
>> not
>> > > > have
>> > > > > to export FLINK_CONF_DIR to update few config options.
>> > > > >
>> > > > >
>> > > > > Best,
>> > > > > Yang
>> > > > >
>> > > > > Stephan Ewen <se...@apache.org> 于2019年8月28日周三 上午1:21写道：
>> > > > >
>> > > > > > One note on the Environment Variables and Configuration
>> discussion.
>> > > > > >
>> > > > > > My understanding is that passed ENV variables are added to the
>> > > > > > configuration in the "GlobalConfiguration.loadConfig()" method
>> (or
>> > > > > > similar).
>> > > > > > For all the code inside Flink, it looks like the data was in the
>> > > config
>> > > > > to
>> > > > > > start with, just that the scripts that compute the variables can
>> > pass
>> > > > the
>> > > > > > values to the process without actually needing to write a file.
>> > > > > >
>> > > > > > For example the "GlobalConfiguration.loadConfig()" method would
>> > take
>> > > > any
>> > > > > > ENV variable prefixed with "flink" and add it as a config key.
>> > > > > > "flink_taskmanager_memory_size=2g" would become
>> > > > "taskmanager.memory.size:
>> > > > > > 2g".
>> > > > > >
>> > > > > >
>> > > > > > On Tue, Aug 27, 2019 at 4:05 PM Xintong Song <
>> > tonysong...@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Thanks for the comments, Till.
>> > > > > > >
>> > > > > > > I've also seen your comments on the wiki page, but let's keep
>> the
>> > > > > > > discussion here.
>> > > > > > >
>> > > > > > > - Regarding 'TaskExecutorSpecifics', how do you think about
>> > naming
>> > > it
>> > > > > > > 'TaskExecutorResourceSpecifics'.
>> > > > > > > - Regarding passing memory configurations into task executors,
>> > I'm
>> > > in
>> > > > > > favor
>> > > > > > > of do it via environment variables rather than configurations,
>> > with
>> > > > the
>> > > > > > > following two reasons.
>> > > > > > >   - It is easier to keep the memory options once calculate
>> not to
>> > > be
>> > > > > > > changed with environment variables rather than configurations.
>> > > > > > >   - I'm not sure whether we should write the configuration in
>> > > startup
>> > > > > > > scripts. Writing changes into the configuration files when
>> > running
>> > > > the
>> > > > > > > startup scripts does not sounds right to me. Or we could make
>> a
>> > > copy
>> > > > of
>> > > > > > > configuration files per flink cluster, and make the task
>> executor
>> > > to
>> > > > > load
>> > > > > > > from the copy, and clean up the copy after the cluster is
>> > shutdown,
>> > > > > which
>> > > > > > > is complicated. (I think this is also what Stephan means in
>> his
>> > > > comment
>> > > > > > on
>> > > > > > > the wiki page?)
>> > > > > > > - Regarding reserving memory, I think this change should be
>> > > included
>> > > > in
>> > > > > > > this FLIP. I think a big part of motivations of this FLIP is
>> to
>> > > unify
>> > > > > > > memory configuration for streaming / batch and make it easy
>> for
>> > > > > > configuring
>> > > > > > > rocksdb memory. If we don't support memory reservation, then
>> > > > streaming
>> > > > > > jobs
>> > > > > > > cannot use managed memory (neither on-heap or off-heap), which
>> > > makes
>> > > > > this
>> > > > > > > FLIP incomplete.
>> > > > > > > - Regarding network memory, I think you are right. I think we
>> > > > probably
>> > > > > > > don't need to change network stack from using direct memory to
>> > > using
>> > > > > > unsafe
>> > > > > > > native memory. Network memory size is deterministic, cannot be
>> > > > reserved
>> > > > > > as
>> > > > > > > managed memory does, and cannot be overused. I think it also
>> > works
>> > > if
>> > > > > we
>> > > > > > > simply keep using direct memory for network and include it in
>> jvm
>> > > max
>> > > > > > > direct memory size.
>> > > > > > >
>> > > > > > > Thank you~
>> > > > > > >
>> > > > > > > Xintong Song
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > On Tue, Aug 27, 2019 at 8:12 PM Till Rohrmann <
>> > > trohrm...@apache.org>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Hi Xintong,
>> > > > > > > >
>> > > > > > > > thanks for addressing the comments and adding a more
>> detailed
>> > > > > > > > implementation plan. I have a couple of comments concerning
>> the
>> > > > > > > > implementation plan:
>> > > > > > > >
>> > > > > > > > - The name `TaskExecutorSpecifics` is not really
>> descriptive.
>> > > > > Choosing
>> > > > > > a
>> > > > > > > > different name could help here.
>> > > > > > > > - I'm not sure whether I would pass the memory
>> configuration to
>> > > the
>> > > > > > > > TaskExecutor via environment variables. I think it would be
>> > > better
>> > > > to
>> > > > > > > write
>> > > > > > > > it into the configuration one uses to start the TM process.
>> > > > > > > > - If possible, I would exclude the memory reservation from
>> this
>> > > > FLIP
>> > > > > > and
>> > > > > > > > add this as part of a dedicated FLIP.
>> > > > > > > > - If possible, then I would exclude changes to the network
>> > stack
>> > > > from
>> > > > > > > this
>> > > > > > > > FLIP. Maybe we can simply say that the direct memory needed
>> by
>> > > the
>> > > > > > > network
>> > > > > > > > stack is the framework direct memory requirement. Changing
>> how
>> > > the
>> > > > > > memory
>> > > > > > > > is allocated can happen in a second step. This would keep
>> the
>> > > scope
>> > > > > of
>> > > > > > > this
>> > > > > > > > FLIP smaller.
>> > > > > > > >
>> > > > > > > > Cheers,
>> > > > > > > > Till
>> > > > > > > >
>> > > > > > > > On Thu, Aug 22, 2019 at 2:51 PM Xintong Song <
>> > > > tonysong...@gmail.com>
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Hi everyone,
>> > > > > > > > >
>> > > > > > > > > I just updated the FLIP document on wiki [1], with the
>> > > following
>> > > > > > > changes.
>> > > > > > > > >
>> > > > > > > > >    - Removed open question regarding MemorySegment
>> > allocation.
>> > > As
>> > > > > > > > >    discussed, we exclude this topic from the scope of this
>> > > FLIP.
>> > > > > > > > >    - Updated content about JVM direct memory parameter
>> > > according
>> > > > to
>> > > > > > > > recent
>> > > > > > > > >    discussions, and moved the other options to "Rejected
>> > > > > > Alternatives"
>> > > > > > > > for
>> > > > > > > > > the
>> > > > > > > > >    moment.
>> > > > > > > > >    - Added implementation steps.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Thank you~
>> > > > > > > > >
>> > > > > > > > > Xintong Song
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > [1]
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
>> > > > > > > > >
>> > > > > > > > > On Mon, Aug 19, 2019 at 7:16 PM Stephan Ewen <
>> > se...@apache.org
>> > > >
>> > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > @Xintong: Concerning "wait for memory users before task
>> > > dispose
>> > > > > and
>> > > > > > > > > memory
>> > > > > > > > > > release": I agree, that's how it should be. Let's try it
>> > out.
>> > > > > > > > > >
>> > > > > > > > > > @Xintong @Jingsong: Concerning " JVM does not wait for
>> GC
>> > > when
>> > > > > > > > allocating
>> > > > > > > > > > direct memory buffer": There seems to be pretty
>> elaborate
>> > > logic
>> > > > > to
>> > > > > > > free
>> > > > > > > > > > buffers when allocating new ones. See
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/file/tip/src/share/classes/java/nio/Bits.java#l643
>> > > > > > > > > >
>> > > > > > > > > > @Till: Maybe. If we assume that the JVM default works
>> (like
>> > > > going
>> > > > > > > with
>> > > > > > > > > > option 2 and not setting "-XX:MaxDirectMemorySize" at
>> all),
>> > > > then
>> > > > > I
>> > > > > > > > think
>> > > > > > > > > it
>> > > > > > > > > > should be okay to set "-XX:MaxDirectMemorySize" to
>> > > > > > > > > > "off_heap_managed_memory + direct_memory" even if we use
>> > > > RocksDB.
>> > > > > > > That
>> > > > > > > > > is a
>> > > > > > > > > > big if, though, I honestly have no idea :D Would be
>> good to
>> > > > > > > understand
>> > > > > > > > > > this, though, because this would affect option (2) and
>> > option
>> > > > > > (1.2).
>> > > > > > > > > >
>> > > > > > > > > > On Mon, Aug 19, 2019 at 4:44 PM Xintong Song <
>> > > > > > tonysong...@gmail.com>
>> > > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Thanks for the inputs, Jingsong.
>> > > > > > > > > > >
>> > > > > > > > > > > Let me try to summarize your points. Please correct
>> me if
>> > > I'm
>> > > > > > > wrong.
>> > > > > > > > > > >
>> > > > > > > > > > >    - Memory consumers should always avoid returning
>> > memory
>> > > > > > segments
>> > > > > > > > to
>> > > > > > > > > > >    memory manager while there are still un-cleaned
>> > > > structures /
>> > > > > > > > threads
>> > > > > > > > > > > that
>> > > > > > > > > > >    may use the memory. Otherwise, it would cause
>> serious
>> > > > > problems
>> > > > > > > by
>> > > > > > > > > > having
>> > > > > > > > > > >    multiple consumers trying to use the same memory
>> > > segment.
>> > > > > > > > > > >    - JVM does not wait for GC when allocating direct
>> > memory
>> > > > > > buffer.
>> > > > > > > > > > >    Therefore even we set proper max direct memory size
>> > > limit,
>> > > > > we
>> > > > > > > may
>> > > > > > > > > > still
>> > > > > > > > > > >    encounter direct memory oom if the GC cleaning
>> memory
>> > > > slower
>> > > > > > > than
>> > > > > > > > > the
>> > > > > > > > > > >    direct memory allocation.
>> > > > > > > > > > >
>> > > > > > > > > > > Am I understanding this correctly?
>> > > > > > > > > > >
>> > > > > > > > > > > Thank you~
>> > > > > > > > > > >
>> > > > > > > > > > > Xintong Song
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > On Mon, Aug 19, 2019 at 4:21 PM JingsongLee <
>> > > > > > > lzljs3620...@aliyun.com
>> > > > > > > > > > > .invalid>
>> > > > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > Hi stephan:
>> > > > > > > > > > > >
>> > > > > > > > > > > > About option 2:
>> > > > > > > > > > > >
>> > > > > > > > > > > > if additional threads not cleanly shut down before
>> we
>> > can
>> > > > > exit
>> > > > > > > the
>> > > > > > > > > > task:
>> > > > > > > > > > > > In the current case of memory reuse, it has freed up
>> > the
>> > > > > memory
>> > > > > > > it
>> > > > > > > > > > > >  uses. If this memory is used by other tasks and
>> > > > asynchronous
>> > > > > > > > threads
>> > > > > > > > > > > >  of exited task may still be writing, there will be
>> > > > > concurrent
>> > > > > > > > > security
>> > > > > > > > > > > >  problems, and even lead to errors in user computing
>> > > > results.
>> > > > > > > > > > > >
>> > > > > > > > > > > > So I think this is a serious and intolerable bug, No
>> > > matter
>> > > > > > what
>> > > > > > > > the
>> > > > > > > > > > > >  option is, it should be avoided.
>> > > > > > > > > > > >
>> > > > > > > > > > > > About direct memory cleaned by GC:
>> > > > > > > > > > > > I don't think it is a good idea, I've encountered so
>> > many
>> > > > > > > > situations
>> > > > > > > > > > > >  that it's too late for GC to cause DirectMemory
>> OOM.
>> > > > Release
>> > > > > > and
>> > > > > > > > > > > >  allocate DirectMemory depend on the type of user
>> job,
>> > > > which
>> > > > > is
>> > > > > > > > > > > >  often beyond our control.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Best,
>> > > > > > > > > > > > Jingsong Lee
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > >
>> > ------------------------------------------------------------------
>> > > > > > > > > > > > From:Stephan Ewen <se...@apache.org>
>> > > > > > > > > > > > Send Time:2019年8月19日(星期一) 15:56
>> > > > > > > > > > > > To:dev <dev@flink.apache.org>
>> > > > > > > > > > > > Subject:Re: [DISCUSS] FLIP-49: Unified Memory
>> > > Configuration
>> > > > > for
>> > > > > > > > > > > > TaskExecutors
>> > > > > > > > > > > >
>> > > > > > > > > > > > My main concern with option 2 (manually release
>> memory)
>> > > is
>> > > > > that
>> > > > > > > > > > segfaults
>> > > > > > > > > > > > in the JVM send off all sorts of alarms on user
>> ends.
>> > So
>> > > we
>> > > > > > need
>> > > > > > > to
>> > > > > > > > > > > > guarantee that this never happens.
>> > > > > > > > > > > >
>> > > > > > > > > > > > The trickyness is in tasks that uses data
>> structures /
>> > > > > > algorithms
>> > > > > > > > > with
>> > > > > > > > > > > > additional threads, like hash table spill/read and
>> > > sorting
>> > > > > > > threads.
>> > > > > > > > > We
>> > > > > > > > > > > need
>> > > > > > > > > > > > to ensure that these cleanly shut down before we can
>> > exit
>> > > > the
>> > > > > > > task.
>> > > > > > > > > > > > I am not sure that we have that guaranteed already,
>> > > that's
>> > > > > why
>> > > > > > > > option
>> > > > > > > > > > 1.1
>> > > > > > > > > > > > seemed simpler to me.
>> > > > > > > > > > > >
>> > > > > > > > > > > > On Mon, Aug 19, 2019 at 3:42 PM Xintong Song <
>> > > > > > > > tonysong...@gmail.com>
>> > > > > > > > > > > > wrote:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Thanks for the comments, Stephan. Summarized in
>> this
>> > > way
>> > > > > > really
>> > > > > > > > > makes
>> > > > > > > > > > > > > things easier to understand.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > I'm in favor of option 2, at least for the
>> moment. I
>> > > > think
>> > > > > it
>> > > > > > > is
>> > > > > > > > > not
>> > > > > > > > > > > that
>> > > > > > > > > > > > > difficult to keep it segfault safe for memory
>> > manager,
>> > > as
>> > > > > > long
>> > > > > > > as
>> > > > > > > > > we
>> > > > > > > > > > > > always
>> > > > > > > > > > > > > de-allocate the memory segment when it is released
>> > from
>> > > > the
>> > > > > > > > memory
>> > > > > > > > > > > > > consumers. Only if the memory consumer continue
>> using
>> > > the
>> > > > > > > buffer
>> > > > > > > > of
>> > > > > > > > > > > > memory
>> > > > > > > > > > > > > segment after releasing it, in which case we do
>> want
>> > > the
>> > > > > job
>> > > > > > to
>> > > > > > > > > fail
>> > > > > > > > > > so
>> > > > > > > > > > > > we
>> > > > > > > > > > > > > detect the memory leak early.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > For option 1.2, I don't think this is a good idea.
>> > Not
>> > > > only
>> > > > > > > > because
>> > > > > > > > > > the
>> > > > > > > > > > > > > assumption (regular GC is enough to clean direct
>> > > buffers)
>> > > > > may
>> > > > > > > not
>> > > > > > > > > > > always
>> > > > > > > > > > > > be
>> > > > > > > > > > > > > true, but also it makes harder for finding
>> problems
>> > in
>> > > > > cases
>> > > > > > of
>> > > > > > > > > > memory
>> > > > > > > > > > > > > overuse. E.g., user configured some direct memory
>> for
>> > > the
>> > > > > > user
>> > > > > > > > > > > libraries.
>> > > > > > > > > > > > > If the library actually use more direct memory
>> then
>> > > > > > configured,
>> > > > > > > > > which
>> > > > > > > > > > > > > cannot be cleaned by GC because they are still in
>> > use,
>> > > > may
>> > > > > > lead
>> > > > > > > > to
>> > > > > > > > > > > > overuse
>> > > > > > > > > > > > > of the total container memory. In that case, if it
>> > > didn't
>> > > > > > touch
>> > > > > > > > the
>> > > > > > > > > > JVM
>> > > > > > > > > > > > > default max direct memory limit, we cannot get a
>> > direct
>> > > > > > memory
>> > > > > > > > OOM
>> > > > > > > > > > and
>> > > > > > > > > > > it
>> > > > > > > > > > > > > will become super hard to understand which part of
>> > the
>> > > > > > > > > configuration
>> > > > > > > > > > > need
>> > > > > > > > > > > > > to be updated.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > For option 1.1, it has the similar problem as
>> 1.2, if
>> > > the
>> > > > > > > > exceeded
>> > > > > > > > > > > direct
>> > > > > > > > > > > > > memory does not reach the max direct memory limit
>> > > > specified
>> > > > > > by
>> > > > > > > > the
>> > > > > > > > > > > > > dedicated parameter. I think it is slightly better
>> > than
>> > > > > 1.2,
>> > > > > > > only
>> > > > > > > > > > > because
>> > > > > > > > > > > > > we can tune the parameter.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Thank you~
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Xintong Song
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > On Mon, Aug 19, 2019 at 2:53 PM Stephan Ewen <
>> > > > > > se...@apache.org
>> > > > > > > >
>> > > > > > > > > > wrote:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > About the "-XX:MaxDirectMemorySize" discussion,
>> > maybe
>> > > > let
>> > > > > > me
>> > > > > > > > > > > summarize
>> > > > > > > > > > > > > it a
>> > > > > > > > > > > > > > bit differently:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > We have the following two options:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (1) We let MemorySegments be de-allocated by the
>> > GC.
>> > > > That
>> > > > > > > makes
>> > > > > > > > > it
>> > > > > > > > > > > > > segfault
>> > > > > > > > > > > > > > safe. But then we need a way to trigger GC in
>> case
>> > > > > > > > de-allocation
>> > > > > > > > > > and
>> > > > > > > > > > > > > > re-allocation of a bunch of segments happens
>> > quickly,
>> > > > > which
>> > > > > > > is
>> > > > > > > > > > often
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > case during batch scheduling or task restart.
>> > > > > > > > > > > > > >   - The "-XX:MaxDirectMemorySize" (option 1.1)
>> is
>> > one
>> > > > way
>> > > > > > to
>> > > > > > > do
>> > > > > > > > > > this
>> > > > > > > > > > > > > >   - Another way could be to have a dedicated
>> > > > bookkeeping
>> > > > > in
>> > > > > > > the
>> > > > > > > > > > > > > > MemoryManager (option 1.2), so that this is a
>> > number
>> > > > > > > > independent
>> > > > > > > > > of
>> > > > > > > > > > > the
>> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" parameter.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (2) We manually allocate and de-allocate the
>> memory
>> > > for
>> > > > > the
>> > > > > > > > > > > > > MemorySegments
>> > > > > > > > > > > > > > (option 2). That way we need not worry about
>> > > triggering
>> > > > > GC
>> > > > > > by
>> > > > > > > > > some
>> > > > > > > > > > > > > > threshold or bookkeeping, but it is harder to
>> > prevent
>> > > > > > > > segfaults.
>> > > > > > > > > We
>> > > > > > > > > > > > need
>> > > > > > > > > > > > > to
>> > > > > > > > > > > > > > be very careful about when we release the memory
>> > > > segments
>> > > > > > > (only
>> > > > > > > > > in
>> > > > > > > > > > > the
>> > > > > > > > > > > > > > cleanup phase of the main thread).
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > If we go with option 1.1, we probably need to
>> set
>> > > > > > > > > > > > > > "-XX:MaxDirectMemorySize" to
>> > > "off_heap_managed_memory +
>> > > > > > > > > > > direct_memory"
>> > > > > > > > > > > > > and
>> > > > > > > > > > > > > > have "direct_memory" as a separate reserved
>> memory
>> > > > pool.
>> > > > > > > > Because
>> > > > > > > > > if
>> > > > > > > > > > > we
>> > > > > > > > > > > > > just
>> > > > > > > > > > > > > > set "-XX:MaxDirectMemorySize" to
>> > > > > "off_heap_managed_memory +
>> > > > > > > > > > > > > jvm_overhead",
>> > > > > > > > > > > > > > then there will be times when that entire
>> memory is
>> > > > > > allocated
>> > > > > > > > by
>> > > > > > > > > > > direct
>> > > > > > > > > > > > > > buffers and we have nothing left for the JVM
>> > > overhead.
>> > > > So
>> > > > > > we
>> > > > > > > > > either
>> > > > > > > > > > > > need
>> > > > > > > > > > > > > a
>> > > > > > > > > > > > > > way to compensate for that (again some safety
>> > margin
>> > > > > cutoff
>> > > > > > > > > value)
>> > > > > > > > > > or
>> > > > > > > > > > > > we
>> > > > > > > > > > > > > > will exceed container memory.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > If we go with option 1.2, we need to be aware
>> that
>> > it
>> > > > > takes
>> > > > > > > > > > elaborate
>> > > > > > > > > > > > > logic
>> > > > > > > > > > > > > > to push recycling of direct buffers without
>> always
>> > > > > > > triggering a
>> > > > > > > > > > full
>> > > > > > > > > > > > GC.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > My first guess is that the options will be
>> easiest
>> > to
>> > > > do
>> > > > > in
>> > > > > > > the
>> > > > > > > > > > > > following
>> > > > > > > > > > > > > > order:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >   - Option 1.1 with a dedicated direct_memory
>> > > > parameter,
>> > > > > as
>> > > > > > > > > > discussed
>> > > > > > > > > > > > > > above. We would need to find a way to set the
>> > > > > direct_memory
>> > > > > > > > > > parameter
>> > > > > > > > > > > > by
>> > > > > > > > > > > > > > default. We could start with 64 MB and see how
>> it
>> > > goes
>> > > > in
>> > > > > > > > > practice.
>> > > > > > > > > > > One
>> > > > > > > > > > > > > > danger I see is that setting this loo low can
>> > cause a
>> > > > > bunch
>> > > > > > > of
>> > > > > > > > > > > > additional
>> > > > > > > > > > > > > > GCs compared to before (we need to watch this
>> > > > carefully).
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >   - Option 2. It is actually quite simple to
>> > > implement,
>> > > > > we
>> > > > > > > > could
>> > > > > > > > > > try
>> > > > > > > > > > > > how
>> > > > > > > > > > > > > > segfault safe we are at the moment.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >   - Option 1.2: We would not touch the
>> > > > > > > > "-XX:MaxDirectMemorySize"
>> > > > > > > > > > > > > parameter
>> > > > > > > > > > > > > > at all and assume that all the direct memory
>> > > > allocations
>> > > > > > that
>> > > > > > > > the
>> > > > > > > > > > JVM
>> > > > > > > > > > > > and
>> > > > > > > > > > > > > > Netty do are infrequent enough to be cleaned up
>> > fast
>> > > > > enough
>> > > > > > > > > through
>> > > > > > > > > > > > > regular
>> > > > > > > > > > > > > > GC. I am not sure if that is a valid assumption,
>> > > > though.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Best,
>> > > > > > > > > > > > > > Stephan
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song <
>> > > > > > > > > > tonysong...@gmail.com>
>> > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Thanks for sharing your opinion Till.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was
>> > wondering
>> > > > > > whether
>> > > > > > > > we
>> > > > > > > > > > can
>> > > > > > > > > > > > > avoid
>> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed
>> > memory
>> > > > and
>> > > > > > > > network
>> > > > > > > > > > > > memory
>> > > > > > > > > > > > > > with
>> > > > > > > > > > > > > > > alternative 3. But after giving it a second
>> > > thought,
>> > > > I
>> > > > > > > think
>> > > > > > > > > even
>> > > > > > > > > > > for
>> > > > > > > > > > > > > > > alternative 3 using direct memory for off-heap
>> > > > managed
>> > > > > > > memory
>> > > > > > > > > > could
>> > > > > > > > > > > > > cause
>> > > > > > > > > > > > > > > problems.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Hi Yang,
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Regarding your concern, I think what proposed
>> in
>> > > this
>> > > > > > FLIP
>> > > > > > > it
>> > > > > > > > > to
>> > > > > > > > > > > have
>> > > > > > > > > > > > > > both
>> > > > > > > > > > > > > > > off-heap managed memory and network memory
>> > > allocated
>> > > > > > > through
>> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are
>> > practically
>> > > > > > native
>> > > > > > > > > memory
>> > > > > > > > > > > and
>> > > > > > > > > > > > > not
>> > > > > > > > > > > > > > > limited by JVM max direct memory. The only
>> parts
>> > of
>> > > > > > memory
>> > > > > > > > > > limited
>> > > > > > > > > > > by
>> > > > > > > > > > > > > JVM
>> > > > > > > > > > > > > > > max direct memory are task off-heap memory and
>> > JVM
>> > > > > > > overhead,
>> > > > > > > > > > which
>> > > > > > > > > > > > are
>> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM
>> max
>> > > > > direct
>> > > > > > > > memory
>> > > > > > > > > > to.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Thank you~
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Xintong Song
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann
>> <
>> > > > > > > > > > > trohrm...@apache.org>
>> > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I
>> > > understand
>> > > > > the
>> > > > > > > two
>> > > > > > > > > > > > > alternatives
>> > > > > > > > > > > > > > > > now.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > I would be in favour of option 2 because it
>> > makes
>> > > > > > things
>> > > > > > > > > > > explicit.
>> > > > > > > > > > > > If
>> > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > don't limit the direct memory, I fear that
>> we
>> > > might
>> > > > > end
>> > > > > > > up
>> > > > > > > > > in a
>> > > > > > > > > > > > > similar
>> > > > > > > > > > > > > > > > situation as we are currently in: The user
>> > might
>> > > > see
>> > > > > > that
>> > > > > > > > her
>> > > > > > > > > > > > process
>> > > > > > > > > > > > > > > gets
>> > > > > > > > > > > > > > > > killed by the OS and does not know why this
>> is
>> > > the
>> > > > > > case.
>> > > > > > > > > > > > > Consequently,
>> > > > > > > > > > > > > > > she
>> > > > > > > > > > > > > > > > tries to decrease the process memory size
>> > > (similar
>> > > > to
>> > > > > > > > > > increasing
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > cutoff
>> > > > > > > > > > > > > > > > ratio) in order to accommodate for the extra
>> > > direct
>> > > > > > > memory.
>> > > > > > > > > > Even
>> > > > > > > > > > > > > worse,
>> > > > > > > > > > > > > > > she
>> > > > > > > > > > > > > > > > tries to decrease memory budgets which are
>> not
>> > > > fully
>> > > > > > used
>> > > > > > > > and
>> > > > > > > > > > > hence
>> > > > > > > > > > > > > > won't
>> > > > > > > > > > > > > > > > change the overall memory consumption.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Cheers,
>> > > > > > > > > > > > > > > > Till
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong
>> Song <
>> > > > > > > > > > > > tonysong...@gmail.com
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Let me explain this with a concrete
>> example
>> > > Till.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Let's say we have the following scenario.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Total Process Memory: 1GB
>> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory +
>> JVM
>> > > > > > > Overhead):
>> > > > > > > > > > 200MB
>> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM
>> Metaspace,
>> > > > > > Off-Heap
>> > > > > > > > > > Managed
>> > > > > > > > > > > > > Memory
>> > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > Network Memory): 800MB
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > For alternative 2, we set
>> > > -XX:MaxDirectMemorySize
>> > > > > to
>> > > > > > > > 200MB.
>> > > > > > > > > > > > > > > > > For alternative 3, we set
>> > > -XX:MaxDirectMemorySize
>> > > > > to
>> > > > > > a
>> > > > > > > > very
>> > > > > > > > > > > large
>> > > > > > > > > > > > > > > value,
>> > > > > > > > > > > > > > > > > let's say 1TB.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > If the actual direct memory usage of Task
>> > > > Off-Heap
>> > > > > > > Memory
>> > > > > > > > > and
>> > > > > > > > > > > JVM
>> > > > > > > > > > > > > > > > Overhead
>> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2
>> and
>> > > > > > > alternative 3
>> > > > > > > > > > > should
>> > > > > > > > > > > > > have
>> > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > same utility. Setting larger
>> > > > > -XX:MaxDirectMemorySize
>> > > > > > > will
>> > > > > > > > > not
>> > > > > > > > > > > > > reduce
>> > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > sizes of the other memory pools.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > If the actual direct memory usage of Task
>> > > > Off-Heap
>> > > > > > > Memory
>> > > > > > > > > and
>> > > > > > > > > > > JVM
>> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >    - Alternative 2 suffers from frequent
>> OOM.
>> > > To
>> > > > > > avoid
>> > > > > > > > > that,
>> > > > > > > > > > > the
>> > > > > > > > > > > > > only
>> > > > > > > > > > > > > > > > thing
>> > > > > > > > > > > > > > > > >    user can do is to modify the
>> configuration
>> > > and
>> > > > > > > > increase
>> > > > > > > > > > JVM
>> > > > > > > > > > > > > Direct
>> > > > > > > > > > > > > > > > > Memory
>> > > > > > > > > > > > > > > > >    (Task Off-Heap Memory + JVM Overhead).
>> > Let's
>> > > > say
>> > > > > > > that
>> > > > > > > > > user
>> > > > > > > > > > > > > > increases
>> > > > > > > > > > > > > > > > JVM
>> > > > > > > > > > > > > > > > >    Direct Memory to 250MB, this will
>> reduce
>> > the
>> > > > > total
>> > > > > > > > size
>> > > > > > > > > of
>> > > > > > > > > > > > other
>> > > > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > >    pools to 750MB, given the total process
>> > > memory
>> > > > > > > remains
>> > > > > > > > > > 1GB.
>> > > > > > > > > > > > > > > > >    - For alternative 3, there is no
>> chance of
>> > > > > direct
>> > > > > > > OOM.
>> > > > > > > > > > There
>> > > > > > > > > > > > are
>> > > > > > > > > > > > > > > > chances
>> > > > > > > > > > > > > > > > >    of exceeding the total process memory
>> > limit,
>> > > > but
>> > > > > > > given
>> > > > > > > > > > that
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > process
>> > > > > > > > > > > > > > > > > may
>> > > > > > > > > > > > > > > > >    not use up all the reserved native
>> memory
>> > > > > > (Off-Heap
>> > > > > > > > > > Managed
>> > > > > > > > > > > > > > Memory,
>> > > > > > > > > > > > > > > > > Network
>> > > > > > > > > > > > > > > > >    Memory, JVM Metaspace), if the actual
>> > direct
>> > > > > > memory
>> > > > > > > > > usage
>> > > > > > > > > > is
>> > > > > > > > > > > > > > > slightly
>> > > > > > > > > > > > > > > > > above
>> > > > > > > > > > > > > > > > >    yet very close to 200MB, user probably
>> do
>> > > not
>> > > > > need
>> > > > > > > to
>> > > > > > > > > > change
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > >    configurations.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Therefore, I think from the user's
>> > > perspective, a
>> > > > > > > > feasible
>> > > > > > > > > > > > > > > configuration
>> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower
>> resource
>> > > > > > > utilization
>> > > > > > > > > > > compared
>> > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > alternative 3.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Thank you~
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Xintong Song
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till
>> > Rohrmann
>> > > <
>> > > > > > > > > > > > > trohrm...@apache.org
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > I guess you have to help me understand
>> the
>> > > > > > difference
>> > > > > > > > > > between
>> > > > > > > > > > > > > > > > > alternative 2
>> > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization
>> > > Xintong.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > - Alternative 2: set
>> XX:MaxDirectMemorySize
>> > > to
>> > > > > Task
>> > > > > > > > > > Off-Heap
>> > > > > > > > > > > > > Memory
>> > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > JVM
>> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that
>> this
>> > > size
>> > > > > is
>> > > > > > > too
>> > > > > > > > > low
>> > > > > > > > > > > > > > resulting
>> > > > > > > > > > > > > > > > in a
>> > > > > > > > > > > > > > > > > > lot of garbage collection and
>> potentially
>> > an
>> > > > OOM.
>> > > > > > > > > > > > > > > > > > - Alternative 3: set
>> XX:MaxDirectMemorySize
>> > > to
>> > > > > > > > something
>> > > > > > > > > > > larger
>> > > > > > > > > > > > > > than
>> > > > > > > > > > > > > > > > > > alternative 2. This would of course
>> reduce
>> > > the
>> > > > > > sizes
>> > > > > > > of
>> > > > > > > > > the
>> > > > > > > > > > > > other
>> > > > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > types.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > How would alternative 2 now result in an
>> > > under
>> > > > > > > > > utilization
>> > > > > > > > > > of
>> > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > compared to alternative 3? If
>> alternative 3
>> > > > > > strictly
>> > > > > > > > > sets a
>> > > > > > > > > > > > > higher
>> > > > > > > > > > > > > > > max
>> > > > > > > > > > > > > > > > > > direct memory size and we use only
>> little,
>> > > > then I
>> > > > > > > would
>> > > > > > > > > > > expect
>> > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > > > alternative 3 results in memory under
>> > > > > utilization.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Cheers,
>> > > > > > > > > > > > > > > > > > Till
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang
>> Wang <
>> > > > > > > > > > > > danrtsey...@gmail.com
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Hi xintong,till
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Native and Direct Memory
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > My point is setting a very large max
>> > direct
>> > > > > > memory
>> > > > > > > > size
>> > > > > > > > > > > when
>> > > > > > > > > > > > we
>> > > > > > > > > > > > > > do
>> > > > > > > > > > > > > > > > not
>> > > > > > > > > > > > > > > > > > > differentiate direct and native
>> memory.
>> > If
>> > > > the
>> > > > > > > direct
>> > > > > > > > > > > > > > > > memory,including
>> > > > > > > > > > > > > > > > > > user
>> > > > > > > > > > > > > > > > > > > direct memory and framework direct
>> > > > memory,could
>> > > > > > be
>> > > > > > > > > > > calculated
>> > > > > > > > > > > > > > > > > > > correctly,then
>> > > > > > > > > > > > > > > > > > > i am in favor of setting direct memory
>> > with
>> > > > > fixed
>> > > > > > > > > value.
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Memory Calculation
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and
>> k8s,we
>> > > > need
>> > > > > to
>> > > > > > > > check
>> > > > > > > > > > the
>> > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > > configurations in client to avoid
>> > > submitting
>> > > > > > > > > successfully
>> > > > > > > > > > > and
>> > > > > > > > > > > > > > > failing
>> > > > > > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > > > > > the flink master.
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Best,
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Yang
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Xintong Song <tonysong...@gmail.com
>> > > > > >于2019年8月13日
>> > > > > > > > > > 周二22:07写道：
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are
>> > > right
>> > > > > that
>> > > > > > > we
>> > > > > > > > > > should
>> > > > > > > > > > > > not
>> > > > > > > > > > > > > > > > include
>> > > > > > > > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP.
>> This
>> > > FLIP
>> > > > > > should
>> > > > > > > > > > > > concentrate
>> > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > how
>> > > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > > configure memory pools for
>> > TaskExecutors,
>> > > > > with
>> > > > > > > > > minimum
>> > > > > > > > > > > > > > > involvement
>> > > > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > > how
>> > > > > > > > > > > > > > > > > > > > memory consumers use it.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > About direct memory, I think
>> > alternative
>> > > 3
>> > > > > may
>> > > > > > > not
>> > > > > > > > > > having
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > same
>> > > > > > > > > > > > > > > > > over
>> > > > > > > > > > > > > > > > > > > > reservation issue that alternative 2
>> > > does,
>> > > > > but
>> > > > > > at
>> > > > > > > > the
>> > > > > > > > > > > cost
>> > > > > > > > > > > > of
>> > > > > > > > > > > > > > > risk
>> > > > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > > > over
>> > > > > > > > > > > > > > > > > > > > using memory at the container level,
>> > > which
>> > > > is
>> > > > > > not
>> > > > > > > > > good.
>> > > > > > > > > > > My
>> > > > > > > > > > > > > > point
>> > > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM
>> > > > > Overhead"
>> > > > > > > are
>> > > > > > > > > not
>> > > > > > > > > > > easy
>> > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > config.
>> > > > > > > > > > > > > > > > > > > For
>> > > > > > > > > > > > > > > > > > > > alternative 2, users might configure
>> > them
>> > > > > > higher
>> > > > > > > > than
>> > > > > > > > > > > what
>> > > > > > > > > > > > > > > actually
>> > > > > > > > > > > > > > > > > > > needed,
>> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM.
>> For
>> > > > > > > alternative
>> > > > > > > > > 3,
>> > > > > > > > > > > > users
>> > > > > > > > > > > > > do
>> > > > > > > > > > > > > > > not
>> > > > > > > > > > > > > > > > > get
>> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not config
>> the
>> > > two
>> > > > > > > options
>> > > > > > > > > > > > > aggressively
>> > > > > > > > > > > > > > > > high.
>> > > > > > > > > > > > > > > > > > But
>> > > > > > > > > > > > > > > > > > > > the consequences are risks of
>> overall
>> > > > > container
>> > > > > > > > > memory
>> > > > > > > > > > > > usage
>> > > > > > > > > > > > > > > > exceeds
>> > > > > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > > > > budget.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Thank you~
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Xintong Song
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till
>> > > > > Rohrmann <
>> > > > > > > > > > > > > > > > trohrm...@apache.org>
>> > > > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP
>> > Xintong.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > All in all I think it already
>> looks
>> > > quite
>> > > > > > good.
>> > > > > > > > > > > > Concerning
>> > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > first
>> > > > > > > > > > > > > > > > > > > open
>> > > > > > > > > > > > > > > > > > > > > question about allocating memory
>> > > > segments,
>> > > > > I
>> > > > > > > was
>> > > > > > > > > > > > wondering
>> > > > > > > > > > > > > > > > whether
>> > > > > > > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the
>> > context
>> > > > of
>> > > > > > this
>> > > > > > > > > FLIP
>> > > > > > > > > > or
>> > > > > > > > > > > > > > whether
>> > > > > > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > > > > > could
>> > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without
>> > knowing
>> > > > all
>> > > > > > > > > details,
>> > > > > > > > > > I
>> > > > > > > > > > > > > would
>> > > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > > > concerned
>> > > > > > > > > > > > > > > > > > > > > that we would widen the scope of
>> this
>> > > > FLIP
>> > > > > > too
>> > > > > > > > much
>> > > > > > > > > > > > because
>> > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > > would
>> > > > > > > > > > > > > > > > > > > have
>> > > > > > > > > > > > > > > > > > > > > to touch all the existing call
>> sites
>> > of
>> > > > the
>> > > > > > > > > > > MemoryManager
>> > > > > > > > > > > > > > where
>> > > > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > > > > > allocate
>> > > > > > > > > > > > > > > > > > > > > memory segments (this should
>> mainly
>> > be
>> > > > > batch
>> > > > > > > > > > > operators).
>> > > > > > > > > > > > > The
>> > > > > > > > > > > > > > > > > addition
>> > > > > > > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > > > > > the memory reservation call to the
>> > > > > > > MemoryManager
>> > > > > > > > > > should
>> > > > > > > > > > > > not
>> > > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > > affected
>> > > > > > > > > > > > > > > > > > > > by
>> > > > > > > > > > > > > > > > > > > > > this and I would hope that this is
>> > the
>> > > > only
>> > > > > > > point
>> > > > > > > > > of
>> > > > > > > > > > > > > > > interaction
>> > > > > > > > > > > > > > > > a
>> > > > > > > > > > > > > > > > > > > > > streaming job would have with the
>> > > > > > > MemoryManager.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > Concerning the second open
>> question
>> > > about
>> > > > > > > setting
>> > > > > > > > > or
>> > > > > > > > > > > not
>> > > > > > > > > > > > > > > setting
>> > > > > > > > > > > > > > > > a
>> > > > > > > > > > > > > > > > > > max
>> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would also
>> be
>> > > > > > interested
>> > > > > > > > why
>> > > > > > > > > > > Yang
>> > > > > > > > > > > > > Wang
>> > > > > > > > > > > > > > > > > thinks
>> > > > > > > > > > > > > > > > > > > > > leaving it open would be best. My
>> > > concern
>> > > > > > about
>> > > > > > > > > this
>> > > > > > > > > > > > would
>> > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > > > > > would
>> > > > > > > > > > > > > > > > > > > > > be in a similar situation as we
>> are
>> > now
>> > > > > with
>> > > > > > > the
>> > > > > > > > > > > > > > > > > RocksDBStateBackend.
>> > > > > > > > > > > > > > > > > > > If
>> > > > > > > > > > > > > > > > > > > > > the different memory pools are not
>> > > > clearly
>> > > > > > > > > separated
>> > > > > > > > > > > and
>> > > > > > > > > > > > > can
>> > > > > > > > > > > > > > > > spill
>> > > > > > > > > > > > > > > > > > over
>> > > > > > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > > > a different pool, then it is quite
>> > hard
>> > > > to
>> > > > > > > > > understand
>> > > > > > > > > > > > what
>> > > > > > > > > > > > > > > > exactly
>> > > > > > > > > > > > > > > > > > > > causes a
>> > > > > > > > > > > > > > > > > > > > > process to get killed for using
>> too
>> > > much
>> > > > > > > memory.
>> > > > > > > > > This
>> > > > > > > > > > > > could
>> > > > > > > > > > > > > > > then
>> > > > > > > > > > > > > > > > > > easily
>> > > > > > > > > > > > > > > > > > > > > lead to a similar situation what
>> we
>> > > have
>> > > > > with
>> > > > > > > the
>> > > > > > > > > > > > > > cutoff-ratio.
>> > > > > > > > > > > > > > > > So
>> > > > > > > > > > > > > > > > > > why
>> > > > > > > > > > > > > > > > > > > > not
>> > > > > > > > > > > > > > > > > > > > > setting a sane default value for
>> max
>> > > > direct
>> > > > > > > > memory
>> > > > > > > > > > and
>> > > > > > > > > > > > > giving
>> > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > > user
>> > > > > > > > > > > > > > > > > > > an
>> > > > > > > > > > > > > > > > > > > > > option to increase it if he runs
>> into
>> > > an
>> > > > > OOM.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2
>> > lead
>> > > to
>> > > > > > lower
>> > > > > > > > > > memory
>> > > > > > > > > > > > > > > > utilization
>> > > > > > > > > > > > > > > > > > than
>> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the
>> direct
>> > > > > memory
>> > > > > > > to a
>> > > > > > > > > > > higher
>> > > > > > > > > > > > > > value?
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > Cheers,
>> > > > > > > > > > > > > > > > > > > > > Till
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM
>> > Xintong
>> > > > > Song <
>> > > > > > > > > > > > > > > > tonysong...@gmail.com
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang.
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > Regarding your comments:
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory*
>> > > > > > > > > > > > > > > > > > > > > > I think setting a very large max
>> > > direct
>> > > > > > > memory
>> > > > > > > > > size
>> > > > > > > > > > > > > > > definitely
>> > > > > > > > > > > > > > > > > has
>> > > > > > > > > > > > > > > > > > > some
>> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not
>> worry
>> > > about
>> > > > > > > direct
>> > > > > > > > > OOM,
>> > > > > > > > > > > and
>> > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > don't
>> > > > > > > > > > > > > > > > > > even
>> > > > > > > > > > > > > > > > > > > > > need
>> > > > > > > > > > > > > > > > > > > > > > to allocate managed / network
>> > memory
>> > > > with
>> > > > > > > > > > > > > > Unsafe.allocate() .
>> > > > > > > > > > > > > > > > > > > > > > However, there are also some
>> down
>> > > sides
>> > > > > of
>> > > > > > > > doing
>> > > > > > > > > > > this.
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >    - One thing I can think of is
>> > that
>> > > > if
>> > > > > a
>> > > > > > > task
>> > > > > > > > > > > > executor
>> > > > > > > > > > > > > > > > > container
>> > > > > > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > > > > > >    killed due to overusing
>> memory,
>> > it
>> > > > > could
>> > > > > > > be
>> > > > > > > > > hard
>> > > > > > > > > > > for
>> > > > > > > > > > > > > use
>> > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > know
>> > > > > > > > > > > > > > > > > > > > which
>> > > > > > > > > > > > > > > > > > > > > > part
>> > > > > > > > > > > > > > > > > > > > > >    of the memory is overused.
>> > > > > > > > > > > > > > > > > > > > > >    - Another down side is that
>> the
>> > > JVM
>> > > > > > never
>> > > > > > > > > > trigger
>> > > > > > > > > > > GC
>> > > > > > > > > > > > > due
>> > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > reaching
>> > > > > > > > > > > > > > > > > > > > > max
>> > > > > > > > > > > > > > > > > > > > > >    direct memory limit, because
>> the
>> > > > limit
>> > > > > > is
>> > > > > > > > too
>> > > > > > > > > > high
>> > > > > > > > > > > > to
>> > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > > reached.
>> > > > > > > > > > > > > > > > > > > > That
>> > > > > > > > > > > > > > > > > > > > > >    means we kind of relay on
>> heap
>> > > > memory
>> > > > > to
>> > > > > > > > > trigger
>> > > > > > > > > > > GC
>> > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > release
>> > > > > > > > > > > > > > > > > > > > direct
>> > > > > > > > > > > > > > > > > > > > > >    memory. That could be a
>> problem
>> > in
>> > > > > cases
>> > > > > > > > where
>> > > > > > > > > > we
>> > > > > > > > > > > > have
>> > > > > > > > > > > > > > > more
>> > > > > > > > > > > > > > > > > > direct
>> > > > > > > > > > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > > > > >    usage but not enough heap
>> > activity
>> > > > to
>> > > > > > > > trigger
>> > > > > > > > > > the
>> > > > > > > > > > > > GC.
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons
>> > for
>> > > > > > > preferring
>> > > > > > > > > > > > setting a
>> > > > > > > > > > > > > > > very
>> > > > > > > > > > > > > > > > > > large
>> > > > > > > > > > > > > > > > > > > > > value,
>> > > > > > > > > > > > > > > > > > > > > > if there are anything else I
>> > > > overlooked.
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation*
>> > > > > > > > > > > > > > > > > > > > > > If there is any conflict between
>> > > > multiple
>> > > > > > > > > > > configuration
>> > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > user
>> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I think we
>> > > should
>> > > > > > throw
>> > > > > > > > an
>> > > > > > > > > > > error.
>> > > > > > > > > > > > > > > > > > > > > > I think doing checking on the
>> > client
>> > > > side
>> > > > > > is
>> > > > > > > a
>> > > > > > > > > good
>> > > > > > > > > > > > idea,
>> > > > > > > > > > > > > > so
>> > > > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > > > > > Yarn /
>> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the problem
>> > > before
>> > > > > > > > submitting
>> > > > > > > > > > the
>> > > > > > > > > > > > > Flink
>> > > > > > > > > > > > > > > > > > cluster,
>> > > > > > > > > > > > > > > > > > > > > which
>> > > > > > > > > > > > > > > > > > > > > > is always a good thing.
>> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on the
>> > > client
>> > > > > side
>> > > > > > > > > > checking,
>> > > > > > > > > > > > > > because
>> > > > > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers
>> on
>> > > > > > different
>> > > > > > > > > > machines
>> > > > > > > > > > > > may
>> > > > > > > > > > > > > > > have
>> > > > > > > > > > > > > > > > > > > > different
>> > > > > > > > > > > > > > > > > > > > > > configurations and the client
>> does
>> > > see
>> > > > > > that.
>> > > > > > > > > > > > > > > > > > > > > > What do you think?
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > Thank you~
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > Xintong Song
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM
>> Yang
>> > > > Wang
>> > > > > <
>> > > > > > > > > > > > > > > > danrtsey...@gmail.com>
>> > > > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > Hi xintong,
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed
>> > proposal.
>> > > > > After
>> > > > > > > all
>> > > > > > > > > the
>> > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > > configuration
>> > > > > > > > > > > > > > > > > > > > > are
>> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more
>> > > powerful
>> > > > to
>> > > > > > > > control
>> > > > > > > > > > the
>> > > > > > > > > > > > > flink
>> > > > > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > > > > usage. I
>> > > > > > > > > > > > > > > > > > > > > > > just have few questions about
>> it.
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >    - Native and Direct Memory
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user
>> > direct
>> > > > > > memory
>> > > > > > > > and
>> > > > > > > > > > > native
>> > > > > > > > > > > > > > > memory.
>> > > > > > > > > > > > > > > > > > They
>> > > > > > > > > > > > > > > > > > > > are
>> > > > > > > > > > > > > > > > > > > > > > all
>> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap
>> memory.
>> > > > > Right?
>> > > > > > > So i
>> > > > > > > > > > don’t
>> > > > > > > > > > > > > think
>> > > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > > > could
>> > > > > > > > > > > > > > > > > > > > not
>> > > > > > > > > > > > > > > > > > > > > > set
>> > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize
>> > > > properly. I
>> > > > > > > > prefer
>> > > > > > > > > > > > leaving
>> > > > > > > > > > > > > > it a
>> > > > > > > > > > > > > > > > > very
>> > > > > > > > > > > > > > > > > > > > large
>> > > > > > > > > > > > > > > > > > > > > > > value.
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >    - Memory Calculation
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained
>> > > > > > > memory(network
>> > > > > > > > > > > memory,
>> > > > > > > > > > > > > > > managed
>> > > > > > > > > > > > > > > > > > > memory,
>> > > > > > > > > > > > > > > > > > > > > > etc.)
>> > > > > > > > > > > > > > > > > > > > > > > is larger than total process
>> > > memory,
>> > > > > how
>> > > > > > do
>> > > > > > > > we
>> > > > > > > > > > deal
>> > > > > > > > > > > > > with
>> > > > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > > > > > > situation?
>> > > > > > > > > > > > > > > > > > > > > > Do
>> > > > > > > > > > > > > > > > > > > > > > > we need to check the memory
>> > > > > configuration
>> > > > > > > in
>> > > > > > > > > > > client?
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > Xintong Song <
>> > > tonysong...@gmail.com>
>> > > > > > > > > > 于2019年8月7日周三
>> > > > > > > > > > > > > > > 下午10:14写道：
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a
>> > > discussion
>> > > > > > > thread
>> > > > > > > > on
>> > > > > > > > > > > > > "FLIP-49:
>> > > > > > > > > > > > > > > > > Unified
>> > > > > > > > > > > > > > > > > > > > > Memory
>> > > > > > > > > > > > > > > > > > > > > > > > Configuration for
>> > > > TaskExecutors"[1],
>> > > > > > > where
>> > > > > > > > we
>> > > > > > > > > > > > > describe
>> > > > > > > > > > > > > > > how
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > > improve
>> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory
>> > > configurations.
>> > > > > The
>> > > > > > > > FLIP
>> > > > > > > > > > > > document
>> > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > mostly
>> > > > > > > > > > > > > > > > > > > > based
>> > > > > > > > > > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > > > > > > > an
>> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory
>> Management
>> > > and
>> > > > > > > > > > Configuration
>> > > > > > > > > > > > > > > > > Reloaded"[2]
>> > > > > > > > > > > > > > > > > > by
>> > > > > > > > > > > > > > > > > > > > > > > Stephan,
>> > > > > > > > > > > > > > > > > > > > > > > > with updates from follow-up
>> > > > > discussions
>> > > > > > > > both
>> > > > > > > > > > > online
>> > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > > offline.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several
>> > > > > > shortcomings
>> > > > > > > of
>> > > > > > > > > > > current
>> > > > > > > > > > > > > > > (Flink
>> > > > > > > > > > > > > > > > > 1.9)
>> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory
>> > > configuration.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >    - Different configuration
>> > for
>> > > > > > > Streaming
>> > > > > > > > > and
>> > > > > > > > > > > > Batch.
>> > > > > > > > > > > > > > > > > > > > > > > >    - Complex and difficult
>> > > > > > configuration
>> > > > > > > of
>> > > > > > > > > > > RocksDB
>> > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > > > > Streaming.
>> > > > > > > > > > > > > > > > > > > > > > > >    - Complicated, uncertain
>> and
>> > > > hard
>> > > > > to
>> > > > > > > > > > > understand.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the
>> > problems
>> > > > can
>> > > > > > be
>> > > > > > > > > > > summarized
>> > > > > > > > > > > > > as
>> > > > > > > > > > > > > > > > > follows.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >    - Extend memory manager
>> to
>> > > also
>> > > > > > > account
>> > > > > > > > > for
>> > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > usage
>> > > > > > > > > > > > > > > > > by
>> > > > > > > > > > > > > > > > > > > > state
>> > > > > > > > > > > > > > > > > > > > > > > >    backends.
>> > > > > > > > > > > > > > > > > > > > > > > >    - Modify how TaskExecutor
>> > > memory
>> > > > > is
>> > > > > > > > > > > partitioned
>> > > > > > > > > > > > > > > > accounted
>> > > > > > > > > > > > > > > > > > > > > individual
>> > > > > > > > > > > > > > > > > > > > > > > >    memory reservations and
>> > pools.
>> > > > > > > > > > > > > > > > > > > > > > > >    - Simplify memory
>> > > configuration
>> > > > > > > options
>> > > > > > > > > and
>> > > > > > > > > > > > > > > calculations
>> > > > > > > > > > > > > > > > > > > logics.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Please find more details in
>> the
>> > > > FLIP
>> > > > > > wiki
>> > > > > > > > > > > document
>> > > > > > > > > > > > > [1].
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early
>> > > design
>> > > > > doc
>> > > > > > > [2]
>> > > > > > > > is
>> > > > > > > > > > out
>> > > > > > > > > > > > of
>> > > > > > > > > > > > > > > sync,
>> > > > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > > it
>> > > > > > > > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the
>> > > discussion
>> > > > in
>> > > > > > > this
>> > > > > > > > > > > mailing
>> > > > > > > > > > > > > list
>> > > > > > > > > > > > > > > > > > thread.)
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your
>> > > feedbacks.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Thank you~
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > [1]
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > [2]
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 2:16 PM Xintong Song <
>> > > > > > > > > > tonysong...@gmail.com>
>> > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Thanks for sharing your opinion Till.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > I'm also in favor of alternative 2. I was
>> > wondering
>> > > > > > whether
>> > > > > > > > we
>> > > > > > > > > > can
>> > > > > > > > > > > > > avoid
>> > > > > > > > > > > > > > > using Unsafe.allocate() for off-heap managed
>> > memory
>> > > > and
>> > > > > > > > network
>> > > > > > > > > > > > memory
>> > > > > > > > > > > > > > with
>> > > > > > > > > > > > > > > alternative 3. But after giving it a second
>> > > thought,
>> > > > I
>> > > > > > > think
>> > > > > > > > > even
>> > > > > > > > > > > for
>> > > > > > > > > > > > > > > alternative 3 using direct memory for off-heap
>> > > > managed
>> > > > > > > memory
>> > > > > > > > > > could
>> > > > > > > > > > > > > cause
>> > > > > > > > > > > > > > > problems.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Hi Yang,
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Regarding your concern, I think what proposed
>> in
>> > > this
>> > > > > > FLIP
>> > > > > > > it
>> > > > > > > > > to
>> > > > > > > > > > > have
>> > > > > > > > > > > > > > both
>> > > > > > > > > > > > > > > off-heap managed memory and network memory
>> > > allocated
>> > > > > > > through
>> > > > > > > > > > > > > > > Unsafe.allocate(), which means they are
>> > practically
>> > > > > > native
>> > > > > > > > > memory
>> > > > > > > > > > > and
>> > > > > > > > > > > > > not
>> > > > > > > > > > > > > > > limited by JVM max direct memory. The only
>> parts
>> > of
>> > > > > > memory
>> > > > > > > > > > limited
>> > > > > > > > > > > by
>> > > > > > > > > > > > > JVM
>> > > > > > > > > > > > > > > max direct memory are task off-heap memory and
>> > JVM
>> > > > > > > overhead,
>> > > > > > > > > > which
>> > > > > > > > > > > > are
>> > > > > > > > > > > > > > > exactly alternative 2 suggests to set the JVM
>> max
>> > > > > direct
>> > > > > > > > memory
>> > > > > > > > > > to.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Thank you~
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Xintong Song
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 1:48 PM Till Rohrmann
>> <
>> > > > > > > > > > > trohrm...@apache.org>
>> > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Thanks for the clarification Xintong. I
>> > > understand
>> > > > > the
>> > > > > > > two
>> > > > > > > > > > > > > alternatives
>> > > > > > > > > > > > > > > > now.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > I would be in favour of option 2 because it
>> > makes
>> > > > > > things
>> > > > > > > > > > > explicit.
>> > > > > > > > > > > > If
>> > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > don't limit the direct memory, I fear that
>> we
>> > > might
>> > > > > end
>> > > > > > > up
>> > > > > > > > > in a
>> > > > > > > > > > > > > similar
>> > > > > > > > > > > > > > > > situation as we are currently in: The user
>> > might
>> > > > see
>> > > > > > that
>> > > > > > > > her
>> > > > > > > > > > > > process
>> > > > > > > > > > > > > > > gets
>> > > > > > > > > > > > > > > > killed by the OS and does not know why this
>> is
>> > > the
>> > > > > > case.
>> > > > > > > > > > > > > Consequently,
>> > > > > > > > > > > > > > > she
>> > > > > > > > > > > > > > > > tries to decrease the process memory size
>> > > (similar
>> > > > to
>> > > > > > > > > > increasing
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > cutoff
>> > > > > > > > > > > > > > > > ratio) in order to accommodate for the extra
>> > > direct
>> > > > > > > memory.
>> > > > > > > > > > Even
>> > > > > > > > > > > > > worse,
>> > > > > > > > > > > > > > > she
>> > > > > > > > > > > > > > > > tries to decrease memory budgets which are
>> not
>> > > > fully
>> > > > > > used
>> > > > > > > > and
>> > > > > > > > > > > hence
>> > > > > > > > > > > > > > won't
>> > > > > > > > > > > > > > > > change the overall memory consumption.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Cheers,
>> > > > > > > > > > > > > > > > Till
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 11:01 AM Xintong
>> Song <
>> > > > > > > > > > > > tonysong...@gmail.com
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Let me explain this with a concrete
>> example
>> > > Till.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Let's say we have the following scenario.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Total Process Memory: 1GB
>> > > > > > > > > > > > > > > > > JVM Direct Memory (Task Off-Heap Memory +
>> JVM
>> > > > > > > Overhead):
>> > > > > > > > > > 200MB
>> > > > > > > > > > > > > > > > > Other Memory (JVM Heap Memory, JVM
>> Metaspace,
>> > > > > > Off-Heap
>> > > > > > > > > > Managed
>> > > > > > > > > > > > > Memory
>> > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > Network Memory): 800MB
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > For alternative 2, we set
>> > > -XX:MaxDirectMemorySize
>> > > > > to
>> > > > > > > > 200MB.
>> > > > > > > > > > > > > > > > > For alternative 3, we set
>> > > -XX:MaxDirectMemorySize
>> > > > > to
>> > > > > > a
>> > > > > > > > very
>> > > > > > > > > > > large
>> > > > > > > > > > > > > > > value,
>> > > > > > > > > > > > > > > > > let's say 1TB.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > If the actual direct memory usage of Task
>> > > > Off-Heap
>> > > > > > > Memory
>> > > > > > > > > and
>> > > > > > > > > > > JVM
>> > > > > > > > > > > > > > > > Overhead
>> > > > > > > > > > > > > > > > > do not exceed 200MB, then alternative 2
>> and
>> > > > > > > alternative 3
>> > > > > > > > > > > should
>> > > > > > > > > > > > > have
>> > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > same utility. Setting larger
>> > > > > -XX:MaxDirectMemorySize
>> > > > > > > will
>> > > > > > > > > not
>> > > > > > > > > > > > > reduce
>> > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > sizes of the other memory pools.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > If the actual direct memory usage of Task
>> > > > Off-Heap
>> > > > > > > Memory
>> > > > > > > > > and
>> > > > > > > > > > > JVM
>> > > > > > > > > > > > > > > > > Overhead potentially exceed 200MB, then
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >    - Alternative 2 suffers from frequent
>> OOM.
>> > > To
>> > > > > > avoid
>> > > > > > > > > that,
>> > > > > > > > > > > the
>> > > > > > > > > > > > > only
>> > > > > > > > > > > > > > > > thing
>> > > > > > > > > > > > > > > > >    user can do is to modify the
>> configuration
>> > > and
>> > > > > > > > increase
>> > > > > > > > > > JVM
>> > > > > > > > > > > > > Direct
>> > > > > > > > > > > > > > > > > Memory
>> > > > > > > > > > > > > > > > >    (Task Off-Heap Memory + JVM Overhead).
>> > Let's
>> > > > say
>> > > > > > > that
>> > > > > > > > > user
>> > > > > > > > > > > > > > increases
>> > > > > > > > > > > > > > > > JVM
>> > > > > > > > > > > > > > > > >    Direct Memory to 250MB, this will
>> reduce
>> > the
>> > > > > total
>> > > > > > > > size
>> > > > > > > > > of
>> > > > > > > > > > > > other
>> > > > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > >    pools to 750MB, given the total process
>> > > memory
>> > > > > > > remains
>> > > > > > > > > > 1GB.
>> > > > > > > > > > > > > > > > >    - For alternative 3, there is no
>> chance of
>> > > > > direct
>> > > > > > > OOM.
>> > > > > > > > > > There
>> > > > > > > > > > > > are
>> > > > > > > > > > > > > > > > chances
>> > > > > > > > > > > > > > > > >    of exceeding the total process memory
>> > limit,
>> > > > but
>> > > > > > > given
>> > > > > > > > > > that
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > process
>> > > > > > > > > > > > > > > > > may
>> > > > > > > > > > > > > > > > >    not use up all the reserved native
>> memory
>> > > > > > (Off-Heap
>> > > > > > > > > > Managed
>> > > > > > > > > > > > > > Memory,
>> > > > > > > > > > > > > > > > > Network
>> > > > > > > > > > > > > > > > >    Memory, JVM Metaspace), if the actual
>> > direct
>> > > > > > memory
>> > > > > > > > > usage
>> > > > > > > > > > is
>> > > > > > > > > > > > > > > slightly
>> > > > > > > > > > > > > > > > > above
>> > > > > > > > > > > > > > > > >    yet very close to 200MB, user probably
>> do
>> > > not
>> > > > > need
>> > > > > > > to
>> > > > > > > > > > change
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > >    configurations.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Therefore, I think from the user's
>> > > perspective, a
>> > > > > > > > feasible
>> > > > > > > > > > > > > > > configuration
>> > > > > > > > > > > > > > > > > for alternative 2 may lead to lower
>> resource
>> > > > > > > utilization
>> > > > > > > > > > > compared
>> > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > alternative 3.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Thank you~
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Xintong Song
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > On Fri, Aug 16, 2019 at 10:28 AM Till
>> > Rohrmann
>> > > <
>> > > > > > > > > > > > > trohrm...@apache.org
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > I guess you have to help me understand
>> the
>> > > > > > difference
>> > > > > > > > > > between
>> > > > > > > > > > > > > > > > > alternative 2
>> > > > > > > > > > > > > > > > > > and 3 wrt to memory under utilization
>> > > Xintong.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > - Alternative 2: set
>> XX:MaxDirectMemorySize
>> > > to
>> > > > > Task
>> > > > > > > > > > Off-Heap
>> > > > > > > > > > > > > Memory
>> > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > JVM
>> > > > > > > > > > > > > > > > > > Overhead. Then there is the risk that
>> this
>> > > size
>> > > > > is
>> > > > > > > too
>> > > > > > > > > low
>> > > > > > > > > > > > > > resulting
>> > > > > > > > > > > > > > > > in a
>> > > > > > > > > > > > > > > > > > lot of garbage collection and
>> potentially
>> > an
>> > > > OOM.
>> > > > > > > > > > > > > > > > > > - Alternative 3: set
>> XX:MaxDirectMemorySize
>> > > to
>> > > > > > > > something
>> > > > > > > > > > > larger
>> > > > > > > > > > > > > > than
>> > > > > > > > > > > > > > > > > > alternative 2. This would of course
>> reduce
>> > > the
>> > > > > > sizes
>> > > > > > > of
>> > > > > > > > > the
>> > > > > > > > > > > > other
>> > > > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > types.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > How would alternative 2 now result in an
>> > > under
>> > > > > > > > > utilization
>> > > > > > > > > > of
>> > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > compared to alternative 3? If
>> alternative 3
>> > > > > > strictly
>> > > > > > > > > sets a
>> > > > > > > > > > > > > higher
>> > > > > > > > > > > > > > > max
>> > > > > > > > > > > > > > > > > > direct memory size and we use only
>> little,
>> > > > then I
>> > > > > > > would
>> > > > > > > > > > > expect
>> > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > > > alternative 3 results in memory under
>> > > > > utilization.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Cheers,
>> > > > > > > > > > > > > > > > > > Till
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 4:19 PM Yang
>> Wang <
>> > > > > > > > > > > > danrtsey...@gmail.com
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Hi xintong,till
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Native and Direct Memory
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > My point is setting a very large max
>> > direct
>> > > > > > memory
>> > > > > > > > size
>> > > > > > > > > > > when
>> > > > > > > > > > > > we
>> > > > > > > > > > > > > > do
>> > > > > > > > > > > > > > > > not
>> > > > > > > > > > > > > > > > > > > differentiate direct and native
>> memory.
>> > If
>> > > > the
>> > > > > > > direct
>> > > > > > > > > > > > > > > > memory,including
>> > > > > > > > > > > > > > > > > > user
>> > > > > > > > > > > > > > > > > > > direct memory and framework direct
>> > > > memory,could
>> > > > > > be
>> > > > > > > > > > > calculated
>> > > > > > > > > > > > > > > > > > > correctly,then
>> > > > > > > > > > > > > > > > > > > i am in favor of setting direct memory
>> > with
>> > > > > fixed
>> > > > > > > > > value.
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Memory Calculation
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > I agree with xintong. For Yarn and
>> k8s,we
>> > > > need
>> > > > > to
>> > > > > > > > check
>> > > > > > > > > > the
>> > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > > configurations in client to avoid
>> > > submitting
>> > > > > > > > > successfully
>> > > > > > > > > > > and
>> > > > > > > > > > > > > > > failing
>> > > > > > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > > > > > the flink master.
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Best,
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Yang
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Xintong Song <tonysong...@gmail.com
>> > > > > >于2019年8月13日
>> > > > > > > > > > 周二22:07写道：
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Thanks for replying, Till.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > About MemorySegment, I think you are
>> > > right
>> > > > > that
>> > > > > > > we
>> > > > > > > > > > should
>> > > > > > > > > > > > not
>> > > > > > > > > > > > > > > > include
>> > > > > > > > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > > > > > issue in the scope of this FLIP.
>> This
>> > > FLIP
>> > > > > > should
>> > > > > > > > > > > > concentrate
>> > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > how
>> > > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > > configure memory pools for
>> > TaskExecutors,
>> > > > > with
>> > > > > > > > > minimum
>> > > > > > > > > > > > > > > involvement
>> > > > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > > how
>> > > > > > > > > > > > > > > > > > > > memory consumers use it.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > About direct memory, I think
>> > alternative
>> > > 3
>> > > > > may
>> > > > > > > not
>> > > > > > > > > > having
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > same
>> > > > > > > > > > > > > > > > > over
>> > > > > > > > > > > > > > > > > > > > reservation issue that alternative 2
>> > > does,
>> > > > > but
>> > > > > > at
>> > > > > > > > the
>> > > > > > > > > > > cost
>> > > > > > > > > > > > of
>> > > > > > > > > > > > > > > risk
>> > > > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > > > over
>> > > > > > > > > > > > > > > > > > > > using memory at the container level,
>> > > which
>> > > > is
>> > > > > > not
>> > > > > > > > > good.
>> > > > > > > > > > > My
>> > > > > > > > > > > > > > point
>> > > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > > > > > both "Task Off-Heap Memory" and "JVM
>> > > > > Overhead"
>> > > > > > > are
>> > > > > > > > > not
>> > > > > > > > > > > easy
>> > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > config.
>> > > > > > > > > > > > > > > > > > > For
>> > > > > > > > > > > > > > > > > > > > alternative 2, users might configure
>> > them
>> > > > > > higher
>> > > > > > > > than
>> > > > > > > > > > > what
>> > > > > > > > > > > > > > > actually
>> > > > > > > > > > > > > > > > > > > needed,
>> > > > > > > > > > > > > > > > > > > > just to avoid getting a direct OOM.
>> For
>> > > > > > > alternative
>> > > > > > > > > 3,
>> > > > > > > > > > > > users
>> > > > > > > > > > > > > do
>> > > > > > > > > > > > > > > not
>> > > > > > > > > > > > > > > > > get
>> > > > > > > > > > > > > > > > > > > > direct OOM, so they may not config
>> the
>> > > two
>> > > > > > > options
>> > > > > > > > > > > > > aggressively
>> > > > > > > > > > > > > > > > high.
>> > > > > > > > > > > > > > > > > > But
>> > > > > > > > > > > > > > > > > > > > the consequences are risks of
>> overall
>> > > > > container
>> > > > > > > > > memory
>> > > > > > > > > > > > usage
>> > > > > > > > > > > > > > > > exceeds
>> > > > > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > > > > budget.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Thank you~
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Xintong Song
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > On Tue, Aug 13, 2019 at 9:39 AM Till
>> > > > > Rohrmann <
>> > > > > > > > > > > > > > > > trohrm...@apache.org>
>> > > > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > Thanks for proposing this FLIP
>> > Xintong.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > All in all I think it already
>> looks
>> > > quite
>> > > > > > good.
>> > > > > > > > > > > > Concerning
>> > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > first
>> > > > > > > > > > > > > > > > > > > open
>> > > > > > > > > > > > > > > > > > > > > question about allocating memory
>> > > > segments,
>> > > > > I
>> > > > > > > was
>> > > > > > > > > > > > wondering
>> > > > > > > > > > > > > > > > whether
>> > > > > > > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > > > > > strictly necessary to do in the
>> > context
>> > > > of
>> > > > > > this
>> > > > > > > > > FLIP
>> > > > > > > > > > or
>> > > > > > > > > > > > > > whether
>> > > > > > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > > > > > could
>> > > > > > > > > > > > > > > > > > > > > be done as a follow up? Without
>> > knowing
>> > > > all
>> > > > > > > > > details,
>> > > > > > > > > > I
>> > > > > > > > > > > > > would
>> > > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > > > concerned
>> > > > > > > > > > > > > > > > > > > > > that we would widen the scope of
>> this
>> > > > FLIP
>> > > > > > too
>> > > > > > > > much
>> > > > > > > > > > > > because
>> > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > > would
>> > > > > > > > > > > > > > > > > > > have
>> > > > > > > > > > > > > > > > > > > > > to touch all the existing call
>> sites
>> > of
>> > > > the
>> > > > > > > > > > > MemoryManager
>> > > > > > > > > > > > > > where
>> > > > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > > > > > allocate
>> > > > > > > > > > > > > > > > > > > > > memory segments (this should
>> mainly
>> > be
>> > > > > batch
>> > > > > > > > > > > operators).
>> > > > > > > > > > > > > The
>> > > > > > > > > > > > > > > > > addition
>> > > > > > > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > > > > > the memory reservation call to the
>> > > > > > > MemoryManager
>> > > > > > > > > > should
>> > > > > > > > > > > > not
>> > > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > > affected
>> > > > > > > > > > > > > > > > > > > > by
>> > > > > > > > > > > > > > > > > > > > > this and I would hope that this is
>> > the
>> > > > only
>> > > > > > > point
>> > > > > > > > > of
>> > > > > > > > > > > > > > > interaction
>> > > > > > > > > > > > > > > > a
>> > > > > > > > > > > > > > > > > > > > > streaming job would have with the
>> > > > > > > MemoryManager.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > Concerning the second open
>> question
>> > > about
>> > > > > > > setting
>> > > > > > > > > or
>> > > > > > > > > > > not
>> > > > > > > > > > > > > > > setting
>> > > > > > > > > > > > > > > > a
>> > > > > > > > > > > > > > > > > > max
>> > > > > > > > > > > > > > > > > > > > > direct memory limit, I would also
>> be
>> > > > > > interested
>> > > > > > > > why
>> > > > > > > > > > > Yang
>> > > > > > > > > > > > > Wang
>> > > > > > > > > > > > > > > > > thinks
>> > > > > > > > > > > > > > > > > > > > > leaving it open would be best. My
>> > > concern
>> > > > > > about
>> > > > > > > > > this
>> > > > > > > > > > > > would
>> > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > > > > > would
>> > > > > > > > > > > > > > > > > > > > > be in a similar situation as we
>> are
>> > now
>> > > > > with
>> > > > > > > the
>> > > > > > > > > > > > > > > > > RocksDBStateBackend.
>> > > > > > > > > > > > > > > > > > > If
>> > > > > > > > > > > > > > > > > > > > > the different memory pools are not
>> > > > clearly
>> > > > > > > > > separated
>> > > > > > > > > > > and
>> > > > > > > > > > > > > can
>> > > > > > > > > > > > > > > > spill
>> > > > > > > > > > > > > > > > > > over
>> > > > > > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > > > a different pool, then it is quite
>> > hard
>> > > > to
>> > > > > > > > > understand
>> > > > > > > > > > > > what
>> > > > > > > > > > > > > > > > exactly
>> > > > > > > > > > > > > > > > > > > > causes a
>> > > > > > > > > > > > > > > > > > > > > process to get killed for using
>> too
>> > > much
>> > > > > > > memory.
>> > > > > > > > > This
>> > > > > > > > > > > > could
>> > > > > > > > > > > > > > > then
>> > > > > > > > > > > > > > > > > > easily
>> > > > > > > > > > > > > > > > > > > > > lead to a similar situation what
>> we
>> > > have
>> > > > > with
>> > > > > > > the
>> > > > > > > > > > > > > > cutoff-ratio.
>> > > > > > > > > > > > > > > > So
>> > > > > > > > > > > > > > > > > > why
>> > > > > > > > > > > > > > > > > > > > not
>> > > > > > > > > > > > > > > > > > > > > setting a sane default value for
>> max
>> > > > direct
>> > > > > > > > memory
>> > > > > > > > > > and
>> > > > > > > > > > > > > giving
>> > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > > user
>> > > > > > > > > > > > > > > > > > > an
>> > > > > > > > > > > > > > > > > > > > > option to increase it if he runs
>> into
>> > > an
>> > > > > OOM.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > @Xintong, how would alternative 2
>> > lead
>> > > to
>> > > > > > lower
>> > > > > > > > > > memory
>> > > > > > > > > > > > > > > > utilization
>> > > > > > > > > > > > > > > > > > than
>> > > > > > > > > > > > > > > > > > > > > alternative 3 where we set the
>> direct
>> > > > > memory
>> > > > > > > to a
>> > > > > > > > > > > higher
>> > > > > > > > > > > > > > value?
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > Cheers,
>> > > > > > > > > > > > > > > > > > > > > Till
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > On Fri, Aug 9, 2019 at 9:12 AM
>> > Xintong
>> > > > > Song <
>> > > > > > > > > > > > > > > > tonysong...@gmail.com
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > Thanks for the feedback, Yang.
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > Regarding your comments:
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > *Native and Direct Memory*
>> > > > > > > > > > > > > > > > > > > > > > I think setting a very large max
>> > > direct
>> > > > > > > memory
>> > > > > > > > > size
>> > > > > > > > > > > > > > > definitely
>> > > > > > > > > > > > > > > > > has
>> > > > > > > > > > > > > > > > > > > some
>> > > > > > > > > > > > > > > > > > > > > > good sides. E.g., we do not
>> worry
>> > > about
>> > > > > > > direct
>> > > > > > > > > OOM,
>> > > > > > > > > > > and
>> > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > don't
>> > > > > > > > > > > > > > > > > > even
>> > > > > > > > > > > > > > > > > > > > > need
>> > > > > > > > > > > > > > > > > > > > > > to allocate managed / network
>> > memory
>> > > > with
>> > > > > > > > > > > > > > Unsafe.allocate() .
>> > > > > > > > > > > > > > > > > > > > > > However, there are also some
>> down
>> > > sides
>> > > > > of
>> > > > > > > > doing
>> > > > > > > > > > > this.
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >    - One thing I can think of is
>> > that
>> > > > if
>> > > > > a
>> > > > > > > task
>> > > > > > > > > > > > executor
>> > > > > > > > > > > > > > > > > container
>> > > > > > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > > > > > >    killed due to overusing
>> memory,
>> > it
>> > > > > could
>> > > > > > > be
>> > > > > > > > > hard
>> > > > > > > > > > > for
>> > > > > > > > > > > > > use
>> > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > know
>> > > > > > > > > > > > > > > > > > > > which
>> > > > > > > > > > > > > > > > > > > > > > part
>> > > > > > > > > > > > > > > > > > > > > >    of the memory is overused.
>> > > > > > > > > > > > > > > > > > > > > >    - Another down side is that
>> the
>> > > JVM
>> > > > > > never
>> > > > > > > > > > trigger
>> > > > > > > > > > > GC
>> > > > > > > > > > > > > due
>> > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > reaching
>> > > > > > > > > > > > > > > > > > > > > max
>> > > > > > > > > > > > > > > > > > > > > >    direct memory limit, because
>> the
>> > > > limit
>> > > > > > is
>> > > > > > > > too
>> > > > > > > > > > high
>> > > > > > > > > > > > to
>> > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > > reached.
>> > > > > > > > > > > > > > > > > > > > That
>> > > > > > > > > > > > > > > > > > > > > >    means we kind of relay on
>> heap
>> > > > memory
>> > > > > to
>> > > > > > > > > trigger
>> > > > > > > > > > > GC
>> > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > release
>> > > > > > > > > > > > > > > > > > > > direct
>> > > > > > > > > > > > > > > > > > > > > >    memory. That could be a
>> problem
>> > in
>> > > > > cases
>> > > > > > > > where
>> > > > > > > > > > we
>> > > > > > > > > > > > have
>> > > > > > > > > > > > > > > more
>> > > > > > > > > > > > > > > > > > direct
>> > > > > > > > > > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > > > > >    usage but not enough heap
>> > activity
>> > > > to
>> > > > > > > > trigger
>> > > > > > > > > > the
>> > > > > > > > > > > > GC.
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > Maybe you can share your reasons
>> > for
>> > > > > > > preferring
>> > > > > > > > > > > > setting a
>> > > > > > > > > > > > > > > very
>> > > > > > > > > > > > > > > > > > large
>> > > > > > > > > > > > > > > > > > > > > value,
>> > > > > > > > > > > > > > > > > > > > > > if there are anything else I
>> > > > overlooked.
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > *Memory Calculation*
>> > > > > > > > > > > > > > > > > > > > > > If there is any conflict between
>> > > > multiple
>> > > > > > > > > > > configuration
>> > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > user
>> > > > > > > > > > > > > > > > > > > > > > explicitly specified, I think we
>> > > should
>> > > > > > throw
>> > > > > > > > an
>> > > > > > > > > > > error.
>> > > > > > > > > > > > > > > > > > > > > > I think doing checking on the
>> > client
>> > > > side
>> > > > > > is
>> > > > > > > a
>> > > > > > > > > good
>> > > > > > > > > > > > idea,
>> > > > > > > > > > > > > > so
>> > > > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > > > > > Yarn /
>> > > > > > > > > > > > > > > > > > > > > > K8s we can discover the problem
>> > > before
>> > > > > > > > submitting
>> > > > > > > > > > the
>> > > > > > > > > > > > > Flink
>> > > > > > > > > > > > > > > > > > cluster,
>> > > > > > > > > > > > > > > > > > > > > which
>> > > > > > > > > > > > > > > > > > > > > > is always a good thing.
>> > > > > > > > > > > > > > > > > > > > > > But we can not only rely on the
>> > > client
>> > > > > side
>> > > > > > > > > > checking,
>> > > > > > > > > > > > > > because
>> > > > > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > > > > > > > standalone cluster TaskManagers
>> on
>> > > > > > different
>> > > > > > > > > > machines
>> > > > > > > > > > > > may
>> > > > > > > > > > > > > > > have
>> > > > > > > > > > > > > > > > > > > > different
>> > > > > > > > > > > > > > > > > > > > > > configurations and the client
>> does
>> > > see
>> > > > > > that.
>> > > > > > > > > > > > > > > > > > > > > > What do you think?
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > Thank you~
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > Xintong Song
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > On Thu, Aug 8, 2019 at 5:09 PM
>> Yang
>> > > > Wang
>> > > > > <
>> > > > > > > > > > > > > > > > danrtsey...@gmail.com>
>> > > > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > Hi xintong,
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > Thanks for your detailed
>> > proposal.
>> > > > > After
>> > > > > > > all
>> > > > > > > > > the
>> > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > > configuration
>> > > > > > > > > > > > > > > > > > > > > are
>> > > > > > > > > > > > > > > > > > > > > > > introduced, it will be more
>> > > powerful
>> > > > to
>> > > > > > > > control
>> > > > > > > > > > the
>> > > > > > > > > > > > > flink
>> > > > > > > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > > > > > > > usage. I
>> > > > > > > > > > > > > > > > > > > > > > > just have few questions about
>> it.
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >    - Native and Direct Memory
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > We do not differentiate user
>> > direct
>> > > > > > memory
>> > > > > > > > and
>> > > > > > > > > > > native
>> > > > > > > > > > > > > > > memory.
>> > > > > > > > > > > > > > > > > > They
>> > > > > > > > > > > > > > > > > > > > are
>> > > > > > > > > > > > > > > > > > > > > > all
>> > > > > > > > > > > > > > > > > > > > > > > included in task off-heap
>> memory.
>> > > > > Right?
>> > > > > > > So i
>> > > > > > > > > > don’t
>> > > > > > > > > > > > > think
>> > > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > > > could
>> > > > > > > > > > > > > > > > > > > > not
>> > > > > > > > > > > > > > > > > > > > > > set
>> > > > > > > > > > > > > > > > > > > > > > > the -XX:MaxDirectMemorySize
>> > > > properly. I
>> > > > > > > > prefer
>> > > > > > > > > > > > leaving
>> > > > > > > > > > > > > > it a
>> > > > > > > > > > > > > > > > > very
>> > > > > > > > > > > > > > > > > > > > large
>> > > > > > > > > > > > > > > > > > > > > > > value.
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >    - Memory Calculation
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > If the sum of and fine-grained
>> > > > > > > memory(network
>> > > > > > > > > > > memory,
>> > > > > > > > > > > > > > > managed
>> > > > > > > > > > > > > > > > > > > memory,
>> > > > > > > > > > > > > > > > > > > > > > etc.)
>> > > > > > > > > > > > > > > > > > > > > > > is larger than total process
>> > > memory,
>> > > > > how
>> > > > > > do
>> > > > > > > > we
>> > > > > > > > > > deal
>> > > > > > > > > > > > > with
>> > > > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > > > > > > situation?
>> > > > > > > > > > > > > > > > > > > > > > Do
>> > > > > > > > > > > > > > > > > > > > > > > we need to check the memory
>> > > > > configuration
>> > > > > > > in
>> > > > > > > > > > > client?
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > Xintong Song <
>> > > tonysong...@gmail.com>
>> > > > > > > > > > 于2019年8月7日周三
>> > > > > > > > > > > > > > > 下午10:14写道：
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a
>> > > discussion
>> > > > > > > thread
>> > > > > > > > on
>> > > > > > > > > > > > > "FLIP-49:
>> > > > > > > > > > > > > > > > > Unified
>> > > > > > > > > > > > > > > > > > > > > Memory
>> > > > > > > > > > > > > > > > > > > > > > > > Configuration for
>> > > > TaskExecutors"[1],
>> > > > > > > where
>> > > > > > > > we
>> > > > > > > > > > > > > describe
>> > > > > > > > > > > > > > > how
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > > improve
>> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory
>> > > configurations.
>> > > > > The
>> > > > > > > > FLIP
>> > > > > > > > > > > > document
>> > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > mostly
>> > > > > > > > > > > > > > > > > > > > based
>> > > > > > > > > > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > > > > > > > an
>> > > > > > > > > > > > > > > > > > > > > > > > early design "Memory
>> Management
>> > > and
>> > > > > > > > > > Configuration
>> > > > > > > > > > > > > > > > > Reloaded"[2]
>> > > > > > > > > > > > > > > > > > by
>> > > > > > > > > > > > > > > > > > > > > > > Stephan,
>> > > > > > > > > > > > > > > > > > > > > > > > with updates from follow-up
>> > > > > discussions
>> > > > > > > > both
>> > > > > > > > > > > online
>> > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > > offline.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > This FLIP addresses several
>> > > > > > shortcomings
>> > > > > > > of
>> > > > > > > > > > > current
>> > > > > > > > > > > > > > > (Flink
>> > > > > > > > > > > > > > > > > 1.9)
>> > > > > > > > > > > > > > > > > > > > > > > > TaskExecutor memory
>> > > configuration.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >    - Different configuration
>> > for
>> > > > > > > Streaming
>> > > > > > > > > and
>> > > > > > > > > > > > Batch.
>> > > > > > > > > > > > > > > > > > > > > > > >    - Complex and difficult
>> > > > > > configuration
>> > > > > > > of
>> > > > > > > > > > > RocksDB
>> > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > > > > Streaming.
>> > > > > > > > > > > > > > > > > > > > > > > >    - Complicated, uncertain
>> and
>> > > > hard
>> > > > > to
>> > > > > > > > > > > understand.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Key changes to solve the
>> > problems
>> > > > can
>> > > > > > be
>> > > > > > > > > > > summarized
>> > > > > > > > > > > > > as
>> > > > > > > > > > > > > > > > > follows.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >    - Extend memory manager
>> to
>> > > also
>> > > > > > > account
>> > > > > > > > > for
>> > > > > > > > > > > > memory
>> > > > > > > > > > > > > > > usage
>> > > > > > > > > > > > > > > > > by
>> > > > > > > > > > > > > > > > > > > > state
>> > > > > > > > > > > > > > > > > > > > > > > >    backends.
>> > > > > > > > > > > > > > > > > > > > > > > >    - Modify how TaskExecutor
>> > > memory
>> > > > > is
>> > > > > > > > > > > partitioned
>> > > > > > > > > > > > > > > > accounted
>> > > > > > > > > > > > > > > > > > > > > individual
>> > > > > > > > > > > > > > > > > > > > > > > >    memory reservations and
>> > pools.
>> > > > > > > > > > > > > > > > > > > > > > > >    - Simplify memory
>> > > configuration
>> > > > > > > options
>> > > > > > > > > and
>> > > > > > > > > > > > > > > calculations
>> > > > > > > > > > > > > > > > > > > logics.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Please find more details in
>> the
>> > > > FLIP
>> > > > > > wiki
>> > > > > > > > > > > document
>> > > > > > > > > > > > > [1].
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > (Please note that the early
>> > > design
>> > > > > doc
>> > > > > > > [2]
>> > > > > > > > is
>> > > > > > > > > > out
>> > > > > > > > > > > > of
>> > > > > > > > > > > > > > > sync,
>> > > > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > > it
>> > > > > > > > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > > > > > > > > appreciated to have the
>> > > discussion
>> > > > in
>> > > > > > > this
>> > > > > > > > > > > mailing
>> > > > > > > > > > > > > list
>> > > > > > > > > > > > > > > > > > thread.)
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Looking forward to your
>> > > feedbacks.
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Thank you~
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > [1]
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > [2]
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://docs.google.com/document/d/1o4KvyyXsQMGUastfPin3ZWeUXWsJgoL7piqp1fFYJvA/edit?usp=sharing
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: [DISCUSS] FLIP-49: Unified Memory Configuration for TaskExecutors

Reply via email to