Alright, thanks for the feedback. I also agree with it. Then this is resolved.
> On 19 Mar 2020, at 14:14, Till Rohrmann <trohrm...@apache.org> wrote: > > I agree with Xintong's proposal. If we see that many users run into this > problem, then one could think about escalating the warning message into a > failure. > > Cheers, > Till > > On Thu, Mar 19, 2020 at 4:23 AM Xintong Song <tonysong...@gmail.com> wrote: > >> I think recommend a minimum value in docs and throw a warning if the heap >> size is too small should be good enough. >> Not sure about failing job if the min heap is not fulfilled. As already >> mentioned, it would be hard to determine the min heap size. And if we make >> the min heap configurable, then in any case that users need to configure >> the min heap, they can configure the heap size directly. >> >> Thank you~ >> >> Xintong Song >> >> >> >> On Wed, Mar 18, 2020 at 10:55 PM Andrey Zagrebin <azagre...@apache.org> >> wrote: >> >>> Hi all, >>> >>> One thing more thing to mention, the current calculations can lead to >>> arbitrary small JVM Heap, maybe even zero. >>> I suggest to introduce a check where we at least recommend to set the JVM >>> heap to e.g. 128Mb. >>> >>> Additionally, we can demand some minimum value to function and fail if it >>> is not fulfilled. >>> We could experiment with what is the working minimum but It is hard to >> come >>> up with this limit because it again can depend on the job and >> environment. >>> >>> Best, >>> Andrey >>> >>> On Wed, Mar 18, 2020 at 5:03 PM Andrey Zagrebin <azagre...@apache.org> >>> wrote: >>> >>>> Hi all, >>>> >>>> Thanks for the feedback, Xintong and Till. >>>> >>>>> rename jobmanager.memory.direct.size into >>> jobmanager.memory.off-heap.size >>>> >>>> I am ok with that to align it with TM and avoid further complications >> for >>>> users. >>>> I will adjust the FLIP. >>>> >>>>> change the default value of JM Metaspace size to 256 MB >>>> >>>> Indeed, no reason to assume that the user code would need less >> Metaspace >>>> in JM. >>>> I will change it unless a better argument is reported for another >> value. >>>> >>>> I think all concerns has been resolved so I am starting the voting in a >>>> separate thread. >>>> >>>> Best, >>>> Andrey >>>> >>>> On Tue, Mar 17, 2020 at 6:16 PM Till Rohrmann <trohrm...@apache.org> >>>> wrote: >>>> >>>>> Thanks for creating this FLIP Andrey. >>>>> >>>>> I agree with Xintong that we should rename >> jobmanager.memory.direct.size >>>>> into jobmanager.memory.off-heap.size which accounts for native and >>> direct >>>>> memory usage. I think it should be good enough and is easier to >>> understand >>>>> for the user. >>>>> >>>>> Concerning the default value for the metaspace size. Did we take the >>>>> lessons learned from the TM metaspace size into account? IIRC we are >>> about >>>>> to change the default value to 256 MB. >>>>> >>>>> Feel free to start a vote once these last two questions have been >>>>> resolved. >>>>> >>>>> Cheers, >>>>> Till >>>>> >>>>> On Thu, Mar 12, 2020 at 4:25 AM Xintong Song <tonysong...@gmail.com> >>>>> wrote: >>>>> >>>>>> Thanks Andrey for kicking this discussion off. >>>>>> >>>>>> Regarding "direct" vs. "off-heap", I'm personally in favor of >> renaming >>>>> the >>>>>> "direct" memory in the current FLIP-116[1] to "off-heap" memory, and >>>>> making >>>>>> it also account for user native memory usage. >>>>>> >>>>>> On one hand, I think it would be good that JM & TM provide >> consistent >>>>>> concepts and terminologies to users. IIUC, this is exactly the >> purpose >>>>> of >>>>>> this FLIP. For TMs, we already have "off-heap" memory accounting for >>>>> both >>>>>> direct and native memory usages, and we did this so that users do >> not >>>>> need >>>>>> to understand the differences between the two kinds. >>>>>> >>>>>> On the other hand, while for TMs it is hard to tell which kind of >>>>> memory is >>>>>> needed mostly due to variety of applications, I believe for JM the >>> major >>>>>> memory consumption is heap memory in most cases. That means we >>> probably >>>>> can >>>>>> rely on the heap activities to trigger GC in most cases, and the max >>>>> direct >>>>>> memory limit can act as a safe net. Moreover, I think the cases >> should >>>>> be >>>>>> very rare that we need native memory for user codes. Therefore, we >>>>> probably >>>>>> should not break the JM/TM consistency for potential risks in such >>> rare >>>>>> cases. >>>>>> >>>>>> WDYT? >>>>>> >>>>>> Thank you~ >>>>>> >>>>>> Xintong Song >>>>>> >>>>>> >>>>>> [1] >>>>>> >>>>>> >>>>> >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP+116%3A+Unified+Memory+Configuration+for+Job+Managers >>>>>> >>>>>> On Wed, Mar 11, 2020 at 8:56 PM Andrey Zagrebin < >> azagre...@apache.org >>>> >>>>>> wrote: >>>>>> >>>>>>> Hi All, >>>>>>> >>>>>>> As you may have noticed, 1.10 release included an extensive >>>>> improvements >>>>>> to >>>>>>> memory management and configuration of Task Managers, FLIP-49: >> [1]. >>>>> The >>>>>>> memory configuration of Job Managers has not been touched in 1.10. >>>>>>> >>>>>>> Although, Job Manager's memory model does not look so >> sophisticated >>> as >>>>>>> for Task Managers, It makes to align Job Manager memory model and >>>>>> settings >>>>>>> with Task Managers. Therefore, we propose to reconsider it as well >>> in >>>>>> 1.11 >>>>>>> and I prepared a FLIP 116 [2] for that. >>>>>>> >>>>>>> Any feedback is appreciated. >>>>>>> >>>>>>> So far, there is one discussion point about how to address native >>>>>>> non-direct memory usage of user code. The user code can be run >> e.g. >>> in >>>>>>> certain job submission scenarios within the JM process. For >>>>> simplicity, >>>>>>> FLIP suggests only an option for direct memory which is translated >>>>> into >>>>>> the >>>>>>> setting of the JVM direct memory limit. >>>>>>> Although, we documented for TM that the similar parameters can >> also >>>>>>> address native non-direct memory usage [3], this can lead to wrong >>>>>>> functioning of the JVM direct memory limit. The direct memory >> option >>>>> in >>>>>> JM >>>>>>> could be also named in more general way, e.g. off-heap memory but >>> this >>>>>>> naming would somewhat hide its nature of JVM direct memory limit. >>>>>>> On the other hand, JVM Overhead does not suffer from this problem >>> and >>>>>>> affects only the container/worker memory size which is the most >>>>> important >>>>>>> matter to address for the native non-direct memory consumption. >> The >>>>>> caveat >>>>>>> here is that JVM Overhead was not supposed to be used by any Flink >>> or >>>>>> user >>>>>>> components. >>>>>>> >>>>>>> Thanks, >>>>>>> Andrey >>>>>>> >>>>>>> [1] >>>>>>> >>>>>>> >>>>>> >>>>> >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors >>>>>>> [2] >>>>>>> >>>>>>> >>>>>> >>>>> >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP+116%3A+Unified+Memory+Configuration+for+Job+Managers >>>>>>> [3] >>>>>>> >>>>>>> >>>>>> >>>>> >>> >> https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/memory/mem_detail.html#overview >>>>>>> >>>>>> >>>>> >>>> >>> >>