Hi Ufuk, Thanks for the explanation.
Yes. Our jobs is all streaming job. Cheers On Tue, Feb 23, 2016 at 2:48 PM, Ufuk Celebi <u...@apache.org> wrote: > The new default is equivalent to the previous "streaming mode". The > community decided to get rid of this distinction, because it was > confusing to users. > > The difference between "streaming mode" and "batch mode" was how > Flink's managed memory was allocated, either lazily when required > ('streaming mode") or eagerly on task manager start up ("batch mode"). > Now it's lazy by default. > > This is not something you need to worry about, but if you are mostly > using the DataSet API where pre allocation has benefits, you can get > the "batch mode" behaviour by using the following configuration key: > > taskmanager.memory.preallocate: true > > But you are using the DataStream API anyways, right? > > – Ufuk > > > On Tue, Feb 23, 2016 at 6:36 AM, Welly Tambunan <if05...@gmail.com> wrote: > > Hi Fabian, > > > > Previously when using flink 0.9-0.10 we start the cluster with streaming > > mode or batch mode. I see that this one is gone on Flink 1.00 snapshot ? > So > > this one has already taken care of the flink and optimize by runtime > > > > > On Mon, Feb 22, 2016 at 5:26 PM, Fabian Hueske <fhue...@gmail.com> > wrote: > >> > >> Hi Welly, > >> > >> sorry for the late response. > >> > >> The number of network buffers primarily depends on the maximum > parallelism > >> of your job. > >> The given formula assumes a specific cluster configuration (1 task > manager > >> per machine, one parallel task per CPU). > >> The formula can be translated to: > >> > >> taskmanager.network.numberOfBuffers: p ^ 2 * t * 4 > >> > >> where p is the maximum parallelism of the job and t is the number of > task > >> manager. > >> You can process more than one parallel task per TM if you configure more > >> than one processing slot per machine ( taskmanager.numberOfTaskSlots). > The > >> TM will divide its memory among all its slots. So it would be possible > to > >> start one TM for each machine with 100GB+ memory and 48 slots each. > >> > >> We can compute the number of network buffers if you give a few more > >> details about your setup: > >> - How many task managers do you start? I assume more than one TM per > >> machine given that you assign only 4GB of memory out of 128GB to each > TM. > >> - What is the maximum parallelism of you program? > >> - How many processing slots do you configure for each TM? > >> > >> In general, pipelined shuffles with a high parallelism require a lot of > >> memory. > >> If you configure batch instead of pipelined transfer, the memory > >> requirement goes down > >> (ExecutionConfig.setExecutionMode(ExecutionMode.BATCH)). > >> > >> Eventually, we want to merge the network buffer and the managed memory > >> pools. So the "taskmanager.network.numberOfBuffers" configuration whill > >> hopefully disappear at some point in the future. > >> > >> Best, Fabian > >> > >> 2016-02-19 9:34 GMT+01:00 Welly Tambunan <if05...@gmail.com>: > >>> > >>> Hi All, > >>> > >>> We are trying to running our job in cluster that has this information > >>> > >>> 1. # of machine: 16 > >>> 2. memory : 128 gb > >>> 3. # of core : 48 > >>> > >>> However when we try to run we have an exception. > >>> > >>> "insufficient number of network buffers. 48 required but only 10 > >>> available. the total number of network buffers is currently set to > 2048" > >>> > >>> After looking at the documentation we set configuration based on docs > >>> > >>> taskmanager.network.numberOfBuffers: # core ^ 2 * # machine * 4 > >>> > >>> However we face another error from JVM > >>> > >>> java.io.IOException: Cannot allocate network buffer pool: Could not > >>> allocate enough memory segments for NetworkBufferPool (required (Mb): > 2304, > >>> allocated (Mb): 698, missing (Mb): 1606). Cause: Java heap space > >>> > >>> We fiddle the taskmanager.heap.mb: 4096 > >>> > >>> Finally the cluster is running. > >>> > >>> However i'm still not sure about the configuration and fiddling in task > >>> manager heap really fine tune. So my question is > >>> > >>> Am i doing it right for numberOfBuffers ? > >>> How much should we allocate on taskmanager.heap.mb given the > information > >>> Any suggestion which configuration we need to set to make it optimal > for > >>> the cluster ? > >>> Is there any chance that this will get automatically resolve by > >>> memory/network buffer manager ? > >>> > >>> Thanks a lot for the help > >>> > >>> Cheers > >>> > >>> -- > >>> Welly Tambunan > >>> Triplelands > >>> > >>> http://weltam.wordpress.com > >>> http://www.triplelands.com > >> > >> > > > > > > > > -- > > Welly Tambunan > > Triplelands > > > > http://weltam.wordpress.com > > http://www.triplelands.com > -- Welly Tambunan Triplelands http://weltam.wordpress.com http://www.triplelands.com <http://www.triplelands.com/blog/>