Hi Ufuk and Fabian, Is that better to start 48 task manager ( one slot each ) in one machine than having single task manager with 48 slot ? Any trade-off that we should know etc ?
Cheers On Tue, Feb 23, 2016 at 3:03 PM, Welly Tambunan <if05...@gmail.com> wrote: > Hi Ufuk, > > Thanks for the explanation. > > Yes. Our jobs is all streaming job. > > Cheers > > On Tue, Feb 23, 2016 at 2:48 PM, Ufuk Celebi <u...@apache.org> wrote: > >> The new default is equivalent to the previous "streaming mode". The >> community decided to get rid of this distinction, because it was >> confusing to users. >> >> The difference between "streaming mode" and "batch mode" was how >> Flink's managed memory was allocated, either lazily when required >> ('streaming mode") or eagerly on task manager start up ("batch mode"). >> Now it's lazy by default. >> >> This is not something you need to worry about, but if you are mostly >> using the DataSet API where pre allocation has benefits, you can get >> the "batch mode" behaviour by using the following configuration key: >> >> taskmanager.memory.preallocate: true >> >> But you are using the DataStream API anyways, right? >> >> – Ufuk >> >> >> On Tue, Feb 23, 2016 at 6:36 AM, Welly Tambunan <if05...@gmail.com> >> wrote: >> > Hi Fabian, >> > >> > Previously when using flink 0.9-0.10 we start the cluster with streaming >> > mode or batch mode. I see that this one is gone on Flink 1.00 snapshot >> ? So >> > this one has already taken care of the flink and optimize by runtime > >> > >> > On Mon, Feb 22, 2016 at 5:26 PM, Fabian Hueske <fhue...@gmail.com> >> wrote: >> >> >> >> Hi Welly, >> >> >> >> sorry for the late response. >> >> >> >> The number of network buffers primarily depends on the maximum >> parallelism >> >> of your job. >> >> The given formula assumes a specific cluster configuration (1 task >> manager >> >> per machine, one parallel task per CPU). >> >> The formula can be translated to: >> >> >> >> taskmanager.network.numberOfBuffers: p ^ 2 * t * 4 >> >> >> >> where p is the maximum parallelism of the job and t is the number of >> task >> >> manager. >> >> You can process more than one parallel task per TM if you configure >> more >> >> than one processing slot per machine ( taskmanager.numberOfTaskSlots). >> The >> >> TM will divide its memory among all its slots. So it would be possible >> to >> >> start one TM for each machine with 100GB+ memory and 48 slots each. >> >> >> >> We can compute the number of network buffers if you give a few more >> >> details about your setup: >> >> - How many task managers do you start? I assume more than one TM per >> >> machine given that you assign only 4GB of memory out of 128GB to each >> TM. >> >> - What is the maximum parallelism of you program? >> >> - How many processing slots do you configure for each TM? >> >> >> >> In general, pipelined shuffles with a high parallelism require a lot of >> >> memory. >> >> If you configure batch instead of pipelined transfer, the memory >> >> requirement goes down >> >> (ExecutionConfig.setExecutionMode(ExecutionMode.BATCH)). >> >> >> >> Eventually, we want to merge the network buffer and the managed memory >> >> pools. So the "taskmanager.network.numberOfBuffers" configuration whill >> >> hopefully disappear at some point in the future. >> >> >> >> Best, Fabian >> >> >> >> 2016-02-19 9:34 GMT+01:00 Welly Tambunan <if05...@gmail.com>: >> >>> >> >>> Hi All, >> >>> >> >>> We are trying to running our job in cluster that has this information >> >>> >> >>> 1. # of machine: 16 >> >>> 2. memory : 128 gb >> >>> 3. # of core : 48 >> >>> >> >>> However when we try to run we have an exception. >> >>> >> >>> "insufficient number of network buffers. 48 required but only 10 >> >>> available. the total number of network buffers is currently set to >> 2048" >> >>> >> >>> After looking at the documentation we set configuration based on docs >> >>> >> >>> taskmanager.network.numberOfBuffers: # core ^ 2 * # machine * 4 >> >>> >> >>> However we face another error from JVM >> >>> >> >>> java.io.IOException: Cannot allocate network buffer pool: Could not >> >>> allocate enough memory segments for NetworkBufferPool (required (Mb): >> 2304, >> >>> allocated (Mb): 698, missing (Mb): 1606). Cause: Java heap space >> >>> >> >>> We fiddle the taskmanager.heap.mb: 4096 >> >>> >> >>> Finally the cluster is running. >> >>> >> >>> However i'm still not sure about the configuration and fiddling in >> task >> >>> manager heap really fine tune. So my question is >> >>> >> >>> Am i doing it right for numberOfBuffers ? >> >>> How much should we allocate on taskmanager.heap.mb given the >> information >> >>> Any suggestion which configuration we need to set to make it optimal >> for >> >>> the cluster ? >> >>> Is there any chance that this will get automatically resolve by >> >>> memory/network buffer manager ? >> >>> >> >>> Thanks a lot for the help >> >>> >> >>> Cheers >> >>> >> >>> -- >> >>> Welly Tambunan >> >>> Triplelands >> >>> >> >>> http://weltam.wordpress.com >> >>> http://www.triplelands.com >> >> >> >> >> > >> > >> > >> > -- >> > Welly Tambunan >> > Triplelands >> > >> > http://weltam.wordpress.com >> > http://www.triplelands.com >> > > > > -- > Welly Tambunan > Triplelands > > http://weltam.wordpress.com > http://www.triplelands.com <http://www.triplelands.com/blog/> > -- Welly Tambunan Triplelands http://weltam.wordpress.com http://www.triplelands.com <http://www.triplelands.com/blog/>