Hi Dillon, I do think that there is a setting available where in once YARN sets up the containers then you do not deallocate them, I had used it previously in HIVE, and it just saves processing time in terms of allocating containers. That said I am still trying to understand how do we determine one YARN container = one executor in SPARK.
Regards, Gourav On Tue, Oct 9, 2018 at 9:04 PM Dillon Dukek <dillon.du...@placed.com.invalid> wrote: > I'm still not sure exactly what you are meaning by saying that you have 6 > yarn containers. Yarn should just be aware of the total available resources > in your cluster and then be able to launch containers based on the > executor requirements you set when you submit your job. If you can, I think > it would be helpful to send me the command you're using to launch your > spark process. You should also be able to use the logs and/or the spark UI > to determine how many executors are running. > > On Tue, Oct 9, 2018 at 12:57 PM Gourav Sengupta <gourav.sengu...@gmail.com> > wrote: > >> hi, >> >> may be I am not quite clear in my head on this one. But how do we know >> that 1 yarn container = 1 executor? >> >> Regards, >> Gourav Sengupta >> >> On Tue, Oct 9, 2018 at 8:53 PM Dillon Dukek >> <dillon.du...@placed.com.invalid> wrote: >> >>> Can you send how you are launching your streaming process? Also what >>> environment is this cluster running in (EMR, GCP, self managed, etc)? >>> >>> On Tue, Oct 9, 2018 at 10:21 AM kant kodali <kanth...@gmail.com> wrote: >>> >>>> Hi All, >>>> >>>> I am using Spark 2.3.1 and using YARN as a cluster manager. >>>> >>>> I currently got >>>> >>>> 1) 6 YARN containers(executors=6) with 4 executor cores for each >>>> container. >>>> 2) 6 Kafka partitions from one topic. >>>> 3) You can assume every other configuration is set to whatever the >>>> default values are. >>>> >>>> Spawned a Simple Streaming Query and I see all the tasks get scheduled >>>> on one YARN container. am I missing any config? >>>> >>>> Thanks! >>>> >>>