Re: Spark UI consuming lots of memory
Hi Nicholas, I think you are right about the issue relating to Spark-11126, I'm seeing it as well. Did you find any workaround? Looking at the pull request for the fix it doesn't look possible. Best regards, Patrick On 15 October 2015 at 19:40, Nicholas Pritchard < nicholas.pritch...@falkonry.com> wrote: > Thanks for your help, most likely this is the memory leak you are fixing > in https://issues.apache.org/jira/browse/SPARK-11126. > -Nick > > On Mon, Oct 12, 2015 at 9:00 PM, Shixiong Zhu <zsxw...@gmail.com> wrote: > >> In addition, you cannot turn off JobListener and SQLListener now... >> >> Best Regards, >> Shixiong Zhu >> >> 2015-10-13 11:59 GMT+08:00 Shixiong Zhu <zsxw...@gmail.com>: >> >>> Is your query very complicated? Could you provide the output of >>> `explain` your query that consumes an excessive amount of memory? If this >>> is a small query, there may be a bug that leaks memory in SQLListener. >>> >>> Best Regards, >>> Shixiong Zhu >>> >>> 2015-10-13 11:44 GMT+08:00 Nicholas Pritchard < >>> nicholas.pritch...@falkonry.com>: >>> >>>> As an update, I did try disabling the ui with "spark.ui.enabled=false", >>>> but the JobListener and SQLListener still consume a lot of memory, leading >>>> to OOM error. Has anyone encountered this before? Is the only solution just >>>> to increase the driver heap size? >>>> >>>> Thanks, >>>> Nick >>>> >>>> On Mon, Oct 12, 2015 at 8:42 PM, Nicholas Pritchard < >>>> nicholas.pritch...@falkonry.com> wrote: >>>> >>>>> I set those configurations by passing to spark-submit script: >>>>> "bin/spark-submit --conf spark.ui.retainedJobs=20 ...". I have verified >>>>> that these configurations are being passed correctly because they are >>>>> listed in the environments tab and also by counting the number of >>>>> job/stages that are listed. The "spark.sql.ui.retainedExecutions=0" >>>>> only applies to the number of "completed" executions; there will always be >>>>> a "running" execution. For some reason, I have one execution that consumes >>>>> an excessive amount of memory. >>>>> >>>>> Actually, I am not interested in the SQL UI, as I find the Job/Stages >>>>> UI to have sufficient information. I am also using Spark Standalone >>>>> cluster >>>>> manager so have not had to use the history server. >>>>> >>>>> >>>>> On Mon, Oct 12, 2015 at 8:17 PM, Shixiong Zhu <zsxw...@gmail.com> >>>>> wrote: >>>>> >>>>>> Could you show how did you set the configurations? You need to set >>>>>> these configurations before creating SparkContext and SQLContext. >>>>>> >>>>>> Moreover, the history sever doesn't support SQL UI. So >>>>>> "spark.eventLog.enabled=true" doesn't work now. >>>>>> >>>>>> Best Regards, >>>>>> Shixiong Zhu >>>>>> >>>>>> 2015-10-13 2:01 GMT+08:00 pnpritchard < >>>>>> nicholas.pritch...@falkonry.com>: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> In my application, the Spark UI is consuming a lot of memory, >>>>>>> especially the >>>>>>> SQL tab. I have set the following configurations to reduce the memory >>>>>>> consumption: >>>>>>> - spark.ui.retainedJobs=20 >>>>>>> - spark.ui.retainedStages=40 >>>>>>> - spark.sql.ui.retainedExecutions=0 >>>>>>> >>>>>>> However, I still get OOM errors in the driver process with the >>>>>>> default 1GB >>>>>>> heap size. The following link is a screen shot of a heap dump report, >>>>>>> showing the SQLListener instance having a retained size of 600MB. >>>>>>> >>>>>>> https://cloud.githubusercontent.com/assets/5124612/10404379/20fbdcfc-6e87-11e5-9415-27e25193a25c.png >>>>>>> >>>>>>> Rather than just increasing the allotted heap size, does anyone have >>>>>>> any >>>>>>> other ideas? Is it possible to disable the SQL tab specifically? I >>>>>>> also >>>>>>> thought about serving the UI from disk rather than memory with >>>>>>> "spark.eventLog.enabled=true" and "spark.ui.enabled=false". Has >>>>>>> anyone tried >>>>>>> this before? >>>>>>> >>>>>>> Thanks, >>>>>>> Nick >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> View this message in context: >>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-UI-consuming-lots-of-memory-tp25033.html >>>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>>> Nabble.com. >>>>>>> >>>>>>> - >>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
Re: Spark UI consuming lots of memory
Thanks for your help, most likely this is the memory leak you are fixing in https://issues.apache.org/jira/browse/SPARK-11126. -Nick On Mon, Oct 12, 2015 at 9:00 PM, Shixiong Zhu <zsxw...@gmail.com> wrote: > In addition, you cannot turn off JobListener and SQLListener now... > > Best Regards, > Shixiong Zhu > > 2015-10-13 11:59 GMT+08:00 Shixiong Zhu <zsxw...@gmail.com>: > >> Is your query very complicated? Could you provide the output of `explain` >> your query that consumes an excessive amount of memory? If this is a small >> query, there may be a bug that leaks memory in SQLListener. >> >> Best Regards, >> Shixiong Zhu >> >> 2015-10-13 11:44 GMT+08:00 Nicholas Pritchard < >> nicholas.pritch...@falkonry.com>: >> >>> As an update, I did try disabling the ui with "spark.ui.enabled=false", >>> but the JobListener and SQLListener still consume a lot of memory, leading >>> to OOM error. Has anyone encountered this before? Is the only solution just >>> to increase the driver heap size? >>> >>> Thanks, >>> Nick >>> >>> On Mon, Oct 12, 2015 at 8:42 PM, Nicholas Pritchard < >>> nicholas.pritch...@falkonry.com> wrote: >>> >>>> I set those configurations by passing to spark-submit script: >>>> "bin/spark-submit --conf spark.ui.retainedJobs=20 ...". I have verified >>>> that these configurations are being passed correctly because they are >>>> listed in the environments tab and also by counting the number of >>>> job/stages that are listed. The "spark.sql.ui.retainedExecutions=0" >>>> only applies to the number of "completed" executions; there will always be >>>> a "running" execution. For some reason, I have one execution that consumes >>>> an excessive amount of memory. >>>> >>>> Actually, I am not interested in the SQL UI, as I find the Job/Stages >>>> UI to have sufficient information. I am also using Spark Standalone cluster >>>> manager so have not had to use the history server. >>>> >>>> >>>> On Mon, Oct 12, 2015 at 8:17 PM, Shixiong Zhu <zsxw...@gmail.com> >>>> wrote: >>>> >>>>> Could you show how did you set the configurations? You need to set >>>>> these configurations before creating SparkContext and SQLContext. >>>>> >>>>> Moreover, the history sever doesn't support SQL UI. So >>>>> "spark.eventLog.enabled=true" doesn't work now. >>>>> >>>>> Best Regards, >>>>> Shixiong Zhu >>>>> >>>>> 2015-10-13 2:01 GMT+08:00 pnpritchard <nicholas.pritch...@falkonry.com >>>>> >: >>>>> >>>>>> Hi, >>>>>> >>>>>> In my application, the Spark UI is consuming a lot of memory, >>>>>> especially the >>>>>> SQL tab. I have set the following configurations to reduce the memory >>>>>> consumption: >>>>>> - spark.ui.retainedJobs=20 >>>>>> - spark.ui.retainedStages=40 >>>>>> - spark.sql.ui.retainedExecutions=0 >>>>>> >>>>>> However, I still get OOM errors in the driver process with the >>>>>> default 1GB >>>>>> heap size. The following link is a screen shot of a heap dump report, >>>>>> showing the SQLListener instance having a retained size of 600MB. >>>>>> >>>>>> https://cloud.githubusercontent.com/assets/5124612/10404379/20fbdcfc-6e87-11e5-9415-27e25193a25c.png >>>>>> >>>>>> Rather than just increasing the allotted heap size, does anyone have >>>>>> any >>>>>> other ideas? Is it possible to disable the SQL tab specifically? I >>>>>> also >>>>>> thought about serving the UI from disk rather than memory with >>>>>> "spark.eventLog.enabled=true" and "spark.ui.enabled=false". Has >>>>>> anyone tried >>>>>> this before? >>>>>> >>>>>> Thanks, >>>>>> Nick >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> View this message in context: >>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-UI-consuming-lots-of-memory-tp25033.html >>>>>> Sent from the Apache Spark User List mailing list archive at >>>>>> Nabble.com. >>>>>> >>>>>> - >>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>>> >>>>>> >>>>> >>>> >>> >> >
Spark UI consuming lots of memory
Hi, In my application, the Spark UI is consuming a lot of memory, especially the SQL tab. I have set the following configurations to reduce the memory consumption: - spark.ui.retainedJobs=20 - spark.ui.retainedStages=40 - spark.sql.ui.retainedExecutions=0 However, I still get OOM errors in the driver process with the default 1GB heap size. The following link is a screen shot of a heap dump report, showing the SQLListener instance having a retained size of 600MB. https://cloud.githubusercontent.com/assets/5124612/10404379/20fbdcfc-6e87-11e5-9415-27e25193a25c.png Rather than just increasing the allotted heap size, does anyone have any other ideas? Is it possible to disable the SQL tab specifically? I also thought about serving the UI from disk rather than memory with "spark.eventLog.enabled=true" and "spark.ui.enabled=false". Has anyone tried this before? Thanks, Nick -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-UI-consuming-lots-of-memory-tp25033.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark UI consuming lots of memory
As an update, I did try disabling the ui with "spark.ui.enabled=false", but the JobListener and SQLListener still consume a lot of memory, leading to OOM error. Has anyone encountered this before? Is the only solution just to increase the driver heap size? Thanks, Nick On Mon, Oct 12, 2015 at 8:42 PM, Nicholas Pritchard < nicholas.pritch...@falkonry.com> wrote: > I set those configurations by passing to spark-submit script: > "bin/spark-submit --conf spark.ui.retainedJobs=20 ...". I have verified > that these configurations are being passed correctly because they are > listed in the environments tab and also by counting the number of > job/stages that are listed. The "spark.sql.ui.retainedExecutions=0" only > applies to the number of "completed" executions; there will always be a > "running" execution. For some reason, I have one execution that consumes an > excessive amount of memory. > > Actually, I am not interested in the SQL UI, as I find the Job/Stages UI > to have sufficient information. I am also using Spark Standalone cluster > manager so have not had to use the history server. > > > On Mon, Oct 12, 2015 at 8:17 PM, Shixiong Zhu <zsxw...@gmail.com> wrote: > >> Could you show how did you set the configurations? You need to set these >> configurations before creating SparkContext and SQLContext. >> >> Moreover, the history sever doesn't support SQL UI. So >> "spark.eventLog.enabled=true" doesn't work now. >> >> Best Regards, >> Shixiong Zhu >> >> 2015-10-13 2:01 GMT+08:00 pnpritchard <nicholas.pritch...@falkonry.com>: >> >>> Hi, >>> >>> In my application, the Spark UI is consuming a lot of memory, especially >>> the >>> SQL tab. I have set the following configurations to reduce the memory >>> consumption: >>> - spark.ui.retainedJobs=20 >>> - spark.ui.retainedStages=40 >>> - spark.sql.ui.retainedExecutions=0 >>> >>> However, I still get OOM errors in the driver process with the default >>> 1GB >>> heap size. The following link is a screen shot of a heap dump report, >>> showing the SQLListener instance having a retained size of 600MB. >>> >>> https://cloud.githubusercontent.com/assets/5124612/10404379/20fbdcfc-6e87-11e5-9415-27e25193a25c.png >>> >>> Rather than just increasing the allotted heap size, does anyone have any >>> other ideas? Is it possible to disable the SQL tab specifically? I also >>> thought about serving the UI from disk rather than memory with >>> "spark.eventLog.enabled=true" and "spark.ui.enabled=false". Has anyone >>> tried >>> this before? >>> >>> Thanks, >>> Nick >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-UI-consuming-lots-of-memory-tp25033.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> - >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >
Re: Spark UI consuming lots of memory
I set those configurations by passing to spark-submit script: "bin/spark-submit --conf spark.ui.retainedJobs=20 ...". I have verified that these configurations are being passed correctly because they are listed in the environments tab and also by counting the number of job/stages that are listed. The "spark.sql.ui.retainedExecutions=0" only applies to the number of "completed" executions; there will always be a "running" execution. For some reason, I have one execution that consumes an excessive amount of memory. Actually, I am not interested in the SQL UI, as I find the Job/Stages UI to have sufficient information. I am also using Spark Standalone cluster manager so have not had to use the history server. On Mon, Oct 12, 2015 at 8:17 PM, Shixiong Zhu <zsxw...@gmail.com> wrote: > Could you show how did you set the configurations? You need to set these > configurations before creating SparkContext and SQLContext. > > Moreover, the history sever doesn't support SQL UI. So > "spark.eventLog.enabled=true" doesn't work now. > > Best Regards, > Shixiong Zhu > > 2015-10-13 2:01 GMT+08:00 pnpritchard <nicholas.pritch...@falkonry.com>: > >> Hi, >> >> In my application, the Spark UI is consuming a lot of memory, especially >> the >> SQL tab. I have set the following configurations to reduce the memory >> consumption: >> - spark.ui.retainedJobs=20 >> - spark.ui.retainedStages=40 >> - spark.sql.ui.retainedExecutions=0 >> >> However, I still get OOM errors in the driver process with the default 1GB >> heap size. The following link is a screen shot of a heap dump report, >> showing the SQLListener instance having a retained size of 600MB. >> >> https://cloud.githubusercontent.com/assets/5124612/10404379/20fbdcfc-6e87-11e5-9415-27e25193a25c.png >> >> Rather than just increasing the allotted heap size, does anyone have any >> other ideas? Is it possible to disable the SQL tab specifically? I also >> thought about serving the UI from disk rather than memory with >> "spark.eventLog.enabled=true" and "spark.ui.enabled=false". Has anyone >> tried >> this before? >> >> Thanks, >> Nick >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-UI-consuming-lots-of-memory-tp25033.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> - >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >
Re: Spark UI consuming lots of memory
In addition, you cannot turn off JobListener and SQLListener now... Best Regards, Shixiong Zhu 2015-10-13 11:59 GMT+08:00 Shixiong Zhu <zsxw...@gmail.com>: > Is your query very complicated? Could you provide the output of `explain` > your query that consumes an excessive amount of memory? If this is a small > query, there may be a bug that leaks memory in SQLListener. > > Best Regards, > Shixiong Zhu > > 2015-10-13 11:44 GMT+08:00 Nicholas Pritchard < > nicholas.pritch...@falkonry.com>: > >> As an update, I did try disabling the ui with "spark.ui.enabled=false", >> but the JobListener and SQLListener still consume a lot of memory, leading >> to OOM error. Has anyone encountered this before? Is the only solution just >> to increase the driver heap size? >> >> Thanks, >> Nick >> >> On Mon, Oct 12, 2015 at 8:42 PM, Nicholas Pritchard < >> nicholas.pritch...@falkonry.com> wrote: >> >>> I set those configurations by passing to spark-submit script: >>> "bin/spark-submit --conf spark.ui.retainedJobs=20 ...". I have verified >>> that these configurations are being passed correctly because they are >>> listed in the environments tab and also by counting the number of >>> job/stages that are listed. The "spark.sql.ui.retainedExecutions=0" >>> only applies to the number of "completed" executions; there will always be >>> a "running" execution. For some reason, I have one execution that consumes >>> an excessive amount of memory. >>> >>> Actually, I am not interested in the SQL UI, as I find the Job/Stages UI >>> to have sufficient information. I am also using Spark Standalone cluster >>> manager so have not had to use the history server. >>> >>> >>> On Mon, Oct 12, 2015 at 8:17 PM, Shixiong Zhu <zsxw...@gmail.com> wrote: >>> >>>> Could you show how did you set the configurations? You need to set >>>> these configurations before creating SparkContext and SQLContext. >>>> >>>> Moreover, the history sever doesn't support SQL UI. So >>>> "spark.eventLog.enabled=true" doesn't work now. >>>> >>>> Best Regards, >>>> Shixiong Zhu >>>> >>>> 2015-10-13 2:01 GMT+08:00 pnpritchard <nicholas.pritch...@falkonry.com> >>>> : >>>> >>>>> Hi, >>>>> >>>>> In my application, the Spark UI is consuming a lot of memory, >>>>> especially the >>>>> SQL tab. I have set the following configurations to reduce the memory >>>>> consumption: >>>>> - spark.ui.retainedJobs=20 >>>>> - spark.ui.retainedStages=40 >>>>> - spark.sql.ui.retainedExecutions=0 >>>>> >>>>> However, I still get OOM errors in the driver process with the default >>>>> 1GB >>>>> heap size. The following link is a screen shot of a heap dump report, >>>>> showing the SQLListener instance having a retained size of 600MB. >>>>> >>>>> https://cloud.githubusercontent.com/assets/5124612/10404379/20fbdcfc-6e87-11e5-9415-27e25193a25c.png >>>>> >>>>> Rather than just increasing the allotted heap size, does anyone have >>>>> any >>>>> other ideas? Is it possible to disable the SQL tab specifically? I also >>>>> thought about serving the UI from disk rather than memory with >>>>> "spark.eventLog.enabled=true" and "spark.ui.enabled=false". Has anyone >>>>> tried >>>>> this before? >>>>> >>>>> Thanks, >>>>> Nick >>>>> >>>>> >>>>> >>>>> -- >>>>> View this message in context: >>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-UI-consuming-lots-of-memory-tp25033.html >>>>> Sent from the Apache Spark User List mailing list archive at >>>>> Nabble.com. >>>>> >>>>> - >>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>>> For additional commands, e-mail: user-h...@spark.apache.org >>>>> >>>>> >>>> >>> >> >
Re: Spark UI consuming lots of memory
Could you show how did you set the configurations? You need to set these configurations before creating SparkContext and SQLContext. Moreover, the history sever doesn't support SQL UI. So "spark.eventLog.enabled=true" doesn't work now. Best Regards, Shixiong Zhu 2015-10-13 2:01 GMT+08:00 pnpritchard <nicholas.pritch...@falkonry.com>: > Hi, > > In my application, the Spark UI is consuming a lot of memory, especially > the > SQL tab. I have set the following configurations to reduce the memory > consumption: > - spark.ui.retainedJobs=20 > - spark.ui.retainedStages=40 > - spark.sql.ui.retainedExecutions=0 > > However, I still get OOM errors in the driver process with the default 1GB > heap size. The following link is a screen shot of a heap dump report, > showing the SQLListener instance having a retained size of 600MB. > > https://cloud.githubusercontent.com/assets/5124612/10404379/20fbdcfc-6e87-11e5-9415-27e25193a25c.png > > Rather than just increasing the allotted heap size, does anyone have any > other ideas? Is it possible to disable the SQL tab specifically? I also > thought about serving the UI from disk rather than memory with > "spark.eventLog.enabled=true" and "spark.ui.enabled=false". Has anyone > tried > this before? > > Thanks, > Nick > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-UI-consuming-lots-of-memory-tp25033.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
Re: Spark UI consuming lots of memory
Is your query very complicated? Could you provide the output of `explain` your query that consumes an excessive amount of memory? If this is a small query, there may be a bug that leaks memory in SQLListener. Best Regards, Shixiong Zhu 2015-10-13 11:44 GMT+08:00 Nicholas Pritchard < nicholas.pritch...@falkonry.com>: > As an update, I did try disabling the ui with "spark.ui.enabled=false", > but the JobListener and SQLListener still consume a lot of memory, leading > to OOM error. Has anyone encountered this before? Is the only solution just > to increase the driver heap size? > > Thanks, > Nick > > On Mon, Oct 12, 2015 at 8:42 PM, Nicholas Pritchard < > nicholas.pritch...@falkonry.com> wrote: > >> I set those configurations by passing to spark-submit script: >> "bin/spark-submit --conf spark.ui.retainedJobs=20 ...". I have verified >> that these configurations are being passed correctly because they are >> listed in the environments tab and also by counting the number of >> job/stages that are listed. The "spark.sql.ui.retainedExecutions=0" only >> applies to the number of "completed" executions; there will always be a >> "running" execution. For some reason, I have one execution that consumes an >> excessive amount of memory. >> >> Actually, I am not interested in the SQL UI, as I find the Job/Stages UI >> to have sufficient information. I am also using Spark Standalone cluster >> manager so have not had to use the history server. >> >> >> On Mon, Oct 12, 2015 at 8:17 PM, Shixiong Zhu <zsxw...@gmail.com> wrote: >> >>> Could you show how did you set the configurations? You need to set these >>> configurations before creating SparkContext and SQLContext. >>> >>> Moreover, the history sever doesn't support SQL UI. So >>> "spark.eventLog.enabled=true" doesn't work now. >>> >>> Best Regards, >>> Shixiong Zhu >>> >>> 2015-10-13 2:01 GMT+08:00 pnpritchard <nicholas.pritch...@falkonry.com>: >>> >>>> Hi, >>>> >>>> In my application, the Spark UI is consuming a lot of memory, >>>> especially the >>>> SQL tab. I have set the following configurations to reduce the memory >>>> consumption: >>>> - spark.ui.retainedJobs=20 >>>> - spark.ui.retainedStages=40 >>>> - spark.sql.ui.retainedExecutions=0 >>>> >>>> However, I still get OOM errors in the driver process with the default >>>> 1GB >>>> heap size. The following link is a screen shot of a heap dump report, >>>> showing the SQLListener instance having a retained size of 600MB. >>>> >>>> https://cloud.githubusercontent.com/assets/5124612/10404379/20fbdcfc-6e87-11e5-9415-27e25193a25c.png >>>> >>>> Rather than just increasing the allotted heap size, does anyone have any >>>> other ideas? Is it possible to disable the SQL tab specifically? I also >>>> thought about serving the UI from disk rather than memory with >>>> "spark.eventLog.enabled=true" and "spark.ui.enabled=false". Has anyone >>>> tried >>>> this before? >>>> >>>> Thanks, >>>> Nick >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-UI-consuming-lots-of-memory-tp25033.html >>>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>>> >>>> - >>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: user-h...@spark.apache.org >>>> >>>> >>> >> >