Hi, the output of the host where the driver is running: ~$ jps 9895 DriverWrapper 24057 Jps 3531 Worker
In that host, I gave too much memory for the driver and no executor could be place for that worker. 2016-08-01 18:06 GMT-03:00 Mich Talebzadeh <mich.talebza...@gmail.com>: > OK > > Can you on the hostname that driver program is running do jps and send the > output please? > > HTH > > Dr Mich Talebzadeh > > > > LinkedIn * > https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 1 August 2016 at 22:03, Maximiliano Patricio Méndez < > mmen...@despegar.com> wrote: > >> Hi, thanks again for the answer. >> >> Looking a little bit closer into that, I found out that the DriverWrapper >> process was not running in the hostname the log reported. It is runnning, >> but in another host. Mistery. >> >> If I manually go to the host that has the DriverWrapper running in it, on >> port 4040, I can see the sparkUI without problems, but if I go through the >> master > applicationUI it tries to send me to the wrong host (the one the >> driver reports in its log). >> >> The hostname the driver reports is the same from which I send the submit >> request. >> >> 2016-08-01 17:27 GMT-03:00 Mich Talebzadeh <mich.talebza...@gmail.com>: >> >>> Fine. >>> >>> In that case which process is your driver program (from jps output)? >>> >>> Thanks >>> >>> Dr Mich Talebzadeh >>> >>> >>> >>> LinkedIn * >>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>> >>> >>> >>> http://talebzadehmich.wordpress.com >>> >>> >>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>> any loss, damage or destruction of data or any other property which may >>> arise from relying on this email's technical content is explicitly >>> disclaimed. The author will in no case be liable for any monetary damages >>> arising from such loss, damage or destruction. >>> >>> >>> >>> On 1 August 2016 at 21:23, Maximiliano Patricio Méndez < >>> mmen...@despegar.com> wrote: >>> >>>> Hi, MIch, thanks for replying. >>>> >>>> I'm deploying from the same instance from where I showed the logs and >>>> commands using --deploy-mode cluster. >>>> >>>> The SparkSubmit process only appears while the bin/spark-submit binary >>>> is active. >>>> When the application starts and the driver takes control, the >>>> SparkSubmit process dies. >>>> >>>> 2016-08-01 16:07 GMT-03:00 Mich Talebzadeh <mich.talebza...@gmail.com>: >>>> >>>>> OK I can see the Worker (19286 Worker and the executor(6548 >>>>> CoarseGrainedExecutorBackend) running on it >>>>> >>>>> Where is spark-submit? Did you submit your job from another node or >>>>> used another method to run it? >>>>> >>>>> HTH >>>>> >>>>> Dr Mich Talebzadeh >>>>> >>>>> >>>>> >>>>> LinkedIn * >>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>> >>>>> >>>>> >>>>> http://talebzadehmich.wordpress.com >>>>> >>>>> >>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for >>>>> any loss, damage or destruction of data or any other property which may >>>>> arise from relying on this email's technical content is explicitly >>>>> disclaimed. The author will in no case be liable for any monetary damages >>>>> arising from such loss, damage or destruction. >>>>> >>>>> >>>>> >>>>> On 1 August 2016 at 19:08, Maximiliano Patricio Méndez < >>>>> mmen...@despegar.com> wrote: >>>>> >>>>>> I just recently tried again, the port 4040 is not used. And even if >>>>>> it were, I think the log would reflect that trying to use the following >>>>>> port (4041) as you mentioned. >>>>>> >>>>>> This is what the driver log says: >>>>>> >>>>>> 16/08/01 13:55:56 INFO Utils: Successfully started service 'SparkUI' on >>>>>> port 4040. >>>>>> 16/08/01 13:55:56 INFO SparkUI: Started SparkUI at http://hostname:4040 >>>>>> >>>>>> >>>>>> If I go to {hostname}: >>>>>> ~$ jps >>>>>> 6548 CoarseGrainedExecutorBackend >>>>>> 19286 Worker >>>>>> 6843 Jps >>>>>> 19182 Master >>>>>> >>>>>> ~$ netstat -nltp >>>>>> Active Internet connections (only servers) >>>>>> Proto Recv-Q Send-Q Local Address Foreign Address >>>>>> State PID/Program name >>>>>> tcp6 0 0 192.168.22.245:43037 :::* >>>>>> LISTEN 6548/java >>>>>> tcp6 0 0 192.168.22.245:56929 :::* >>>>>> LISTEN 19286/java >>>>>> tcp6 0 0 192.168.22.245:7077 :::* >>>>>> LISTEN 19182/java >>>>>> tcp6 0 0 :::33296 :::* >>>>>> LISTEN 6548/java >>>>>> tcp6 0 0 :::8080 :::* >>>>>> LISTEN 19182/java >>>>>> tcp6 0 0 :::8081 :::* >>>>>> LISTEN 19286/java >>>>>> tcp6 0 0 192.168.22.245:6066 :::* >>>>>> LISTEN 19182/java >>>>>> >>>>>> ~$ netstat -nltap | grep 4040 >>>>>> >>>>>> I'm really lost here and don't know much about spark yet, but >>>>>> shouldn't there be a DriverWrapper process which holds the bind on port >>>>>> 4040? >>>>>> >>>>>> >>>>>> 2016-08-01 13:49 GMT-03:00 Mich Talebzadeh <mich.talebza...@gmail.com >>>>>> >: >>>>>> >>>>>>> Can you check if port 4040 is actually used? If it used the next >>>>>>> available one would 4041. For example below Zeppelin uses it >>>>>>> >>>>>>> >>>>>>> *netstat -plten|grep 4040*tcp 0 0 >>>>>>> :::4040 :::* LISTEN >>>>>>> 1005 73372882 *10699*/java >>>>>>> *ps aux|grep 10699* >>>>>>> hduser 10699 0.1 3.8 3172308 952932 pts/3 SNl Jul30 5:57 >>>>>>> /usr/java/latest/bin/java -cp /data6/hduser/zeppelin-0.6.0/ ... >>>>>>> >>>>>>> HTH >>>>>>> >>>>>>> Dr Mich Talebzadeh >>>>>>> >>>>>>> >>>>>>> >>>>>>> LinkedIn * >>>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>>>> >>>>>>> >>>>>>> >>>>>>> http://talebzadehmich.wordpress.com >>>>>>> >>>>>>> >>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>>> for any loss, damage or destruction of data or any other property which >>>>>>> may >>>>>>> arise from relying on this email's technical content is explicitly >>>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>>> damages >>>>>>> arising from such loss, damage or destruction. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 1 August 2016 at 17:44, Maximiliano Patricio Méndez < >>>>>>> mmen...@despegar.com> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> Thanks for the answers. >>>>>>>> >>>>>>>> @Jacek: To verify if the ui is up, I enter to all the worker nodes >>>>>>>> of my cluster and run netstat -nltp | grep 4040 with no result. The >>>>>>>> log of >>>>>>>> the driver tells me in which server and on which port should the spark >>>>>>>> ui >>>>>>>> be up, but it isn't. >>>>>>>> >>>>>>>> >>>>>>>> @Mich: I've tried to specify spark.ui.port=nnn but I only manage to >>>>>>>> change the log, reporting that the driver should be in another port. >>>>>>>> >>>>>>>> The ui has no problem to start in that port (4040) when I run my >>>>>>>> application in client mode. >>>>>>>> >>>>>>>> Could there be a network issue making the ui to fail silently? I've >>>>>>>> read some of the code regarding those parts of the driver log, but >>>>>>>> couldn't >>>>>>>> find anything weird. >>>>>>>> >>>>>>>> 2016-07-29 19:45 GMT-03:00 Mich Talebzadeh < >>>>>>>> mich.talebza...@gmail.com>: >>>>>>>> >>>>>>>>> why chance it. Best to explicitly specify in spark-submit (or >>>>>>>>> whatever) which port to listen to >>>>>>>>> >>>>>>>>> --conf "spark.ui.port=nnn" >>>>>>>>> >>>>>>>>> and see if it works >>>>>>>>> >>>>>>>>> HTH >>>>>>>>> >>>>>>>>> >>>>>>>>> Dr Mich Talebzadeh >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> LinkedIn * >>>>>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> http://talebzadehmich.wordpress.com >>>>>>>>> >>>>>>>>> >>>>>>>>> *Disclaimer:* Use it at your own risk. Any and all responsibility >>>>>>>>> for any loss, damage or destruction of data or any other property >>>>>>>>> which may >>>>>>>>> arise from relying on this email's technical content is explicitly >>>>>>>>> disclaimed. The author will in no case be liable for any monetary >>>>>>>>> damages >>>>>>>>> arising from such loss, damage or destruction. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 29 July 2016 at 23:37, Jacek Laskowski <ja...@japila.pl> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I'm curious about "For some reason, sometimes the SparkUI does not >>>>>>>>>> appear to be bound on port 4040 (or any other) but the application >>>>>>>>>> runs perfectly and finishes giving the expected answer." How do >>>>>>>>>> you >>>>>>>>>> check that web UI listens to the port 4040? >>>>>>>>>> >>>>>>>>>> Pozdrawiam, >>>>>>>>>> Jacek Laskowski >>>>>>>>>> ---- >>>>>>>>>> https://medium.com/@jaceklaskowski/ >>>>>>>>>> Mastering Apache Spark http://bit.ly/mastering-apache-spark >>>>>>>>>> Follow me at https://twitter.com/jaceklaskowski >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Jul 28, 2016 at 11:37 PM, Maximiliano Patricio Méndez >>>>>>>>>> <mmen...@despegar.com> wrote: >>>>>>>>>> > Hi, >>>>>>>>>> > >>>>>>>>>> > I'm having some trouble trying to submit an application to my >>>>>>>>>> spark cluster. >>>>>>>>>> > For some reason, sometimes the SparkUI does not appear to be >>>>>>>>>> bound on port >>>>>>>>>> > 4040 (or any other) but the application runs perfectly and >>>>>>>>>> finishes giving >>>>>>>>>> > the expected answer. >>>>>>>>>> > >>>>>>>>>> > And don't know why, but if I restart all the workers at once >>>>>>>>>> sometimes it >>>>>>>>>> > begins to work and sometimes it doesn't. >>>>>>>>>> > >>>>>>>>>> > In the driver logs, when it fails to start the SparkUI I see >>>>>>>>>> some these >>>>>>>>>> > lines: >>>>>>>>>> > 16/07/28 16:13:37 INFO Utils: Successfully started service >>>>>>>>>> 'SparkUI' on port >>>>>>>>>> > 4040. >>>>>>>>>> > 16/07/28 16:13:37 INFO SparkUI: Started SparkUI at >>>>>>>>>> http://hostname-00:4040 >>>>>>>>>> > >>>>>>>>>> > but nothing running in those ports. >>>>>>>>>> > >>>>>>>>>> > I'm attaching the full driver log in which I've activated jetty >>>>>>>>>> logs on >>>>>>>>>> > DEBUG but couldn't find anything. >>>>>>>>>> > >>>>>>>>>> > The only properties that I'm not leaving at default at the >>>>>>>>>> configuration is >>>>>>>>>> > the SPARK_PUBLIC_DNS=$(hostname), SPARK_WORKER_CORES and >>>>>>>>>> SPARK_WORKER_MEMORY >>>>>>>>>> > >>>>>>>>>> > Have anyone faced something similar? >>>>>>>>>> > >>>>>>>>>> > Thanks >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>> > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >