Hi, Download 0.7.0 -> Run R tutorial notebook repeatedly
will reproduce the problem? Otherwise, can someone clarify instruction to reproduce the problem? Thanks, moon On Sat, Feb 18, 2017 at 5:45 AM xyun...@simuwell.com <xyun...@simuwell.com> wrote: > Within the Scala REPL everything is working fine. Even you application > session is down but you run the same code again you will see there is a new > job getting created. But zeppelin has such a problem. you can do a test. > run a notebook and get the job finished at the end of the notebook. Then > re-run the same notebook again then it will get stucked. Running spark code > on Scala REPL and Zeppelin are different > > > *From:* Paul Brenner <pbren...@placeiq.com> > *Date:* 2017-02-17 12:37 > > *To:* users <users@zeppelin.apache.org> > *Subject:* Re: Re: Zeppelin unable to respond after some time > > I don’t believe that this explains my issue. Running the Scala REPL also > keeps a session alive for as long as the REPL is running. I’ve had REPLs > open for days (shhhhh don’t tell anyone) that have correspondingly kept > sessions alive for the same period of time with no problem. I only see this > issue in zeppelin. > > We run zeppelin on a server and allow multiple users to connect, each with > their own interpreters. We also find that zeppelin memory usage on the > server will steadily creep up over time. Executing sys.exit in a spark > paragraph, restarting the interpreter, and using yarn application -kill > often will cause zeppelin to end the related interpreter process but not > always. So over time we find that many zombie processes pile up and eat up > resources. > > The only way to keep on top of this is to regularly login to the zeppelin > server and kill zombie jobs. Here is a command that I’ve found helpful. > When you know that a specific user has no active zeppelin interpreters > running then execute the following: > > ps aux | grep zeppelin | grep "2BSGYY7S8" | grep java | awk -F " " > '{print$2}' | xargs sudo -u yarn kill -9 > > where “2BSGYY7S8" is the interpreter id (found in interpreter.json) and > “yarn” is actually the name of the user that originally started zeppelin > with: > zeppelin-daemon.sh start > > To kill every interpreter except for a specific users just flip it around > with: > ps aux | grep zeppelin | grep -v "2BSGYY7S8" |grep -v > "zeppelin.server.ZeppelinServer" | grep java | awk -F " " '{print$2}' | > xargs sudo -u yarn kill -9 > > > If I do this every few days zeppelin keeps humming along pretty smoothly > most of the time. > <http://www.placeiq.com/> <http://www.placeiq.com/> > <http://www.placeiq.com/> Paul Brenner <https://twitter.com/placeiq> > <https://twitter.com/placeiq> <https://twitter.com/placeiq> > <https://www.facebook.com/PlaceIQ> <https://www.facebook.com/PlaceIQ> > <https://www.linkedin.com/company/placeiq> > <https://www.linkedin.com/company/placeiq> > DATA SCIENTIST > *(217) 390-3033 <(217)%20390-3033> * > > <http://www.placeiq.com/2015/05/26/placeiq-named-winner-of-prestigious-2015-oracle-data-cloud-activate-award/> > <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> > <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> > <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> > <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> > <http://placeiq.com/2016/03/08/measuring-addressable-tv-campaigns-is-now-possible/> > <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> > <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> > <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> > <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> > <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> > <http://pages.placeiq.com/Location-Data-Accuracy-Whitepaper-Download.html?utm_source=Signature&utm_medium=Email&utm_campaign=AccuracyWP> > <http://placeiq.com/2016/08/03/placeiq-bolsters-location-intelligence-platform-with-mastercard-insights/> > <http://placeiq.com/2016/10/26/the-making-of-a-location-data-industry-milestone/>[image: > PlaceIQ:Location Data Accuracy] > <http://placeiq.com/2016/12/07/placeiq-introduces-landmark-a-groundbreaking-offering-that-delivers-access-to-the-highest-quality-location-data-for-insights-that-fuel-limitless-business-decisions/> > > On Fri, Feb 17, 2017 at 3:23 PM "xyun...@simuwell.com" > <">"xyun...@simuwell.com" > > wrote: > > The problem could be not only the resource, but the session. If you run a > chunk of spark code and you should see the a running application in the > spark UI, but in your code if you shut it down after the job is finished, > then on the spark UI you will see the hob is finished. Within zeppelin, > each job will start the spark session only once(different interpreter mode > could be set if you want notebooks to share the session or not), if you > closed it ,it will never restart it again. The only way to get the same > code work again is to restat the interpreter or restart zeppelin. I`m not > sure if I explain clearly, but hope it could help > > > > From: Paul Brenner <pbren...@placeiq.com> > > *Date:* 2017-02-17 12:14 > *To:* users <users@zeppelin.apache.org> > *Subject:* Re: Re: Zeppelin unable to respond after some time > I’ve definitely had this problem with jobs that don’t take all the > resources on the cluster. Also, my experience matches what others have > reported: just restarting zeppelin and re-runing the stuck paragraph solves > the issue. > > I’ve also experienced this problem with for loops. Some for loops which > write to disk but absolutely don’t have any variables that are increasing > in size will hang in Zeppelin. If I run the exact same code in the scala > REPL it goes through without problem. > > > > <http://www.placeiq.com/> <http://www.placeiq.com/> > <http://www.placeiq.com/> Paul Brenner <https://twitter.com/placeiq> > <https://twitter.com/placeiq> <https://twitter.com/placeiq> > <https://www.facebook.com/PlaceIQ> <https://www.facebook.com/PlaceIQ> > <https://www.linkedin.com/company/placeiq> > <https://www.linkedin.com/company/placeiq> > DATA SCIENTIST > *(217) 390-3033 <(217)%20390-3033> * > > <http://www.placeiq.com/2015/05/26/placeiq-named-winner-of-prestigious-2015-oracle-data-cloud-activate-award/> > <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> > <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> > <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> > <http://placeiq.com/2015/12/18/accuracy-vs-precision-in-location-data-mma-webinar/> > <http://placeiq.com/2016/03/08/measuring-addressable-tv-campaigns-is-now-possible/> > <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> > <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> > <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> > <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> > <http://placeiq.com/2016/04/13/placeiq-joins-the-network-advertising-initiative-nai-as-100th-member/> > <http://pages.placeiq.com/Location-Data-Accuracy-Whitepaper-Download.html?utm_source=Signature&utm_medium=Email&utm_campaign=AccuracyWP> > <http://placeiq.com/2016/08/03/placeiq-bolsters-location-intelligence-platform-with-mastercard-insights/> > <http://placeiq.com/2016/10/26/the-making-of-a-location-data-industry-milestone/>[image: > PlaceIQ:Location Data Accuracy] > <http://placeiq.com/2016/12/07/placeiq-introduces-landmark-a-groundbreaking-offering-that-delivers-access-to-the-highest-quality-location-data-for-insights-that-fuel-limitless-business-decisions/> > > On Fri, Feb 17, 2017 at 2:12 PM "xyun...@simuwell.com" > <">"xyun...@simuwell.com" > > wrote: > > I have solved the similar issue before. You should check on spark UI and > probably you will see your single job is taking all the resources. > Therefore further job that submitting to the same cluster will just hang on > there. When you restart zeppelin then the old job is killed and all the > resource it took will be released > > ------------------------------ > xyun...@simuwell.com > > > > From: RUSHIKESH RAUT <rushikeshraut...@gmail.com> > > *Date:* 2017-02-17 02:29 > *To:* users <users@zeppelin.apache.org> > *Subject:* Re: Zeppelin unable to respond after some time > Yes happens with r and spark codes frequently > > On Feb 17, 2017 3:25 PM, "小野圭二" <onoke...@gmail.com> wrote: > > yes, almost every time. > There are not any special operations. > Just run the tutorial demos. > From my feeling, it happens in R demo frequently. > > 2017-02-17 18:50 GMT+09:00 Jeff Zhang <zjf...@gmail.com>: > > > Is it easy to reproduce it ? > > 小野圭二 <onoke...@gmail.com>于2017年2月17日周五 下午5:47写道: > > I am facing on the same issue now. > > 2017-02-17 18:25 GMT+09:00 RUSHIKESH RAUT <rushikeshraut...@gmail.com>: > > Hi all, > > I am facing a issue while using Zeppelin. I am trying to load some > data(not that big data) into Zeppelin and then build some visualization on > it. The problem is that when I try to run the code first time it's working > but after some time the same code doesn't work. It remains in running state > on gui, but no logs are generated in Zeppelin logs. Also all further tasks > are hanging in pending state. > As soon as I restart Zeppelin it works. So I am guessing it's some memory > issue. I have read that Zeppelin stores the data in memory so it is > possible that it runs out of memory after some time. > How do I debug this issue? How much is the default memory that Zeppelin > takes at start? Also is there any way that I can run Zeppelin with > specified memory so that I can start the process with more memory. Because > it doesn't make sense to restart Zeppelin after every half hour > > Thanks, > Rushikesh Raut > > > >