Re: how to setup notebook storage path
ideally each user should use /home//zeppelin/notebooks folder. is there a way to do this? thank you From: Manuel Sopena Ballesteros Sent: Wednesday, 17 June 2020 1:32:16 AM To: users Subject: Re: how to setup notebook storage path thank you Jeff, do we need to put the full path? From: Jeff Zhang Sent: Tuesday, 16 June 2020 4:49:49 PM To: users Subject: Re: how to setup notebook storage path zeppelin.notebook.dir in zeppelin-site.xml is the notebook location for VFSNotebookRepo Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2020年6月16日周二 下午2:43写道: Dear Zeppelin community, I am using zeppelin 0.8.0 deployed by HDP/ambari, by default it uses FileSystemNotebookRepo as a notebook storage with path /user/. I would like to change it to VFSNotebookRepo instead of hadoop. I can change the zeppelin.notebook.storage in zeppelin-site configuration file so my questions are: * which is the default location where zeppelin will store the notebbooks under VFSNotebookRepo? * how can I specify the location of the notebooks? Thank you very much Manuel NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
Re: how to setup notebook storage path
thank you Jeff, do we need to put the full path? From: Jeff Zhang Sent: Tuesday, 16 June 2020 4:49:49 PM To: users Subject: Re: how to setup notebook storage path zeppelin.notebook.dir in zeppelin-site.xml is the notebook location for VFSNotebookRepo Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2020年6月16日周二 下午2:43写道: Dear Zeppelin community, I am using zeppelin 0.8.0 deployed by HDP/ambari, by default it uses FileSystemNotebookRepo as a notebook storage with path /user/. I would like to change it to VFSNotebookRepo instead of hadoop. I can change the zeppelin.notebook.storage in zeppelin-site configuration file so my questions are: * which is the default location where zeppelin will store the notebbooks under VFSNotebookRepo? * how can I specify the location of the notebooks? Thank you very much Manuel NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
how to setup path to store notes
Dear Zeppelin community, I am using zeppelin 0.8.0 deployed by HDP/ambari, by default it uses FileSystemNotebookRepo as a notebook storage with path /user/. I would like to change it to VFSNotebookRepo instead of hadoop. I can change the zeppelin.notebook.storage in zeppelin-site configuration file so my questions are: * which is the default location where zeppelin will store the notebbooks under VFSNotebookRepo? * how can I specify the location of the notebooks? Thank you very much Manuel NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
how to setup notebook storage path
Dear Zeppelin community, I am using zeppelin 0.8.0 deployed by HDP/ambari, by default it uses FileSystemNotebookRepo as a notebook storage with path /user/. I would like to change it to VFSNotebookRepo instead of hadoop. I can change the zeppelin.notebook.storage in zeppelin-site configuration file so my questions are: * which is the default location where zeppelin will store the notebbooks under VFSNotebookRepo? * how can I specify the location of the notebooks? Thank you very much Manuel NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
Re: Zeppelin context crashing
Im using 0.8.0 it works for spark2.pyspark and spark2.r, so far it only fails in spark2.scala From: Jeff Zhang Sent: Wednesday, 20 May 2020 11:57:22 AM To: users Subject: Re: Zeppelin context crashing Which version of zeppelin are you using ? I remember this is a bug of 0.8, but is fixed in 0.8.2 Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2020年5月20日周三 上午9:28写道: this is what I can see from the zeppelin logs DEBUG [2020-05-20 11:25:01,509] ({Exec Stream Pumper} RemoteInterpreterManagedProcess.java[processLine]:298) - 20/05/20 11:25:01 INFO Client: Application report for application_1587693971329_0042 (state: RUNNING) INFO [2020-05-20 11:25:01,753] ({pool-2-thread-74} SchedulerFactory.java[jobStarted]:109) - Job 20160223-144701_1698149301 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-anaconda3:mansop:-shared_session INFO [2020-05-20 11:25:01,754] ({pool-2-thread-74} Paragraph.java[jobRun]:380) - Run paragraph [paragraph_id: 20160223-144701_1698149301, interpreter: anaconda3.spark, note_id: 2BWJFTXKJ, user: mansop] DEBUG [2020-05-20 11:25:01,754] ({pool-2-thread-74} Paragraph.java[jobRun]:433) - RUN : z.input("name", "sun") DEBUG [2020-05-20 11:25:01,754] ({pool-2-thread-74} RemoteInterpreter.java[interpret]:207) - st: z.input("name", "sun") DEBUG [2020-05-20 11:25:01,758] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE_ALL, data:{"messages":[],"noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"}) DEBUG [2020-05-20 11:25:01,768] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:META_INFOS, data:{"message":"Spark UI enabled","url":"http://zeta-6-13-mlx.mlx:39578"}) DEBUG [2020-05-20 11:25:01,770] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE_ALL, data:{"messages":[],"noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"}) DEBUG [2020-05-20 11:25:01,795] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE, data:{"data":"","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301","type":"TEXT"}) DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, data:{"data":"\u003cconsole\u003e:24: error: not found: value z\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"}) DEBUG [2020-05-20 11:25:01,796] ({pool-26-thread-1} AppendOutputRunner.java[run]:91) - Processing time for append-output took 0 milliseconds DEBUG [2020-05-20 11:25:01,796] ({pool-26-thread-1} AppendOutputRunner.java[run]:107) - Processing size for append-output is 40 characters DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, data:{"data":" z.input(\"name\", \"sun\")\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"}) DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, data:{"data":" ^\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"}) DEBUG [2020-05-20 11:25:01,801] ({pool-2-thread-74} RemoteScheduler.java[run]:328) - Job Error, 20160223-144701_1698149301, null DEBUG [2020-05-20 11:25:01,896] ({pool-26-thread-1} AppendOutputRunner.java[run]:91) - Processing time for append-output took 0 milliseconds DEBUG [2020-05-20 11:25:01,896] ({pool-26-thread-1} AppendOutputRunner.java[run]:107) - Processing size for append-output is 39 characters INFO [2020-05-20 11:25:01,911] ({pool-2-thread-74} SchedulerFactory.java[jobFinished]:115) - Job 20160223-144701_1698149301 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-anaconda3:mansop:-shared_session DEBUG [2020-05-20 11:25:02,411] ({Exec Stream Pumper} RemoteInterpreterManagedP
Re: Zeppelin context crashing
this is what I can see from the zeppelin logs DEBUG [2020-05-20 11:25:01,509] ({Exec Stream Pumper} RemoteInterpreterManagedProcess.java[processLine]:298) - 20/05/20 11:25:01 INFO Client: Application report for application_1587693971329_0042 (state: RUNNING) INFO [2020-05-20 11:25:01,753] ({pool-2-thread-74} SchedulerFactory.java[jobStarted]:109) - Job 20160223-144701_1698149301 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-anaconda3:mansop:-shared_session INFO [2020-05-20 11:25:01,754] ({pool-2-thread-74} Paragraph.java[jobRun]:380) - Run paragraph [paragraph_id: 20160223-144701_1698149301, interpreter: anaconda3.spark, note_id: 2BWJFTXKJ, user: mansop] DEBUG [2020-05-20 11:25:01,754] ({pool-2-thread-74} Paragraph.java[jobRun]:433) - RUN : z.input("name", "sun") DEBUG [2020-05-20 11:25:01,754] ({pool-2-thread-74} RemoteInterpreter.java[interpret]:207) - st: z.input("name", "sun") DEBUG [2020-05-20 11:25:01,758] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE_ALL, data:{"messages":[],"noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"}) DEBUG [2020-05-20 11:25:01,768] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:META_INFOS, data:{"message":"Spark UI enabled","url":"http://zeta-6-13-mlx.mlx:39578"}) DEBUG [2020-05-20 11:25:01,770] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE_ALL, data:{"messages":[],"noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"}) DEBUG [2020-05-20 11:25:01,795] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_UPDATE, data:{"data":"","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301","type":"TEXT"}) DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, data:{"data":"\u003cconsole\u003e:24: error: not found: value z\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"}) DEBUG [2020-05-20 11:25:01,796] ({pool-26-thread-1} AppendOutputRunner.java[run]:91) - Processing time for append-output took 0 milliseconds DEBUG [2020-05-20 11:25:01,796] ({pool-26-thread-1} AppendOutputRunner.java[run]:107) - Processing size for append-output is 40 characters DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, data:{"data":" z.input(\"name\", \"sun\")\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"}) DEBUG [2020-05-20 11:25:01,796] ({Thread-1602} RemoteInterpreterEventPoller.java[run]:114) - Receive message from RemoteInterpreter Process: RemoteInterpreterEvent(type:OUTPUT_APPEND, data:{"data":" ^\n","index":"0","noteId":"2BWJFTXKJ","paragraphId":"20160223-144701_1698149301"}) DEBUG [2020-05-20 11:25:01,801] ({pool-2-thread-74} RemoteScheduler.java[run]:328) - Job Error, 20160223-144701_1698149301, null DEBUG [2020-05-20 11:25:01,896] ({pool-26-thread-1} AppendOutputRunner.java[run]:91) - Processing time for append-output took 0 milliseconds DEBUG [2020-05-20 11:25:01,896] ({pool-26-thread-1} AppendOutputRunner.java[run]:107) - Processing size for append-output is 39 characters INFO [2020-05-20 11:25:01,911] ({pool-2-thread-74} SchedulerFactory.java[jobFinished]:115) - Job 20160223-144701_1698149301 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-anaconda3:mansop:-shared_session DEBUG [2020-05-20 11:25:02,411] ({Exec Stream Pumper} RemoteInterpreterManagedProcess.java[processLine]:298) - 20/05/20 11:25:02 INFO Client: Application report for application_1587693971329_0043 (state: RUNNING) From: Manuel Sopena Ballesteros Sent: Wednesday, 20 May 2020 10:16:36 AM To: users@zeppelin.apache.org Subject: Zeppelin context crashing Dear Zeppelin community, For some reason my Zeppelin is not aware of the Zeppelin context paragraph %spark2.spark
Zeppelin context crashing
Dear Zeppelin community, For some reason my Zeppelin is not aware of the Zeppelin context paragraph %spark2.spark z.input("name", "sun") output :24: error: not found: value z z.input("name", "sun") ^ Any thoughts? thank you very much Manuel NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
predefined notes to new users
Dear Zeppelin community, We are using zeppelin through Hortonworks Data Platform. We realised that Zeppelin provides a set of predefined notes tutorials (eg Getting Started / Apache Spark in 5 Minutes) that is available to all new users. We would like to: - Delete those notes. - Create new notes as tutorials and make it available to all new users. How can we do that? thank you very much Manuel NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
Re: error restarting interpreter if shiro [url] /api/interpreter/** = authc is commented
thank you this works like a charm! From: meilfo...@gmx.net Sent: Wednesday, 29 April 2020 4:14:41 PM To: users@zeppelin.apache.org Cc: users Subject: Aw: error restarting interpreter if shiro [url] /api/interpreter/** = authc is commented Hi, try this: #Will allow all authenticated user to restart Interpreters /api/interpreter/setting/restart/** = authc #Will only allow the role "admin" to access/change Interpreter settings /api/interpreter/** = authc, roles[admin] Also change Interpreter mode to perUser (or perNote) and isolated as otherwise in case userA restarts an interpreter will have impact on userB interpreter instance (his instance is also gone). Also somtimes (when Interpreter crashed before due to e.g. Spark YARN app run out of memory) you need to click on "Restart Interpreter" twice as you get an error during the first attempt but second attempt/click will work. Regards, Tom Gesendet: Mittwoch, 29. April 2020 um 04:44 Uhr Von: "Manuel Sopena Ballesteros" An: "users" Betreff: error restarting interpreter if shiro [url] /api/interpreter/** = authc is commented I have restricted access to the interpreter configuration page by editing the shiro [url] section as follows [urls] # This section is used for url-based security. # You can secure interpreter, configuration and credential information by urls. Comment or uncomment the below urls that you want to hide. # anon means the access is anonymous. # authc means Form based Auth Security # To enfore security, comment the line below and uncomment the next one /api/version = anon /api/interpreter/** = authc, roles[admin] #/api/interpreter/** = authc /api/configurations/** = authc, roles[admin] /api/credential/** = authc, roles[admin] #/** = anon /** = authc I keep getting "Error restart interpreter." when try to restart the interpreter... How can I fix this so I can restart the interpreter at the same time access to the interpreter configuration section is not allowed? thank you NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
error restarting interpreter if shiro [url] /api/interpreter/** = authc is commented
I have restricted access to the interpreter configuration page by editing the shiro [url] section as follows [urls] # This section is used for url-based security. # You can secure interpreter, configuration and credential information by urls. Comment or uncomment the below urls that you want to hide. # anon means the access is anonymous. # authc means Form based Auth Security # To enfore security, comment the line below and uncomment the next one /api/version = anon /api/interpreter/** = authc, roles[admin] #/api/interpreter/** = authc /api/configurations/** = authc, roles[admin] /api/credential/** = authc, roles[admin] #/** = anon /** = authc I keep getting "Error restart interpreter." when try to restart the interpreter... How can I fix this so I can restart the interpreter at the same time access to the interpreter configuration section is not allowed? thank you NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
how to speedup AD authentication
Hi, Sometimes a user tries to login (to zeppelin) it takes few minutes... is there a way to speed this up? Thank you Manuel Sopena Ballesteros Big Data Engineer | Kinghorn Centre for Clinical Genomics [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/> a: 384 Victoria Street, Darlinghurst NSW 2010 p: +61 2 9355 5760 | +61 4 12 123 123 e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on Twitter<http://twitter.com/GarvanInstitute> and LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: Hiding shiro.ini and other sensitive files from end users
Have you setup impersonate in spark interpreter? Manuel From: Tony Primerano [mailto:primer...@tonycode.com] Sent: Thursday, November 21, 2019 12:35 PM To: users@zeppelin.apache.org Subject: Re: Hiding shiro.ini and other sensitive files from end users I am currently running in spark stand-alone mode. On Wed, Nov 20, 2019, 6:25 PM Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> wrote: Hi Tony, Are you running a yarn cluster? thanks Manuel From: Tony Primerano [mailto:primer...@tonycode.com<mailto:primer...@tonycode.com>] Sent: Thursday, November 21, 2019 9:08 AM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: Hiding shiro.ini and other sensitive files from end users Is there a recommended way to hide secrets contained in shirio.ini and other files? I made my shell interpreter run as a different user to prevent access to configuration files but from a python interpreter you can run shell commands as the Zeppelin process user. Is there a way to prevent this? Thanks Tony NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: Hiding shiro.ini and other sensitive files from end users
Hi Tony, Are you running a yarn cluster? thanks Manuel From: Tony Primerano [mailto:primer...@tonycode.com] Sent: Thursday, November 21, 2019 9:08 AM To: users@zeppelin.apache.org Subject: Hiding shiro.ini and other sensitive files from end users Is there a recommended way to hide secrets contained in shirio.ini and other files? I made my shell interpreter run as a different user to prevent access to configuration files but from a python interpreter you can run shell commands as the Zeppelin process user. Is there a way to prevent this? Thanks Tony NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: restrict interpreters to users
Rather than exception, I get an HTTP ERROR 503 when I hardcode a user in shiro config [cid:image002.png@01D59F98.6B7039E0] Manuel From: Manuel Sopena Ballesteros Sent: Wednesday, November 20, 2019 11:37 AM To: users@zeppelin.apache.org Subject: RE: restrict interpreters to users Unfortunately, zeppelin will throw an exception if I change the [user] section in shiro configuration. I guess this is because I am using AD integration hence local users are not allowed? Please advise Manuel From: iamabug [mailto:18133622...@163.com] Sent: Tuesday, November 19, 2019 4:54 PM To: users@zeppelin.apache.org Subject: Re: restrict interpreters to users I think you misconfigure [roles] paragraph and [users] paragraph. Suppose you want mansop to be an admin and alice to be a plain user without access to `interpreter` menu, you can try this: [users] mansop = password_for_mansop, admin alice = password_for_alice [roles] role1 = * role2 = * role3 = * admin = * note that alice is not an admin or any other special role so she can only use basic features. I think [roles] paragraph should be about role name and their permissions but I am not aware of any specific permissions and the documentation needs to provide more details. Just to be clear, if the configuration above is used, role1, role2, role3 have the same permissions as admin does. Please let me know if it works. On 11/19/2019 13:17,Manuel Sopena Ballesteros<mailto:manuel...@garvan.org.au> wrote: We are using shiro to authenticate against Active Directory. I changed the shiro configuration like this [roles] role1 = * role2 = * role3 = * admin = mansop however other users different than mansop can see and edit interpreters. NOTE: mansop is an AD login I would like to restrict users from editing or viewing interpreters. Any thoughts? Thank you Manuel From: iamabug [mailto:18133622...@163.com<mailto:18133622...@163.com>] Sent: Tuesday, November 19, 2019 12:31 PM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: Re:restrict interpreters to users Do you mean anonymous login by `by default` ? If yes, enabling Shiro authentication can change this ? Please refer to https://zeppelin.apache.org/docs/0.8.2/setup/security/shiro_authentication.html On 11/19/2019 09:28,Manuel Sopena Ballesteros<mailto:manuel...@garvan.org.au> wrote: Dear Zeppelin community, By default interpreters configuration can be changed by any user. Is there a way to avoid this? I would like to hide some interpreters so people can’t change them. Thank you very much Manuel Sopena Ballesteros Big Data Engineer | Kinghorn Centre for Clinical Genomics [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/> a: 384 Victoria Street, Darlinghurst NSW 2010 p: +61 2 9355 5760 | +61 4 12 123 123 e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on Twitter<http://twitter.com/GarvanInstitute> and LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: restrict interpreters to users
Unfortunately, zeppelin will throw an exception if I change the [user] section in shiro configuration. I guess this is because I am using AD integration hence local users are not allowed? Please advise Manuel From: iamabug [mailto:18133622...@163.com] Sent: Tuesday, November 19, 2019 4:54 PM To: users@zeppelin.apache.org Subject: Re: restrict interpreters to users I think you misconfigure [roles] paragraph and [users] paragraph. Suppose you want mansop to be an admin and alice to be a plain user without access to `interpreter` menu, you can try this: [users] mansop = password_for_mansop, admin alice = password_for_alice [roles] role1 = * role2 = * role3 = * admin = * note that alice is not an admin or any other special role so she can only use basic features. I think [roles] paragraph should be about role name and their permissions but I am not aware of any specific permissions and the documentation needs to provide more details. Just to be clear, if the configuration above is used, role1, role2, role3 have the same permissions as admin does. Please let me know if it works. On 11/19/2019 13:17,Manuel Sopena Ballesteros<mailto:manuel...@garvan.org.au> wrote: We are using shiro to authenticate against Active Directory. I changed the shiro configuration like this [roles] role1 = * role2 = * role3 = * admin = mansop however other users different than mansop can see and edit interpreters. NOTE: mansop is an AD login I would like to restrict users from editing or viewing interpreters. Any thoughts? Thank you Manuel From: iamabug [mailto:18133622...@163.com<mailto:18133622...@163.com>] Sent: Tuesday, November 19, 2019 12:31 PM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: Re:restrict interpreters to users Do you mean anonymous login by `by default` ? If yes, enabling Shiro authentication can change this ? Please refer to https://zeppelin.apache.org/docs/0.8.2/setup/security/shiro_authentication.html On 11/19/2019 09:28,Manuel Sopena Ballesteros<mailto:manuel...@garvan.org.au> wrote: Dear Zeppelin community, By default interpreters configuration can be changed by any user. Is there a way to avoid this? I would like to hide some interpreters so people can’t change them. Thank you very much Manuel Sopena Ballesteros Big Data Engineer | Kinghorn Centre for Clinical Genomics [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/> a: 384 Victoria Street, Darlinghurst NSW 2010 p: +61 2 9355 5760 | +61 4 12 123 123 e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on Twitter<http://twitter.com/GarvanInstitute> and LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: Re:restrict interpreters to users
We are using shiro to authenticate against Active Directory. I changed the shiro configuration like this [roles] role1 = * role2 = * role3 = * admin = mansop however other users different than mansop can see and edit interpreters. NOTE: mansop is an AD login I would like to restrict users from editing or viewing interpreters. Any thoughts? Thank you Manuel From: iamabug [mailto:18133622...@163.com] Sent: Tuesday, November 19, 2019 12:31 PM To: users@zeppelin.apache.org Subject: Re:restrict interpreters to users Do you mean anonymous login by `by default` ? If yes, enabling Shiro authentication can change this ? Please refer to https://zeppelin.apache.org/docs/0.8.2/setup/security/shiro_authentication.html On 11/19/2019 09:28,Manuel Sopena Ballesteros<mailto:manuel...@garvan.org.au> wrote: Dear Zeppelin community, By default interpreters configuration can be changed by any user. Is there a way to avoid this? I would like to hide some interpreters so people can’t change them. Thank you very much Manuel Sopena Ballesteros Big Data Engineer | Kinghorn Centre for Clinical Genomics [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/> a: 384 Victoria Street, Darlinghurst NSW 2010 p: +61 2 9355 5760 | +61 4 12 123 123 e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on Twitter<http://twitter.com/GarvanInstitute> and LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
restrict interpreters to users
Dear Zeppelin community, By default interpreters configuration can be changed by any user. Is there a way to avoid this? I would like to hide some interpreters so people can't change them. Thank you very much Manuel Sopena Ballesteros Big Data Engineer | Kinghorn Centre for Clinical Genomics [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/> a: 384 Victoria Street, Darlinghurst NSW 2010 p: +61 2 9355 5760 | +61 4 12 123 123 e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on Twitter<http://twitter.com/GarvanInstitute> and LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: send parameters to pyspark
Thank you very much, that worked What about passing –conf flag to pyspark? Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Friday, November 15, 2019 12:35 PM To: users Subject: Re: send parameters to pyspark you can set property spark.jars Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年11月15日周五 上午9:30写道: Dear zeppelin community, I need to send some parameters to pyspark so it can find extra jars. This is an example of the parameters I need to send to pyspark: pyspark \ --jars /share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar \ --conf spark.driver.extraClassPath=/share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar \ --conf spark.executor.extraClassPath=/share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar \ --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ --conf spark.kryo.registrator=is.hail.kryo.HailKryoRegistrator How could I configure my spark interpreter to do this? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
send parameters to pyspark
Dear zeppelin community, I need to send some parameters to pyspark so it can find extra jars. This is an example of the parameters I need to send to pyspark: pyspark \ --jars /share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar \ --conf spark.driver.extraClassPath=/share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar \ --conf spark.executor.extraClassPath=/share/ClusterShare/anaconda3/envs/python37/lib/python3.7/site-packages/hail/hail-all-spark.jar \ --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ --conf spark.kryo.registrator=is.hail.kryo.HailKryoRegistrator How could I configure my spark interpreter to do this? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: spark r interpreter resets working directory
Sorry, I got confused with the terminology (I meant paragraph instead of note) My interpreter is configured per user +isolated --> this means the same interpreter process (jvm process) for same user. First paragraph %anaconda3.r setwd("/home/mansop") getwd() output: [1] "/home/mansop" Second paragraph %anaconda3.r getwd() output: [1] "/d0/hadoop/yarn/local/usercache/mansop/appcache/application_1572410115474_0106/container_e16_1572410115474_0106_01_01" Why R does not carry the working directory to the second paragraph even if both are running in the same interpreter process? Thank you Manuel From: Manuel Sopena Ballesteros [mailto:manuel...@garvan.org.au] Sent: Wednesday, November 13, 2019 2:32 PM To: users@zeppelin.apache.org Subject: spark r interpreter resets working directory Dear Zeppelin community, I am testing spark r interpreter and realised it does not keep the working directory across notes. [cid:image001.png@01D59A4A.938EE2D0] What is the reason behind this behavior? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: spark r interpreter resets working directory
Ok, what should I do in order to be able to reuse variables across different notes? Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Wednesday, November 13, 2019 4:57 PM To: users Subject: Re: spark r interpreter resets working directory In that case, each user use different interpreter process. In your second note, the current working directory is the yarn container location which is expected Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年11月13日周三 下午1:50写道: Yarn cluster using impersonate (per user + isolated) I guess that means each note use different interpreters? Manuel From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Wednesday, November 13, 2019 2:35 PM To: users Subject: Re: spark r interpreter resets working directory Does your different notes share the same interpreter ? I suspect you are using per note isolated or scoped mode. Looks like you are local or yarn-client mode for the first note, but using yarn-cluster mode for the second note Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年11月13日周三 上午11:31写道: Dear Zeppelin community, I am testing spark r interpreter and realised it does not keep the working directory across notes. [cid:image001.png@01D59A44.11B5FA10] What is the reason behind this behavior? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: spark r interpreter resets working directory
Yarn cluster using impersonate (per user + isolated) I guess that means each note use different interpreters? Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Wednesday, November 13, 2019 2:35 PM To: users Subject: Re: spark r interpreter resets working directory Does your different notes share the same interpreter ? I suspect you are using per note isolated or scoped mode. Looks like you are local or yarn-client mode for the first note, but using yarn-cluster mode for the second note Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年11月13日周三 上午11:31写道: Dear Zeppelin community, I am testing spark r interpreter and realised it does not keep the working directory across notes. [cid:image001.png@01D59A42.7279ED90] What is the reason behind this behavior? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
spark r interpreter resets working directory
Dear Zeppelin community, I am testing spark r interpreter and realised it does not keep the working directory across notes. [cid:image001.png@01D59A2F.0E03FB20] What is the reason behind this behavior? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
python interperter not installing (directory already exists)
Hi, For some reason python interpreter is missing from the interpreter list so I am trying to reinstall it. $ sudo /usr/hdp/3.1.0.0-78/zeppelin/bin/install-interpreter.sh -n python Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/zeppelin/lib/interpreter/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/zeppelin/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/3.1.0.0-78/zeppelin/lib/slf4j-simple-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Directory /usr/hdp/3.1.0.0-78/zeppelin/interpreter/python already exists Skipped Question: is it ok to delete /usr/hdp/3.1.0.0-78/zeppelin/interpreter/python and reinstall? Thank you NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: can't plot
:37,711] ({pool-6-thread-2} Interpreter.java[getProperty]:222) - key: zeppelin.interpreter.localRepo, value: /usr/hdp/current/zeppelin-server/local-repo/mansop So I am confused because it says that ipython prerequisites are meet but still fails to start iphython interpreter So what is involved in the process to start ipython interpreter from zeppelin point of view? Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Wednesday, October 30, 2019 5:10 PM To: users Subject: Re: can't plot It might be due other reason, you can set the interpreter log level to be DEBUG to get more info. Add following into log4j.properties log4j.logger.org.apache.zeppelin.interpreter=DEBUG Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 下午1:51写道: Ok, One more question, I am getting an error when I force ipython %mansop.ipyspark print("Hello world!") java.io.IOException: Fail to launch IPython Kernel in 30 seconds at org.apache.zeppelin.python.IPythonInterpreter.launchIPythonKernel(IPythonInterpreter.java:297) at org.apache.zeppelin.python.IPythonInterpreter.open(IPythonInterpreter.java:154) at org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:66) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617) at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#ipython-support both grpcio and jupyter are installed any idea? Manuel From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Wednesday, October 30, 2019 12:53 PM To: users Subject: Re: can't plot Based on the error message, you are still using python instead of ipython. It is hard to tell what's wrong. One suggestion is to try 0.8.2 which is the latest release. Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 上午9:47写道: Didn’t like %matplotlib inline Traceback (most recent call last): File "/d1/hadoop/yarn/local/usercache/mansop/appcache/application_1570749574365_0083/container_e15_1570749574365_0083_01_01/tmp/zeppelin_pyspark-2736590645623350055.py", line 364, in code = compile('\n'.join(stmts), '', 'exec', ast.PyCF_ONLY_AST, 1) File "", line 1 %matplotlib inline ^ SyntaxError: invalid syntax Manuel From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Wednesday, October 30, 2019 12:43 PM To: users Subject: Re: can't plot Try this %pyspark %matplotlib inline import matplotlib.pyplot as plt plt.figure() plt.plot([1, 2, 3]) Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 上午9:39写道: Another example: %pyspark import matplotlib.pyplot as plt plt.plot([1, 2, 3]) z.show(plt) plt.close() According to documentation https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#matplotlib-integration Am I right assuming that I can use z.show in %pyspark? Thank you Manuel From: Manuel Sopena Ballesteros [mailto:manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>] Sent: Wednesday, October 30, 2019 12:12 PM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: can't plot Dear Zeppelin user community, I am running Zeppelin 0.8.0 and I am not able to print a plot using pyspark interpreter: This is my notebook: %pyspark import matplotlib.pyplot as plt plt.figure() plt.plot([1, 2, 3]) And this is the output: [] Any idea? NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. Th
RE: can't plot
Ok, One more question, I am getting an error when I force ipython %mansop.ipyspark print("Hello world!") java.io.IOException: Fail to launch IPython Kernel in 30 seconds at org.apache.zeppelin.python.IPythonInterpreter.launchIPythonKernel(IPythonInterpreter.java:297) at org.apache.zeppelin.python.IPythonInterpreter.open(IPythonInterpreter.java:154) at org.apache.zeppelin.spark.IPySparkInterpreter.open(IPySparkInterpreter.java:66) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617) at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#ipython-support both grpcio and jupyter are installed any idea? Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Wednesday, October 30, 2019 12:53 PM To: users Subject: Re: can't plot Based on the error message, you are still using python instead of ipython. It is hard to tell what's wrong. One suggestion is to try 0.8.2 which is the latest release. Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 上午9:47写道: Didn’t like %matplotlib inline Traceback (most recent call last): File "/d1/hadoop/yarn/local/usercache/mansop/appcache/application_1570749574365_0083/container_e15_1570749574365_0083_01_01/tmp/zeppelin_pyspark-2736590645623350055.py", line 364, in code = compile('\n'.join(stmts), '', 'exec', ast.PyCF_ONLY_AST, 1) File "", line 1 %matplotlib inline ^ SyntaxError: invalid syntax Manuel From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Wednesday, October 30, 2019 12:43 PM To: users Subject: Re: can't plot Try this %pyspark %matplotlib inline import matplotlib.pyplot as plt plt.figure() plt.plot([1, 2, 3]) Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 上午9:39写道: Another example: %pyspark import matplotlib.pyplot as plt plt.plot([1, 2, 3]) z.show(plt) plt.close() According to documentation https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#matplotlib-integration Am I right assuming that I can use z.show in %pyspark? Thank you Manuel From: Manuel Sopena Ballesteros [mailto:manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>] Sent: Wednesday, October 30, 2019 12:12 PM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: can't plot Dear Zeppelin user community, I am running Zeppelin 0.8.0 and I am not able to print a plot using pyspark interpreter: This is my notebook: %pyspark import matplotlib.pyplot as plt plt.figure() plt.plot([1, 2, 3]) And this is the output: [] Any idea? NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, c
RE: can't plot
Didn’t like %matplotlib inline Traceback (most recent call last): File "/d1/hadoop/yarn/local/usercache/mansop/appcache/application_1570749574365_0083/container_e15_1570749574365_0083_01_01/tmp/zeppelin_pyspark-2736590645623350055.py", line 364, in code = compile('\n'.join(stmts), '', 'exec', ast.PyCF_ONLY_AST, 1) File "", line 1 %matplotlib inline ^ SyntaxError: invalid syntax Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Wednesday, October 30, 2019 12:43 PM To: users Subject: Re: can't plot Try this %pyspark %matplotlib inline import matplotlib.pyplot as plt plt.figure() plt.plot([1, 2, 3]) Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年10月30日周三 上午9:39写道: Another example: %pyspark import matplotlib.pyplot as plt plt.plot([1, 2, 3]) z.show(plt) plt.close() According to documentation https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#matplotlib-integration Am I right assuming that I can use z.show in %pyspark? Thank you Manuel From: Manuel Sopena Ballesteros [mailto:manuel...@garvan.org.au<mailto:manuel...@garvan.org.au>] Sent: Wednesday, October 30, 2019 12:12 PM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: can't plot Dear Zeppelin user community, I am running Zeppelin 0.8.0 and I am not able to print a plot using pyspark interpreter: This is my notebook: %pyspark import matplotlib.pyplot as plt plt.figure() plt.plot([1, 2, 3]) And this is the output: [] Any idea? NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: can't plot
Another example: %pyspark import matplotlib.pyplot as plt plt.plot([1, 2, 3]) z.show(plt) plt.close() According to documentation https://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#matplotlib-integration Am I right assuming that I can use z.show in %pyspark? Thank you Manuel From: Manuel Sopena Ballesteros [mailto:manuel...@garvan.org.au] Sent: Wednesday, October 30, 2019 12:12 PM To: users@zeppelin.apache.org Subject: can't plot Dear Zeppelin user community, I am running Zeppelin 0.8.0 and I am not able to print a plot using pyspark interpreter: This is my notebook: %pyspark import matplotlib.pyplot as plt plt.figure() plt.plot([1, 2, 3]) And this is the output: [] Any idea? NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
can't plot
Dear Zeppelin user community, I am running Zeppelin 0.8.0 and I am not able to print a plot using pyspark interpreter: This is my notebook: %pyspark import matplotlib.pyplot as plt plt.figure() plt.plot([1, 2, 3]) And this is the output: [] Any idea? NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: error starting interpreter in yarn cluster mode
Zeppelin version is 0.8.0 No changes to the source code but this zeppelin is installed by HDP Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Friday, October 18, 2019 5:48 PM To: users Subject: Re: error starting interpreter in yarn cluster mode The error seems a little weird, what version of zeppelin do you use ? Did you do any change on the source code ? Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年10月18日周五 下午2:36写道: Dear Zeppelin community, I am playing the script below in Zeppelin yarn cluster mode: %pyspark print("Hello world!") output: :5: error: object zeppelin is not a member of package org.apache var value: org.apache.zeppelin.spark.SparkZeppelinContext = _ ^ :6: error: object zeppelin is not a member of package org.apache def set(x: Any) = value = x.asInstanceOf[org.apache.zeppelin.spark.SparkZeppelinContext] ^ /d1/hadoop/yarn/local/usercache/mansop/appcache/application_1570749574365_0038/container_e15_1570749574365_0038_01_01/tmp/zeppelin_pyspark-5060717441683949247.py:179: UserWarning: Unable to load inline matplotlib backend, falling back to Agg warnings.warn("Unable to load inline matplotlib backend, " Hello world! Any idea why am I getting these errors? Thank you NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
error starting interpreter in yarn cluster mode
Dear Zeppelin community, I am playing the script below in Zeppelin yarn cluster mode: %pyspark print("Hello world!") output: :5: error: object zeppelin is not a member of package org.apache var value: org.apache.zeppelin.spark.SparkZeppelinContext = _ ^ :6: error: object zeppelin is not a member of package org.apache def set(x: Any) = value = x.asInstanceOf[org.apache.zeppelin.spark.SparkZeppelinContext] ^ /d1/hadoop/yarn/local/usercache/mansop/appcache/application_1570749574365_0038/container_e15_1570749574365_0038_01_01/tmp/zeppelin_pyspark-5060717441683949247.py:179: UserWarning: Unable to load inline matplotlib backend, falling back to Agg warnings.warn("Unable to load inline matplotlib backend, " Hello world! Any idea why am I getting these errors? Thank you NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: thrift.transport.TTransportException
Hi Jeff, Sorry for the late response. I ran yarn-cluster mode with this setup %spark2.conf master yarn spark.submit.deployMode cluster zeppelin.pyspark.python /home/mansop/anaconda2/bin/python spark.driver.memory 10g I added ` log4j.logger.org.apache.zeppelin.interpreter=DEBUG` to the ` log4j_yarn_cluster.properties` file but nothing has changed, in fact the ` zeppelin-interpreter-spark2-mansop-root-zama-mlx.mlx.log` file is not updated after running my notes This code works %pyspark print("Hello world!") However this one does not work: %pyspark a = "bigword" aList = [] for i in range(1000): aList.append(i**i*a) #print aList for word in aList: print word which means I am still getting org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) and spark logs says: ERROR [2019-10-09 12:15:16,454] ({SIGTERM handler} SignalUtils.scala[apply$mcZ$sp]:43) - RECEIVED SIGNAL TERM … ERROR [2019-10-09 12:15:16,609] ({Reporter} Logging.scala[logError]:91) - Exception from Reporter thread. org.apache.hadoop.yarn.exceptions.ApplicationAttemptNotFoundException: Application attempt appattempt_1570490897819_0013_01 doesn't exist in ApplicationMasterService cache. Any idea? Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Friday, October 4, 2019 5:12 PM To: users Subject: Re: thrift.transport.TTransportException Then it looks like something wrong with the python process. Do you run it in yarn-cluster mode or yarn-client mode ? Try to add the following line to log4j.properties for yarn-client mode or log4j_yarn_cluster.properties for yarn-cluster mode log4j.logger.org.apache.zeppelin.interpreter=DEBUG And try it again, this time you will get more log info, I suspect the python process fail to start Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年10月4日周五 上午9:09写道: Sorry for the late response, Yes, I have successfully ran few simple scala codes using %spark interpreter in zeppelin. What should I do next? Manuel From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Tuesday, October 1, 2019 5:44 PM To: users Subject: Re: thrift.transport.TTransportException It looks like you are using pyspark, could you try just start scala spark interpreter via `%spark` ? First let's figure out whether it is related with pyspark. Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年10月1日周二 下午3:29写道: Dear Zeppelin community, I would like to ask for advice in regards an error I am having with thrift. I am getting quite a lot of these errors while running my notebooks org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:274) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:258) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:233) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:229) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(RemoteInterpreterProcess.java:135) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:228) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:437) at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:307) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) And this is the Spark driver application logs: … === YARN executor launch context: env: CLASSPATH -> {{PWD}}{{PWD}}/__spark_con
RE: thrift.transport.TTransportException
Sorry for the late response, Yes, I have successfully ran few simple scala codes using %spark interpreter in zeppelin. What should I do next? Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Tuesday, October 1, 2019 5:44 PM To: users Subject: Re: thrift.transport.TTransportException It looks like you are using pyspark, could you try just start scala spark interpreter via `%spark` ? First let's figure out whether it is related with pyspark. Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年10月1日周二 下午3:29写道: Dear Zeppelin community, I would like to ask for advice in regards an error I am having with thrift. I am getting quite a lot of these errors while running my notebooks org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:274) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:258) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:233) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:229) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(RemoteInterpreterProcess.java:135) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:228) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:437) at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:307) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) And this is the Spark driver application logs: … === YARN executor launch context: env: CLASSPATH -> {{PWD}}{{PWD}}/__spark_conf__{{PWD}}/__spark_libs__/*$HADOOP_CONF_DIR/usr/hdp/3.1.0.0-78/hadoop/*/usr/hdp/3.1.0.0-78/hadoop/lib/*/usr/hdp/current/hadoop-hdfs-client/*/usr/hdp/current/hadoop-hdfs-client/lib/*/usr/hdp/current/hadoop-yarn-client/*/usr/hdp/current/hadoop-yarn-client/lib/*$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/3.1.0.0-78/hadoop/lib/hadoop-lzo-0.6.0.3.1.0.0-78.jar:/etc/hadoop/conf/secure{{PWD}}/__spark_conf__/__hadoop_conf__ SPARK_YARN_STAGING_DIR -> hdfs://gl-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1568954689585_0052 SPARK_USER -> mansop PYTHONPATH -> /usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/spark2-client/python/:{{PWD}}/pyspark.zip{{PWD}}/py4j-0.10.7-src.zip command: LD_LIBRARY_PATH="/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:$LD_LIBRARY_PATH" \ {{JAVA_HOME}}/bin/java \ -server \ -Xmx1024m \ '-XX:+UseNUMA' \ -Djava.io.tmpdir={{PWD}}/tmp \ '-Dspark.history.ui.port=18081' \ -Dspark.yarn.app.container.log.dir= \ -XX:OnOutOfMemoryError='kill %p' \ org.apache.spark.executor.CoarseGrainedExecutorBackend \ --driver-url \ spark://coarsegrainedschedu...@r640-1-12-mlx.mlx:35602 \ --executor-id \ \ --hostname \ \ --cores \ 1 \ --app-id \ application_1568954689585_0052 \ --user-class-path \ file:$PWD/__app__.jar \ 1>/stdout \ 2>/stderr resources: __app__.jar -> resource { scheme: "hdfs&
thrift.transport.TTransportException
Dear Zeppelin community, I would like to ask for advice in regards an error I am having with thrift. I am getting quite a lot of these errors while running my notebooks org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_interpret(RemoteInterpreterService.java:274) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.interpret(RemoteInterpreterService.java:258) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:233) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter$4.call(RemoteInterpreter.java:229) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.callRemoteFunction(RemoteInterpreterProcess.java:135) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.interpret(RemoteInterpreter.java:228) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:437) at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:307) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) And this is the Spark driver application logs: ... === YARN executor launch context: env: CLASSPATH -> {{PWD}}{{PWD}}/__spark_conf__{{PWD}}/__spark_libs__/*$HADOOP_CONF_DIR/usr/hdp/3.1.0.0-78/hadoop/*/usr/hdp/3.1.0.0-78/hadoop/lib/*/usr/hdp/current/hadoop-hdfs-client/*/usr/hdp/current/hadoop-hdfs-client/lib/*/usr/hdp/current/hadoop-yarn-client/*/usr/hdp/current/hadoop-yarn-client/lib/*$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/3.1.0.0-78/hadoop/lib/hadoop-lzo-0.6.0.3.1.0.0-78.jar:/etc/hadoop/conf/secure{{PWD}}/__spark_conf__/__hadoop_conf__ SPARK_YARN_STAGING_DIR -> hdfs://gl-hdp-ctrl01-mlx.mlx:8020/user/mansop/.sparkStaging/application_1568954689585_0052 SPARK_USER -> mansop PYTHONPATH -> /usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip:/usr/hdp/current/spark2-client/python/:{{PWD}}/pyspark.zip{{PWD}}/py4j-0.10.7-src.zip command: LD_LIBRARY_PATH="/usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64:$LD_LIBRARY_PATH" \ {{JAVA_HOME}}/bin/java \ -server \ -Xmx1024m \ '-XX:+UseNUMA' \ -Djava.io.tmpdir={{PWD}}/tmp \ '-Dspark.history.ui.port=18081' \ -Dspark.yarn.app.container.log.dir= \ -XX:OnOutOfMemoryError='kill %p' \ org.apache.spark.executor.CoarseGrainedExecutorBackend \ --driver-url \ spark://coarsegrainedschedu...@r640-1-12-mlx.mlx:35602 \ --executor-id \ \ --hostname \ \ --cores \ 1 \ --app-id \ application_1568954689585_0052 \ --user-class-path \ file:$PWD/__app__.jar \ 1>/stdout \ 2>/stderr resources: __app__.jar -> resource { scheme: "hdfs" host: "gl-hdp-ctrl01-mlx.mlx" port: 8020 file: "/user/mansop/.sparkStaging/application_1568954689585_0052/spark-interpreter-0.8.0.3.1.0.0-78.jar" } size: 20433040 timestamp: 1569804142906 type: FILE visibility: PRIVATE __spark_conf__ -> resource { scheme: "hdfs" host: "gl-hdp-ctrl01-mlx.mlx" port: 8020 file: "/user/mansop/.sparkStaging/application_1568954689585_0052/__spark_conf__.zip" } size: 277725 timestamp: 1569804143239 type: ARCHIVE visibility: PRIVATE sparkr -> resource { scheme: "hdfs" host: "gl-hdp-ctrl01-mlx.mlx" port: 8020 file: "/user/mansop/.sparkStaging/application_1568954689585_0052/sparkr.zip" } size:
conda interpreter
Dear Zeppelin user community, I have a situation where I can't install R packages through zeppelin because: 1. R expects me to give some feedback like choosing a repository or agreeing to compile and install package from source code. 2. Be able to create multiple environments to keep different versions of python, R for each project For 1) I don't think zeppelin provides capabilities for user interaction. Am I right assuming this? For 2) How should I manage this? Documentation says I can use conda but this will only work for python... what about if I want to run my environment in Spark? What would you recommend? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
interactive notebook
Dear Zeppelin community, I am trying to install the following library [cid:image003.png@01D561B3.57C646F0] However when I run the command above `install.packages('Seurat')` in zeppelin notebook, it freezes, I guess because R is waiting the user to select an option. I know this is a silly example but this issue may happen in other situations. Is there a way I can setup zeppelin to run in interactive mode? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: conda and pyspark interpreter
This relates to python interpreter how would it work if I need to use the pyspark? Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Thursday, August 22, 2019 12:01 PM To: users Subject: Re: conda and pyspark interpreter See zeppelin doc . http://zeppelin.apache.org/docs/0.8.0/interpreter/python.html#conda Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年8月22日周四 上午9:57写道: Hi, Is there a way to integrate conda with pyspark interpreter so users can create list and activate environments? Thank you very much Manuel NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
conda and pyspark interpreter
Hi, Is there a way to integrate conda with pyspark interpreter so users can create list and activate environments? Thank you very much Manuel NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
spark interpreter "master" parameter always resets to yarn-client after restart zeppelin
Dear Zeppelin user community, I would like I a zeppelin installation with spark integration and the "master" parameter in the spark interpreter configuration always resets its value from "yarn" to "yarn-client" after zeppelin service reboot. How can I stop that? Thank you NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
python virtual environment on spark interpreter
Dear Zeppelin user community, I have a zeppelin installation connected to a Spark cluster. I setup Zeppelin to submit jobs in yarn cluster mode and also impersonation is enabled. Now I would like to be able to use a python virtual environment instead of system one. Is there a way I could specify the python parameter in the spark interpreter settings so is can point to specific folder use home folder (eg /home/{user_home}/python_virt_env/python) instead of a system one? If not how should I achieve what I want? Thank you Manuel NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
users needs to install their own python and R libraries
Dear Zeppelin user community, I am trying to setup python and R to submit jobs through Spark cluster. This is already done but now I need to enable the users to install their own libraries. I was thinking to ask the users to setup conda in their home directory and modify the `zeppelin.pyspark.python` to full conda python path. Then user should be able to enable either python2 or 3 using `generic configuration interpreter`. Is this the right way of doing what I am trying to do? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: multiple interpreters for spark python2 and 3
Hi, Do I need to create 2 spark interpreter groups or can I just create a new py3spark interpreter inside eexisting spark interpreter group like the example below? … { "group": "spark", "name": "pyspark", "className": "org.apache.zeppelin.spark.PySparkInterpreter", "properties": { "zeppelin.pyspark.python": { "envName": "PYSPARK_PYTHON", "propertyName": null, "defaultValue": "python", "description": "Python command to run pyspark with", "type": "string" }, "zeppelin.pyspark.useIPython": { "envName": null, "propertyName": "zeppelin.pyspark.useIPython", "defaultValue": true, "description": "whether use IPython when it is available", "type": "checkbox" } }, "editor": { "language": "python", "editOnDblClick": false, "completionKey": "TAB", "completionSupport": true } }, { "group": "spark", "name": "py3spark", "className": "org.apache.zeppelin.spark.PySparkInterpreter", "properties": { "zeppelin.py3spark.python": { "envName": "PYSPARK_PYTHON", "propertyName": null, "defaultValue": "python3.6", "description": "Python3.6 command to run pyspark with", "type": "string" }, "zeppelin.pyspark.useIPython": { "envName": null, "propertyName": "zeppelin.pyspark.useIPython", "defaultValue": true, "description": "whether use IPython when it is available", "type": "checkbox" } }, "editor": { "language": "python", "editOnDblClick": false, "completionKey": "TAB", "completionSupport": true } }, … Thank you Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Monday, August 12, 2019 5:46 PM To: users Subject: Re: multiple interpreters for spark python2 and 3 2 Approaches: 1. create 2 spark interpreters, one with python2 and another with python3 2. use generic configuration interpreter. https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235 Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年8月12日周一 下午3:41写道: Dear Zeppelin community, I have a zeppelin installation and a spark cluster. I need to provide options for users to run either python2 or 3 code using pyspark. At the moment the only way of doing this is by editing the spark interpreter and changing the `zeppelin.pyspark.python` from python to python3.6. Is there a way to copy/duplicate the spark interpreter one with python2 and the other with python3 so I can chose which one to use without leaving the notebook? Thank you NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
spark jobs in spark history
Dear Zeppelin community, I have a Zeppelin installation connected to Spark. I realized that zeppelin runs a spark job when it starts but I can't see each individual jobs submitted through zeppelin notebooks. Is this the expected behavior by design? Is there a way I can see in spark history server the different submissions from the zeppelin notebook? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
multiple interpreters for spark python2 and 3
Dear Zeppelin community, I have a zeppelin installation and a spark cluster. I need to provide options for users to run either python2 or 3 code using pyspark. At the moment the only way of doing this is by editing the spark interpreter and changing the `zeppelin.pyspark.python` from python to python3.6. Is there a way to copy/duplicate the spark interpreter one with python2 and the other with python3 so I can chose which one to use without leaving the notebook? Thank you NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: can't use @spark2.r interpreter
correct Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Friday, June 28, 2019 12:41 PM To: users Subject: Re: can't use @spark2.r interpreter Are you using HDP ? Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年6月28日周五 上午10:32写道: Dear Zeppelin community, I am trying to setup spark R interpreter in Zeppeling, however I can’t make it work. This is my notebook: %spark2.r 1 + 1 And this is the output: Error in dev.control(displaylist = if (record) "enable" else "inhibit"): dev.control() called without an open graphics device Any idea? Thank you Manuel Sopena Ballesteros Big Data Engineer | Kinghorn Centre for Clinical Genomics [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/> a: 384 Victoria Street, Darlinghurst NSW 2010 p: +61 2 9355 5760 | +61 4 12 123 123 e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on Twitter<http://twitter.com/GarvanInstitute> and LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
can't use @spark2.r interpreter
Dear Zeppelin community, I am trying to setup spark R interpreter in Zeppeling, however I can't make it work. This is my notebook: %spark2.r 1 + 1 And this is the output: Error in dev.control(displaylist = if (record) "enable" else "inhibit"): dev.control() called without an open graphics device Any idea? Thank you Manuel Sopena Ballesteros Big Data Engineer | Kinghorn Centre for Clinical Genomics [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/> a: 384 Victoria Street, Darlinghurst NSW 2010 p: +61 2 9355 5760 | +61 4 12 123 123 e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on Twitter<http://twitter.com/GarvanInstitute> and LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
error trying to run r script in spark2.r interpreter
java:174) at org.apache.zeppelin.spark.SparkRInterpreter.open(SparkRInterpreter.java:106) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617) at org.apache.zeppelin.scheduler.Job.run(Job.java:188) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) any advice? Manuel Sopena Ballesteros Big Data Engineer | Kinghorn Centre for Clinical Genomics [cid:image001.png@01D4C835.ED3C2230] <https://www.garvan.org.au/> a: 384 Victoria Street, Darlinghurst NSW 2010 p: +61 2 9355 5760 | +61 4 12 123 123 e: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> Like us on Facebook<http://www.facebook.com/garvaninstitute> | Follow us on Twitter<http://twitter.com/GarvanInstitute> and LinkedIn<http://www.linkedin.com/company/garvan-institute-of-medical-research> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: python interpreter not working
Same error without impersonation --> The interpreter will be instantiated “globally” in “shared” process [root@gl-hdp-ctrl01 zeppelin]# python -V Python 2.7.5 Thank you Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Wednesday, June 5, 2019 12:49 PM To: users Subject: Re: python interpreter not working Which zeppelin version do you use ? Does it work without impersonation ? Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>> 于2019年6月5日周三 上午10:38写道: Dear Zeppelin community, I am trying to setup the python interpreter. Installation is successful however I can’t make any python code to run. This is what I can see from the logs: INFO [2019-06-05 12:35:07,788] ({pool-2-thread-2} SchedulerFactory.java[jobStarted]:109) - Job 20190605-122140_1966429456 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-python:mansop:-shared_session INFO [2019-06-05 12:35:07,789] ({pool-2-thread-2} Paragraph.java[jobRun]:380) - Run paragraph [paragraph_id: 20190605-122140_1966429456, interpreter: python, note_id: 2EBKSAFA9, user: mansop] WARN [2019-06-05 12:35:17,799] ({pool-2-thread-2} NotebookServer.java[afterStatusChange]:2302) - Job 20190605-122140_1966429456 is finished, status: ERROR, exception: null, result: %text python is not responding INFO [2019-06-05 12:35:17,841] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:115) - Job 20190605-122140_1966429456 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-python:mansop:-shared_session My python interpreter is setup with impersonation Any thoughts? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. -- Best Regards Jeff Zhang NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
python interpreter not working
Dear Zeppelin community, I am trying to setup the python interpreter. Installation is successful however I can't make any python code to run. This is what I can see from the logs: INFO [2019-06-05 12:35:07,788] ({pool-2-thread-2} SchedulerFactory.java[jobStarted]:109) - Job 20190605-122140_1966429456 started by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-python:mansop:-shared_session INFO [2019-06-05 12:35:07,789] ({pool-2-thread-2} Paragraph.java[jobRun]:380) - Run paragraph [paragraph_id: 20190605-122140_1966429456, interpreter: python, note_id: 2EBKSAFA9, user: mansop] WARN [2019-06-05 12:35:17,799] ({pool-2-thread-2} NotebookServer.java[afterStatusChange]:2302) - Job 20190605-122140_1966429456 is finished, status: ERROR, exception: null, result: %text python is not responding INFO [2019-06-05 12:35:17,841] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:115) - Job 20190605-122140_1966429456 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-python:mansop:-shared_session My python interpreter is setup with impersonation Any thoughts? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: how to load pandas into pyspark (centos 6 with python 2.6)
Ok, this is what I am getting $/tmp/pythonvenv/bin/pip install pandas The directory '/home/zeppelin/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag. pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available. The directory '/home/zeppelin/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag. Collecting pandas Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/pandas/ Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/pandas/ Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/pandas/ Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/pandas/ Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/pandas/ Could not find a version that satisfies the requirement pandas (from versions: ) No matching distribution found for pandas Could not fetch URL https://pypi.python.org/simple/pandas/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.python.org', port=443): Max retries exceeded with url: /simple/pandas/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)) - skipping Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Friday, June 8, 2018 2:54 PM To: users@zeppelin.apache.org Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6) Just find pip in your python 3.6 folder, and run pip using full path. e.g. /tmp/Python-3.6.5/pip install pandas Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>>于2018年6月8日周五 下午12:47写道: Sorry for the stupid question How can I use pip? Zeppelin will run pip through the shell interpreter but my system global python is 2.6… [cid:image002.jpg@01D3FF37.8827CBF0] thanks Manuel From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Friday, June 8, 2018 1:45 PM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6) pip should be available under your python3.6.5, you can use that to install pandas Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午11:40写道: Hi Jeff, Thank you very much for your quick response. My zeppelin is deployed using HDP (hortonworks platform) so I already have spark/yarn integration and I am using zeppelin.pyspark.python to tell pyspark to run python 3.6: zeppelin.pyspark.python --> /tmp/Python-3.6.5/python I do have root access to the machine but OS is centos 6 (python system environment is 2.6) hence pip is not available Thank you Manuel From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Friday, June 8, 2018 11:47 AM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6) First I would suggest you to use python 2.7 or python 3.x, because spark2.x has drop the support of python 2.6. Second you need to configure PYSPARK_PYTHON in spark interpreter setting to point to the python that you installed. (I don't know what do you mena that you can't install pandas system wide). Do you mean you are not root and don't have permission to install python packages ? Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午9:26写道: Dear Zeppelin community, I am trying to load pandas into my zeppelin %spark2.pyspark interpreter. The system I am using is centos 6 with python 2.6 so I can’t install pandas system wide through pip as suggested in the documentation. What can I do if I want to add modules into the %spark2.pyspark interpreter? Thank you very much Manuel Sopena Ballesteros | Big data Engineer Garvan Institute of Medical Research The
RE: how to load pandas into pyspark (centos 6 with python 2.6)
Sorry for the stupid question How can I use pip? Zeppelin will run pip through the shell interpreter but my system global python is 2.6… [cid:image002.jpg@01D3FF37.8827CBF0] thanks Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Friday, June 8, 2018 1:45 PM To: users@zeppelin.apache.org Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6) pip should be available under your python3.6.5, you can use that to install pandas Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午11:40写道: Hi Jeff, Thank you very much for your quick response. My zeppelin is deployed using HDP (hortonworks platform) so I already have spark/yarn integration and I am using zeppelin.pyspark.python to tell pyspark to run python 3.6: zeppelin.pyspark.python --> /tmp/Python-3.6.5/python I do have root access to the machine but OS is centos 6 (python system environment is 2.6) hence pip is not available Thank you Manuel From: Jeff Zhang [mailto:zjf...@gmail.com<mailto:zjf...@gmail.com>] Sent: Friday, June 8, 2018 11:47 AM To: users@zeppelin.apache.org<mailto:users@zeppelin.apache.org> Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6) First I would suggest you to use python 2.7 or python 3.x, because spark2.x has drop the support of python 2.6. Second you need to configure PYSPARK_PYTHON in spark interpreter setting to point to the python that you installed. (I don't know what do you mena that you can't install pandas system wide). Do you mean you are not root and don't have permission to install python packages ? Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午9:26写道: Dear Zeppelin community, I am trying to load pandas into my zeppelin %spark2.pyspark interpreter. The system I am using is centos 6 with python 2.6 so I can’t install pandas system wide through pip as suggested in the documentation. What can I do if I want to add modules into the %spark2.pyspark interpreter? Thank you very much Manuel Sopena Ballesteros | Big data Engineer Garvan Institute of Medical Research The Kinghorn Cancer Centre, 370 Victoria Street, Darlinghurst, NSW 2010<https://maps.google.com/?q=370+Victoria+Street,+Darlinghurst,+NSW+2010&entry=gmail&source=g> T: + 61 (0)2 9355 5760 | F: +61 (0)2 9295 8507 | E: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
RE: how to load pandas into pyspark (centos 6 with python 2.6)
Hi Jeff, Thank you very much for your quick response. My zeppelin is deployed using HDP (hortonworks platform) so I already have spark/yarn integration and I am using zeppelin.pyspark.python to tell pyspark to run python 3.6: zeppelin.pyspark.python --> /tmp/Python-3.6.5/python I do have root access to the machine but OS is centos 6 (python system environment is 2.6) hence pip is not available Thank you Manuel From: Jeff Zhang [mailto:zjf...@gmail.com] Sent: Friday, June 8, 2018 11:47 AM To: users@zeppelin.apache.org Subject: Re: how to load pandas into pyspark (centos 6 with python 2.6) First I would suggest you to use python 2.7 or python 3.x, because spark2.x has drop the support of python 2.6. Second you need to configure PYSPARK_PYTHON in spark interpreter setting to point to the python that you installed. (I don't know what do you mena that you can't install pandas system wide). Do you mean you are not root and don't have permission to install python packages ? Manuel Sopena Ballesteros mailto:manuel...@garvan.org.au>>于2018年6月8日周五 上午9:26写道: Dear Zeppelin community, I am trying to load pandas into my zeppelin %spark2.pyspark interpreter. The system I am using is centos 6 with python 2.6 so I can’t install pandas system wide through pip as suggested in the documentation. What can I do if I want to add modules into the %spark2.pyspark interpreter? Thank you very much Manuel Sopena Ballesteros | Big data Engineer Garvan Institute of Medical Research The Kinghorn Cancer Centre, 370 Victoria Street, Darlinghurst, NSW 2010<https://maps.google.com/?q=370+Victoria+Street,+Darlinghurst,+NSW+2010&entry=gmail&source=g> T: + 61 (0)2 9355 5760 | F: +61 (0)2 9295 8507 | E: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed. NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
how to load pandas into pyspark (centos 6 with python 2.6)
Dear Zeppelin community, I am trying to load pandas into my zeppelin %spark2.pyspark interpreter. The system I am using is centos 6 with python 2.6 so I can't install pandas system wide through pip as suggested in the documentation. What can I do if I want to add modules into the %spark2.pyspark interpreter? Thank you very much Manuel Sopena Ballesteros | Big data Engineer Garvan Institute of Medical Research The Kinghorn Cancer Centre, 370 Victoria Street, Darlinghurst, NSW 2010 T: + 61 (0)2 9355 5760 | F: +61 (0)2 9295 8507 | E: manuel...@garvan.org.au<mailto:manuel...@garvan.org.au> NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.