Re: Issue when running toree, with jupyterhub, and ambari

2016-11-29 Thread Ian.Maloney
I find it strange that toree calls /etc/hadoop/conf/topology_script.py every second, even when nothing is being executed. Is there a way to turn that off in toree? Thanks, Ian On 11/18/16, 9:38 AM, "Maloney, Ian" wrote: >Like I mentioned below, JupyterHub is in a python 3 environment. I

Re: Issue when running toree, with jupyterhub, and ambari

2016-11-18 Thread Ian.Maloney
Like I mentioned below, JupyterHub is in a python 3 environment. I have a pyspark kernel and a toree kernel, both pointing to the same python. The pyspark kernel works fine. The toree kernel will print errors about the topology_script.py over and over. So it is specific to toree. Ian Maloney Plat

Re: Issue when running toree, with jupyterhub, and ambari

2016-11-17 Thread Ian.Maloney
I¹d prefer not to change those scripts, that¹s the issue. I¹m wondering why toree is running them, but not my pyspark notebook. Ian Maloney Platform Architect Advanced Analytics Internal: 828716 Office: (734) 623-8716 Mobile: (313) 910-9272 On 11/17/16, 3:15 PM, "Marius van Niekerk" wrot

Issue when running toree, with jupyterhub, and ambari

2016-11-17 Thread Ian.Maloney
Hi, I’m experiencing a strange issue when running a toree kernel with jupyterhub. The python version used for spark in the kernel.json is 2.7, I verified that in the notebook itself, but in the jupyterhub logs, I see errors from two python files, created by ambari: /usr/bin/hdp-select /etc/had

No module named pyspark

2016-11-08 Thread Ian.Maloney
Hi, I recently switched from using Toree with a local spark setup to using it with a yarn client setup. It seems like this may have caused an issue with pyspark. Now when I use anything from MLlib, I get this: Error from python worker: /app/hdp_app/anaconda/bin/python: No module named pyspark

Re: matplotlib in Toree

2016-11-07 Thread Ian.Maloney
I¹ve been having some issues with this logic. Once I get to sizable datasets (~ 40,000), it no longer works, while with smaller ones ~1000 it works. I¹m thinking the HTML becomes too much, since this works fine using seaborn/matplotlib alone: %%pyspark # seaborn stuff sns.set(style='darkgrid') s

Re: matplotlib in Toree

2016-11-07 Thread Ian.Maloney
Thanks Marius! This ended up working in python: kernel.display().html(³some html²) On 11/2/16, 7:46 PM, "Marius van Niekerk" wrote: >So matplotlib isn't very well supported at the moment. Toree's pyspark >kernel is not ipykernel -- so the magics that work with that (and >matplotlib int

matplotlib in Toree

2016-11-02 Thread Ian.Maloney
Hi, I just noticed that simple plots with matplotlib do not work in Toree. I get this error in the UI: Magic pyspark failed to execute with error: null was reset! In the logs I see: 16/11/02 14:12:15 ERROR PySparkProcessHandler: null process failed: org.apache.commons.exec.ExecuteException:

Re: Share Object across interpreters

2016-11-02 Thread Ian.Maloney
Chip, you’re right, this did the trick: %%pyspark print kernel.data().get(“x") Thanks so much for the help! On 11/2/16, 1:26 PM, "Chip Senkbeil" wrote: >I just did that using the RC3 version of Toree for the 0.1.x branch. If >you're on master, maybe it doesn't require _jvm_kernel. I ju

Re: Share Object across interpreters

2016-11-02 Thread Ian.Maloney
That is not working for me in the release I have 0.1.0… %%pyspark print dir(kernel._jvm_kernel) ['__call__', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr

Re: Share Object across interpreters

2016-11-02 Thread Ian.Maloney
Thanks Chip, now, I understand how to work with it from the JVM side. Any chance you have a snippet of how to get a value from the map in python? Ian Maloney Platform Architect Advanced Analytics Internal: 828716 Office: (734) 623-8716 Mobile: (313) 910-9272 On 11/2/16, 11:39 AM, "Chip Sen

Share Object across interpreters

2016-11-02 Thread Ian.Maloney
Hi, I’m working primarily using the default scala/spark interpreter. It works great, except when I need to plot something. Is there a way I can take a scala object or spark data frame I’ve created in a scala cell and pass it off to a pyspark cell for plotting? This documentation issue, might b