Re: Spark for client

2016-03-01 Thread Mohannad Ali
Jupyter (http://jupyter.org/) also supports Spark and generally it's a beast allows you to do so much more. On Mar 1, 2016 00:25, "Mich Talebzadeh" wrote: > Thank you very much both > > Zeppelin looks promising. Basically as I understand runs an agent on a > given port

Re: Support virtualenv in PySpark

2016-03-01 Thread Mohannad Ali
Hello Jeff, Well this would also mean that you have to manage the same virtualenv (same path) on all nodes and install your packages to it the same way you would if you would install the packages to the default python path. In any case at the moment you can already do what you proposed by

Re: Using functional programming rather than SQL

2016-02-24 Thread Mohannad Ali
e (assuming datasource is Hive or > some RDBMS) > > Regards > Sab > > Regards > Sab > On 24-Feb-2016 11:49 pm, "Mohannad Ali" <man...@gmail.com> wrote: > >> That is incorrect HiveContext does not need a hive instance to run. >> On Feb 24, 2016 19:15,

Re: Using functional programming rather than SQL

2016-02-24 Thread Mohannad Ali
That is incorrect HiveContext does not need a hive instance to run. On Feb 24, 2016 19:15, "Sabarish Sasidharan" < sabarish.sasidha...@manthan.com> wrote: > Yes > > Regards > Sab > On 24-Feb-2016 9:15 pm, "Koert Kuipers" wrote: > >> are you saying that HiveContext.sql(...)

Re: Spark Job Hanging on Join

2016-02-23 Thread Mohannad Ali
f auto > > broadcast and go with CatesianProduct in 1.6 > > > > On Mon, Feb 22, 2016 at 1:45 AM, Mohannad Ali <man...@gmail.com> wrote: > >> Hello everyone, > >> > >> I'm working with Tamara and I wanted to give you guys an update on the > &

Re: spark.driver.maxResultSize doesn't work in conf-file

2016-02-22 Thread Mohannad Ali
In spark-defaults you put the values like "spark.driver.maxResultSize 0" instead of "spark.driver.maxResultSize=0" I think. On Sat, Feb 20, 2016 at 3:40 PM, AlexModestov wrote: > I have a string spark.driver.maxResultSize=0 in the spark-defaults.conf. > But I get

Re: Spark Job Hanging on Join

2016-02-22 Thread Mohannad Ali
Hello everyone, I'm working with Tamara and I wanted to give you guys an update on the issue: 1. Here is the output of .explain(): > Project >

Re: [Example] : read custom schema from file

2016-02-22 Thread Mohannad Ali
Hello Divya, What kind of file? Best Regards, Mohannad On Mon, Feb 22, 2016 at 8:40 AM, Divya Gehlot wrote: > Hi, > Can anybody help me by providing me example how can we read schema of the > data set from the file. > > > > Thanks, > Divya >

Re: Submit custom python packages from current project

2016-02-16 Thread Mohannad Ali
n-client /path/to/project/main_script.py > > Regards, > Ram > > > On 16 February 2016 at 15:33, Mohannad Ali <man...@gmail.com> wrote: > >> Hello Everyone, >> >> I have code inside my project organized in packages and modules, however >> I keep getting the

Re: reading spark dataframe in python

2016-02-16 Thread Mohannad Ali
I think you need to consider using something like this: http://sparklingpandas.com/ On Tue, Feb 16, 2016 at 10:59 AM, Devesh Raj Singh wrote: > Hi, > > I want to read a spark dataframe using python and then convert the spark > dataframe to pandas dataframe then convert

Submit custom python packages from current project

2016-02-16 Thread Mohannad Ali
Hello Everyone, I have code inside my project organized in packages and modules, however I keep getting the error "ImportError: No module named " when I run spark on YARN. My directory structure is something like this: project/ package/ module.py __init__.py bin/