Contributing to Spark in GSoC 2017

2016-11-09 Thread Krishna Kalyan
Hello,
I am Krishna, currently a 2nd year Masters student in (MSc. in Data Mining)
currently in Barcelona studying at Université Polytechnique de Catalogne.
I know its a little early for GSoC, however I wanted to get  a head start
working with the spark community.
Is there anyone who would be mentoring GSoC 2017?.
Could anyone please guide on how to go about it?.

Related Experience:
My masters is mostly focussed on data mining and machine learning
techniques. Before my masters, I was a  data engineer with IBM (India). I
was responsible for managing 50 node Hadoop Cluster for more than a year.
Most of my time was spent optimising and writing ETL (Apache Pig) jobs. Our
daily batch job aggregated more than 30gbs of CDR+Weblogs in our cluster.

I am the most comfortable with Python and R. (Not a Scala expert, I am sure
that I can pick it up quickly)

 My CV could be viewed by following the link below.
(https://github.com/krishnakalyan3/Resume/raw/master/Resume.pdf)

My Spark Pull Requests
(
https://github.com/apache/spark/pulls?utf8=%E2%9C%93=is%3Apr%20author%3Akrishnakalyan3%20
)

Thank you so much,
Krishna


Re: Running Unit Tests in pyspark failure

2016-11-03 Thread Krishna Kalyan
I could resolve this by passing the argument below
 ./python/run-tests --python-executables=python2.7

Thanks,
Krishna

On Thu, Nov 3, 2016 at 4:16 PM, Krishna Kalyan <krishnakaly...@gmail.com>
wrote:

> Hello,
> I am trying to run unit tests on pyspark.
>
> When I try to run unit test I am faced with errors.
> krishna@Krishna:~/Experiment/spark$ ./python/run-tests
> Running PySpark tests. Output is in /Users/krishna/Experiment/spar
> k/python/unit-tests.log
> Will test against the following Python executables: ['python2.6']
> Will test the following Python modules: ['pyspark-core', 'pyspark-ml',
> 'pyspark-mllib', 'pyspark-sql', 'pyspark-streaming']
> Please install unittest2 to test with Python 2.6 or earlier
> Had test failures in pyspark.sql.tests with python2.6; see logs.
>
> and when I try to Install unittest2, It says requirement already satisfied.
>
> krishna@Krishna:~/Experiment/spark$ sudo pip install --upgrade unittest2
> Password:
> Requirement already up-to-date: unittest2 in /usr/local/lib/python2.7/site-
> packages
> Requirement already up-to-date: argparse in 
> /usr/local/lib/python2.7/site-packages
> (from unittest2)
> Requirement already up-to-date: six>=1.4 in 
> /usr/local/lib/python2.7/site-packages
> (from unittest2)
> Requirement already up-to-date: traceback2 in
> /usr/local/lib/python2.7/site-packages (from unittest2)
> Requirement already up-to-date: linecache2 in
> /usr/local/lib/python2.7/site-packages (from traceback2->unittest2)
>
> Help!
>
> Thanks,
> Krishna
>
>
>
>
>


Running Unit Tests in pyspark failure

2016-11-03 Thread Krishna Kalyan
Hello,
I am trying to run unit tests on pyspark.

When I try to run unit test I am faced with errors.
krishna@Krishna:~/Experiment/spark$ ./python/run-tests
Running PySpark tests. Output is in /Users/krishna/Experiment/
spark/python/unit-tests.log
Will test against the following Python executables: ['python2.6']
Will test the following Python modules: ['pyspark-core', 'pyspark-ml',
'pyspark-mllib', 'pyspark-sql', 'pyspark-streaming']
Please install unittest2 to test with Python 2.6 or earlier
Had test failures in pyspark.sql.tests with python2.6; see logs.

and when I try to Install unittest2, It says requirement already satisfied.

krishna@Krishna:~/Experiment/spark$ sudo pip install --upgrade unittest2
Password:
Requirement already up-to-date: unittest2 in /usr/local/lib/python2.7/site-
packages
Requirement already up-to-date: argparse in
/usr/local/lib/python2.7/site-packages
(from unittest2)
Requirement already up-to-date: six>=1.4 in
/usr/local/lib/python2.7/site-packages
(from unittest2)
Requirement already up-to-date: traceback2 in
/usr/local/lib/python2.7/site-packages
(from unittest2)
Requirement already up-to-date: linecache2 in
/usr/local/lib/python2.7/site-packages
(from traceback2->unittest2)

Help!

Thanks,
Krishna


Contributing to PySpark

2016-10-18 Thread Krishna Kalyan
Hello,
I am a masters student. Could someone please let me know how set up my dev
working environment to contribute to pyspark.
Questions I had were:
a) Should I use Intellij Idea or PyCharm?.
b) How do I test my changes?.

Regards,
Krishna