Re: Is it possible to pass additional parameters to a python function when used inside RDD.filter method?

2015-12-04 Thread Praveen Chundi
Passing a lambda function should work. my_rrd.filter(lambda x: myfunc(x,newparam)) Best regards, Praveen Chundi On 04.12.2015 13:19, Abhishek Shivkumar wrote: Hi, I am using spark with python and I have a filter constraint as follows: |my_rdd.filter(my_func)| where my_func is a method I

Re: merge 3 different types of RDDs in one

2015-12-01 Thread Praveen Chundi
cogroup could be useful to you, since all three are PairRDD's. https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.PairRDDFunctions Best Regards, Praveen On 01.12.2015 10:47, Shams ul Haque wrote: Hi All, I have made 3 RDDs of 3 different dataset, all RDDs are

Re: Distributing Python code packaged as tar balls

2015-11-17 Thread Praveen Chundi
, Davies Liu wrote: Python does not support library as tar balls, so PySpark may also not support that. On Wed, Nov 4, 2015 at 5:40 AM, Praveen Chundi <mail.chu...@gmail.com> wrote: Hi, Pyspark/spark-submit offers a --py-files handle to distribute python code for execution. Currently(versi

Distributing Python code packaged as tar balls

2015-11-04 Thread Praveen Chundi
Hi, Pyspark/spark-submit offers a --py-files handle to distribute python code for execution. Currently(version 1.5) only zip files seem to be supported, I have tried distributing tar balls unsuccessfully. Is it worth adding support for tar balls? Best regards, Praveen Chundi