Re: PySpark working with Generators

2017-07-05 Thread Saatvik Shah
u want? >> >> >> >> *From:* Saatvik Shah [mailto:saatvikshah1...@gmail.com] >> *Sent:* Friday, June 30, 2017 8:55 AM >> *To:* ayan guha >> *Cc:* user >> *Subject:* Re: PySpark working with Generators >> >> >> >> Hey Ayan, >>

Re: PySpark working with Generators

2017-06-30 Thread Jörn Franke
istent.com> wrote: >> Wouldn’t this work if you load the files in hdfs and let the partitions be >> equal to the amount of parallelism you want? >> >> >> >> From: Saatvik Shah [mailto:saatvikshah1...@gmail.com] >> Sent: Friday,

Re: PySpark working with Generators

2017-06-30 Thread Saatvik Shah
h1...@gmail.com] > *Sent:* Friday, June 30, 2017 8:55 AM > *To:* ayan guha > *Cc:* user > *Subject:* Re: PySpark working with Generators > > > > Hey Ayan, > > > > This isnt a typical text file - Its a proprietary data format for which a > native Spark reader

RE: PySpark working with Generators

2017-06-29 Thread Mahesh Sawaiker
Wouldn’t this work if you load the files in hdfs and let the partitions be equal to the amount of parallelism you want? From: Saatvik Shah [mailto:saatvikshah1...@gmail.com] Sent: Friday, June 30, 2017 8:55 AM To: ayan guha Cc: user Subject: Re: PySpark working with Generators Hey Ayan

Re: PySpark working with Generators

2017-06-29 Thread ayan guha
e cores and a lower memory. I'm not sure how to tackle this >>> since generators cannot be pickled and thus I'm not sure how to ditribute >>> the work of reading each file_path on the rdd? >>> >>> >>> >>> -- >>> View this message in conte

Re: PySpark working with Generators

2017-06-29 Thread Saatvik Shah
; I'd like to now do something similar but with the generator, so that I can >> work with more cores and a lower memory. I'm not sure how to tackle this >> since generators cannot be pickled and thus I'm not sure how to ditribute >> the work of reading each file_path on the rdd? >&

Re: PySpark working with Generators

2017-06-29 Thread ayan guha
sage in context: > http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-working-with-Generators-tp28810.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Best Regards, Ayan Guha

PySpark working with Generators

2017-06-29 Thread saatvikshah1994
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-working-with-Generators-tp28810.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: use