u want?
>>
>>
>>
>> *From:* Saatvik Shah [mailto:saatvikshah1...@gmail.com]
>> *Sent:* Friday, June 30, 2017 8:55 AM
>> *To:* ayan guha
>> *Cc:* user
>> *Subject:* Re: PySpark working with Generators
>>
>>
>>
>> Hey Ayan,
>>
istent.com> wrote:
>> Wouldn’t this work if you load the files in hdfs and let the partitions be
>> equal to the amount of parallelism you want?
>>
>>
>>
>> From: Saatvik Shah [mailto:saatvikshah1...@gmail.com]
>> Sent: Friday,
h1...@gmail.com]
> *Sent:* Friday, June 30, 2017 8:55 AM
> *To:* ayan guha
> *Cc:* user
> *Subject:* Re: PySpark working with Generators
>
>
>
> Hey Ayan,
>
>
>
> This isnt a typical text file - Its a proprietary data format for which a
> native Spark reader
Wouldn’t this work if you load the files in hdfs and let the partitions be
equal to the amount of parallelism you want?
From: Saatvik Shah [mailto:saatvikshah1...@gmail.com]
Sent: Friday, June 30, 2017 8:55 AM
To: ayan guha
Cc: user
Subject: Re: PySpark working with Generators
Hey Ayan
e cores and a lower memory. I'm not sure how to tackle this
>>> since generators cannot be pickled and thus I'm not sure how to ditribute
>>> the work of reading each file_path on the rdd?
>>>
>>>
>>>
>>> --
>>> View this message in conte
; I'd like to now do something similar but with the generator, so that I can
>> work with more cores and a lower memory. I'm not sure how to tackle this
>> since generators cannot be pickled and thus I'm not sure how to ditribute
>> the work of reading each file_path on the rdd?
>&
sage in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-working-with-Generators-tp28810.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
> --
Best Regards,
Ayan Guha
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/PySpark-working-with-Generators-tp28810.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe e-mail: use