Check these links:

https://stackoverflow.com/questions/31610971/spark-repartition-vs-coalesce

https://medium.com/@mrpowers/managing-spark-partitions-with-coalesce-and-repartition-4050c57ad5c4



El dom., 5 may. 2019 a las 11:48, hemant singh (<hemant2...@gmail.com>)
escribió:

> Based on size of the output data you can do the math of how many file you
> will need to produce 100MB files. Once you have number of files you can do
> coalesce or repartition depending on whether your job writes more or less
> output partitions.
>
> On Sun, 5 May 2019 at 2:21 PM, rajat kumar <kumar.rajat20...@gmail.com>
> wrote:
>
>> Hi All,
>> My spark sql job produces output as per default partition and creates N
>> number of files.
>> I want to create each file as 100Mb sized in the final result.
>>
>> how can I do it ?
>>
>> thanks
>> rajat
>>
>>

-- 
Alonso Isidoro Roman
[image: https://]about.me/alonso.isidoro.roman
<https://about.me/alonso.isidoro.roman?promo=email_sig&utm_source=email_sig&utm_medium=email_sig&utm_campaign=external_links>

Reply via email to