Yes still I see more number of part files and exactly the number I have defined 
did spark.sql.shuffle.partitions

Sent from my iPhone

> On Oct 17, 2017, at 2:32 PM, Michael Artz <michaelea...@gmail.com> wrote:
> 
> Have you tried caching it and using a coalesce? 
> 
> 
> 
>> On Oct 17, 2017 1:47 PM, "KhajaAsmath Mohammed" <mdkhajaasm...@gmail.com> 
>> wrote:
>> I tried repartitions but spark.sql.shuffle.partitions is taking up 
>> precedence over repartitions or coalesce. how to get the lesser number of 
>> files with same performance?
>> 
>>> On Fri, Oct 13, 2017 at 3:45 AM, Tushar Adeshara 
>>> <tushar_adesh...@persistent.com> wrote:
>>> You can also try coalesce as it will avoid full shuffle.
>>> 
>>> 
>>> Regards,
>>> Tushar Adeshara
>>> 
>>> Technical Specialist – Analytics Practice
>>> 
>>> Cell: +91-81490 04192
>>> 
>>> Persistent Systems Ltd. | Partners in Innovation | www.persistentsys.com
>>> 
>>> 
>>> From: KhajaAsmath Mohammed <mdkhajaasm...@gmail.com>
>>> Sent: 13 October 2017 09:35
>>> To: user @spark
>>> Subject: Spark - Partitions
>>>  
>>> Hi,
>>> 
>>> I am reading hive query and wiriting the data back into hive after doing 
>>> some transformations.
>>> 
>>> I have changed setting spark.sql.shuffle.partitions to 2000 and since then 
>>> job completes fast but the main problem is I am getting 2000 files for each 
>>> partition 
>>> size of file is 10 MB .
>>> 
>>> is there a way to get same performance but write lesser number of files ?
>>> 
>>> I am trying repartition now but would like to know if there are any other 
>>> options.
>>> 
>>> Thanks,
>>> Asmath
>>> DISCLAIMER
>>> ==========
>>> This e-mail may contain privileged and confidential information which is 
>>> the property of Persistent Systems Ltd. It is intended only for the use of 
>>> the individual or entity to which it is addressed. If you are not the 
>>> intended recipient, you are not authorized to read, retain, copy, print, 
>>> distribute or use this message. If you have received this communication in 
>>> error, please notify the sender and delete all copies of this message. 
>>> Persistent Systems Ltd. does not accept any liability for virus infected 
>>> mails.
>> 

Reply via email to