by partition do you mean 14000 files loaded in each batch session (say
daily)?.

Have you actually tested this?

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 22 May 2016 at 20:24, swetha kasireddy <swethakasire...@gmail.com> wrote:

> The data is not very big. Say 1MB-10 MB at the max per partition. What is
> the best way to insert this 14k partitions with decent performance?
>
> On Sun, May 22, 2016 at 12:18 PM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> the acid question is how many rows are you going to insert in a batch
>> session? btw if this is purely an sql operation then you can do all that in
>> hive running on spark engine. It will be very fast as well.
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 22 May 2016 at 20:14, Jörn Franke <jornfra...@gmail.com> wrote:
>>
>>> 14000 partitions seem to be way too many to be performant (except for
>>> large data sets). How much data does one partition contain?
>>>
>>> > On 22 May 2016, at 09:34, SRK <swethakasire...@gmail.com> wrote:
>>> >
>>> > Hi,
>>> >
>>> > In my Spark SQL query to insert data, I have around 14,000 partitions
>>> of
>>> > data which seems to be causing memory issues. How can I insert the
>>> data for
>>> > 100 partitions at a time to avoid any memory issues?
>>> >
>>> >
>>> >
>>> > --
>>> > View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-insert-data-for-100-partitions-at-a-time-using-Spark-SQL-tp26997.html
>>> > Sent from the Apache Spark User List mailing list archive at
>>> Nabble.com.
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> > For additional commands, e-mail: user-h...@spark.apache.org
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>

Reply via email to