by partition do you mean 14000 files loaded in each batch session (say daily)?.
Have you actually tested this? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 22 May 2016 at 20:24, swetha kasireddy <swethakasire...@gmail.com> wrote: > The data is not very big. Say 1MB-10 MB at the max per partition. What is > the best way to insert this 14k partitions with decent performance? > > On Sun, May 22, 2016 at 12:18 PM, Mich Talebzadeh < > mich.talebza...@gmail.com> wrote: > >> the acid question is how many rows are you going to insert in a batch >> session? btw if this is purely an sql operation then you can do all that in >> hive running on spark engine. It will be very fast as well. >> >> >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> On 22 May 2016 at 20:14, Jörn Franke <jornfra...@gmail.com> wrote: >> >>> 14000 partitions seem to be way too many to be performant (except for >>> large data sets). How much data does one partition contain? >>> >>> > On 22 May 2016, at 09:34, SRK <swethakasire...@gmail.com> wrote: >>> > >>> > Hi, >>> > >>> > In my Spark SQL query to insert data, I have around 14,000 partitions >>> of >>> > data which seems to be causing memory issues. How can I insert the >>> data for >>> > 100 partitions at a time to avoid any memory issues? >>> > >>> > >>> > >>> > -- >>> > View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-insert-data-for-100-partitions-at-a-time-using-Spark-SQL-tp26997.html >>> > Sent from the Apache Spark User List mailing list archive at >>> Nabble.com. >>> > >>> > --------------------------------------------------------------------- >>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> > For additional commands, e-mail: user-h...@spark.apache.org >>> > >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >