What is the end database, Have you checked the performance of your query at
the target?

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 19 April 2016 at 23:14, Jonathan Gray <jonny.g...@gmail.com> wrote:

> Hi,
>
> I'm trying to write ~60 million rows from a DataFrame to a database using
> JDBC using Spark 1.6.1, something similar to df.write().jdbc(...)
>
> The write seems to not be performing well.  Profiling the application with
> a master of local[*] it appears there is not much socket write activity and
> also not much CPU.
>
> I would expect there to be an almost continuous block of socket write
> activity showing up somewhere in the profile.
>
> I can see that the top hot method involves
> apache.spark.unsafe.platform.CopyMemory all from calls within
> JdbcUtils.savePartition(...).  However, the CPU doesn't seem particularly
> stressed so I'm guessing this isn't the cause of the problem.
>
> Is there any best practices or has anyone come across a case like this
> before where a write to a database seems to perform poorly?
>
> Thanks,
> Jon
>

Reply via email to