Hi,

How about trying to increate 'batchsize

On Wed, Apr 20, 2016 at 7:14 AM, Jonathan Gray <jonny.g...@gmail.com> wrote:

> Hi,
>
> I'm trying to write ~60 million rows from a DataFrame to a database using
> JDBC using Spark 1.6.1, something similar to df.write().jdbc(...)
>
> The write seems to not be performing well.  Profiling the application with
> a master of local[*] it appears there is not much socket write activity and
> also not much CPU.
>
> I would expect there to be an almost continuous block of socket write
> activity showing up somewhere in the profile.
>
> I can see that the top hot method involves
> apache.spark.unsafe.platform.CopyMemory all from calls within
> JdbcUtils.savePartition(...).  However, the CPU doesn't seem particularly
> stressed so I'm guessing this isn't the cause of the problem.
>
> Is there any best practices or has anyone come across a case like this
> before where a write to a database seems to perform poorly?
>
> Thanks,
> Jon
>



-- 
---
Takeshi Yamamuro

Reply via email to