Hi, How about trying to increate 'batchsize
On Wed, Apr 20, 2016 at 7:14 AM, Jonathan Gray <jonny.g...@gmail.com> wrote: > Hi, > > I'm trying to write ~60 million rows from a DataFrame to a database using > JDBC using Spark 1.6.1, something similar to df.write().jdbc(...) > > The write seems to not be performing well. Profiling the application with > a master of local[*] it appears there is not much socket write activity and > also not much CPU. > > I would expect there to be an almost continuous block of socket write > activity showing up somewhere in the profile. > > I can see that the top hot method involves > apache.spark.unsafe.platform.CopyMemory all from calls within > JdbcUtils.savePartition(...). However, the CPU doesn't seem particularly > stressed so I'm guessing this isn't the cause of the problem. > > Is there any best practices or has anyone come across a case like this > before where a write to a database seems to perform poorly? > > Thanks, > Jon > -- --- Takeshi Yamamuro