Well it could also depend on the receiving database. You should also check the 
executors. Updating to the latest version of the JDBC driver and JDK8, if 
supported by JDBC driver, could help.

> On 20 Apr 2016, at 00:14, Jonathan Gray <jonny.g...@gmail.com> wrote:
> 
> Hi,
> 
> I'm trying to write ~60 million rows from a DataFrame to a database using 
> JDBC using Spark 1.6.1, something similar to df.write().jdbc(...)
> 
> The write seems to not be performing well.  Profiling the application with a 
> master of local[*] it appears there is not much socket write activity and 
> also not much CPU.
> 
> I would expect there to be an almost continuous block of socket write 
> activity showing up somewhere in the profile.
> 
> I can see that the top hot method involves 
> apache.spark.unsafe.platform.CopyMemory all from calls within 
> JdbcUtils.savePartition(...).  However, the CPU doesn't seem particularly 
> stressed so I'm guessing this isn't the cause of the problem.
> 
> Is there any best practices or has anyone come across a case like this before 
> where a write to a database seems to perform poorly?
> 
> Thanks,
> Jon

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to