Re: Spark 1.6.1 DataFrame write to JDBC

Takeshi Yamamuro Wed, 20 Apr 2016 22:17:42 -0700

Sorry to wrongly send message in mid.
How about trying to increate 'batchsize` in a jdbc option to improve
performance?


// maropu

On Thu, Apr 21, 2016 at 2:15 PM, Takeshi Yamamuro <linguin....@gmail.com>
wrote:

> Hi,
>
> How about trying to increate 'batchsize
>
> On Wed, Apr 20, 2016 at 7:14 AM, Jonathan Gray <jonny.g...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I'm trying to write ~60 million rows from a DataFrame to a database using
>> JDBC using Spark 1.6.1, something similar to df.write().jdbc(...)
>>
>> The write seems to not be performing well.  Profiling the application
>> with a master of local[*] it appears there is not much socket write
>> activity and also not much CPU.
>>
>> I would expect there to be an almost continuous block of socket write
>> activity showing up somewhere in the profile.
>>
>> I can see that the top hot method involves
>> apache.spark.unsafe.platform.CopyMemory all from calls within
>> JdbcUtils.savePartition(...).  However, the CPU doesn't seem particularly
>> stressed so I'm guessing this isn't the cause of the problem.
>>
>> Is there any best practices or has anyone come across a case like this
>> before where a write to a database seems to perform poorly?
>>
>> Thanks,
>> Jon
>>
>
>
>
> --
> ---
> Takeshi Yamamuro
>



-- 
---
Takeshi Yamamuro

Re: Spark 1.6.1 DataFrame write to JDBC

Reply via email to