Re: Re: spark dataframe jdbc read/write using dbcp connection pool

2016-01-20 Thread fightf...@163.com
cessfully. Do I need to increase the partitions? Or is there any other alternatives I can choose to tune this ? Best, Sun. fightf...@163.com From: fightf...@163.com Date: 2016-01-20 15:06 To: 刘虓 CC: user Subject: Re: Re: spark dataframe jdbc read/write using dbcp connection pool Hi, Thanks a lot

Re: Re: spark dataframe jdbc read/write using dbcp connection pool

2016-01-20 Thread 刘虓
; pass the numPartitions property > to get the partition purpose. Is this what you recommend ? Can you advice > a little more implementation on this ? > > Best, > Sun. > > -- > fightf...@163.com > > > *From:* 刘虓 <ipf...@gmail.com> &

Re: Re: spark dataframe jdbc read/write using dbcp connection pool

2016-01-20 Thread fightf...@163.com
: 2016-01-20 18:31 To: fightf...@163.com CC: user Subject: Re: Re: spark dataframe jdbc read/write using dbcp connection pool Hi, I think you can view the spark job ui to find out whether the partition works or not,pay attention to the storage page to the partition size and which stage / task fails

Re: Re: spark dataframe jdbc read/write using dbcp connection pool

2016-01-19 Thread fightf...@163.com
4") The added_year column in mysql table contains range of (1985-2015), and I pass the numPartitions property to get the partition purpose. Is this what you recommend ? Can you advice a little more implementation on this ? Best, Sun. fightf...@163.com From: 刘虓 Date: 2016-01-20 11:26

Re: spark dataframe jdbc read/write using dbcp connection pool

2016-01-19 Thread 刘虓
Hi, I suggest you partition the JDBC reading on a indexed column of the mysql table 2016-01-20 10:11 GMT+08:00 fightf...@163.com : > Hi , > I want to load really large volumn datasets from mysql using spark > dataframe api. And then save as > parquet file or orc file to