Hi Deenar,
Thanks for your valuable inputs
Here is a situation, if a Source Table does not have any such column(unique
values,numeric and sequential) which is suitable as Partition Column to be
specified for JDBCRDD Constructor or DataSource API.How to proceed further
on this scenario and also
You have 2 options
a) don't use partitioning, if the table is small spark will only use one
task to load it
val jdbcDF = sqlContext.read.format("jdbc").options(
Map("url" -> "jdbc:postgresql:dbserver",
"dbtable" -> "schema.tablename")).load()
b) create a view that includes hashcode column
HI Deenar,
Please find the SQL query below:
var SQL_RDD= new JdbcRDD( sc, ()=>
DriverManager.getConnection(url,user,pass),"select col1, col2,
col3..col 37 from schema.Table LIMIT ? OFFSET ?",100,0,*1*,(r:
ResultSet) => (r.getInt("col1"),r.getInt("col2")...r.getInt("col37")))
When I
On 24 September 2015 at 17:48, Deenar Toraskar <
deenar.toras...@thinkreactive.co.uk> wrote:
> you are interpreting the JDBCRDD API incorrectly. If you want to use
> partitions, then the column used to partition and present in the where
> clause must be numeric and the lower bound and upper bound
Which version of Spark you are using ?? I can get correct results using
JdbcRDD. Infact there is a test suite precisely for this (JdbcRDDSuite) .
I changed according to your input and got correct results from this test
suite.
On Wed, Sep 23, 2015 at 11:00 AM, satish chandra j
I am using Spark 1.5. I always get count = 100, irrespective of num
partitions.
On Wed, Sep 23, 2015 at 5:00 PM, satish chandra j
wrote:
> HI,
> Currently using Spark 1.2.2, could you please let me know correct results
> output count which you got it by using
HI,
Could anybody provide inputs if they have came across similar issue
@Rishitesh
Could you provide if any sample code to use JdbcRDDSuite
Regards,
Satish Chandra
On Wed, Sep 23, 2015 at 5:14 PM, Rishitesh Mishra
wrote:
> I am using Spark 1.5. I always get count =
HI,
Currently using Spark 1.2.2, could you please let me know correct results
output count which you got it by using JdbcRDDSuite
Regards,
Satish Chandra
On Wed, Sep 23, 2015 at 4:02 PM, Rishitesh Mishra
wrote:
> Which version of Spark you are using ?? I can get
Satish
Can you post the SQL query you are using?
The SQL query must have 2 placeholders and both of them should be an
inclusive range (<= and >=)..
e.g. select title, author from books where ? <= id and id <= ?
Are you doing this?
Deenar
On 23 September 2015 at 20:18, Deenar Toraskar <