Spark JDBC reads

El-Hassan Wanas Tue, 07 Mar 2017 03:11:49 -0800

Hello,

There is, as usual, a big table lying on some JDBC data source. I amdoing some data processing on that data from Spark, however, in order tospeed up my analysis, I use reduced encodings and minimize the generalsize of the data before processing.

Spark has been doing a great job at generating the proper workflows thatdo that preprocessing for me, but it seems to generate those workflowsfor execution on the Spark Cluster. The issue with that is the largetransfer cost is still incurred.

Is there any way to force Spark to run the preprocessing on the JDBCdata source and get the prepared output DataFrame instead?


Thanks,

Wanas


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Spark JDBC reads

Reply via email to