I am working in an environment where data is stored in MS SQL Server. It has been secured so that only a specific set of machines can access the database through an integrated security Microsoft JDBC connection. We also have a couple of beefy linux machines we can use to host a Spark cluster but those machines do not have access to the databases directly. How can I pull the data from the SQL database on the smaller development machine and then have it distribute to the Spark cluster for processing? Can the driver pull data and then distribute execution?
Thanks, Thomas Ginter 801-448-7676 thomas.gin...@utah.edu --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org