If your use case is more to do with querying RDBMS and then bringing the
results to spark do some analysis then Spark SQL JDBC datasource API
<http://www.sparkexpert.com/2015/03/28/loading-database-data-into-spark-using-data-sources-api/>
is the best. If your use case is to bring entire data to spark, then you'll
have to explore other options which depends on the datatype. For e.g. Spark
RedShift integration
<http://spark-packages.org/package/databricks/spark-redshift>

Best Regards,

Sujeevan. N

On Sat, Apr 25, 2015 at 4:24 PM, ayan guha <guha.a...@gmail.com> wrote:

> Actually, Spark SQL provides a data source. Here is from documentation -
>
> JDBC To Other Databases
>
> Spark SQL also includes a data source that can read data from other
> databases using JDBC. This functionality should be preferred over using
> JdbcRDD
> <https://spark.apache.org/docs/1.3.1/api/scala/index.html#org.apache.spark.rdd.JdbcRDD>.
> This is because the results are returned as a DataFrame and they can easily
> be processed in Spark SQL or joined with other data sources. The JDBC data
> source is also easier to use from Java or Python as it does not require the
> user to provide a ClassTag. (Note that this is different than the Spark SQL
> JDBC server, which allows other applications to run queries using Spark
> SQL).
>
> On Fri, Apr 24, 2015 at 6:27 PM, ayan guha <guha.a...@gmail.com> wrote:
>
>> What is the specific usecase? I can think of couple of ways (write to
>> hdfs and then read from spark or stream data to spark). Also I have seen
>> people using mysql jars to bring data in. Essentially you want to simulate
>> creation of rdd.
>> On 24 Apr 2015 18:15, "sequoiadb" <mailing-list-r...@sequoiadb.com>
>> wrote:
>>
>>> If I run spark in stand-alone mode ( not YARN mode ), is there any tool
>>> like Sqoop that able to transfer data from RDBMS to spark storage?
>>>
>>> Thanks
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>
>
> --
> Best Regards,
> Ayan Guha
>

Reply via email to