https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/JdbcRDD.scala
On Thu, Aug 7, 2014 at 8:08 AM, 诺铁 <noty...@gmail.com> wrote: > I haven't seen people write directly to sql database, > mainly because it's difficult to deal with failure, > what if network broken in half of the process? should we drop all data in > database and restart from beginning? if the process is "Appending" data to > database, then things becomes even complex. > > but if this process can be doable, it would be a very good thing. > > > On Wed, Aug 6, 2014 at 11:24 PM, Yana <yana.kadiy...@gmail.com> wrote: > >> Hi Vida, >> >> I am writing to a DB -- or trying to :). >> >> I believe the best practice for this (you can search the mailing list >> archives) is to do a combination of mapPartitions and use a grouped >> iterator. >> Look at this thread, esp. the comment from A. Boisvert and Matei's comment >> above it: >> https://groups.google.com/forum/#!topic/spark-users/LUb7ZysYp2k >> >> Basically the short story is that you want to open as few connections as >> possible but write more than 1 insert at a time. >> >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Save-an-RDD-to-a-SQL-Database-tp11516p11549.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > -- Thomas Nieborowski 510-207-7049 mobile 510-339-1716 home