Re: Save an RDD to a SQL Database

Cheng Lian Thu, 07 Aug 2014 08:25:34 -0700

Maybe a little off topic, but would you mind to share your motivation of saving 
the RDD into an SQL DB?


If you’re just trying to do further transformations/queries with SQL for 
convenience, then you may just use Spark SQL directly within your Spark 
application without saving them into DB:

  val sqlContext = new org.apache.spark.sql.SQLContext(sparkContext)
  import sqlContext._

  // First create a case class to describe your schema
  case class Record(fieldA: T1, fieldB: T2, …)

  // Transform RDD elements to Records and register it as a SQL table
  rdd.map(…).registerAsTable(“myTable”)

  // Torture them until they tell you the truth :)
  sql(“SELECT fieldA FROM myTable WHERE fieldB > 10”)

On Aug 6, 2014, at 11:29 AM, Vida Ha <vid...@gmail.com> wrote:

> 
> Hi,
> 
> I would like to save an RDD to a SQL database.  It seems like this would be a 
> common enough use case.  Are there any built in libraries to do it?
> 
> Otherwise, I'm just planning on mapping my RDD, and having that call a method 
> to write to the database.   Given that a lot of records are going to be 
> written, the code would need to be smart and do a batch insert after enough 
> records have collected.  Does that sound like a reasonable approach?
> 
> 
> -Vida
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Save an RDD to a SQL Database

Reply via email to