On Wed, Jul 1, 2015 at 7:19 AM, Shushant Arora <shushantaror...@gmail.com> wrote:
> JavaRDD<String> rdd = javasparkcontext.parllelise(tables); You are already creating an RDD in Java here ;) However, it's not clear to me why you'd want to make this an RDD. Is the list of tables so large that it doesn't fit on a single machine? If not, you may be better off spinning up one spark job for dumping each table in tables using a JDBC datasource <https://spark.apache.org/docs/latest/sql-programming-guide.html#jdbc-to-other-databases> . On Wed, Jul 1, 2015 at 12:00 PM, Silvio Fiorito < silvio.fior...@granturing.com> wrote: > Sure, you can create custom RDDs. Haven’t done so in Java, but in Scala > absolutely. > > From: Shushant Arora > Date: Wednesday, July 1, 2015 at 1:44 PM > To: Silvio Fiorito > Cc: user > Subject: Re: custom RDD in java > > ok..will evaluate these options but is it possible to create RDD in > java? > > > On Wed, Jul 1, 2015 at 8:29 PM, Silvio Fiorito < > silvio.fior...@granturing.com> wrote: > >> If all you’re doing is just dumping tables from SQLServer to HDFS, have >> you looked at Sqoop? >> >> Otherwise, if you need to run this in Spark could you just use the >> existing JdbcRDD? >> >> >> From: Shushant Arora >> Date: Wednesday, July 1, 2015 at 10:19 AM >> To: user >> Subject: custom RDD in java >> >> Hi >> >> Is it possible to write custom RDD in java? >> >> Requirement is - I am having a list of Sqlserver tables need to be >> dumped in HDFS. >> >> So I have a >> List<String> tables = {dbname.tablename,dbname.tablename2......}; >> >> then >> JavaRDD<String> rdd = javasparkcontext.parllelise(tables); >> >> JavaRDDString> tablecontent = rdd.map(new >> Function<String,Iterable<String>>){fetch table and return populate iterable} >> >> tablecontent.storeAsTextFile("hffs path"); >> >> >> In rdd.map(new Function<String,>). I cannot keep complete table content >> in memory , so I want to creat my own RDD to handle it. >> >> Thanks >> Shushant >> >> >> >> >> >> >> >