Re: reuse hadoop code in Spark

2014-06-05 Thread Wei Tan
://researcher.ibm.com/person/us-wtan From: Matei Zaharia matei.zaha...@gmail.com To: user@spark.apache.org, Date: 06/04/2014 04:28 PM Subject:Re: reuse hadoop code in Spark Yes, you can write some glue in Spark to call these. Some functions to look at: - SparkContext.hadoopRDD lets you

Re: reuse hadoop code in Spark

2014-06-05 Thread Matei Zaharia
@spark.apache.org, Date:06/04/2014 04:28 PM Subject:Re: reuse hadoop code in Spark Yes, you can write some glue in Spark to call these. Some functions to look at: - SparkContext.hadoopRDD lets you create an input RDD from an existing JobConf configured by Hadoop

reuse hadoop code in Spark

2014-06-04 Thread Wei Tan
Hello, I am trying to use spark in such a scenario: I have code written in Hadoop and now I try to migrate to Spark. The mappers and reducers are fairly complex. So I wonder if I can reuse the map() functions I already wrote in Hadoop (Java), and use Spark to chain them, mixing the Java

Re: reuse hadoop code in Spark

2014-06-04 Thread Matei Zaharia
Yes, you can write some glue in Spark to call these. Some functions to look at: - SparkContext.hadoopRDD lets you create an input RDD from an existing JobConf configured by Hadoop (including InputFormat, paths, etc) - RDD.mapPartitions lets you operate in all the values on one partition (block)