Github user CodingCat commented on a diff in the pull request:
https://github.com/apache/incubator-spark/pull/636#discussion_r9988672
--- Diff: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala
---
@@ -686,6 +649,47 @@ class PairRDDFunctions[K: ClassTag, V: ClassTag](self:
RDD[(K, V)])
}
/**
+ * Output the RDD to any Hadoop-supported storage system with new Hadoop
API, using a Hadoop
+ * Job object for that storage system. The Job should set an
OutputFormat and any output paths
+ * required (e.g. a table name to write to) in the same way as it would
be configured for a Hadoop
+ * MapReduce job.
+ */
+ def saveAsNewAPIHadoopDataset(job: NewAPIHadoopJob) {
--- End diff --
Hi @mateiz in the new API, the old JobConf is replaced by mapreduce.Job
(it's different from mapred.Job), I got this from here
http://www.slideshare.net/sh1mmer/upgrading-to-the-new-map-reduce-api (page 10)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
[email protected] or file a JIRA ticket with INFRA.
---