[ https://issues.apache.org/jira/browse/SPARK-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14058165#comment-14058165 ]
Tathagata Das commented on SPARK-2345: -------------------------------------- Then what I suggested earlier makes sense, doesnt it? DStream.foreachRDDPartition[T](function: Iterator[T] => Unit) This will effectively rdd.foreachPartition(function) on every RDD generated by the DStream. RDD.saveAsHadoopFile / RDD.newAPIHadoopFile should support any arbitrary OutputFormat that works with HDFS API. > ForEachDStream should have an option of running the foreachfunc on Spark > ------------------------------------------------------------------------ > > Key: SPARK-2345 > URL: https://issues.apache.org/jira/browse/SPARK-2345 > Project: Spark > Issue Type: Bug > Components: Streaming > Reporter: Hari Shreedharan > > Today the Job generated simply calls the foreachfunc, but does not run it on > spark itself using the sparkContext.runJob method. -- This message was sent by Atlassian JIRA (v6.2#6252)