Re: spark core api vs. google cloud dataflow

2016-02-23 Thread Reynold Xin
That's the just transform function in DataFrame /** * Concise syntax for chaining custom transformations. * {{{ * def featurize(ds: DataFrame) = ... * * df * .transform(featurize) * .transform(...) * }}} * @since 1.6.0 */ def transform[U](t: DataFrame

spark core api vs. google cloud dataflow

2016-02-23 Thread lonely Feb
oogle Cloud Dataflow provides distributed dataset which called PCollection, and syntactic sugar based on PCollection is provided in the form of "apply". Note that "apply" is different from spark api "map" which passing each element of the source through a function func. I wonder can spark support