>From the performance and scalability standpoint, is it better to plug in, say
a multi-threaded pipeliner into a Spark job, or implement pipelining via
Spark's own transformation mechanisms such as e.g. map or filter?

I'm seeing some reference architectures where things like 'morphlines' are
plugged into Spark but it'd seem that Spark may yield better performance and
scalability if each stage within a pipeline is a function in a Spark job - ?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Pipelining-with-Spark-tp22976.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to