What's the best practice of creating RDD from some external unix command output? I assume if the output size is large (say millions of lines), creating RDD from an array of all lines is not a good idea? Thanks!
-- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Create-RDD-from-output-of-unix-command-tp23723.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org