These are quite different creatures. You have a distributed set of Strings, but want a local stream of bytes, which involves three conversions:
- collect data to driver - concatenate strings in some way - encode strings as bytes according to an encoding Your approach is OK but might be faster to avoid disk, if you have enough memory: - collect() to a Array[String] locally - use Guava utilities to turn a bunch of Strings into a Reader - Use the Apache Commons ReaderInputStream to read it as encoded bytes I might wonder if that's all really what you want to do though. On Fri, Mar 13, 2015 at 9:54 AM, Ayoub <benali.ayoub.i...@gmail.com> wrote: > Hello, > > I need to convert an RDD[String] to a java.io.InputStream but I didn't find > an east way to do it. > Currently I am saving the RDD as temporary file and then opening an > inputstream on the file but that is not really optimal. > > Does anybody know a better way to do that ? > > Thanks, > Ayoub. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/RDD-to-InputStream-tp22031.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org