The only way I find is to turn it into a list - in effect holding everything in memory (see code below). Surely Spark has a better way.
Also what about unterminated iterables like a Fibonacci series - (useful only if limited in some other way ) /** * make an RDD from an iterable * @param inp input iterator * @param ctx context * @param <T> type * @return rdd from inerator as a list */ public static @Nonnull <T> JavaRDD<T> fromIterable(@Nonnull final Iterable<T> inp,@Nonnull final JavaSparkContext ctx) { List<T> holder = new ArrayList<T>(); for (T k : inp) { holder.add(k); } return ctx.parallelize(holder); }