Was the original issue with Spark 1.1 (i.e. master branch) or an earlier release?
One possibility is that your S3 bucket is in a remote Amazon region, which would make it very slow. In my experience though saveAsTextFile has worked even for pretty large datasets in that situation, so maybe there's something else in your job causing a problem. Have you tried other operations on the data, like count(), or saving synthetic datasets (e.g. sc.parallelize(1 to 100*1000*1000, 20).saveAsTextFile(...)? Matei On August 25, 2014 at 12:09:25 PM, amnonkhen (amnon...@gmail.com) wrote: Hi jerryye, Maybe if you voted up my question on Stack Overflow it would get some traction and we would get nearer to a solution. Thanks, Amnon -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/saveAsTextFile-to-s3-on-spark-does-not-work-just-hangs-tp7795p7991.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org