The  hadoop docs about s3 <http://wiki.apache.org/hadoop/AmazonS3>   (linked
to by the Spark docs) say that s3n files are subject to "the 5GB limit on
file size imposed by S3."  However,  limit was raised
<http://www.computerworld.com/s/article/9200763/Amazon_s_S3_can_now_store_files_of_up_to_5TB>
  
about three years ago.  So it wasn't clear to me whether this limit still
applies to Hadoops s3n urls.

Well, I tried running a spark job on a 200GB s3n file, and it ran fine.  Has
this been other people's experience?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/s3n-5GB-tp943.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to