Yes, it does work with fewer GZipped files. I am reading the files in using sc.textFile() and a pattern string.
For example: a = sc.textFile('s3n://bucket/2014-??-??/*.gz') a.count() Nick On Tue, May 20, 2014 at 10:09 PM, Madhu <ma...@madhu.com> wrote: > I have read gzip files from S3 successfully. > > It sounds like a file is corrupt or not a valid gzip file. > > Does it work with fewer gzip files? > How are you reading the files? > > > > > ----- > Madhu > https://www.linkedin.com/in/msiddalingaiah > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/count-ing-gz-files-gives-java-io-IOException-incorrect-header-check-tp5768p6149.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >