Hi,

I've been trying to track down some problems with Spark reads being very
slow with s3n:// URIs (NativeS3FileSystem).  After some digging around, I
realized that this file system implementation fetches the entire file,
which isn't really a Spark problem, but it really slows down things when
trying to just read headers from a Parquet file or just creating partitions
in the RDD.  Is this something that others have observed before, or am I
doing something wrong?

Thanks,
Akshat

Reply via email to