Github user rxin commented on the pull request: https://github.com/apache/spark/pull/11270#issuecomment-186404419 In order to avoid breaking changes (e.g. we can always read Parquet with load), maybe we want to special case handle for Parquet beyond looking at file names. I looked at the binary protocol (see https://github.com/Parquet/parquet-format), and it looks like Parquet always start with "PAR1" in the beginning of the file. That is to say, if the first four bytes are: 0x50, 0x41, 0x52, 0x31, then it is a Parquet file.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org