Github user mengxr commented on the issue: https://github.com/apache/spark/pull/22328 @mhamilton723 I thought about that option too. Loading general binary files is a useful feature but I don't feel it is necessary to pull it into the current scope. No matter whether the image data source has its own implementation or builds on top of the binary data source, I expect users to use ~~~scala spark.read.format("image").load("...") ~~~ to read images instead of something like: ~~~scala spark.read.format("binary").load("...").withColumn("image", decode($"binary")) ~~~ So we can definitely add binary file data source later and swap the implementation without changing the public interface. But we don't need to block this PR getting into 2.4, which will be cut soon. Sounds good?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org