[GitHub] spark issue #22328: [SPARK-22666][ML][SQL] Spark datasource for image format

mengxr Tue, 04 Sep 2018 22:12:54 -0700

Github user mengxr commented on the issue:

    https://github.com/apache/spark/pull/22328
  
    @mhamilton723 I thought about that option too. Loading general binary files 
is a useful feature but I don't feel it is necessary to pull it into the 
current scope. No matter whether the image data source has its own 
implementation or builds on top of the binary data source, I expect users to use
    
    ~~~scala
    spark.read.format("image").load("...")
    ~~~
    
    to read images instead of something like:
    
    ~~~scala
    spark.read.format("binary").load("...").withColumn("image", 
decode($"binary"))
    ~~~
    
    So we can definitely add binary file data source later and swap the 
implementation without changing the public interface. But we don't need to 
block this PR getting into 2.4, which will be cut soon.
    
    Sounds good?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22328: [SPARK-22666][ML][SQL] Spark datasource for image format

Reply via email to