Cheng Lian created SPARK-14273:
----------------------------------

             Summary: Add FileFormat.isSplittable to indicate whether a format 
is splittable
                 Key: SPARK-14273
                 URL: https://issues.apache.org/jira/browse/SPARK-14273
             Project: Spark
          Issue Type: Sub-task
    Affects Versions: 2.0.0
            Reporter: Cheng Lian


{{FileSourceStrategy}} assumes that all data source formats are splittable and 
always splits data files by fixed partition size. However, not all HDSF based 
formats are splittable. We need a flag to indicate that and ensure that 
non-splittable files won't be split into multiple Spark partitions.

(PS: Is it "splitable" or "splittable"? Probably the latter one? Hadoop uses 
the former one though...)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to