[ https://issues.apache.org/jira/browse/SPARK-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15409183#comment-15409183 ]
Aseem Bansal commented on SPARK-16893: -------------------------------------- Reading a CSV causes an exception. Code used and excpetion are below. Also present in the github issue that I have referenced here. {code} public static void main(String[] args) { SparkSession spark = SparkSession .builder() .appName("my app") .getOrCreate(); Dataset<Row> df = spark.read() .format("com.databricks.spark.csv") .option("header", "true") .option("nullValue", "") .csv("/home/aseem/data.csv") ; df.show(); } {code} bq. Exception in thread "main" java.lang.RuntimeException: Multiple sources found for csv (org.apache.spark.sql.execution.datasources.csv.CSVFileFormat, com.databricks.spark.csv.DefaultSource15), please specify the fully qualified class name. People need to use format("csv"). I think that is counter intuitive seeing that I am using the CSV method. > Spark CSV Provider option is not documented > ------------------------------------------- > > Key: SPARK-16893 > URL: https://issues.apache.org/jira/browse/SPARK-16893 > Project: Spark > Issue Type: Documentation > Affects Versions: 2.0.0 > Reporter: Aseem Bansal > Priority: Minor > > I was working with databricks spark csv library and came across an error. I > have logged the issue in their github but it would be good to document that > in Apache Spark's documentation also > I faced it with CSV. Someone else faced that with JSON > http://stackoverflow.com/questions/38761920/spark2-0-error-multiple-sources-found-for-json-when-read-json-file > Complete Issue details here > https://github.com/databricks/spark-csv/issues/367 -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org