Re: Spark Streaming : Multiple sources found for csv : Error

Srabasti Banerjee Thu, 30 Aug 2018 22:17:19 -0700

Great we are already discussing/working to fix the issue.Happy to help if I can 
:-)


Any workarounds that we can use for now?
Please note I am not invoking any additional packages while running spark 
submit on the thin jar.
Thanks,Srabasti Banerjee





   On Thursday, 30 August, 2018, 9:02:11 PM GMT-7, Hyukjin Kwon 
<gurwls...@gmail.com> wrote:  
 
 Yea, this is exactly what I have been worried of the recent changes (discussed 
in https://issues.apache.org/jira/browse/SPARK-24924)See 
https://github.com/apache/spark/pull/17916. This should be fine in upper Spark 
versions.

FYI, +Wechen and DongjoonI want to add Thomas Graves and Gengliang Wang too but 
can't fine their email addresses.
2018년 8월 31일 (금) 오전 11:52, Srabasti Banerjee <srabast...@ymail.com.invalid>님이 
작성:

Hi,
I am trying to run below code to read file as a dataframe onto a Stream (for 
Spark Streaming) developed via Eclipse IDE, defining schemas appropriately, by 
running thin jar on server and am getting error below. Tried out suggestions 
from researching on internet based on "spark.read.option.schema.csv" similar 
errors with no success.
Am thinking this can be a bug as the changes might not have been done for 
readStream option? Has anybody encountered similar issue for Spark Streaming?
Looking forward to hear your response(s)!
ThanksSrabasti Banerjee

Error
Exception in thread "main" java.lang.RuntimeException: Multiple sources found 
for csv (com.databricks.spark.csv.DefaultSource15, 
org.apache.spark.sql.execution.datasources.csv.CSVFileFormat), please specify 
the fully qualified class name.

Code:  val csvdf = spark.readStream.option("sep", 
",").schema(userSchema).csv("server_path") //does not resolve error
val csvdf = spark.readStream.option("sep", 
",").schema(userSchema).format("com.databricks.spark.csv").csv("server_path") 
//does not resolve error val csvdf = spark.readStream.option("sep", 
",").schema(userSchema).csv("server_path") //does not resolve errorval csvdf = 
spark.readStream.option("sep", 
",").schema(userSchema).format("org.apache.spark.sql.execution.datasources.csv").csv("server_path")
 //does not resolve errorval csvdf = spark.readStream.option("sep", 
",").schema(userSchema).format("org.apache.spark.sql.execution.datasources.csv.CSVFileFormat").csv("server_path")
 //does not resolve errorval csvdf = spark.readStream.option("sep", 
",").schema(userSchema).format("com.databricks.spark.csv.DefaultSource15").csv("server_path")
 //does not resolve error

Re: Spark Streaming : Multiple sources found for csv : Error

Reply via email to