Abul,

Mina is right, until Spark 1.6 csv parsing was bailable as a separate spark
package. With Spark 2.0 csv parsing is built in. Zeppelin 0.6.1 ships with
Spark 2.0.

Thanks,
Vinay

On Wednesday, August 17, 2016, Mina Lee <[email protected]> wrote:

> Hi Abul,
>
> spark-csv is integrated into spark itself so you don't need to load
> spark-csv dependencies anymore.
>
> Could you try below instead?
>
> val df = sqlContext.read.
> options(Map("header" -> "true", "inferSchema" -> "true")).
> csv("hdfs:// ... /S&P")
>
> df.printSchema
>
> Hope this solves your issue!
>
> Mina
>
> On Wed, Aug 17, 2016 at 11:43 AM Abul Basar <[email protected]
> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
>
>> Hello,
>>
>> It is exciting to see new release 0.6.1 in a short span after 0.6
>> release.
>>
>> I am test driving 0.6.1 with spark 2.0 (Scala 2.11). RDD, DF operations
>> are working fine. I am facing a problem while using csv package (
>> https://github.com/databricks/spark-csv).
>>
>> i added "com.databricks:spark-csv_2.11:1.4.0" in the interpreter
>> dependencies using UI and  I am trying the following code. I restarted
>> zeppelin.
>>
>>
>> val df = spark.sqlContext.read.
>> format("com.databricks.spark.csv").
>> options(Map("header" -> "true", "inferSchema" -> "true")).
>> load("hdfs:// ... /S&P")
>>
>> df.printSchema
>>
>>
>> The above statement errors out with the follow message
>>
>> java.lang.NoSuchMethodError: com.univocity.parsers.csv.CsvParserSettings.
>> setUnescapedQuoteHandling(Lcom/univocity/parsers/csv/
>> UnescapedQuoteHandling;)V
>> at org.apache.spark.sql.execution.datasources.csv.
>> CsvReader.parser$lzycompute(CSVParser.scala:50)
>> at org.apache.spark.sql.execution.datasources.csv.
>> CsvReader.parser(CSVParser.scala:35)
>> at org.apache.spark.sql.execution.datasources.csv.
>> LineCsvReader.parseLine(CSVParser.scala:117)
>> at org.apache.spark.sql.execution.datasources.csv.
>> CSVFileFormat.inferSchema(CSVFileFormat.scala:59)
>> at org.apache.spark.sql.execution.datasources.
>> DataSource$$anonfun$15.apply(DataSource.scala:392)
>> at org.apache.spark.sql.execution.datasources.
>> DataSource$$anonfun$15.apply(DataSource.scala:392)
>> at scala.Option.orElse(Option.scala:289)
>> at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(
>> DataSource.scala:391)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:132)
>> ... 46 elided
>>
>>
>> I successfully tested the same code using REPL.The above error seems a
>> bug introduced in 0.6.1. It works fine in 0.6.0.
>>
>> Any ideas about how to resolve the issue?
>>
>> Thanks!
>> - AB
>>
>>
>>

Reply via email to