Abul, Mina is right, until Spark 1.6 csv parsing was bailable as a separate spark package. With Spark 2.0 csv parsing is built in. Zeppelin 0.6.1 ships with Spark 2.0.
Thanks, Vinay On Wednesday, August 17, 2016, Mina Lee <[email protected]> wrote: > Hi Abul, > > spark-csv is integrated into spark itself so you don't need to load > spark-csv dependencies anymore. > > Could you try below instead? > > val df = sqlContext.read. > options(Map("header" -> "true", "inferSchema" -> "true")). > csv("hdfs:// ... /S&P") > > df.printSchema > > Hope this solves your issue! > > Mina > > On Wed, Aug 17, 2016 at 11:43 AM Abul Basar <[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > >> Hello, >> >> It is exciting to see new release 0.6.1 in a short span after 0.6 >> release. >> >> I am test driving 0.6.1 with spark 2.0 (Scala 2.11). RDD, DF operations >> are working fine. I am facing a problem while using csv package ( >> https://github.com/databricks/spark-csv). >> >> i added "com.databricks:spark-csv_2.11:1.4.0" in the interpreter >> dependencies using UI and I am trying the following code. I restarted >> zeppelin. >> >> >> val df = spark.sqlContext.read. >> format("com.databricks.spark.csv"). >> options(Map("header" -> "true", "inferSchema" -> "true")). >> load("hdfs:// ... /S&P") >> >> df.printSchema >> >> >> The above statement errors out with the follow message >> >> java.lang.NoSuchMethodError: com.univocity.parsers.csv.CsvParserSettings. >> setUnescapedQuoteHandling(Lcom/univocity/parsers/csv/ >> UnescapedQuoteHandling;)V >> at org.apache.spark.sql.execution.datasources.csv. >> CsvReader.parser$lzycompute(CSVParser.scala:50) >> at org.apache.spark.sql.execution.datasources.csv. >> CsvReader.parser(CSVParser.scala:35) >> at org.apache.spark.sql.execution.datasources.csv. >> LineCsvReader.parseLine(CSVParser.scala:117) >> at org.apache.spark.sql.execution.datasources.csv. >> CSVFileFormat.inferSchema(CSVFileFormat.scala:59) >> at org.apache.spark.sql.execution.datasources. >> DataSource$$anonfun$15.apply(DataSource.scala:392) >> at org.apache.spark.sql.execution.datasources. >> DataSource$$anonfun$15.apply(DataSource.scala:392) >> at scala.Option.orElse(Option.scala:289) >> at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation( >> DataSource.scala:391) >> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) >> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:132) >> ... 46 elided >> >> >> I successfully tested the same code using REPL.The above error seems a >> bug introduced in 0.6.1. It works fine in 0.6.0. >> >> Any ideas about how to resolve the issue? >> >> Thanks! >> - AB >> >> >>
