Amer Sheikh created ZEPPELIN-3601:
-------------------------------------

             Summary: Zeppelin 0.8 error whilst trying to read a csv file.
                 Key: ZEPPELIN-3601
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3601
             Project: Zeppelin
          Issue Type: Bug
          Components: zeppelin-interpreter
    Affects Versions: 0.8.0
         Environment: Windows
            Reporter: Amer Sheikh


Whilst trying to do a simple csv read using

 

var df = sqlContext.read.format("com.databricks.spark.csv").option("header", 
"true").option("inferSchema", "true").option("delimiter", 
"|").load("C:\\VodafoneData\\Customer\\TEMP\\TOTAL_AMDOCS_POS_MOVIL.TXT")

 

I get the following error in Windows. This is using the packaged SPARK 2.2 
binaries.

On Stack Overflow the same issue has been seen on Linux.  I'm also unable to 
set up my local SPARK 2.3.1 version on Zeppelin 0.8.

 

I tried setting the SPARK_HOME variable as my environment variable and that 
didn't work at all.

So I tried setting it in zeppelin-env.cmd, but it was ignored.  Why is Zeppelin 
so complicated to set up ???

 

Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most 
recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): 
java.lang.NoSuchMethodError: 
org.apache.hadoop.fs.FileSystem$Statistics.getThreadStatistics()Lorg/apache/hadoop/fs/FileSystem$Statistics$StatisticsData;
 at 
org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
 at 
org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1$$anonfun$apply$mcJ$sp$1.apply(SparkHadoopUtil.scala:149)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
 at scala.collection.Iterator$class.foreach(Iterator.scala:893)
 at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
 at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
 at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
 at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
 at scala.collection.AbstractTraversable.map(Traversable.scala:104)
 at 
org.apache.spark.deploy.SparkHadoopUtil$$anonfun$1.apply$mcJ$sp(SparkHadoopUtil.scala:149)
 at 
org.apache.spark.deploy.SparkHadoopUtil.getFSBytesReadOnThreadCallback(SparkHadoopUtil.scala:150)
 at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.<init>(FileScanRDD.scala:78)
 at 
org.apache.spark.sql.execution.datasources.FileScanRDD.compute(FileScanRDD.scala:71)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
 at org.apache.spark.scheduler.Task.run(Task.scala:108)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to