Could you check the spark interpreter log to see whether avro jar is loaded
? It is located in /var/log/zeppelin/zeppelin-interpreter-spark*.log

On Thu, Sep 8, 2016 at 10:31 AM, York Huang <[email protected]>
wrote:

> Hi,
>
> I am using HDP 2.4 sandbox. I download spark-avro jar file and add the
> following to
> /etc/spark/2.4.0.0-169/0/spark-defaults.conf
> and
> /usr/hdp/2.4.0.0-169/etc/spark/conf/spark-defaults.conf.
>
> spark.driver.extraClassPath /opt/spark-jars/spark-avro_2.10-2.0.1.jar
> spark.executor.extraClassPath /opt/spark-jars/spark-avro_2.10-2.0.1.jar
> spark.files /opt/spark-jars/spark-avro_2.10-2.0.1.jar
>
> I also add
> export SPARK_SUBMIT_OPTIONS="--packages com.databricks:spark-avro_2.
> 10:2.0.1"
> to file
> /etc/zeppelin/conf.dist/zeppelin-env.sh.
>
> If I run pyspark and run the following codes, it works.
> df = sqlContext.read.format("com.databricks.spark.avro").load("
> /user/root/ato/RES_TRANSACTION.avro")
>
> But if I go to zeppelin, and run the following same codes.
> %pyspark
> df = sqlContext.read.format("com.databricks.spark.avro").load("
> /user/root/ato/RES_TRANSACTION.avro")
> It failed. The error is
> Py4JJavaError: An error occurred while calling o66.load.
> : java.lang.ClassNotFoundException: Failed to find data source:
> com.databricks.spark.avro. Please use Spark package
> http://spark-packages.org/package/databricks/spark-avro
> at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.
> lookupDataSource(ResolvedDataSource.scala:72)
> at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(
> ResolvedDataSource.scala:102)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
> at py4j.Gateway.invoke(Gateway.java:259)
> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> at py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:209)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.ClassNotFoundException: com.databricks.spark.avro.
> DefaultSource
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at org.apache.spark.sql.execution.datasources.
> ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(
> ResolvedDataSource.scala:62)
> at org.apache.spark.sql.execution.datasources.
> ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(
> ResolvedDataSource.scala:62)
> at scala.util.Try$.apply(Try.scala:161)
> at org.apache.spark.sql.execution.datasources.
> ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
> at org.apache.spark.sql.execution.datasources.
> ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
> at scala.util.Try.orElse(Try.scala:82)
> at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.
> lookupDataSource(ResolvedDataSource.scala:62)
> ... 14 more
> (<class 'py4j.protocol.Py4JJavaError'>, Py4JJavaError(u'An error occurred
> while calling o66.load.\n', JavaObject id=o67), <traceback object at
> 0x7f230714a170>)
>
> Can anyone please help?
>
> Thanks,
>
> York
>
>


-- 
Best Regards

Jeff Zhang

Reply via email to