Hi, Not sure if this is it, but could you please try "com.databricks.spark.avro" instead of just "avro".
Thanks, Burak On Jun 13, 2015 9:55 AM, "Shing Hing Man" <mat...@yahoo.com.invalid> wrote: > Hi, > I am trying to read a avro file in SparkR (in Spark 1.4.0). > > I started R using the following. > matmsh@gauss:~$ sparkR --packages com.databricks:spark-avro_2.10:1.0.0 > > Inside the R shell, when I issue the following, > > > read.df(sqlContext, "file:///home/matmsh/myfile.avro","avro") > > I get the following exception. > Caused by: java.lang.RuntimeException: Failed to load class for data > source: avro > > Below is the stack trace. > > > matmsh@gauss:~$ sparkR --packages com.databricks:spark-avro_2.10:1.0.0 > > R version 3.2.0 (2015-04-16) -- "Full of Ingredients" > Copyright (C) 2015 The R Foundation for Statistical Computing > Platform: x86_64-suse-linux-gnu (64-bit) > > R is free software and comes with ABSOLUTELY NO WARRANTY. > You are welcome to redistribute it under certain conditions. > Type 'license()' or 'licence()' for distribution details. > > Natural language support but running in an English locale > > R is a collaborative project with many contributors. > Type 'contributors()' for more information and > 'citation()' on how to cite R or R packages in publications. > > Type 'demo()' for some demos, 'help()' for on-line help, or > 'help.start()' for an HTML browser interface to help. > Type 'q()' to quit R. > > Launching java with spark-submit command > /home/matmsh/installed/spark/bin/spark-submit "--packages" > "com.databricks:spark-avro_2.10:1.0.0" "sparkr-shell" > /tmp/RtmpoT7FrF/backend_port464e1e2fb16a > Ivy Default Cache set to: /home/matmsh/.ivy2/cache > The jars for the packages stored in: /home/matmsh/.ivy2/jars > :: loading settings :: url = > jar:file:/home/matmsh/installed/sparks/spark-1.4.0-bin-hadoop2.3/lib/spark-assembly-1.4.0-hadoop2.3.0.jar!/org/apache/ivy/core/settings/ivysettings.xml > com.databricks#spark-avro_2.10 added as a dependency > :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 > confs: [default] > found com.databricks#spark-avro_2.10;1.0.0 in list > found org.apache.avro#avro;1.7.6 in local-m2-cache > found org.codehaus.jackson#jackson-core-asl;1.9.13 in list > found org.codehaus.jackson#jackson-mapper-asl;1.9.13 in list > found com.thoughtworks.paranamer#paranamer;2.3 in list > found org.xerial.snappy#snappy-java;1.0.5 in list > found org.apache.commons#commons-compress;1.4.1 in list > found org.tukaani#xz;1.0 in list > found org.slf4j#slf4j-api;1.6.4 in list > :: resolution report :: resolve 421ms :: artifacts dl 16ms > :: modules in use: > com.databricks#spark-avro_2.10;1.0.0 from list in [default] > com.thoughtworks.paranamer#paranamer;2.3 from list in [default] > org.apache.avro#avro;1.7.6 from local-m2-cache in [default] > org.apache.commons#commons-compress;1.4.1 from list in [default] > org.codehaus.jackson#jackson-core-asl;1.9.13 from list in [default] > org.codehaus.jackson#jackson-mapper-asl;1.9.13 from list in [default] > org.slf4j#slf4j-api;1.6.4 from list in [default] > org.tukaani#xz;1.0 from list in [default] > org.xerial.snappy#snappy-java;1.0.5 from list in [default] > --------------------------------------------------------------------- > | | modules || artifacts | > | conf | number| search|dwnlded|evicted|| number|dwnlded| > --------------------------------------------------------------------- > | default | 9 | 0 | 0 | 0 || 9 | 0 | > --------------------------------------------------------------------- > :: retrieving :: org.apache.spark#spark-submit-parent > confs: [default] > 0 artifacts copied, 9 already retrieved (0kB/9ms) > 15/06/13 17:37:42 INFO spark.SparkContext: Running Spark version 1.4.0 > 15/06/13 17:37:42 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 15/06/13 17:37:42 WARN util.Utils: Your hostname, gauss resolves to a > loopback address: 127.0.0.1; using 192.168.0.10 instead (on interface > enp3s0) > 15/06/13 17:37:42 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind > to another address > 15/06/13 17:37:42 INFO spark.SecurityManager: Changing view acls to: matmsh > 15/06/13 17:37:42 INFO spark.SecurityManager: Changing modify acls to: > matmsh > 15/06/13 17:37:42 INFO spark.SecurityManager: SecurityManager: > authentication disabled; ui acls disabled; users with view permissions: > Set(matmsh); users with modify permissions: Set(matmsh) > 15/06/13 17:37:43 INFO slf4j.Slf4jLogger: Slf4jLogger started > 15/06/13 17:37:43 INFO Remoting: Starting remoting > 15/06/13 17:37:43 INFO Remoting: Remoting started; listening on addresses > :[akka.tcp://sparkDriver@192.168.0.10:46219] > 15/06/13 17:37:43 INFO util.Utils: Successfully started service > 'sparkDriver' on port 46219. > 15/06/13 17:37:43 INFO spark.SparkEnv: Registering MapOutputTracker > 15/06/13 17:37:43 INFO spark.SparkEnv: Registering BlockManagerMaster > 15/06/13 17:37:43 INFO storage.DiskBlockManager: Created local directory > at > /tmp/spark-c8661016-d922-4ad3-a171-7b0f719c40a2/blockmgr-e79853e5-e046-4b13-a3ba-0b4c46831461 > 15/06/13 17:37:43 INFO storage.MemoryStore: MemoryStore started with > capacity 265.4 MB > 15/06/13 17:37:43 INFO spark.HttpFileServer: HTTP File server directory is > /tmp/spark-c8661016-d922-4ad3-a171-7b0f719c40a2/httpd-0f11e45e-08fe-40b1-8bf9-21de1dd472b7 > 15/06/13 17:37:43 INFO spark.HttpServer: Starting HTTP Server > 15/06/13 17:37:43 INFO server.Server: jetty-8.y.z-SNAPSHOT > 15/06/13 17:37:43 INFO server.AbstractConnector: Started > SocketConnector@0.0.0.0:48507 > 15/06/13 17:37:43 INFO util.Utils: Successfully started service 'HTTP file > server' on port 48507. > 15/06/13 17:37:43 INFO spark.SparkEnv: Registering OutputCommitCoordinator > 15/06/13 17:37:43 INFO server.Server: jetty-8.y.z-SNAPSHOT > 15/06/13 17:37:43 INFO server.AbstractConnector: Started > SelectChannelConnector@0.0.0.0:4040 > 15/06/13 17:37:43 INFO util.Utils: Successfully started service 'SparkUI' > on port 4040. > 15/06/13 17:37:43 INFO ui.SparkUI: Started SparkUI at > http://192.168.0.10:4040 > 15/06/13 17:37:43 INFO spark.SparkContext: Added JAR > file:/home/matmsh/.ivy2/jars/com.databricks_spark-avro_2.10-1.0.0.jar at > http://192.168.0.10:48507/jars/com.databricks_spark-avro_2.10-1.0.0.jar > with timestamp 1434213463626 > 15/06/13 17:37:43 INFO spark.SparkContext: Added JAR > file:/home/matmsh/.ivy2/jars/org.apache.avro_avro-1.7.6.jar at > http://192.168.0.10:48507/jars/org.apache.avro_avro-1.7.6.jar with > timestamp 1434213463627 > 15/06/13 17:37:43 INFO spark.SparkContext: Added JAR > file:/home/matmsh/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar > at > http://192.168.0.10:48507/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar > with timestamp 1434213463627 > 15/06/13 17:37:43 INFO spark.SparkContext: Added JAR > file:/home/matmsh/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar > at > http://192.168.0.10:48507/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar > with timestamp 1434213463628 > 15/06/13 17:37:43 INFO spark.SparkContext: Added JAR > file:/home/matmsh/.ivy2/jars/com.thoughtworks.paranamer_paranamer-2.3.jar > at > http://192.168.0.10:48507/jars/com.thoughtworks.paranamer_paranamer-2.3.jar > with timestamp 1434213463628 > 15/06/13 17:37:43 INFO spark.SparkContext: Added JAR > file:/home/matmsh/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar at > http://192.168.0.10:48507/jars/org.xerial.snappy_snappy-java-1.0.5.jar > with timestamp 1434213463630 > 15/06/13 17:37:43 INFO spark.SparkContext: Added JAR > file:/home/matmsh/.ivy2/jars/org.apache.commons_commons-compress-1.4.1.jar > at > http://192.168.0.10:48507/jars/org.apache.commons_commons-compress-1.4.1.jar > with timestamp 1434213463630 > 15/06/13 17:37:43 INFO spark.SparkContext: Added JAR > file:/home/matmsh/.ivy2/jars/org.slf4j_slf4j-api-1.6.4.jar at > http://192.168.0.10:48507/jars/org.slf4j_slf4j-api-1.6.4.jar with > timestamp 1434213463630 > 15/06/13 17:37:43 INFO spark.SparkContext: Added JAR > file:/home/matmsh/.ivy2/jars/org.tukaani_xz-1.0.jar at > http://192.168.0.10:48507/jars/org.tukaani_xz-1.0.jar with timestamp > 1434213463630 > 15/06/13 17:37:43 INFO executor.Executor: Starting executor ID driver on > host localhost > 15/06/13 17:37:43 INFO util.Utils: Successfully started service > 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55381. > 15/06/13 17:37:43 INFO netty.NettyBlockTransferService: Server created on > 55381 > 15/06/13 17:37:43 INFO storage.BlockManagerMaster: Trying to register > BlockManager > 15/06/13 17:37:43 INFO storage.BlockManagerMasterEndpoint: Registering > block manager localhost:55381 with 265.4 MB RAM, BlockManagerId(driver, > localhost, 55381) > 15/06/13 17:37:43 INFO storage.BlockManagerMaster: Registered BlockManager > > Welcome to SparkR! > Spark context is available as sc, SQL context is available as sqlContext > > read.df(sqlContext, "file:///home/matmsh/myfile.avro","avro") > 15/06/13 17:38:53 ERROR r.RBackendHandler: load on 1 failed > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:127) > at > org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74) > at > org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) > at > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Failed to load class for data > source: avro > at scala.sys.package$.error(package.scala:27) > at > org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl.scala:216) > at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:229) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114) > at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1230) > ... 25 more > Error: returnStatus == 0 is not TRUE > > > Thanks in advance for any assistance! > Shing > > > >