Hi, I am trying to read a avro file in SparkR (in Spark 1.4.0). I started R using the following. matmsh@gauss:~$ sparkR --packages com.databricks:spark-avro_2.10:1.0.0 Inside the R shell, when I issue the following,
> read.df(sqlContext, "file:///home/matmsh/myfile.avro","avro") I get the following exception. Caused by: java.lang.RuntimeException: Failed to load class for data source: avro Below is the stack trace. matmsh@gauss:~$ sparkR --packages com.databricks:spark-avro_2.10:1.0.0 R version 3.2.0 (2015-04-16) -- "Full of Ingredients"Copyright (C) 2015 The R Foundation for Statistical ComputingPlatform: x86_64-suse-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY.You are welcome to redistribute it under certain conditions.Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors.Type 'contributors()' for more information and'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or'help.start()' for an HTML browser interface to help.Type 'q()' to quit R. Launching java with spark-submit command /home/matmsh/installed/spark/bin/spark-submit "--packages" "com.databricks:spark-avro_2.10:1.0.0" "sparkr-shell" /tmp/RtmpoT7FrF/backend_port464e1e2fb16a Ivy Default Cache set to: /home/matmsh/.ivy2/cacheThe jars for the packages stored in: /home/matmsh/.ivy2/jars:: loading settings :: url = jar:file:/home/matmsh/installed/sparks/spark-1.4.0-bin-hadoop2.3/lib/spark-assembly-1.4.0-hadoop2.3.0.jar!/org/apache/ivy/core/settings/ivysettings.xmlcom.databricks#spark-avro_2.10 added as a dependency:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 confs: [default] found com.databricks#spark-avro_2.10;1.0.0 in list found org.apache.avro#avro;1.7.6 in local-m2-cache found org.codehaus.jackson#jackson-core-asl;1.9.13 in list found org.codehaus.jackson#jackson-mapper-asl;1.9.13 in list found com.thoughtworks.paranamer#paranamer;2.3 in list found org.xerial.snappy#snappy-java;1.0.5 in list found org.apache.commons#commons-compress;1.4.1 in list found org.tukaani#xz;1.0 in list found org.slf4j#slf4j-api;1.6.4 in list:: resolution report :: resolve 421ms :: artifacts dl 16ms :: modules in use: com.databricks#spark-avro_2.10;1.0.0 from list in [default] com.thoughtworks.paranamer#paranamer;2.3 from list in [default] org.apache.avro#avro;1.7.6 from local-m2-cache in [default] org.apache.commons#commons-compress;1.4.1 from list in [default] org.codehaus.jackson#jackson-core-asl;1.9.13 from list in [default] org.codehaus.jackson#jackson-mapper-asl;1.9.13 from list in [default] org.slf4j#slf4j-api;1.6.4 from list in [default] org.tukaani#xz;1.0 from list in [default] org.xerial.snappy#snappy-java;1.0.5 from list in [default] --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 9 | 0 | 0 | 0 || 9 | 0 | ---------------------------------------------------------------------:: retrieving :: org.apache.spark#spark-submit-parent confs: [default] 0 artifacts copied, 9 already retrieved (0kB/9ms)15/06/13 17:37:42 INFO spark.SparkContext: Running Spark version 1.4.015/06/13 17:37:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable15/06/13 17:37:42 WARN util.Utils: Your hostname, gauss resolves to a loopback address: 127.0.0.1; using 192.168.0.10 instead (on interface enp3s0)15/06/13 17:37:42 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address15/06/13 17:37:42 INFO spark.SecurityManager: Changing view acls to: matmsh15/06/13 17:37:42 INFO spark.SecurityManager: Changing modify acls to: matmsh15/06/13 17:37:42 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(matmsh); users with modify permissions: Set(matmsh)15/06/13 17:37:43 INFO slf4j.Slf4jLogger: Slf4jLogger started15/06/13 17:37:43 INFO Remoting: Starting remoting15/06/13 17:37:43 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.0.10:46219]15/06/13 17:37:43 INFO util.Utils: Successfully started service 'sparkDriver' on port 46219.15/06/13 17:37:43 INFO spark.SparkEnv: Registering MapOutputTracker15/06/13 17:37:43 INFO spark.SparkEnv: Registering BlockManagerMaster15/06/13 17:37:43 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-c8661016-d922-4ad3-a171-7b0f719c40a2/blockmgr-e79853e5-e046-4b13-a3ba-0b4c4683146115/06/13 17:37:43 INFO storage.MemoryStore: MemoryStore started with capacity 265.4 MB15/06/13 17:37:43 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-c8661016-d922-4ad3-a171-7b0f719c40a2/httpd-0f11e45e-08fe-40b1-8bf9-21de1dd472b715/06/13 17:37:43 INFO spark.HttpServer: Starting HTTP Server15/06/13 17:37:43 INFO server.Server: jetty-8.y.z-SNAPSHOT15/06/13 17:37:43 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:4850715/06/13 17:37:43 INFO util.Utils: Successfully started service 'HTTP file server' on port 48507.15/06/13 17:37:43 INFO spark.SparkEnv: Registering OutputCommitCoordinator15/06/13 17:37:43 INFO server.Server: jetty-8.y.z-SNAPSHOT15/06/13 17:37:43 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:404015/06/13 17:37:43 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.15/06/13 17:37:43 INFO ui.SparkUI: Started SparkUI at http://192.168.0.10:404015/06/13 17:37:43 INFO spark.SparkContext: Added JAR file:/home/matmsh/.ivy2/jars/com.databricks_spark-avro_2.10-1.0.0.jar at http://192.168.0.10:48507/jars/com.databricks_spark-avro_2.10-1.0.0.jar with timestamp 143421346362615/06/13 17:37:43 INFO spark.SparkContext: Added JAR file:/home/matmsh/.ivy2/jars/org.apache.avro_avro-1.7.6.jar at http://192.168.0.10:48507/jars/org.apache.avro_avro-1.7.6.jar with timestamp 143421346362715/06/13 17:37:43 INFO spark.SparkContext: Added JAR file:/home/matmsh/.ivy2/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar at http://192.168.0.10:48507/jars/org.codehaus.jackson_jackson-core-asl-1.9.13.jar with timestamp 143421346362715/06/13 17:37:43 INFO spark.SparkContext: Added JAR file:/home/matmsh/.ivy2/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar at http://192.168.0.10:48507/jars/org.codehaus.jackson_jackson-mapper-asl-1.9.13.jar with timestamp 143421346362815/06/13 17:37:43 INFO spark.SparkContext: Added JAR file:/home/matmsh/.ivy2/jars/com.thoughtworks.paranamer_paranamer-2.3.jar at http://192.168.0.10:48507/jars/com.thoughtworks.paranamer_paranamer-2.3.jar with timestamp 143421346362815/06/13 17:37:43 INFO spark.SparkContext: Added JAR file:/home/matmsh/.ivy2/jars/org.xerial.snappy_snappy-java-1.0.5.jar at http://192.168.0.10:48507/jars/org.xerial.snappy_snappy-java-1.0.5.jar with timestamp 143421346363015/06/13 17:37:43 INFO spark.SparkContext: Added JAR file:/home/matmsh/.ivy2/jars/org.apache.commons_commons-compress-1.4.1.jar at http://192.168.0.10:48507/jars/org.apache.commons_commons-compress-1.4.1.jar with timestamp 143421346363015/06/13 17:37:43 INFO spark.SparkContext: Added JAR file:/home/matmsh/.ivy2/jars/org.slf4j_slf4j-api-1.6.4.jar at http://192.168.0.10:48507/jars/org.slf4j_slf4j-api-1.6.4.jar with timestamp 143421346363015/06/13 17:37:43 INFO spark.SparkContext: Added JAR file:/home/matmsh/.ivy2/jars/org.tukaani_xz-1.0.jar at http://192.168.0.10:48507/jars/org.tukaani_xz-1.0.jar with timestamp 143421346363015/06/13 17:37:43 INFO executor.Executor: Starting executor ID driver on host localhost15/06/13 17:37:43 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55381.15/06/13 17:37:43 INFO netty.NettyBlockTransferService: Server created on 5538115/06/13 17:37:43 INFO storage.BlockManagerMaster: Trying to register BlockManager15/06/13 17:37:43 INFO storage.BlockManagerMasterEndpoint: Registering block manager localhost:55381 with 265.4 MB RAM, BlockManagerId(driver, localhost, 55381)15/06/13 17:37:43 INFO storage.BlockManagerMaster: Registered BlockManager Welcome to SparkR! Spark context is available as sc, SQL context is available as sqlContext> read.df(sqlContext, "file:///home/matmsh/myfile.avro","avro")15/06/13 17:38:53 ERROR r.RBackendHandler: load on 1 failedjava.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:127) at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74) at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) at java.lang.Thread.run(Thread.java:745)Caused by: java.lang.RuntimeException: Failed to load class for data source: avro at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl.scala:216) at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:229) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114) at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1230) ... 25 moreError: returnStatus == 0 is not TRUE Thanks in advance for any assistance!Shing