[ https://issues.apache.org/jira/browse/SPARK-8409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14591723#comment-14591723 ]
Arun edited comment on SPARK-8409 at 6/18/15 12:48 PM: ------------------------------------------------------- Hi Shivaram, I got the below error when i did as you told, reading from hdfs for csv file, kindly make a note that the HDFS path which i have given is syntax correct. TIA >df_1 <- read.df(sqlContext, >"hdfs://ABRLMISDEV:8020/sparkR/Data_sale_quantity_Cleaned_Missing_dates.csv", "com.databricks.spark.csv", header="true") 15/06/18 17:55:53 ERROR RBackendHandler: load on 1 failed java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandl er.scala:127) at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.s cala:74) at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.s cala:36) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChanne lInboundHandler.java:105) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst ractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(Abstra ctChannelHandlerContext.java:319) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToM essageDecoder.java:103) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst ractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(Abstra ctChannelHandlerContext.java:319) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessage Decoder.java:163) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst ractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(Abstra ctChannelHandlerContext.java:319) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChanne lPipeline.java:787) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(Abstra ctNioByteChannel.java:130) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.jav a:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEve ntLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.ja va:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThread EventExecutor.java:116) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorato r.run(DefaultThreadFactory.java:137) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Failed to load class for data source: com .databricks.spark.csv at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl .scala:216) at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:229) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114) at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1230) ... 25 more Error: returnStatus == 0 is not TRUE was (Author: b.arunguna...@gmail.com): Hi Shivaram, I got the below error when i did as you told, reading from hdfs for csv file, kindly make a note that the HDFS link which i have given is syntax correct. TIA >df_1 <- read.df(sqlContext, >"hdfs://ABRLMISDEV:8020/sparkR/Data_sale_quantity_Cleaned_Missing_dates.csv", "com.databricks.spark.csv", header="true") 15/06/18 17:55:53 ERROR RBackendHandler: load on 1 failed java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces sorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandl er.scala:127) at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.s cala:74) at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.s cala:36) at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChanne lInboundHandler.java:105) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst ractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(Abstra ctChannelHandlerContext.java:319) at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToM essageDecoder.java:103) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst ractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(Abstra ctChannelHandlerContext.java:319) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessage Decoder.java:163) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(Abst ractChannelHandlerContext.java:333) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(Abstra ctChannelHandlerContext.java:319) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChanne lPipeline.java:787) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(Abstra ctNioByteChannel.java:130) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.jav a:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEve ntLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.ja va:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThread EventExecutor.java:116) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorato r.run(DefaultThreadFactory.java:137) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Failed to load class for data source: com .databricks.spark.csv at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl .scala:216) at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:229) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:114) at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1230) ... 25 more Error: returnStatus == 0 is not TRUE > In windows cant able to read .csv or .json files using read.df() > ----------------------------------------------------------------- > > Key: SPARK-8409 > URL: https://issues.apache.org/jira/browse/SPARK-8409 > Project: Spark > Issue Type: Bug > Components: Build > Affects Versions: 1.4.0 > Environment: sparkR API > Reporter: Arun > Priority: Critical > Labels: build > > Hi, > In SparkR shell, I invoke: > > mydf<-read.df(sqlContext, "/home/esten/ami/usaf.json", source="json", > > header="false") > I have tried various filetypes (csv, txt), all fail. > in sparkR of spark 1.4 for eg.) df_1<- read.df(sqlContext, > "E:/setup/spark-1.4.0-bin-hadoop2.6/spark-1.4.0-bin-hadoop2.6/examples/src/main/resources/nycflights13.csv", > source = "csv") > RESPONSE: "ERROR RBackendHandler: load on 1 failed" > BELOW THE WHOLE RESPONSE: > 15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(177600) called with > curMem=0, maxMem=278302556 > 15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0 stored as values in > memory (estimated size 173.4 KB, free 265.2 MB) > 15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(16545) called with > curMem=177600, maxMem=278302556 > 15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes > in memory (estimated size 16.2 KB, free 265.2 MB) > 15/06/16 08:09:13 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory > on localhost:37142 (size: 16.2 KB, free: 265.4 MB) > 15/06/16 08:09:13 INFO SparkContext: Created broadcast 0 from load at > NativeMethodAccessorImpl.java:-2 > 15/06/16 08:09:16 WARN DomainSocketFactory: The short-circuit local reads > feature cannot be used because libhadoop cannot be loaded. > 15/06/16 08:09:17 ERROR RBackendHandler: load on 1 failed > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:127) > > at > org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74) > at > org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) > > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163) > > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) > > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) > > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) > > at > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) > > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does > not exist: hdfs://smalldata13.hdp:8020/home/esten/ami/usaf.json > at > org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285) > > at > org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) > > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) > > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) > at > org.apache.spark.rdd.RDD$$anonfun$treeAggregate$1.apply(RDD.scala:1069) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148) > > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109) > > at org.apache.spark.rdd.RDD.withScope(RDD.scala:286) > at org.apache.spark.rdd.RDD.treeAggregate(RDD.scala:1067) > at org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58) > at > org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139) > > at > org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:138) > > at scala.Option.getOrElse(Option.scala:120) > at > org.apache.spark.sql.json.JSONRelation.schema$lzycompute(JSONRelation.scala:137) > > at > org.apache.spark.sql.json.JSONRelation.schema(JSONRelation.scala:137) > at > org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30) > at > org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120) > at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1230) > ... 25 more > Error: returnStatus == 0 is not TRUE > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org