[ https://issues.apache.org/jira/browse/SPARK-8409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14595568#comment-14595568 ]
Arun edited comment on SPARK-8409 at 6/22/15 10:19 AM: ------------------------------------------------------- Hi Shivram, I tried putting those files in that destination as discussed in earlier conversation, but that dint work well. Then i tried installing the csv package in my home, private internet which is not under restrictions or proxies,but i got the following errors in sparkRshell. E:\spark-1.4.0-bin-hadoop2.6\bin>.\sparkR --packages com.databricks:spark-csv_2. 10:1.0.3 R version 3.2.1 (2015-06-18) -- "World-Famous Astronaut" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. Launching java with spark-submit command E:\spark-1.4.0-bin-hadoop2.6\bin\../bin /spark-submit.cmd "--packages" "com.databricks:spark-csv_2.10:1.0.3" "sparkr-sh ell" C:\Users\acer1\AppData\Local\Temp\Rtmp0gENwW\backend_port198831cf7692 Ivy Default Cache set to: C:\Users\acer1\.ivy2\cache The jars for the packages stored in: C:\Users\acer1\.ivy2\jars :: loading settings :: url = jar:file:/E:/spark-1.4.0-bin-hadoop2.6/lib/spark-as sembly-1.4.0-hadoop2.6.0.jar!/org/apache/ivy/core/settings/ivysettings.xml com.databricks#spark-csv_2.10 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 confs: [default] You probably access the destination server through a proxy server that is not we ll configured. You probably access the destination server through a proxy server that is not we ll configured. You probably access the destination server through a proxy server that is not we ll configured. You probably access the destination server through a proxy server that is not we ll configured. :: resolution report :: resolve 23610ms :: artifacts dl 0ms :: modules in use: --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 1 | 0 | 0 | 0 || 0 | 0 | --------------------------------------------------------------------- :: problems summary :: :::: WARNINGS Host repo1.maven.org not found. url=https://repo1.maven.org/maven2/com/d atabricks/spark-csv_2.10/1.0.3/spark-csv_2.10-1.0.3.pom Host repo1.maven.org not found. url=https://repo1.maven.org/maven2/com/d atabricks/spark-csv_2.10/1.0.3/spark-csv_2.10-1.0.3.jar Host dl.bintray.com not found. url=http://dl.bintray.com/spark-packages/ maven/com/databricks/spark-csv_2.10/1.0.3/spark-csv_2.10-1.0.3.pom Host dl.bintray.com not found. url=http://dl.bintray.com/spark-packages/ maven/com/databricks/spark-csv_2.10/1.0.3/spark-csv_2.10-1.0.3.jar module not found: com.databricks#spark-csv_2.10;1.0.3 ==== local-m2-cache: tried file:/C:/Users/acer1/.m2/repository/com/databricks/spark-csv_2.10/1.0. 3/spark-csv_2.10-1.0.3.pom -- artifact com.databricks#spark-csv_2.10;1.0.3!spark-csv_2.10.jar: file:/C:/Users/acer1/.m2/repository/com/databricks/spark-csv_2.10/1.0. 3/spark-csv_2.10-1.0.3.jar ==== local-ivy-cache: tried -- artifact com.databricks#spark-csv_2.10;1.0.3!spark-csv_2.10.jar: file:/C:/Users/acer1/.ivy2/local/com.databricks\spark-csv_2.10\1.0.3\j ars\spark-csv_2.10.jar ==== central: tried https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.0.3/spa rk-csv_2.10-1.0.3.pom -- artifact com.databricks#spark-csv_2.10;1.0.3!spark-csv_2.10.jar: https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.0.3/spa rk-csv_2.10-1.0.3.jar ==== spark-packages: tried http://dl.bintray.com/spark-packages/maven/com/databricks/spark-csv_2. 10/1.0.3/spark-csv_2.10-1.0.3.pom -- artifact com.databricks#spark-csv_2.10;1.0.3!spark-csv_2.10.jar: http://dl.bintray.com/spark-packages/maven/com/databricks/spark-csv_2. 10/1.0.3/spark-csv_2.10-1.0.3.jar :::::::::::::::::::::::::::::::::::::::::::::: :: UNRESOLVED DEPENDENCIES :: :::::::::::::::::::::::::::::::::::::::::::::: :: com.databricks#spark-csv_2.10;1.0.3: not found :::::::::::::::::::::::::::::::::::::::::::::: :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: c om.databricks#spark-csv_2.10;1.0.3: not found] at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(Spa rkSubmit.scala:978) at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSu bmit.scala:262) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:144) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/06/22 14:26:39 INFO Utils: Shutdown hook called Error in SparkR::sparkR.init(Sys.getenv("MASTER", unset = "")) : JVM is not ready after 10 seconds > was (Author: b.arunguna...@gmail.com): Hi Shivram, I tried putting those files in that destination as discussed in earlier conversation, but that dint work well. Then i tried installing the csv package in my home, private internet which is not under restrictions or proxies,but i got the following errors.Can you check in windows environment by putting this code ".\sparkR --packages com.databricks:spark-csv_2.10:1.0.3" , whether the issue is in network or in windows environment. E:\spark-1.4.0-bin-hadoop2.6\bin>.\sparkR --packages com.databricks:spark-csv_2. 10:1.0.3 R version 3.2.1 (2015-06-18) -- "World-Famous Astronaut" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. Launching java with spark-submit command E:\spark-1.4.0-bin-hadoop2.6\bin\../bin /spark-submit.cmd "--packages" "com.databricks:spark-csv_2.10:1.0.3" "sparkr-sh ell" C:\Users\acer1\AppData\Local\Temp\Rtmp0gENwW\backend_port198831cf7692 Ivy Default Cache set to: C:\Users\acer1\.ivy2\cache The jars for the packages stored in: C:\Users\acer1\.ivy2\jars :: loading settings :: url = jar:file:/E:/spark-1.4.0-bin-hadoop2.6/lib/spark-as sembly-1.4.0-hadoop2.6.0.jar!/org/apache/ivy/core/settings/ivysettings.xml com.databricks#spark-csv_2.10 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 confs: [default] You probably access the destination server through a proxy server that is not we ll configured. You probably access the destination server through a proxy server that is not we ll configured. You probably access the destination server through a proxy server that is not we ll configured. You probably access the destination server through a proxy server that is not we ll configured. :: resolution report :: resolve 23610ms :: artifacts dl 0ms :: modules in use: --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 1 | 0 | 0 | 0 || 0 | 0 | --------------------------------------------------------------------- :: problems summary :: :::: WARNINGS Host repo1.maven.org not found. url=https://repo1.maven.org/maven2/com/d atabricks/spark-csv_2.10/1.0.3/spark-csv_2.10-1.0.3.pom Host repo1.maven.org not found. url=https://repo1.maven.org/maven2/com/d atabricks/spark-csv_2.10/1.0.3/spark-csv_2.10-1.0.3.jar Host dl.bintray.com not found. url=http://dl.bintray.com/spark-packages/ maven/com/databricks/spark-csv_2.10/1.0.3/spark-csv_2.10-1.0.3.pom Host dl.bintray.com not found. url=http://dl.bintray.com/spark-packages/ maven/com/databricks/spark-csv_2.10/1.0.3/spark-csv_2.10-1.0.3.jar module not found: com.databricks#spark-csv_2.10;1.0.3 ==== local-m2-cache: tried file:/C:/Users/acer1/.m2/repository/com/databricks/spark-csv_2.10/1.0. 3/spark-csv_2.10-1.0.3.pom -- artifact com.databricks#spark-csv_2.10;1.0.3!spark-csv_2.10.jar: file:/C:/Users/acer1/.m2/repository/com/databricks/spark-csv_2.10/1.0. 3/spark-csv_2.10-1.0.3.jar ==== local-ivy-cache: tried -- artifact com.databricks#spark-csv_2.10;1.0.3!spark-csv_2.10.jar: file:/C:/Users/acer1/.ivy2/local/com.databricks\spark-csv_2.10\1.0.3\j ars\spark-csv_2.10.jar ==== central: tried https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.0.3/spa rk-csv_2.10-1.0.3.pom -- artifact com.databricks#spark-csv_2.10;1.0.3!spark-csv_2.10.jar: https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.0.3/spa rk-csv_2.10-1.0.3.jar ==== spark-packages: tried http://dl.bintray.com/spark-packages/maven/com/databricks/spark-csv_2. 10/1.0.3/spark-csv_2.10-1.0.3.pom -- artifact com.databricks#spark-csv_2.10;1.0.3!spark-csv_2.10.jar: http://dl.bintray.com/spark-packages/maven/com/databricks/spark-csv_2. 10/1.0.3/spark-csv_2.10-1.0.3.jar :::::::::::::::::::::::::::::::::::::::::::::: :: UNRESOLVED DEPENDENCIES :: :::::::::::::::::::::::::::::::::::::::::::::: :: com.databricks#spark-csv_2.10;1.0.3: not found :::::::::::::::::::::::::::::::::::::::::::::: :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: c om.databricks#spark-csv_2.10;1.0.3: not found] at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(Spa rkSubmit.scala:978) at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSu bmit.scala:262) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:144) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 15/06/22 14:26:39 INFO Utils: Shutdown hook called Error in SparkR::sparkR.init(Sys.getenv("MASTER", unset = "")) : JVM is not ready after 10 seconds > > In windows cant able to read .csv or .json files using read.df() > ----------------------------------------------------------------- > > Key: SPARK-8409 > URL: https://issues.apache.org/jira/browse/SPARK-8409 > Project: Spark > Issue Type: Bug > Components: SparkR, Windows > Affects Versions: 1.4.0 > Environment: sparkR API > Reporter: Arun > Priority: Critical > > Hi, > In SparkR shell, I invoke: > > mydf<-read.df(sqlContext, "/home/esten/ami/usaf.json", source="json", > > header="false") > I have tried various filetypes (csv, txt), all fail. > in sparkR of spark 1.4 for eg.) df_1<- read.df(sqlContext, > "E:/setup/spark-1.4.0-bin-hadoop2.6/spark-1.4.0-bin-hadoop2.6/examples/src/main/resources/nycflights13.csv", > source = "csv") > RESPONSE: "ERROR RBackendHandler: load on 1 failed" > BELOW THE WHOLE RESPONSE: > 15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(177600) called with > curMem=0, maxMem=278302556 > 15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0 stored as values in > memory (estimated size 173.4 KB, free 265.2 MB) > 15/06/16 08:09:13 INFO MemoryStore: ensureFreeSpace(16545) called with > curMem=177600, maxMem=278302556 > 15/06/16 08:09:13 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes > in memory (estimated size 16.2 KB, free 265.2 MB) > 15/06/16 08:09:13 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory > on localhost:37142 (size: 16.2 KB, free: 265.4 MB) > 15/06/16 08:09:13 INFO SparkContext: Created broadcast 0 from load at > NativeMethodAccessorImpl.java:-2 > 15/06/16 08:09:16 WARN DomainSocketFactory: The short-circuit local reads > feature cannot be used because libhadoop cannot be loaded. > 15/06/16 08:09:17 ERROR RBackendHandler: load on 1 failed > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:127) > > at > org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:74) > at > org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:36) > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) > > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163) > > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333) > > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319) > > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787) > > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130) > > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) > > at > io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) > > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hadoop.mapred.InvalidInputException: Input path does > not exist: hdfs://smalldata13.hdp:8020/home/esten/ami/usaf.json > at > org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:285) > > at > org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:228) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313) > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) > > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) > > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) > at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) > at > org.apache.spark.rdd.RDD$$anonfun$treeAggregate$1.apply(RDD.scala:1069) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:148) > > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:109) > > at org.apache.spark.rdd.RDD.withScope(RDD.scala:286) > at org.apache.spark.rdd.RDD.treeAggregate(RDD.scala:1067) > at org.apache.spark.sql.json.InferSchema$.apply(InferSchema.scala:58) > at > org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:139) > > at > org.apache.spark.sql.json.JSONRelation$$anonfun$schema$1.apply(JSONRelation.scala:138) > > at scala.Option.getOrElse(Option.scala:120) > at > org.apache.spark.sql.json.JSONRelation.schema$lzycompute(JSONRelation.scala:137) > > at > org.apache.spark.sql.json.JSONRelation.schema(JSONRelation.scala:137) > at > org.apache.spark.sql.sources.LogicalRelation.<init>(LogicalRelation.scala:30) > at > org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:120) > at org.apache.spark.sql.SQLContext.load(SQLContext.scala:1230) > ... 25 more > Error: returnStatus == 0 is not TRUE > > -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org