[ 
https://issues.apache.org/jira/browse/SPARK-16055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15346708#comment-15346708
 ] 

Shivaram Venkataraman commented on SPARK-16055:
-----------------------------------------------

Thanks [~KrishnaKalyan3] -- Feel free to pick this up. Once you create a PR it 
will automatically be assigned to you. You can see more details at 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-ContributingCodeChanges

> sparkR.init() can not load sparkPackages when executing an R file
> -----------------------------------------------------------------
>
>                 Key: SPARK-16055
>                 URL: https://issues.apache.org/jira/browse/SPARK-16055
>             Project: Spark
>          Issue Type: Brainstorming
>          Components: SparkR
>    Affects Versions: 1.6.1
>            Reporter: Sun Rui
>            Priority: Minor
>
> This is an issue reported in the Spark user mailing list. Refer to 
> http://comments.gmane.org/gmane.comp.lang.scala.spark.user/35742
> This issue does not occur in an interactive SparkR session, while it does 
> occur when executing an R file.
> The following example code can be put into an R file to reproduce this issue:
> {code}
> .libPaths(c("/home/user/spark-1.6.1-bin-hadoop2.6/R/lib",.libPaths()))
> Sys.setenv(SPARK_HOME="/home/user/spark-1.6.1-bin-hadoop2.6")
> library("SparkR")
> sc <- sparkR.init(sparkPackages = "com.databricks:spark-csv_2.11:1.4.0")
> sqlContext <- sparkRSQL.init(sc)
> df <- read.df(sqlContext, 
> "file:///home/user/spark-1.6.1-bin-hadoop2.6/data/mllib/sample_tree_data.csv","csv")
> showDF(df)
> {code}
> The error message is as such:
> {panel}
> 16/06/19 15:48:56 ERROR RBackendHandler: loadDF on 
> org.apache.spark.sql.api.r.SQLUtils failed
> Error in invokeJava(isStatic = TRUE, className, methodName, ...) : 
>   java.lang.ClassNotFoundException: Failed to find data source: csv. Please 
> find packages at http://spark-packages.org
>       at 
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>       at 
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>       at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>       at org.apache.spark.sql.api.r.SQLUtils$.loadDF(SQLUtils.scala:160)
>       at org.apache.spark.sql.api.r.SQLUtils.loadDF(SQLUtils.scala)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:606)
>       at 
> org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141)
>       at 
> org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala
> Calls: read.df -> callJStatic -> invokeJava
> Execution halted
> {panel}
> The reason behind this is that in case you execute an R file, the R backend 
> launches before the R interpreter, so there is no opportunity for packages 
> specified with ‘sparkPackages’ to be processed.
> This JIRA issue is to track this issue. An appropriate solution is to be 
> discussed. Maybe documentation the limitation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to