Re: value toDF is not a member of RDD object

Todd Nist Wed, 13 May 2015 07:02:22 -0700

I believe what Dean Wampler was suggesting is to use the sqlContext not the
sparkContext (sc), which is where the createDataFrame function resides:


https://spark.apache.org/docs/1.3.1/api/scala/index.html#org.apache.spark.sql.SQLContext

HTH.

-Todd

On Wed, May 13, 2015 at 6:00 AM, SLiZn Liu <sliznmail...@gmail.com> wrote:

> Additionally, after I successfully packaged the code, and submitted via 
> spark-submit
> webcat_2.11-1.0.jar, the following error was thrown at the line where
> toDF() been called:
>
> Exception in thread "main" java.lang.NoSuchMethodError: 
> scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaUniverse$JavaMirror;
>   at WebcatApp$.main(webcat.scala:49)
>   at WebcatApp.main(webcat.scala)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
>   at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
>   at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
>   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
>   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> Unsurprisingly, if I remove toDF, no error occurred.
>
> I have moved the case class definition outside of main but inside the
> outer object scope, and removed the provided specification in build.sbt.
> However, when I tried *Dean Wampler*‘s suggestion of using
> sc.createDataFrame() the compiler says this function is not a member of sc,
> and I cannot find any reference in the latest documents. What else should I
> try?
>
> REGARDS,
> Todd Leo
> 
>
> On Wed, May 13, 2015 at 11:27 AM SLiZn Liu <sliznmail...@gmail.com> wrote:
>
>> Thanks folks, really appreciate all your replies! I tried each of your
>> suggestions and in particular, *Animesh*‘s second suggestion of *making
>> case class definition global* helped me getting off the trap.
>>
>> Plus, I should have paste my entire code with this mail to help the
>> diagnose.
>>
>> REGARDS,
>> Todd Leo
>> 
>>
>> On Wed, May 13, 2015 at 12:10 AM Dean Wampler <deanwamp...@gmail.com>
>> wrote:
>>
>>> It's the import statement Olivier showed that makes the method
>>> available.
>>>
>>> Note that you can also use `sc.createDataFrame(myRDD)`, without the need
>>> for the import statement. I personally prefer this approach.
>>>
>>> Dean Wampler, Ph.D.
>>> Author: Programming Scala, 2nd Edition
>>> <http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
>>> Typesafe <http://typesafe.com>
>>> @deanwampler <http://twitter.com/deanwampler>
>>> http://polyglotprogramming.com
>>>
>>> On Tue, May 12, 2015 at 9:33 AM, Olivier Girardot <ssab...@gmail.com>
>>> wrote:
>>>
>>>> you need to instantiate a SQLContext :
>>>> val sc : SparkContext = ...
>>>> val sqlContext = new SQLContext(sc)
>>>> import sqlContext.implicits._
>>>>
>>>> Le mar. 12 mai 2015 à 12:29, SLiZn Liu <sliznmail...@gmail.com> a
>>>> écrit :
>>>>
>>>>> I added `libraryDependencies += "org.apache.spark" % "spark-sql_2.11"
>>>>> % "1.3.1"` to `build.sbt` but the error remains. Do I need to import
>>>>> modules other than `import org.apache.spark.sql.{ Row, SQLContext }`?
>>>>>
>>>>> On Tue, May 12, 2015 at 5:56 PM Olivier Girardot <ssab...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> toDF is part of spark SQL so you need Spark SQL dependency + import
>>>>>> sqlContext.implicits._ to get the toDF method.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Olivier.
>>>>>>
>>>>>> Le mar. 12 mai 2015 à 11:36, SLiZn Liu <sliznmail...@gmail.com> a
>>>>>> écrit :
>>>>>>
>>>>>>> Hi User Group,
>>>>>>>
>>>>>>> I’m trying to reproduce the example on Spark SQL Programming Guide
>>>>>>> <https://spark.apache.org/docs/latest/sql-programming-guide.html#inferring-the-schema-using-reflection>,
>>>>>>> and got a compile error when packaging with sbt:
>>>>>>>
>>>>>>> [error] myfile.scala:30: value toDF is not a member of 
>>>>>>> org.apache.spark.rdd.RDD[Person]
>>>>>>> [error] val people = 
>>>>>>> sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p
>>>>>>>  => Person(p(0), p(1).trim.toInt)).toDF()
>>>>>>> [error]                                                                 
>>>>>>>                                                              ^
>>>>>>> [error] one error found
>>>>>>> [error] (compile:compileIncremental) Compilation failed
>>>>>>> [error] Total time: 3 s, completed May 12, 2015 4:11:53 PM
>>>>>>>
>>>>>>> I double checked my code includes import sqlContext.implicits._
>>>>>>> after reading this post
>>>>>>> <https://mail-archives.apache.org/mod_mbox/spark-user/201503.mbox/%3c1426522113299-22083.p...@n3.nabble.com%3E>
>>>>>>> on spark mailing list, even tried to use toDF("col1", "col2")
>>>>>>> suggested by Xiangrui Meng in that post and got the same error.
>>>>>>>
>>>>>>> The Spark version is specified in build.sbt file as follows:
>>>>>>>
>>>>>>> scalaVersion := "2.11.6"
>>>>>>> libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "1.3.1" 
>>>>>>> % "provided"
>>>>>>> libraryDependencies += "org.apache.spark" % "spark-mllib_2.11" % "1.3.1"
>>>>>>>
>>>>>>> Anyone have ideas the cause of this error?
>>>>>>>
>>>>>>> REGARDS,
>>>>>>> Todd Leo
>>>>>>> 
>>>>>>>
>>>>>>
>>>

Re: value toDF is not a member of RDD object

Reply via email to