The toDF function in scala uses a bit of Scala magic that allows you to add 
methods to existing classes. Here’s a link to explanation 
https://www.oreilly.com/library/view/scala-cookbook/9781449340292/ch01s11.html

In short, you can implement a class that extends the List class and add methods 
to your  list class, and you can implement an implicit converter that converts 
from List to your class. When the Scala compiler sees that you are calling a 
function on a List object that doesn’t exist in the List class, it will look 
for implicit converters that convert List object to another object that has the 
function, and will automatically call it.

So, if you have a class
Class MyList extends List {
def toDF(colName: String): DataFrame{
    …..
}
}

and a implicit converter
implicit def convertListToMyList(list: List): MyList {
….
}

when you do
List("apple","orange","cherry").toDF("fruit")

Internally, Scala will generate the code as
convertListToMyList(List("apple","orange","cherry")).toDF("fruit")


From: Bitfox <bit...@bitfox.top>
Date: Wednesday, March 16, 2022 at 12:06 AM
To: "user @spark" <user@spark.apache.org>
Subject: [EXTERNAL] Question on List to DF


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.


I am wondering why the list in scala spark can be converted into a dataframe 
directly?


scala> val df = List("apple","orange","cherry").toDF("fruit")

df: org.apache.spark.sql.DataFrame = [fruit: string]



scala> df.show

+------+

| fruit|

+------+

| apple|

|orange|

|cherry|

+------+



I don't think pyspark can convert that as well.



Thank you.

Reply via email to