The toDF function in scala uses a bit of Scala magic that allows you to add methods to existing classes. Here’s a link to explanation https://www.oreilly.com/library/view/scala-cookbook/9781449340292/ch01s11.html
In short, you can implement a class that extends the List class and add methods to your list class, and you can implement an implicit converter that converts from List to your class. When the Scala compiler sees that you are calling a function on a List object that doesn’t exist in the List class, it will look for implicit converters that convert List object to another object that has the function, and will automatically call it. So, if you have a class Class MyList extends List { def toDF(colName: String): DataFrame{ ….. } } and a implicit converter implicit def convertListToMyList(list: List): MyList { …. } when you do List("apple","orange","cherry").toDF("fruit") Internally, Scala will generate the code as convertListToMyList(List("apple","orange","cherry")).toDF("fruit") From: Bitfox <bit...@bitfox.top> Date: Wednesday, March 16, 2022 at 12:06 AM To: "user @spark" <user@spark.apache.org> Subject: [EXTERNAL] Question on List to DF CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. I am wondering why the list in scala spark can be converted into a dataframe directly? scala> val df = List("apple","orange","cherry").toDF("fruit") df: org.apache.spark.sql.DataFrame = [fruit: string] scala> df.show +------+ | fruit| +------+ | apple| |orange| |cherry| +------+ I don't think pyspark can convert that as well. Thank you.