Thank you. that makes sense. On Wed, Mar 16, 2022 at 2:03 PM Lalwani, Jayesh <jlalw...@amazon.com> wrote:
> The toDF function in scala uses a bit of Scala magic that allows you to > add methods to existing classes. Here’s a link to explanation > https://www.oreilly.com/library/view/scala-cookbook/9781449340292/ch01s11.html > > > > In short, you can implement a class that extends the List class and add > methods to your list class, and you can implement an implicit converter > that converts from List to your class. When the Scala compiler sees that > you are calling a function on a List object that doesn’t exist in the List > class, it will look for implicit converters that convert List object to > another object that has the function, and will automatically call it. > > So, if you have a class > > Class MyList extends List { > def toDF(colName: String): DataFrame{ > ….. > } > } > > and a implicit converter > implicit def convertListToMyList(list: List): MyList { > > …. > } > > when you do > List("apple","orange","cherry").toDF("fruit") > > > > Internally, Scala will generate the code as > convertListToMyList(List("apple","orange","cherry")).toDF("fruit") > > > > > > *From: *Bitfox <bit...@bitfox.top> > *Date: *Wednesday, March 16, 2022 at 12:06 AM > *To: *"user @spark" <user@spark.apache.org> > *Subject: *[EXTERNAL] Question on List to DF > > > > *CAUTION*: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > I am wondering why the list in scala spark can be converted into a > dataframe directly? > > > > scala> val df = List("apple","orange","cherry").toDF("fruit") > > *df*: *org.apache.spark.sql.DataFrame* = [fruit: string] > > > > scala> df.show > > +------+ > > | fruit| > > +------+ > > | apple| > > |orange| > > |cherry| > > +------+ > > > > I don't think pyspark can convert that as well. > > > > Thank you. >