Hi Jorn, You can measure the time for ser/deser yourself using web UI or SparkListeners.
Pozdrawiam, Jacek Laskowski ---- https://medium.com/@jaceklaskowski/ Mastering Apache Spark http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Fri, Jun 24, 2016 at 10:14 AM, Jörn Franke <jornfra...@gmail.com> wrote: > I would push the Spark people to provide equivalent functionality . In the > end it is a deserialization/serialization process which should not be done > back and forth because it is one of the more costly aspects during > processing. It needs to convert Java objects to a binary representation. It > is ok to do it once, because afterwards the access in binary form is much > more efficient, but this will be completely irrelevant if you convert back > and forth all the time. > > I have heard somewhere the figure that serialization/deserialization takes > 80% of the time in the big data world, but i would be happy to see this > figure be confirmed empirically for different scenarios. Unfortunately I do > not have a source for this figure so do not take it as granted. > >> On 24 Jun 2016, at 08:00, pan <pranav.na...@gmail.com> wrote: >> >> Hello, >> I am trying to understand the cost of converting an RDD to Dataframe and >> back. Would a conversion back and forth very frequently cost performance. >> >> I do observe that some operations like join are implemented very differently >> for RDD (pair) and Dataframe so trying to figure out the cose of converting >> one to another >> >> Regards, >> Pranav >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Cost-of-converting-RDD-s-to-dataframe-and-back-tp27222.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org