Hi Jorn,

You can measure the time for ser/deser yourself using web UI or SparkListeners.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Fri, Jun 24, 2016 at 10:14 AM, Jörn Franke <jornfra...@gmail.com> wrote:
> I would push the Spark people to provide equivalent functionality . In the 
> end it is a deserialization/serialization process which should not be done 
> back and forth because it is one of the more costly aspects during 
> processing. It needs to convert Java objects to a binary representation. It 
> is ok to do it once, because afterwards the access in binary form is much 
> more efficient, but this will be completely irrelevant if you convert back 
> and forth all the time.
>
> I have heard somewhere the figure that serialization/deserialization takes 
> 80% of the time in the big data world, but i would be happy to see this 
> figure be confirmed empirically for different scenarios. Unfortunately I do 
> not have a source for this figure so do not take it as granted.
>
>> On 24 Jun 2016, at 08:00, pan <pranav.na...@gmail.com> wrote:
>>
>> Hello,
>>   I am trying to understand the cost of converting an RDD to Dataframe and
>> back. Would a conversion back and forth very frequently cost performance.
>>
>> I do observe that some operations like join are implemented very differently
>> for RDD (pair) and Dataframe so trying to figure out the cose of converting
>> one to another
>>
>> Regards,
>> Pranav
>>
>>
>>
>> --
>> View this message in context: 
>> http://apache-spark-user-list.1001560.n3.nabble.com/Cost-of-converting-RDD-s-to-dataframe-and-back-tp27222.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to