PM
To: Antoaneta Marinova
Cc: user
Subject: Re: Spark 2.0 - DataFrames vs Dataset performance
Hi Antoaneta,
I believe the difference is not due to Datasets being slower (DataFrames are
just an alias to Datasets now), but rather using a user defined function for
filtering vs using Spark builtins
Hi Antoaneta,
I believe the difference is not due to Datasets being slower (DataFrames
are just an alias to Datasets now), but rather using a user defined
function for filtering vs using Spark builtins. The builtin can use tricks
from Project Tungsten, such as only deserializing the "event_type" co