Below is my test code using Spark 2.0.1. DeserializeToObject doesn’t exist
in filter() but in map(). Does it means map() does not Tungsten operation?

case class Event(id: Long)
val e1 = Seq(Event(1L), Event(2L)).toDSval e2 = Seq(Event(2L), Event(3L)).toDS

e1.filter(e=>e.id < 10 && e.id > 5).explain
// == Physical Plan ==// *Filter <function1>.apply// +- LocalTableScan [id#145L]

e1.map(e=>e.id < 10 && e.id > 5).explain// == Physical Plan ==//
*SerializeFromObject [input[0, boolean, true] AS value#155]// +-
*MapElements <function1>, obj#154: boolean//    +-
*DeserializeToObject newInstance(class $line41.$read$$iw$$iw$Event),
obj#153: // $line41.$read$$iw$$iw$Event//       +- LocalTableScan
[id#145L]

Another question: If I register a complex function as a UDF, in what
situation, DeserializeToObject/SerialzeFromObject will happen?

Thanks.
​

Reply via email to