Please share the results of df.explain()[1] for both. That should give us some clues of what the differences are
[1]https://github.com/apache/spark/blob/e1c90d66bbea5b4cb97226610701b0389b734651/sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala#L499 -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org