Re: The Dataset unit test is much slower than the RDD unit test (in Scala)

2022-11-01 Thread Cheng Pan
Which Spark version are you using? SPARK-36444[1] and SPARK-38138[2] may be related, please test w/ the patched version or disable DPP by setting spark.sql.optimizer.dynamicPartitionPruning.enabled=false to see if it helps. [1] https://issues.apache.org/jira/browse/SPARK-36444 [2]

Re: The Dataset unit test is much slower than the RDD unit test (in Scala)

2022-11-01 Thread Enrico Minack
Hi Tanin, running your test with option "spark.sql.planChangeLog.level" set to "info" or "warn" (depending on your Spark log level) will show you insights into the planning (which rules are applied, how long rules take, how many iterations are done). Hoping this helps, Enrico Am 25.10.22