Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20856 @HyukjinKwon good analysis! Currently Spark is a little messy about what shall be serialized and sent to executors. Sometimes we just send an entire query tree but only read a few properties of it. It seems to me it would be better to always do codegen at driver side, to avoid complex expression/plan operations at executor side.(not sure if it's possible, cc @viirya @rednaxelafx @kiszk). For this particular problem, I think we can just change these `val`s to `lazy val` or `def` in `FileSourceScanExec`, with a unit test.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org