Re: [SQL] When SQLConf vals gets own accessor defs?

2021-09-06 Thread Wenchen Fan
I think SQLConf doesn't need defs anymore. In the beginning, SQLConf lived in sql/core, so we have to add defs if the code in sql/catalyst needs to access configs. Now SQLConf is in sql/catalyst (this was done a few years ago), defs are only needed if we have some special logic that is not just

Re: [SQL] s.s.a.coalescePartitions.parallelismFirst true but recommends false

2021-09-06 Thread Wenchen Fan
This is correct. It's true by default so that AQE doesn't have performance regression. If you run a benchmark, larger parallelism usually means better performance. However, it's recommended to set it to false, so that AQE can give better resource utilization, which is good for a busy Spark

Re: Observer Namenode and Committer Algorithm V1

2021-09-06 Thread Adam Binford
Sharing some things I learned looking into the Delta Lake issue: - This was a read after write inconsistency _all on the driver_. Specifically it currently uses the FileSystem API for reading table logs for greater compatibility, but the FileContext API for writes for atomic renames. This led to