[PySpark][Spark Dataframe][Observation] Why empty dataframe join doesn't let you get metrics from observation?
Hey folks, I actively using observe method on my spark jobs and noticed interesting behavior: Here is an example of working and non working code: https://gist.github.com/Coola4kov/8aeeb05abd39794f8362a3cf1c66519c In a few words, if I'm joining dataframe after some filter rules and it became empty, observations configured on the first dataframe never return any results, unless some action called on the empty dataframe specifically before join. Looks like a bug to me, I will appreciate any advice on how to fix this behavior.
Re: [FYI] SPARK-45981: Improve Python language test coverage
Awesome! On Sat, Dec 2, 2023 at 2:33 PM Dongjoon Hyun wrote: > Hi, All. > > As a part of Apache Spark 4.0.0 (SPARK-44111), the Apache Spark community > starts to have test coverage for all supported Python versions from Today. > > - https://github.com/apache/spark/actions/runs/7061665420 > > Here is a summary. > > 1. Main CI: All PRs and commits on `master` branch are tested with Python > 3.9. > 2. Daily CI: > https://github.com/apache/spark/actions/workflows/build_python.yml > - PyPy 3.8 > - Python 3.10 > - Python 3.11 > - Python 3.12 > > This is a great addition for PySpark 4.0+ users and an extensible > framework for all future Python versions. > > Thank you all for making this together! > > Best, > Dongjoon. >