Re: [PySpark][Spark Dataframe][Observation] Why empty dataframe join doesn't let you get metrics from observation?

2023-12-11 Thread Михаил Кулаков
quot;, 6), ("b", 5)).toDF("col1", "col4") > > val o1 = Observation() > val o2 = Observation() > > val df1 = df.observe(o1, count("*")).filter("col1 = 'c'") > val df2 = df1.join(df_join, "col1", "left").observe(o2, count("*")) > &g

[PySpark][Spark Dataframe][Observation] Why empty dataframe join doesn't let you get metrics from observation?

2023-12-02 Thread Михаил Кулаков
Hey folks, I actively using observe method on my spark jobs and noticed interesting behavior: Here is an example of working and non working code: https://gist.github.com/Coola4kov/8aeeb05abd39794f8362a3cf1c66519c In a few words, if I'm joining dataframe after some filter rules and it became