[ https://issues.apache.org/jira/browse/SPARK-37954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17490639#comment-17490639 ]
L. C. Hsieh commented on SPARK-37954: ------------------------------------- I quickly scanned what we did and it looks like intentional to resolve the missing references for Filter. To separately consider the two cases df.select(df("name")).filter(df("id") === 0).show() and df = df.select(df("name")) df = df.filter(df("id") === 0).show() seems a bit weird idea. I don't think/remember we expect they are two different things. > old columns should not be available after select or drop > -------------------------------------------------------- > > Key: SPARK-37954 > URL: https://issues.apache.org/jira/browse/SPARK-37954 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL > Affects Versions: 3.0.1 > Reporter: Jean Bon > Priority: Major > > > {code:java} > from pyspark.sql import SparkSession > from pyspark.sql.functions import col as col > spark = SparkSession.builder.appName('available_columns').getOrCreate() > df = spark.range(5).select((col("id")+10).alias("id2")) > assert df.columns==["id2"] #OK > try: > df.select("id") > error_raise = False > except: > error_raise = True > assert error_raise #OK > df = df.drop("id") #should raise an error > df.filter(col("id")!=2).count() #returns 4, should raise an error > {code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org