[ https://issues.apache.org/jira/browse/SPARK-33057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Li Jin updated SPARK-33057: --------------------------- Description: Current, trying to use filter with a window operations will fail: {code:java} df = spark.range(100) win = Window.partitionBy().orderBy('id') df.filter(F.rank().over(win) > 10).show() {code} Error: {code:java} Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/icexelloss/opt/miniconda3/envs/ibis-dev-spark-3/lib/python3.8/site-packages/pyspark/sql/dataframe.py", line 1461, in filter jdf = self._jdf.filter(condition._jc) File "/Users/icexelloss/opt/miniconda3/envs/ibis-dev-spark-3/lib/python3.8/site-packages/pyspark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1304, in __call__ File "/Users/icexelloss/opt/miniconda3/envs/ibis-dev-spark-3/lib/python3.8/site-packages/pyspark/sql/utils.py", line 134, in deco raise_from(converted) File "<string>", line 3, in raise_from pyspark.sql.utils.AnalysisException: It is not allowed to use window functions inside WHERE clause;{code} Although the code is same as the code below, which works: {code:java} df = spark.range(100) win = Window.partitionBy().orderBy('id') df = df.withColumn('rank', F.rank().over(win)) df = df[df['rank'] > 10] df = df.drop('rank'){code} was: Current, trying to use filter with a window operations will fail: {code:java} df = spark.range(100) win = Window.partitionBy().orderBy('id') df.filter(F.rank().over(win) > 10).show() {code} Error: {code:java} Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/icexelloss/opt/miniconda3/envs/ibis-dev-spark-3/lib/python3.8/site-packages/pyspark/sql/dataframe.py", line 1461, in filter jdf = self._jdf.filter(condition._jc) File "/Users/icexelloss/opt/miniconda3/envs/ibis-dev-spark-3/lib/python3.8/site-packages/pyspark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1304, in __call__ File "/Users/icexelloss/opt/miniconda3/envs/ibis-dev-spark-3/lib/python3.8/site-packages/pyspark/sql/utils.py", line 134, in deco raise_from(converted) File "<string>", line 3, in raise_from pyspark.sql.utils.AnalysisException: It is not allowed to use window functions inside WHERE clause;{code} Although the code is same as: {code:java} df = spark.range(100) win = Window.partitionBy().orderBy('id') df = df.withColumn('rank', F.rank().over(win)) df = df[df['rank'] > 10] df = df.drop('rank'){code} > Cannot use filter with window operations > ---------------------------------------- > > Key: SPARK-33057 > URL: https://issues.apache.org/jira/browse/SPARK-33057 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.0.1 > Reporter: Li Jin > Priority: Major > > Current, trying to use filter with a window operations will fail: > > {code:java} > df = spark.range(100) > win = Window.partitionBy().orderBy('id') > df.filter(F.rank().over(win) > 10).show() > {code} > Error: > {code:java} > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File > "/Users/icexelloss/opt/miniconda3/envs/ibis-dev-spark-3/lib/python3.8/site-packages/pyspark/sql/dataframe.py", > line 1461, in filter > jdf = self._jdf.filter(condition._jc) > File > "/Users/icexelloss/opt/miniconda3/envs/ibis-dev-spark-3/lib/python3.8/site-packages/pyspark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", > line 1304, in __call__ > File > "/Users/icexelloss/opt/miniconda3/envs/ibis-dev-spark-3/lib/python3.8/site-packages/pyspark/sql/utils.py", > line 134, in deco > raise_from(converted) > File "<string>", line 3, in raise_from > pyspark.sql.utils.AnalysisException: It is not allowed to use window > functions inside WHERE clause;{code} > Although the code is same as the code below, which works: > {code:java} > df = spark.range(100) > win = Window.partitionBy().orderBy('id') > df = df.withColumn('rank', F.rank().over(win)) > df = df[df['rank'] > 10] > df = df.drop('rank'){code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org