[ https://issues.apache.org/jira/browse/SPARK-24210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500383#comment-16500383 ]
Li Yuanjian commented on SPARK-24210: ------------------------------------- I think it maybe not a bug. #KO: returns r1 and r3ex.filter(('c1 = 1') and ('c2 = 1')).show() This cause by python self base string __and__ implementation. After passing to df.filter, there's only 'c2 = 1'. #KO: returns r0 and r3ex.filter('c1 = 1 & c2 = 1').show()#KO: returns r0 and r3ex.filter('c1 == 1 & c2 == 1').show() As you mentioned, [https://github.com/apache/spark/pull/6961] actually fix the '&' between column, but not string expression like 'c1 = 1 & c2 = 1', here in ex.filter('c1 = 1 & c2 = 1'), Spark parse it to valueExpression like: 'Filter (('a = (1 & 'b)) = 1), I think this make sense here. > incorrect handling of boolean expressions when using column in expressions in > pyspark.sql.DataFrame filter function > ------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-24210 > URL: https://issues.apache.org/jira/browse/SPARK-24210 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 2.1.2 > Reporter: Michael H > Priority: Major > > {code:python} > ex = spark.createDataFrame([ > ('r0', 0, 0), > ('r1', 0, 1), > ('r2', 1, 0), > ('r3', 1, 1)]\ > , "row: string, c1: int, c2: int") > #KO: returns r1 and r3 > ex.filter(('c1 = 1') and ('c2 = 1')).show() > #OK, raises an exception > ex.filter(('c1 == 1') & ('c2 == 1')).show() > #KO: returns r0 and r3 > ex.filter('c1 = 1 & c2 = 1').show() > #KO: returns r0 and r3 > ex.filter('c1 == 1 & c2 == 1').show() > #OK: returns r3 only > ex.filter('c1 = 1 and c2 = 1').show() > #OK: returns r3 only > ex.filter('c1 == 1 and c2 == 1').show() > {code} > building the expressions using {code}ex.c1{code} or {code}ex['c1']{code} we > don't have this. > Issue seems related with > https://github.com/apache/spark/pull/6961 -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org