[ https://issues.apache.org/jira/browse/SPARK-8408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14619622#comment-14619622 ]
Davies Liu commented on SPARK-8408: ----------------------------------- In Python, We cannot override `or` `and` `not`, so we should use `|` `&` `~` for them. We will throw an exception if you have to use `and` with columns. see https://github.com/apache/spark/pull/6961 > Python OR operator is not considered while creating a column of boolean type > ---------------------------------------------------------------------------- > > Key: SPARK-8408 > URL: https://issues.apache.org/jira/browse/SPARK-8408 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 1.4.0 > Environment: OSX Apache Spark 1.4.0 > Reporter: Felix Maximilian Möller > Priority: Minor > Fix For: 1.4.1 > > Attachments: bug_report.ipynb.json > > > h3. Given > {code} > d = [{'name': 'Alice', 'age': 1},{'name': 'Bob', 'age': 2}] > person_df = sqlContext.createDataFrame(d) > {code} > h3. When > {code} > person_df.filter(person_df.age==1 or person_df.age==2).collect() > {code} > h3. Expected > [Row(age=1, name=u'Alice'), Row(age=2, name=u'Bob')] > h3. Actual > [Row(age=1, name=u'Alice')] > h3. While > {code} > person_df.filter("age = 1 or age = 2").collect() > {code} > yields the correct result: > [Row(age=1, name=u'Alice'), Row(age=2, name=u'Bob')] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org