[ https://issues.apache.org/jira/browse/SPARK-8408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Felix Maximilian Möller updated SPARK-8408: ------------------------------------------- Description: h3. Given {code} d = [{'name': 'Alice', 'age': 1},{'name': 'Bob', 'age': 2}] person_df = sqlContext.createDataFrame(d) {code} h3. When {code} person_df.filter(person_df.age==1 or person_df.age==2).collect() {code} h3. Expected [Row(age=1, name=u'Alice'), Row(age=2, name=u'Bob')] h3. Actual [Row(age=1, name=u'Alice')] h3. While {code} person_df.filter("age = 1 or age = 2").collect() {code} yields the correct result: [Row(age=1, name=u'Alice'), Row(age=2, name=u'Bob')] was: Given ===== .. code:: python d = [{'name': 'Alice', 'age': 1},{'name': 'Bob', 'age': 2}] person_df = sqlContext.createDataFrame(d) When ==== .. code:: python person_df.filter(person_df.age==1 or person_df.age==2).collect() Expected ======== [Row(age=1, name=u'Alice'), Row(age=2, name=u'Bob')] Actual ====== [Row(age=1, name=u'Alice')] While ===== .. code:: python person_df.filter("age = 1 or age = 2").collect() yields the correct result: [Row(age=1, name=u'Alice'), Row(age=2, name=u'Bob')] > Python OR operator is not considered while creating a column of boolean type > ---------------------------------------------------------------------------- > > Key: SPARK-8408 > URL: https://issues.apache.org/jira/browse/SPARK-8408 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 1.4.0 > Environment: OSX Apache Spark 1.4.0 > Reporter: Felix Maximilian Möller > Priority: Minor > Attachments: bug_report.ipynb.json > > > h3. Given > {code} > d = [{'name': 'Alice', 'age': 1},{'name': 'Bob', 'age': 2}] > person_df = sqlContext.createDataFrame(d) > {code} > h3. When > {code} > person_df.filter(person_df.age==1 or person_df.age==2).collect() > {code} > h3. Expected > [Row(age=1, name=u'Alice'), Row(age=2, name=u'Bob')] > h3. Actual > [Row(age=1, name=u'Alice')] > h3. While > {code} > person_df.filter("age = 1 or age = 2").collect() > {code} > yields the correct result: > [Row(age=1, name=u'Alice'), Row(age=2, name=u'Bob')] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org