RE: pyspark.sql.functions.last not working as expected

2016-08-18 Thread Alexander Peletz
ct: RE: pyspark.sql.functions.last not working as expected So here is the test case from the commit adding the first/last methods here: https://github.com/apache/spark/pull/10957/commits/defcc02a8885e884d5140b11705b764a51753162<https://urldefense.proofpoint.com/v2/ur

RE: pyspark.sql.functions.last not working as expected

2016-08-17 Thread Alexander Peletz
Seq( +Row("a", 0, null, null, "x", null, null, "z"), + Row("a", 1, null, null, "x", null, null, "z"), +Row("a", 2, null, null, "x", null, null, "z"), +Row("a", 3, n

pyspark.sql.functions.last not working as expected

2016-08-17 Thread Alexander Peletz
Hi, I am using Spark 2.0 and I am getting unexpected results using the last() method. Has anyone else experienced this? I get the sense that last() is working correctly within a given data partition but not across the entire RDD. First() seems to work as expected so I can work around this by