Re: coalesce on dataFrame

2015-07-01 Thread Olivier Girardot
PySpark or Spark (scala) ? When you use coalesce with anything but a column you must use a literal like that in PySpark : from pyspark.sql import functions as F F.coalesce(df.a, F.lit(True)) Le mer. 1 juil. 2015 à 12:03, Ewan Leith ewan.le...@realitymine.com a écrit : It's in spark 1.4.0, or

RE: coalesce on dataFrame

2015-07-01 Thread Ewan Leith
It's in spark 1.4.0, or should be at least: https://issues.apache.org/jira/browse/SPARK-6972 Ewan -Original Message- From: Hafiz Mujadid [mailto:hafizmujadi...@gmail.com] Sent: 01 July 2015 08:23 To: user@spark.apache.org Subject: coalesce on dataFrame How can we use coalesce(1, true)

Re: coalesce on dataFrame

2015-07-01 Thread Burak Yavuz
You can use df.repartition(1) in Spark 1.4. See here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala#L1396 . Best, Burak On Wed, Jul 1, 2015 at 3:05 AM, Olivier Girardot ssab...@gmail.com wrote: PySpark or Spark (scala) ? When you use