It is actually different. coalesce expression is to pick the first value that is not null: https://msdn.microsoft.com/en-us/library/ms190349.aspx
Would be great to update the documentation for it (both Scala and Java) to explain that it is different from coalesce function on a DataFrame/RDD. Do you want to submit a pull request? On Wed, Apr 22, 2015 at 3:05 AM, Olivier Girardot < o.girar...@lateral-thoughts.com> wrote: > I think I found the Coalesce you were talking about, but this is a > catalyst class that I think is not available from pyspark > > Regards, > > Olivier. > > Le mer. 22 avr. 2015 à 11:56, Olivier Girardot < > o.girar...@lateral-thoughts.com> a écrit : > >> Where should this *coalesce* come from ? Is it related to the partition >> manipulation coalesce method ? >> Thanks ! >> >> Le lun. 20 avr. 2015 à 22:48, Reynold Xin <r...@databricks.com> a écrit : >> >>> Ah ic. You can do something like >>> >>> >>> df.select(coalesce(df("a"), lit(0.0))) >>> >>> On Mon, Apr 20, 2015 at 1:44 PM, Olivier Girardot < >>> o.girar...@lateral-thoughts.com> wrote: >>> >>>> From PySpark it seems to me that the fillna is relying on Java/Scala >>>> code, that's why I was wondering. >>>> Thank you for answering :) >>>> >>>> Le lun. 20 avr. 2015 à 22:22, Reynold Xin <r...@databricks.com> a >>>> écrit : >>>> >>>>> You can just create fillna function based on the 1.3.1 implementation >>>>> of fillna, no? >>>>> >>>>> >>>>> On Mon, Apr 20, 2015 at 2:48 AM, Olivier Girardot < >>>>> o.girar...@lateral-thoughts.com> wrote: >>>>> >>>>>> a UDF might be a good idea no ? >>>>>> >>>>>> Le lun. 20 avr. 2015 à 11:17, Olivier Girardot < >>>>>> o.girar...@lateral-thoughts.com> a écrit : >>>>>> >>>>>> > Hi everyone, >>>>>> > let's assume I'm stuck in 1.3.0, how can I benefit from the >>>>>> *fillna* API >>>>>> > in PySpark, is there any efficient alternative to mapping the >>>>>> records >>>>>> > myself ? >>>>>> > >>>>>> > Regards, >>>>>> > >>>>>> > Olivier. >>>>>> > >>>>>> >>>>> >>>>> >>>