It is actually different.

coalesce expression is to pick the first value that is not null:
https://msdn.microsoft.com/en-us/library/ms190349.aspx

Would be great to update the documentation for it (both Scala and Java) to
explain that it is different from coalesce function on a DataFrame/RDD. Do
you want to submit a pull request?



On Wed, Apr 22, 2015 at 3:05 AM, Olivier Girardot <
o.girar...@lateral-thoughts.com> wrote:

> I think I found the Coalesce you were talking about, but this is a
> catalyst class that I think is not available from pyspark
>
> Regards,
>
> Olivier.
>
> Le mer. 22 avr. 2015 à 11:56, Olivier Girardot <
> o.girar...@lateral-thoughts.com> a écrit :
>
>> Where should this *coalesce* come from ? Is it related to the partition
>> manipulation coalesce method ?
>> Thanks !
>>
>> Le lun. 20 avr. 2015 à 22:48, Reynold Xin <r...@databricks.com> a écrit :
>>
>>> Ah ic. You can do something like
>>>
>>>
>>> df.select(coalesce(df("a"), lit(0.0)))
>>>
>>> On Mon, Apr 20, 2015 at 1:44 PM, Olivier Girardot <
>>> o.girar...@lateral-thoughts.com> wrote:
>>>
>>>> From PySpark it seems to me that the fillna is relying on Java/Scala
>>>> code, that's why I was wondering.
>>>> Thank you for answering :)
>>>>
>>>> Le lun. 20 avr. 2015 à 22:22, Reynold Xin <r...@databricks.com> a
>>>> écrit :
>>>>
>>>>> You can just create fillna function based on the 1.3.1 implementation
>>>>> of fillna, no?
>>>>>
>>>>>
>>>>> On Mon, Apr 20, 2015 at 2:48 AM, Olivier Girardot <
>>>>> o.girar...@lateral-thoughts.com> wrote:
>>>>>
>>>>>> a UDF might be a good idea no ?
>>>>>>
>>>>>> Le lun. 20 avr. 2015 à 11:17, Olivier Girardot <
>>>>>> o.girar...@lateral-thoughts.com> a écrit :
>>>>>>
>>>>>> > Hi everyone,
>>>>>> > let's assume I'm stuck in 1.3.0, how can I benefit from the
>>>>>> *fillna* API
>>>>>> > in PySpark, is there any efficient alternative to mapping the
>>>>>> records
>>>>>> > myself ?
>>>>>> >
>>>>>> > Regards,
>>>>>> >
>>>>>> > Olivier.
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>

Reply via email to