I'll try thanks Le ven. 24 avr. 2015 à 00:09, Reynold Xin <r...@databricks.com> a écrit :
> You can do it similar to the way countDistinct is done, can't you? > > > https://github.com/apache/spark/blob/master/python/pyspark/sql/functions.py#L78 > > > > On Thu, Apr 23, 2015 at 1:59 PM, Olivier Girardot < > o.girar...@lateral-thoughts.com> wrote: > >> I found another way setting a SPARK_HOME on a released version and >> launching an ipython to load the contexts. >> I may need your insight however, I found why it hasn't been done at the >> same time, this method (like some others) uses a varargs in Scala and for >> now the way functions are called only one parameter is supported. >> >> So at first I tried to just generalise the helper function "_" in the >> functions.py file to multiple arguments, but py4j's handling of varargs >> forces me to create an Array[Column] if the target method is expecting >> varargs. >> >> But from Python's perspective, we have no idea of whether the target >> method will be expecting varargs or just multiple arguments (to un-tuple). >> I can create a special case for "coalesce" or "for method that takes of >> list of columns as arguments" considering they will be varargs based (and >> therefore needs an Array[Column] instead of just a list of arguments) >> >> But this seems very specific and very prone to future mistakes. >> Is there any way in Py4j to know before calling it the signature of a >> method ? >> >> >> Le jeu. 23 avr. 2015 à 22:17, Olivier Girardot < >> o.girar...@lateral-thoughts.com> a écrit : >> >>> What is the way of testing/building the pyspark part of Spark ? >>> >>> Le jeu. 23 avr. 2015 à 22:06, Olivier Girardot < >>> o.girar...@lateral-thoughts.com> a écrit : >>> >>>> yep :) I'll open the jira when I've got the time. >>>> Thanks >>>> >>>> Le jeu. 23 avr. 2015 à 19:31, Reynold Xin <r...@databricks.com> a >>>> écrit : >>>> >>>>> Ah damn. We need to add it to the Python list. Would you like to give >>>>> it a shot? >>>>> >>>>> >>>>> On Thu, Apr 23, 2015 at 4:31 AM, Olivier Girardot < >>>>> o.girar...@lateral-thoughts.com> wrote: >>>>> >>>>>> Yep no problem, but I can't seem to find the coalesce fonction in >>>>>> pyspark.sql.{*, functions, types or whatever :) } >>>>>> >>>>>> Olivier. >>>>>> >>>>>> Le lun. 20 avr. 2015 à 11:48, Olivier Girardot < >>>>>> o.girar...@lateral-thoughts.com> a écrit : >>>>>> >>>>>> > a UDF might be a good idea no ? >>>>>> > >>>>>> > Le lun. 20 avr. 2015 à 11:17, Olivier Girardot < >>>>>> > o.girar...@lateral-thoughts.com> a écrit : >>>>>> > >>>>>> >> Hi everyone, >>>>>> >> let's assume I'm stuck in 1.3.0, how can I benefit from the >>>>>> *fillna* API >>>>>> >> in PySpark, is there any efficient alternative to mapping the >>>>>> records >>>>>> >> myself ? >>>>>> >> >>>>>> >> Regards, >>>>>> >> >>>>>> >> Olivier. >>>>>> >> >>>>>> > >>>>>> >>>>> >>>>> >