Hi @Hyukjin Kwon <gurwls...@gmail.com> I see you have resolved the JIRA and I got some more things to do in functions.py (only done 50%). So shall I create a new JIRA for each new PR or ok to reuse this one?
On Fri, 19 Aug 2022, 09:29 Khalid Mammadov, <khalidmammad...@gmail.com> wrote: > Will do, thanks! > > On Fri, 19 Aug 2022, 09:11 Hyukjin Kwon, <gurwls...@gmail.com> wrote: > >> Sure, that would be great. >> >> I did the first 25 functions in functions.py. Please go ahead with the >> rest of them. >> You can create a PR with the title such >> as [SPARK-40142][PYTHON][SQL][FOLLOW-UP] Make pyspark.sql.functions >> examples self-contained (part 2, 25 functions) >> >> Thanks! >> >> On Fri, 19 Aug 2022 at 16:50, Khalid Mammadov <khalidmammad...@gmail.com> >> wrote: >> >>> I am picking up "functions.py" if noone is already >>> >>> On Fri, 19 Aug 2022, 07:56 Khalid Mammadov, <khalidmammad...@gmail.com> >>> wrote: >>> >>>> I thought it's all finished (checked few). Do you have list of those >>>> 50%? >>>> Happy to contribute 😊 >>>> >>>> On Fri, 19 Aug 2022, 05:54 Hyukjin Kwon, <gurwls...@gmail.com> wrote: >>>> >>>>> We're half way, roughly 50%. More contributions would be very helpful. >>>>> If the size of the file is too large, feel free to split it to >>>>> multiple parts (e.g., https://github.com/apache/spark/pull/37575) >>>>> >>>>> On Tue, 9 Aug 2022 at 12:26, Qian SUN <qian.sun2...@gmail.com> wrote: >>>>> >>>>>> Sure, I will do it. SPARK-40010 >>>>>> <https://issues.apache.org/jira/browse/SPARK-40010> is built to >>>>>> track progress. >>>>>> >>>>>> Hyukjin Kwon gurwls...@gmail.com <http://mailto:gurwls...@gmail.com> >>>>>> 于2022年8月9日周二 10:58写道: >>>>>> >>>>>> Please go ahead. Would be very appreciated. >>>>>>> >>>>>>> On Tue, 9 Aug 2022 at 11:58, Qian SUN <qian.sun2...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Hyukjin >>>>>>>> >>>>>>>> I would like to do some work and pick up *Window.py *if possible. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Qian >>>>>>>> >>>>>>>> Hyukjin Kwon <gurwls...@gmail.com> 于2022年8月9日周二 10:41写道: >>>>>>>> >>>>>>>>> Thanks Khalid for taking a look. >>>>>>>>> >>>>>>>>> On Tue, 9 Aug 2022 at 00:37, Khalid Mammadov < >>>>>>>>> khalidmammad...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Hyukjin >>>>>>>>>> That's great initiative, here is a PR that address one of those >>>>>>>>>> issues that's waiting for review: >>>>>>>>>> https://github.com/apache/spark/pull/37408 >>>>>>>>>> >>>>>>>>>> Perhaps, it would be also good to track these pending issues >>>>>>>>>> somewhere to avoid effort duplication. >>>>>>>>>> >>>>>>>>>> For example, I would like to pick up *union* and *union all* if >>>>>>>>>> no one has already. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Khalid >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, Aug 8, 2022 at 1:44 PM Hyukjin Kwon <gurwls...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> I am trying to improve PySpark documentation especially: >>>>>>>>>>> >>>>>>>>>>> - Make the examples self-contained, e.g., >>>>>>>>>>> >>>>>>>>>>> https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pivot.html >>>>>>>>>>> - Document Parameters >>>>>>>>>>> >>>>>>>>>>> https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pivot.html#pandas.DataFrame.pivot. >>>>>>>>>>> There are many API that misses parameters in PySpark, e.g., >>>>>>>>>>> DataFrame.union >>>>>>>>>>> >>>>>>>>>>> Here is one example PR I am working on: >>>>>>>>>>> https://github.com/apache/spark/pull/37437 >>>>>>>>>>> I can't do it all by myself. Any help, review, and contributions >>>>>>>>>>> would be welcome and appreciated. >>>>>>>>>>> >>>>>>>>>>> Thank you all in advance. >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best! >>>>>>>> Qian SUN >>>>>>>> >>>>>>> -- >>>>>> Best! >>>>>> Qian SUN >>>>>> >>>>>