Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Community

2024-03-23 Thread Winston Lai
+1 -- Thank You & Best Regards Winston Lai From: Jay Han Date: Sunday, 24 March 2024 at 08:39 To: Kiran Kumar Dusi Cc: Farshid Ashouri , Matei Zaharia , Mich Talebzadeh , Spark dev list , user @spark Subject: Re: A proposal for creating a Knowledge Sharing Hub for Apache Spark Communit

Re: Extracting Logical Plan

2023-08-02 Thread Winston Lai
) > 10) > val resultDF = filteredDF.select("column1", "column2") > > // Trigger the execution of the DF to invoke the listener > resultDF.show() Thank You & Best Regards Winston Lai From: Vibhatha Abeykoon Sent: Wednesday, August 2, 2023

Re: Extracting Logical Plan

2023-08-02 Thread Winston Lai
(e.g., scala/Python/R) that you use to run Spark. On Wednesday, August 2, 2023, Vibhatha Abeykoon wrote: > Hi Winston, > > I am looking for a way to access the LogicalPlan object in Scala. Not sure > if explain function would serve the purpose. > > On Wed, Aug 2, 2023 at 9:14 AM

Re: Extracting Logical Plan

2023-08-01 Thread Winston Lai
Hi Vibhatha, Have you tried pyspark.sql.DataFrame.explain — PySpark 3.4.1 documentation (apache.org) before? I am not sure what infra that you have, you can

Re: ChatGPT and prediction of Spark future

2023-05-31 Thread Winston Lai
are interested. Thank you a lot for your continuous help in this Spark community! I'd be glad if my reply is useful to you  Thank You & Best Regards Winston Lai From: Mich Talebzadeh Sent: Thursday, June 1, 2023 4:51:43 AM To: user @spark Subject: ChatGPT and predic

Re: Does spark read the same file twice, if two stages are using the same DataFrame?

2023-05-07 Thread Winston Lai
it in other metrics from Spark UI. That is my personal understanding based on what I have read and seen on my job runs. If there is any mistake, be free to correct me. Thank You & Best Regards Winston Lai From: Nitin Siwach Sent: Sunday, May 7, 2023 12:22:3

Re: ***pyspark.sql.functions.monotonically_increasing_id()***

2023-04-28 Thread Winston Lai
ou may use row_number() alone to generate the so call index or use monotonically_increasing_id() together with rank() to assign the id and ranking them to make the result more deterministic. Thank You & Best Regards Winston Lai Thank You & Best Regards Winston Lai ___

Re: Slack for PySpark users

2023-03-27 Thread Winston Lai
Please let us know when the channel is created. I'd like to join :) Thank You & Best Regards Winston Lai From: Denny Lee Sent: Tuesday, March 28, 2023 9:43:08 AM To: Hyukjin Kwon Cc: keen ; user@spark.apache.org Subject: Re: Slack for PySpark users +1 I t

Re: [EXTERNAL] Re: Online classes for spark topics

2023-03-09 Thread Winston Lai
important notes for those users? For example, what are the additional factors affecting the Spark performance using Pandas API on Spark? How to tune them in addition to the conventional Spark tuning methods applied to Spark SQL users. Thank You & Best Regards Winston

Re: Online classes for spark topics

2023-03-08 Thread Winston Lai
+1, any webinar on Spark related topic is appreciated  Thank You & Best Regards Winston Lai From: asma zgolli Sent: Thursday, March 9, 2023 5:43:06 AM To: karan alang Cc: Mich Talebzadeh ; ashok34...@yahoo.com ; User Subject: Re: Online classes for s