答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
According to my understanding, the original intention of this feature is that when a user has entered the pyspark document, if he finds that the version he is currently in is not the version he wants, he can easily jump to the version he wants by clicking on the drop-down box. Additionally, in t

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Jungtaek Lim
But this does not answer my question about updating the dropdown for the doc of "already released versions", right? Let's say we just released version D, and the dropdown has version A, B, C. We have another release tomorrow as version E, and it's probably easy to add A, B, C, D in the dropdown of

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
Based on my understanding, we should not update versions that have already been released, such as the situation you mentioned: `But what about dropout of version D? Should we add E in the dropdown?` We only need to record the latest `version. json` file that has already been published at the ti

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Jungtaek Lim
Let me be more specific. We have two active release version lines, 3.4.x and 3.5.x. We just released Spark 3.5.1, having a dropdown as 3.5.1 and 3.4.2 given the fact the last version of 3.4.x is 3.4.2. After a month we released Spark 3.4.3. In the dropdown of Spark 3.4.3, there will be 3.5.1 and 3

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
Okay, I see. Perhaps we can solve this confusion by sharing the same file `version.json` across `all versions` in the `Spark website repo`? Make each version of the document display the `same` data in the dropdown menu. 发件人: Jungtaek Lim 发送时间: 2024年3月5日 17:09:0

Re: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Jungtaek Lim
Yeah the approach seems OK to me - please double check that the doc generation in Spark repo won't fail after the move of the js file. Other than that, it would be probably just a matter of updating the release process. On Tue, Mar 5, 2024 at 7:24 PM Pan,Bingkun wrote: > Okay, I see. > > Perhaps

Re: When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why

2024-03-05 Thread Mich Talebzadeh
Hi Jason, I read your notes and the code simulating the problem as link https://issues.apache.org/jira/browse/SPARK-38388 and the specific repartition issue (SPARK-38388) that this code aims to demonstrate The code below from the above link Jira import scala.sys.process._ import org.apache.spar

答复: [ANNOUNCE] Apache Spark 3.5.1 released

2024-03-05 Thread Pan,Bingkun
Okay, Let me double-check it carefully. Thank you very much for your help! 发件人: Jungtaek Lim 发送时间: 2024年3月5日 21:56:41 收件人: Pan,Bingkun 抄送: Dongjoon Hyun; dev; user 主题: Re: [ANNOUNCE] Apache Spark 3.5.1 released Yeah the approach seems OK to me - please double c