Thank you, Wenchen. As the author of one of the patches you mentioned, I'd like to add that this kind of issues happens from time to time. IIRC, in the worst case, the release image was broken during the voting period. In other words, historically, we were blocked at the release process although we were able to create RC successfully a few days before.
- https://github.com/apache/spark/pull/53083 [SPARK-54371][INFRA] Fix spark-rm Dockefile to install pkgdown version at the end Hence, Wenchen, please proceed this discussion in a new thread because - It's a problem across multiple releases as you mentioned. - The main root cause is the immature release script and process building images inconsistently every trigger. - Lastly, it sounds like inevitably very important for those who proposed to extend the 3.5.x lifecycle. Thank you again, Dongjoon. On 2026/01/23 10:11:26 Wenchen Fan wrote: > A followup on the Spark R doc issue: > - The docker image for creating releases had some environment issues, which > led to unexpected Spark R doc files being generated. (unexpected but not > wrong) > - The trouble of it is that, the Spark R doc files have name conflicts in > case insensitive file systems such as Mac OS. > - The docker image has been fixed later. The affected releases are from > 3.5.4 to 4.1.0-preview3. From 4.1.0-preview4, the Spark R doc files are > expected. > - I've extended the fix to all the active branches recently. > > The problem is, for people who work on spark-website git repo, the > unexpected Spark R doc files are still in the repo and cause issues for git > operations. I have some proposals to fix it, from most conservative to most > aggressive: > 1. Release 3.5.9 and 4.0.2. Redirect the doc link of the affected releases > (3.5.4 to 4.1.0-preview3) to ASF Archive Service, so that the unexpected > Spark R doc files are not in the spark-website git repo anymore. > 2. Regenerate Saprk R docs on branch-3.5 and branch 4.0. Use them to > replace the ones of the affected releases (3.5.4 to 4.1.0-preview3) in the > spark-website git repo. > 3. In addition to option 2, also replaces the Spark R docs of the affected > releases (3.5.4 to 4.1.0-preview3) in Apache dist. > > Thoughts? > > On Wed, Jan 21, 2026 at 3:50 AM Hyukjin Kwon <[email protected]> wrote: > > > Lgtm > > > > On Tue, Jan 20, 2026 at 11:38 PM Dongjoon Hyun <[email protected]> > > wrote: > > > >> Here is a PR to remove MD files first from the Apache Spark website. > >> > >> https://github.com/apache/spark-website/pull/665 > >> > >> Dongjoon. > >> > >> On 2026/01/20 13:58:50 Dongjoon Hyun wrote: > >> > I confirmed the MD files were generated by the automated release script. > >> > > >> > $ svn log > >> > > >> https://dist.apache.org/repos/dist/release/spark/docs/3.5.8/api/R/reference/index.md > >> > ------------------------------------------------------------------------ > >> > r81907 | dongjoon | 2026-01-15 21:01:34 +0900 (Thu, 15 Jan 2026) | 2 > >> lines > >> > Release Apache Spark 3.5.8 documentation > >> > ------------------------------------------------------------------------ > >> > r81820 | dongjoon | 2026-01-12 14:36:54 +0900 (Mon, 12 Jan 2026) | 1 > >> line > >> > Apache Spark v3.5.8-rc1 docs > >> > ------------------------------------------------------------------------ > >> > > >> > We can see the revision 81820 via the web page, too. > >> > > >> > > >> https://dist.apache.org/repos/dist/dev/spark/v3.5.8-rc1-docs/_site/api/R/index.md?p=81820 > >> > > >> > For the record, here is the GitHub Action link for the run. > >> > > >> > > >> https://github.com/dongjoon-hyun/spark/actions/runs/20907053840/job/60062426287 > >> > > >> > ================ > >> > Release details: > >> > BRANCH: branch-3.5 > >> > VERSION: 3.5.8 > >> > TAG: v3.5.8-rc1 > >> > NEXT: 3.5.9-SNAPSHOT > >> > > >> > ASF USER: *** > >> > GPG KEY: ***@apache.org > >> > FULL NAME: ***-hyun > >> > E-MAIL: ***@apache.org > >> > ================ > >> > > >> > Dongjoon. > >> > > >> > > >> > > >> > On Tue, Jan 20, 2026 at 10:24 PM Wenchen Fan <[email protected]> > >> wrote: > >> > > >> > > Hi Dongjoon, > >> > > > >> > > The file name case sensitivity issue is not related to this. There is > >> no > >> > > markdown files in the 3.5.7 docs: > >> > > > >> https://dist.apache.org/repos/dist/release/spark/docs/3.5.7/api/R/reference > >> > > , so it seems to be a mistake for 3.5.8. I think the apache dist is > >> in sync > >> > > with spark-website git repo, so we can also notice the issue in > >> > > > >> https://github.com/apache/spark-website/tree/asf-site/site/docs/3.5.8/api/R/reference > >> > > . > >> > > > >> > > On Tue, Jan 20, 2026 at 8:48 PM Dongjoon Hyun <[email protected]> > >> wrote: > >> > > > >> > >> Hi, Wenchen. > >> > >> > >> > >> Thank you for reporting. > >> > >> > >> > >> Just to be clear, if we clone the Spark Website repository cleanly on > >> > >> MacOS Today, we can see the following. Do you mean the same URL > >> links? > >> > >> > >> > >> $ git clone [email protected]:apache/spark-website.git > >> > >> > >> > >> $ git status > >> > >> On branch asf-site > >> > >> Your branch is up to date with 'origin/asf-site'. > >> > >> > >> > >> Changes not staged for commit: > >> > >> (use "git add <file>..." to update what will be committed) > >> > >> (use "git restore <file>..." to discard changes in working > >> directory) > >> > >> modified: site/docs/3.5.6/api/R/reference/GroupedData.html > >> > >> modified: site/docs/3.5.6/api/R/reference/isNaN.html > >> > >> modified: site/docs/3.5.7/api/R/reference/GroupedData.html > >> > >> modified: site/docs/3.5.7/api/R/reference/isNaN.html > >> > >> modified: > >> > >> site/docs/4.1.0-preview3/api/R/reference/GroupedData.html > >> > >> modified: > >> site/docs/4.1.0-preview3/api/R/reference/isNaN.html > >> > >> > >> > >> Dongjoon. > >> > >> > >> > >> On 2026/01/19 07:54:58 Wenchen Fan wrote: > >> > >> > Hi Dongjoon, > >> > >> > > >> > >> > Thanks for driving this release! There seems to be an issue in the > >> > >> Spark R > >> > >> > API docs: > >> > >> > > >> > >> > >> https://dist.apache.org/repos/dist/release/spark/docs/3.5.8/api/R/reference/ > >> > >> > . It should only have html files but there are markdown file as > >> well. Is > >> > >> > there a bug in the new Github Action based release pipeline, or it > >> was a > >> > >> > mistake during manual processing? > >> > >> > > >> > >> > On Fri, Jan 16, 2026 at 10:45 AM Dongjoon Hyun < > >> [email protected] > >> > >> > > >> > >> > wrote: > >> > >> > > >> > >> > > We are happy to announce the availability of Apache Spark 3.5.8! > >> > >> > > > >> > >> > > Spark 3.5.8 is the eighth maintenance release based on the > >> branch-3.5 > >> > >> > > branch of Spark. It contains many fixes including security and > >> > >> correctness > >> > >> > > domains. We strongly recommend all 3.5 users to upgrade to this > >> or > >> > >> higher > >> > >> > > stable release. > >> > >> > > > >> > >> > > To download Spark 3.5.8, head over to the download page: > >> > >> > > https://spark.apache.org/downloads.html > >> > >> > > > >> > >> > > To view the release notes: > >> > >> > > https://spark.apache.org/releases/spark-release-3-5-8.html > >> > >> > > > >> > >> > > We would like to acknowledge all community members for > >> contributing to > >> > >> > > this release. This release would not have been possible without > >> you. > >> > >> > > > >> > >> > > Best regards, > >> > >> > > Dongjoon Hyun > >> > >> > > > >> > >> > > >> > >> > >> > >> --------------------------------------------------------------------- > >> > >> To unsubscribe e-mail: [email protected] > >> > >> > >> > >> > >> > > >> > >> --------------------------------------------------------------------- > >> To unsubscribe e-mail: [email protected] > >> > >> > --------------------------------------------------------------------- To unsubscribe e-mail: [email protected]
