moonming opened a new pull request, #2055:
URL: https://github.com/apache/apisix-website/pull/2055

   ## Summary
   
   Removes long-standing sitemap bloat from outdated sub-project documentation. 
The EN sitemap currently carries ~800 sub-project doc URLs — including ancient 
versions (ingress-controller `0.4.0`–`2.0.0`, docker `apisix-2.10.x`) and ~80 
thin `/tags/` pages — while the main APISIX docs are clean. Two root causes:
   
   1. **`scripts/sync-docs.js`** built *every* release of each sub-project. 
apisix is curated via `config/apisix-versions.js`; the sub-projects weren't. 
This change keeps only the **newest** released version of each 
(`SUBPROJECT_VERSIONS_TO_KEEP`, default `1`). The latest version is served 
unversioned at `/docs/<project>/` and indexed; `next` remains 
robots-disallowed. Older versions stay available in each project's source repo 
(git tags).
   2. **`scripts/update-sitemap-loc.js`** — the version-exclusion regex only 
matched 2-part versions (`/docs/apisix/3.14/`). It missed 3-part semver 
(`/docs/ingress-controller/2.0.0/`) and prefixed versions 
(`/docs/docker/apisix-2.10.0/`) — exactly why sub-project versioned docs leaked 
into the sitemap. Broadened to cover all three forms.
   
   ## ⚠️ Behavior change — please confirm
   
   This **removes older sub-project doc versions from the published site** 
(they remain in each source repo via git tags). Intentional per discussion, but 
a content-availability change maintainers should sign off on. 
`SUBPROJECT_VERSIONS_TO_KEEP` can be raised to keep a wider window — the 
sitemap regex now handles >1 correctly.
   
   ## Test plan
   
   - [ ] CI build passes (`yarn build`), including the doc sync step.
   - [ ] After build, each sub-project has a single versioned tree + `next`; 
`/docs/ingress-controller/`, `/docs/docker/`, etc. serve the latest version.
   - [ ] `website` / `doc` / `blog` `sitemap.xml` no longer contains 
`/docs/<project>/<old-version>/` URLs (the "Filtered out N URLs" log should 
rise).
   - [ ] Spot-check: `/docs/ingress-controller/2.0.0/...` and 
`/docs/docker/apisix-2.10.0/...` are absent from the sitemap.
   
   ## Verification done locally
   
   The regex was validated against real URL shapes (2-part, 3-part, 
docker-prefixed → excluded; latest/unversioned and non-version doc paths → 
kept). The version slice and both scripts' syntax were checked. Full doc-sync + 
build verification is delegated to CI (it clones the sub-project repos).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to