moulivashisth opened a new pull request, #8572:
URL: https://github.com/apache/incubator-devlake/pull/8572
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
### ⚠️ Pre Checklist
> Please complete _ALL_ items in this checklist, and remove before submitting
- [ x] I have read through the [Contributing
Documentation](https://devlake.apache.org/community/).
- [ x] I have added relevant tests.
- [ x] I have added relevant documentation.
- [ x] I will add labels to the PR, such as `pr-type/bug-fix`,
`pr-type/feature-development`, etc.
<!--
Thanks for submitting a pull request!
We appreciate you spending the time to work on these changes.
Please fill out as many sections below as possible.
-->
### Summary
**Fix GitLab Users collection hitting offset pagination limits by adding
keyset pagination.**
This PR updates the GitLab `CollectAccounts` subtask to avoid `max offset`
errors when collecting large user sets:
- **Self-managed GitLab instances** now use **keyset pagination** on
`/api/v4/users`
(`pagination=keyset&order_by=id&sort=asc&per_page=N&id_after=<last_id>`)
and **do not** send `page`.
- **gitlab.com / jihulab.com** keep existing behavior on project member
endpoints
(`/projects/:id/members[/all]`) which typically remain under offset caps
per project.
- Retains existing API-version fallback (`/members/all` vs `/members/` for <
v13.11).
- Response parser now tracks the last item’s `id` to advance the keyset
cursor safely.
- No breaking changes to task wiring or raw table schema
(`gitlab_api_users`).
**Why:** Some instances enforce strict offset caps (e.g., 50k), causing
`offset pagination is restricted` errors when fetching Users. Keyset pagination
removes the offset and enables full retrieval.
**Risk/Compatibility:**
- Backward compatible; only changes query parameters and cursor handling.
- If a project’s members list exceeds offset caps and the endpoint lacks
keyset, users should collect site users (self-managed path) or shard by
project/group—unchanged from current guidance.
### Does this close any open issues?
Closes 8529 ([Bug][GitLab] Pagination not working Again)
### Screenshots
N/A
### Other Information
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]