dosubot[bot] commented on issue #8450:
URL:
https://github.com/apache/incubator-devlake/issues/8450#issuecomment-2909019004
<!-- Answer -->
To filter out the collection of certain GitLab "product warehouses"
(repositories or projects) that take a long time to collect in Apache DevLake,
you can use the configuration options provided in the GitLab plugin.
The key settings to look at are:
- **dataScope**: This section allows you to search for and select specific
GitLab projects or subgroups to include or exclude from data collection. By
carefully selecting only the projects you want, you can avoid collecting from
those "pure product warehouses" that are slow to process. The UI provides a
miller column for browsing and filtering projects and subgroups, making it
easier to select or deselect them as needed.
[See config
reference](https://github.com/apache/incubator-devlake/blob/main/config-ui/src/plugins/register/gitlab/config.tsx)
- **scopeConfig**: This section lets you define which entities to collect
(such as CODE, TICKET, CODEREVIEW, etc.) and apply transformation patterns. You
can use these patterns to further refine what data is collected from each
selected project.
[See config
reference](https://github.com/apache/incubator-devlake/blob/main/config-ui/src/plugins/register/gitlab/config.tsx)
- **Incremental Collection**: If you still need to collect from some large
projects but want to reduce the time, consider enabling incremental collection.
This will only fetch new or updated data, rather than collecting everything
from scratch each time.
[See more about collectors and incremental
collection](https://github.com/apache/incubator-devlake/blob/main/backend/DevelopmentManual.md)
**Note:** The term "pure product warehouses" is not standard in DevLake or
GitLab documentation. If you mean specific types of repositories or projects,
you will need to identify them by name or pattern and exclude them during the
project selection step in the dataScope configuration.
If you need more granular filtering (for example, by project naming
patterns), you may need to use transformation patterns in the scopeConfig or
manually select/deselect projects during the onboarding/configuration process.
If you have a specific example of what you mean by "pure product
warehouses," I can provide more targeted configuration guidance.
<!-- Dosu Comment Footer -->
*To continue talking to [Dosu](https://go.dosu.dev/dosubot), mention
@dosu-bot.*
---
<sup>Help Dosu learn! Give it feedback: [Great
Response](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=great_response)
| [Irrelevant
Answer](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=irrelevant_answer)
| [Incorrect
Sources](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=incorrect_sources)
| [Too
Verbose](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=too_verbose)
|
[Hallucination](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=hallucination)
| [Bug
Report](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=bug_report)
|
[Other](https://app.dosu.dev/response-feedback/07c1c4b3-a783-490e-90a7-8656dd122a55?feedback_type=other)</sup>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]