danielemoraschi opened a new pull request, #8815:
URL: https://github.com/apache/incubator-devlake/pull/8815
### ⚠️ Pre Checklist
- [x] I have read through the [Contributing
Documentation](https://devlake.apache.org/community/).
- [x] I have added relevant tests.
- [x] I have added relevant documentation.
- [x] I will add labels to the PR, such as `pr-type/bug-fix`,
`pr-type/feature-development`, etc.
### Summary
Fixes `clearHistoryData()` in the linker plugin which was deleting all
`pull_request_issues` records instead of only the current project's
linker-created rows.
**Root cause:** The function used a `LEFT JOIN` with `project_name` in the
`ON` clause. With a LEFT JOIN, unmatched rows still appear in the result, so
the subquery returned every PR ID in the system effectively wiping the entire
`pull_request_issues` table on every linker run.
**Impact:** When two projects share a GitHub repo, running the pipeline for
one project deleted all PR-issue links created by the other project's pipeline
(and links from the GitHub converter).
**Fix:**
- `INNER JOIN` + `WHERE` for `project_name` (fixes the LEFT JOIN bug)
- Issue-side subquery scoped to current project's boards (prevents
cross-project deletion)
- `_raw_data_table` / `_raw_data_remark` filter to only delete
linker-created rows (preserves GitHub converter rows)
**Tests:**
- Added `TestLinkPrToIssueWithSharedRepo` e2e test with CSV fixtures
simulating two projects sharing a repo
- Verifies that running the linker for one project correctly creates its
links, deletes its stale links, and preserves the other project's linker links
and converter links
- Existing `TestLinkPrToIssue` continues to pass unchanged
### Does this close any open issues?
Closes #8814
### Screenshots
N/A — backend-only change.
### Other Information
The bug was introduced in commit
[`a4cb023ba`](https://github.com/apache/incubator-devlake/commit/a4cb023ba)
(May 2024, "Clear history data when running linker"). The existing e2e test did
not catch it because it only covered a single project and flushed the table
before running.
**One edge case:**
If an issue is removed from a project's board between two linker runs, the
old stale link for that issue won't be cleaned up (because the issue-side
subquery no longer matches it). But this matches how the creation logic works,
it also scopes to current board state. And the old code had the same conceptual
issue (it just masked it by deleting everything).
Opening this PR since this issue is a blocker for my setup.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]