ciaramulligan commented on issue #8834:
URL:
https://github.com/apache/incubator-devlake/issues/8834#issuecomment-4238299813
Thanks @dosubot — that's a great analysis. The LEFT JOIN + WHERE
anti-pattern explains exactly what we're seeing.
To confirm from our data: we verified that all affected issues exist in
_tool_jira_board_issues for the board, but their changelogs may have been
collected via epic-sourced or cross-referenced paths and lack the board
association at conversion time. This matches your explanation.
The impact for us is significant — projects range from 4% to 99% missing
changelogs in the domain layer, which makes any metric relying on
issue_changelogs unreliable at scale.
We'd prefer option 2 (move board_id into the ON clause) — that preserves the
LEFT JOIN semantics and ensures all collected changelogs are converted
regardless of how the issue was collected. Option 1 (explicit INNER JOIN) would
just make the current data loss intentional rather than fixing it.
Adding a log/warning for dropped items would also be valuable for debugging.
Is there a timeline for a fix?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]