squatboy opened a new pull request, #8820:
URL: https://github.com/apache/incubator-devlake/pull/8820

   <!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
   -->
   ### ⚠️ Pre Checklist
   
   - [x] I have read through the [Contributing 
Documentation](https://devlake.apache.org/community/).
   - [x] I have added relevant tests.
   - [x] I have added relevant documentation.
   - [x] I will add labels to the PR, such as `pr-type/bug-fix`, 
`pr-type/feature-development`, etc.
   
   ### Summary
   This is a follow-up to #8637.
   
   `#8637` stopped failing the GraphQL collector immediately when GitHub 
returned `Could not resolve to an Issue`, but the missing issue could still 
remain in DevLake's local issue and raw tables.
   
   This patch keeps the GraphQL collector tolerant of that missing-issue data 
error and also removes the stale local rows that keep the missing issue in the 
GitHub GraphQL `refresh OPEN issues` path.
   
   What changed:
   
   - continue into `ResponseParser` when the only GraphQL data errors are 
ignorable `Could not resolve to an Issue` errors
   - track which issue numbers were requested in the GitHub GraphQL open-issue 
refresh batch
   - compare the requested issue numbers with the successfully resolved issues 
returned by GitHub
   - delete stale local refresh-input rows for unresolved issues from:
     - `_tool_github_issues`
     - `_tool_github_issue_comments`
     - `_tool_github_issue_events`
     - `_tool_github_issue_labels`
     - `_tool_github_issue_assignees`
     - `_tool_github_pull_request_issues`
     - the corresponding raw GraphQL issue row when available through 
`RawDataOrigin`
   
   Why:
   
   - without cleanup, deleted or transferred issues remain orphaned in the 
GitHub GraphQL collector input set and can be retried forever during later 
refreshes
   - this patch makes the fix from #8637 complete for the `refresh OPEN issues` 
collector path
   
   ### Does this close any open issues?
   Closes #8819
   
   ### Screenshots
   N/A
   
   ### Other Information
   Tests and validation:
   
   - added unit test for ignorable GraphQL missing-issue errors
   - added unit tests for missing-issue detection in the GitHub GraphQL issue 
refresh path
   - verified:
     - `go test ./plugins/github_graphql/tasks`
     - `go build ./helpers/pluginhelper/api`
   
   Scope note:
   
   - this patch is intentionally scoped to the GitHub GraphQL collector refresh 
path and the tool and raw rows that feed it
   - it does not attempt a full cross-layer hard delete of every derived 
domain-layer ticket row
   - no website or user-facing documentation repository change is required for 
this internal collector bugfix
   - related earlier fix: #7969 and #8637
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to