narrowizard opened a new issue, #8400: URL: https://github.com/apache/incubator-devlake/issues/8400
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/incubator-devlake/issues?q=is%3Aissue) and found no similar feature requirement. ### Use case When importing issue data from third-party platforms that aren't covered by existing DevLake plugins, users often leverage the `customize` plugin's ability to ingest data via `issues.csv` through the `/plugins/customize/csvfiles/issues.csv` API endpoint. While this process successfully creates issues domain layer entities, it currently only focuses on the issue data itself. It does not automatically create or update corresponding entries in the `accounts` domain layer table based on fields like `creator_name` or `assignee_name` present in the `issues.csv`. This limitation means that even if issue data is successfully imported, the associated user data required for analyses related to individuals (like work distribution, assignee performance, etc.) is missing or requires a separate, potentially manual, process to populate the accounts table. This creates a significant gap in the imported data's completeness for common engineering metrics and requires extra effort from users. ### Description This feature request proposes enhancing the customize plugin's handling of the /plugins/customize/csvfiles/issues.csv import process. Specifically, when data is ingested via this endpoint, the plugin should, in addition to creating/updating issues domain entities, also process the creator_name and assignee_name fields within the CSV data. For each unique name found in either the creator_name or assignee_name columns across the entire imported CSV, the plugin should perform the following action: - Check if an account with a matching identifier (derived from the name) already exists in the accounts domain layer table. - If an account does not exist for that name, a new account entry should be automatically created in the accounts table, using the name from the CSV. (The exact mechanism for generating the account_id and mapping the name would need to be determined during implementation, potentially hashing the name or using the name itself if suitable for the schema). Implementing this feature would greatly streamline the process of importing issue data from external sources via the customize plugin, ensuring that essential related account information is automatically populated in the domain layer, making the data ready for richer analysis without manual intervention. ### Related issues _No response_ ### Are you willing to submit a PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
