prashantwason opened a new pull request #1520: [HUDI-797] Small performance improvement for rewriting records. URL: https://github.com/apache/incubator-hudi/pull/1520 When adding HUDI metadata field to schema, the position of incoming schema fields is retained. This allows us to lookup fields in exiting record and assign to the new record using field positions (array index lookup). This is faster than looking up fields using name (HashMap based lookup). ## What is the purpose of the pull request Small performance improvement for rewriting records during the ingestion phase. [HUDI-797](https://issues.apache.org/jira/browse/HUDI-797) as the details on the usecase and improvements. ## Brief change log 1. Modified HoodieAvroUtils.addMetadataFields to retain the field positions. 2. Added a new rewriting function HoodieAvroUtils.rewriteHoodieRecord which rewrites using field positions rather than field names. ## Verify this pull request This change is verified automatically by all the HUDI client tests which ingest data into HUDI. ## Committer checklist - [ ] Has a corresponding JIRA in PR title & commit - [ ] Commit message is descriptive of the change - [ ] CI is green - [ ] Necessary doc changes done or have another open PR - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services