[ https://issues.apache.org/jira/browse/HUDI-1449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454891#comment-17454891 ]
Vinoth Chandar commented on HUDI-1449: -------------------------------------- this is already supported > Support for _hoodie_record_key as a virtual column > --------------------------------------------------- > > Key: HUDI-1449 > URL: https://issues.apache.org/jira/browse/HUDI-1449 > Project: Apache Hudi > Issue Type: Improvement > Components: Common Core > Reporter: Nishith Agarwal > Assignee: Abhishek Modi > Priority: Major > > Context: > Currently, _hoodie_record_key is written to DFS, as a column in the Parquet > file. In our production systems at Uber however, _hoodie_record_key > contains data that can be found in a different column (or set of columns). > This means that we are storing duplicated data. > Proposal: > In the interest of improving storage efficiency, we could add confs / > abstract classes that can construct the _hoodie_record_key given other > columns. That way we do not have to store duplicated data on DFS. > > RFC -> > https://cwiki.apache.org/confluence/display/HUDI/RFC+-+21+%3A+Allow+HoodieRecordKey+to+be+Virtual -- This message was sent by Atlassian Jira (v8.20.1#820001)