satish created HUDI-1443: ---------------------------- Summary: Remove record deserialization in RDDCustomColumnsSortPartitioner Key: HUDI-1443 URL: https://issues.apache.org/jira/browse/HUDI-1443 Project: Apache Hudi Issue Type: Sub-task Components: Performance Reporter: satish
https://github.com/apache/hudi/pull/2263#discussion_r533653930 has the context. When sorting is specified as part of clustering, we use custom partitioner RDDCustomColumnsSortPartitioner. This deserializes schema to get values for sort columns. Check if its possible to avoid this and implement the suggestion in PR. -- This message was sent by Atlassian Jira (v8.3.4#803005)