[ https://issues.apache.org/jira/browse/HUDI-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
steven zhang updated HUDI-1680: ------------------------------- Description: Currently HoodieMergeOnReadRDD/HoodieBootstrapRDD's partition may have multiple PartitionedFiles. we should compute the hosts with the most data of the PartitionedFiles for data locality was: Currently HoodieMergeOnReadRDD/HoodieBootstrapRDD's partition may have multiple PartitionedFiles. we should compute the hosts with the most data of the PartitionedFiles > Add getPreferredLocations for RDD > --------------------------------- > > Key: HUDI-1680 > URL: https://issues.apache.org/jira/browse/HUDI-1680 > Project: Apache Hudi > Issue Type: Improvement > Components: Spark Integration > Reporter: steven zhang > Assignee: steven zhang > Priority: Minor > Fix For: 0.8.0 > > > Currently HoodieMergeOnReadRDD/HoodieBootstrapRDD's partition may have > multiple PartitionedFiles. we should > compute the hosts with the most data of the PartitionedFiles for data locality -- This message was sent by Atlassian Jira (v8.3.4#803005)