[ 
https://issues.apache.org/jira/browse/HUDI-1680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

steven zhang updated HUDI-1680:
-------------------------------
    Description: 
Currently HoodieMergeOnReadRDD/HoodieBootstrapRDD's partition may have multiple 
PartitionedFiles. we should

compute the hosts with the most data of the PartitionedFiles for data locality

  was:
Currently HoodieMergeOnReadRDD/HoodieBootstrapRDD's partition may have multiple 
PartitionedFiles. we should

compute the hosts with the most data of the PartitionedFiles


> Add getPreferredLocations for RDD
> ---------------------------------
>
>                 Key: HUDI-1680
>                 URL: https://issues.apache.org/jira/browse/HUDI-1680
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Spark Integration
>            Reporter: steven zhang
>            Assignee: steven zhang
>            Priority: Minor
>             Fix For: 0.8.0
>
>
> Currently HoodieMergeOnReadRDD/HoodieBootstrapRDD's partition may have 
> multiple PartitionedFiles. we should
> compute the hosts with the most data of the PartitionedFiles for data locality



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to