HI Spark developers, If I want to hint spark to use particular list of hosts to execute tasks on. I see that getBlockLocations is used to get the list of hosts from HDFS.
https://github.com/apache/spark/blob/7955b3962ac46b89564e0613db7bea98a1478bf2/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L386 Hinting Spark by custom getBlockLocation which return Array of BlockLocations with host ip address doesn’t help, Spark continues to host it on other executors hosts. Is there something I am doing wrong ? Test: Spark.read.csv() Appreciate your inputs 😊 Thanks, Nasrulla