yaooqinn commented on PR #36222: URL: https://github.com/apache/spark/pull/36222#issuecomment-1102315504
> @yaooqinn I am trying to understand why it is null to begin with - getPreferredLocations is expected to either return Nil or a sequence with preferred locality - not null within that Seq. This happened on our customer side, I don't know why nulls within a seq came from either as it was hard to do the trace. > If there is some codepath in spark which is resulting null getting generated, we should fix that. I have checked all RDD implementations and supposed HadoopRDD might be an issue. > convertSplitLocationInfo, which you had modified, should not be having null as input - atleast based on my understanding of InputSplitWithLocationInfo.getLocationInfo/InputSplit. getLocationInfo At least these APIs and custom implementations do not guarantee non-null values within an array -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org