yaooqinn commented on PR #36222:
URL: https://github.com/apache/spark/pull/36222#issuecomment-1102315504

   > @yaooqinn I am trying to understand why it is null to begin with - 
getPreferredLocations is expected to either return Nil or a sequence with 
preferred locality - not null within that Seq.
   
   This happened on our customer side, I don't know why nulls within a seq came 
from either as it was hard to do the trace.
   
   > If there is some codepath in spark which is resulting null getting 
generated, we should fix that.
   
   I have checked all RDD implementations and supposed HadoopRDD might be an 
issue. 
   
   >  convertSplitLocationInfo, which you had modified, should not be having 
null as input - atleast based on my understanding of 
InputSplitWithLocationInfo.getLocationInfo/InputSplit. getLocationInfo
   
   At least these APIs and custom implementations do not guarantee non-null 
values within an array


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to