JoshRosen commented on PR #36775:
URL: https://github.com/apache/spark/pull/36775#issuecomment-1171817134

   > Thanks for you guys reviews, I think @JoshRosen's way is more appropriate, 
we better change the implementation to "false negatives", though it may need 
more efforts to implement it and it's a breaking change to the users.
   > 
   > Do we need to implement it or change this to identify whether it's an 
exception from `DFSClient.checkOpen` to make the codes more robust?
   
   Maybe we can tackle the issues separately? 
   
   In the long-term I think we should reconsider the overall approach for 
detecting corrupt files and maybe move towards a "false negatives" design. This 
would be a breaking change and might require datasource changes, so I think it 
deserves a separate and broader discussion (maybe on a separate JIRA or on the 
dev mailing list).
   
   In the short-term, I think it's okay to make an incremental improvement to 
the current design and fix the specific issue targeted by this PR. Merging the 
slightly-more-robust-via-checking-exception-source version of the change you 
proposed here delivers real value to users and solves a real problem even if it 
doesn't completely solve the larger pre-existing design problem in this 
feature. Could you update this PR to include the more robust method of error 
checking and ping us for a final look? Please add a code comment near the 
modified code which references SPARK-39389 and explains the motivation for the 
change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to