rtpsw commented on PR #14347:
URL: https://github.com/apache/arrow/pull/14347#issuecomment-1277658521

   > Okay, so here is the problem: users shouldn't pass invalid data to Arrow 
APIs (except to `Validate` and `ValidateFull`, which are explicitly designed to 
handle such data). So it doesn't make sense to check for invalid data at the 
beginning of other functions; also, it can be quite costly (`ValidateFull` can 
typically be `O(nrows * columns)`).
   > 
   > (note: "invalid data" here is a badly structured array)
   
   This circles back to points we discussed. I can understand the requirement 
of passing valid data in a correct Arrow app, as well as in correct Arrow code, 
but less so during its development, where incorrect code frequently occurs. 
This PR aims to make (failure analysis during) development easier, given that 
its runtime cost is small. For the purpose of cost, I think the calls to 
`ValidateFull` in `PrintDiff` shouldn't count because they can be removed - I 
only give them for reproducibility without a segmentation fault. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to