scovich commented on PR #9097:
URL: https://github.com/apache/arrow-rs/pull/9097#issuecomment-4261151340

   @alamb I think we were a bit stuck on performance vs. correctness trade-off?
   * This PR just skims bytes of skipped fields with basically no validation, 
just watching for navigation aids like `{`, `[`, and `"`.
   * I had done some pathfinding that did proper validation of skipped bytes, 
and it basically erased all the performance gains once I fixed the bugs in my 
initial version.
   * related: https://github.com/apache/arrow-rs/issues/9204
   
   Based on the above, we probably have to give up on enforcing correctness of 
the skipped bytes if we want the performance gains.
   
   My other worry is, the state machine has overhead for all object fields even 
tho only top-level fields might be skipped. I think the latest version of the 
PR reduced that overhead quite a bit, but fundamentally there has to be a 
branch to decide whether to skip or not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to