Re: [PR] Implement arrow-avro SchemaStore and Fingerprinting To Enable Schema Resolution [arrow-rs]

via GitHub Mon, 04 Aug 2025 15:41:52 -0700


scovich commented on PR #8006:
URL: https://github.com/apache/arrow-rs/pull/8006#issuecomment-3152638873


   > this PR has a lot of comments and quite a few refinements were needed. I 
also know it was way too big.
   
   Honestly, I think the "state machine" for avro decoding is just really 
complex. It felt like a lot of the questions and churn ultimately come from 
that underlying complexity. And my lack of familiarity with the avro spec. Not 
sure how easily those issues could be avoided merely by raising a smaller PR?
   
   > To move forward, would it be best to simply focus on getting the 
`schema.rs` changes in first (~500-600 LOC) and then follow-up with a 
`arrow-avro/src/reader/mod.rs` PR?
   
   Interesting. By "schema.rs changes" you mean adding the new `SchemaStore` 
and fingerprinting infrastructure? With unit tests but not yet integrated into 
the actual decoder? That does seem like a good idea. I don't think there's any 
outstanding controversy in that part of the code (just a couple of follow-on 
items)?
   
   > I do apologize for both the size of this PR and the quality issues. 
   
   This is _NOT_ the kind of run of the mill logic that I would normally 
associate with "quality issues." 
   For me, at least, this has been an invigorating PR to review -- not a 
frustrating one. 
   Some hard core software engineering here with depth and subtlety to work 
through.
   
   > Let me know what I should do and I'll get on it immediately.
   
   Now that we know more, the split you propose seems quite reasonable; I'm not 
sure it was obvious two weeks ago tho?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Implement arrow-avro SchemaStore and Fingerprinting To Enable Schema Resolution [arrow-rs]

Reply via email to