friendlymatthew commented on PR #7878: URL: https://github.com/apache/arrow-rs/pull/7878#issuecomment-3058305633
> I have sort of lost track of the current state of this PR. Is it something we want to merge and then do follow ons? If so, can we file some tickets to track those follow ons? Or do we want to keep working on this PR? > > I would like to improve the benchmarks to more cleanly separate the validation benchmarks from JSON parsing benchmarks. Hi, I would like to spend a bit more time on this. @scovich raised some good points about redundant checks which will remove the need to collect offsets. Here is my checklist for this PR: - [ ] avoid materializing offsets + remove redundant checks - [ ] better documentation - [ ] fix the validation benchmark and move them back to `parquet-variant` - [ ] make the offset validation DRY by moving it to a freestanding method The `simdutf8` stuff can be incorporated as a follow up PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org