jecsand838 commented on issue #9211:
URL: https://github.com/apache/arrow-rs/issues/9211#issuecomment-3775468568

   > > **Describe alternatives you've considered**
   > > 
   > > * **More interpreter specialization without native JIT**
   > >   
   > >   * Replace enum matching with per-field function pointers / vtables, or 
build a small bytecode interpreter. This could reduce some overhead, but likely 
won’t remove as much branching as native codegen.
   > 
   > Aside: I'm pretty sure this alternative wouldn't reduce overhead -- isn't 
rust enum dispatch generally _faster_ than vtable dispatch?
   
   Good catch. If the only change is replacing `match` on an enum with a 
vtable/`dyn Trait` call, I fully agree that it's unlikely to be a win and can 
easily be a regression.
   
   +1 on the general idea of separating structural decode from typed conversion.
   
   Like @scovich called out, what you're describing is very similar to how 
`arrow-json`'s tape decoder is structured. It first parses the byte stream into 
a flattened "tape" / offset representation, and only on `flush()` runs a 
schema-aware `ArrayDecoder` over that tape to build Arrow arrays.
   
   `arrow-avro` today uses a schema specialized, direct-to-builders approach 
where `RecordDecoder` owns a `Vec<Decoder>` (a per-field decoder with one per 
reader/output field), and the hot loop is effectively:
   - `for _ in 0..count { for field in &mut fields { field.decode(&mut cursor)? 
} }`
     (or `Projector::project_record(...)`), with each `Decoder` appending 
directly into typed per-column buffers/builders.
   `flush()` then mostly wraps/finalizes those buffers into a `RecordBatch`.
   
   Adopting a two phase approach for Avro would likely mean introducing an 
explicit intermediate representation (tape-like, or a compiled decode plan) 
that captures the physical-wire details we need (union tags / branch index 
`int` varint, lengths/offsets for bytes/strings, array/map block counts, 
decoded varints, etc.), followed by a second pass that's purely walk buffers -> 
build arrays / apply promotions. That could plausibly help vectorization / 
reduce per-field branching, at the cost of extra buffering and/or passes.
   
   One caveat on the fixed-width grouping: Avro `int`/`long` are zig-zag 
varints on the wire (and many length/count markers are varint-encoded too), so 
the obvious fixed-width cases are more like `null`/`boolean`, `float`/`double`, 
and `fixed(N)`.
   
   Overall, this is definitely worth looking into imo.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to