HadrienG2 opened a new issue, #5711:
URL: https://github.com/apache/arrow-rs/issues/5711

   **Which part is this question about**
   
   Library API of `arrow_array::builder::NullBuilder`.
   
   **Describe your question**
   
   As far as I can tell, `NullBuilder` is the only builder for which `capacity 
== len`. The way this is currently handled is quite surprising, for example 
`NullBuilder::with_capacity(123)` is not empty, but has an initial length of 
123, i.e. the final array length is not defined only by what is pushed, but 
also by what the initial builder capacity was.
   
   Now, I understand that the reason why this happens is that the designer of 
`NullBuilder` wanted to only store a length, since that's enough information to 
encode an "array of nulls", and managing a capacity would be pointless overhead.
   
   But my question is, why was the current design favored over the following 
alternatives?
   
   1. Do not expose capacity setters or getters at all (i.e. no 
NullBuilder::with_capacity() and no NullBuilder::capacity())
   2. Make `with_capacity()` a no-op and make `capacity()` return the current 
length.
   
   I think these alternatives are better because they shift the surprising part 
of the API to the behavior of capacity, rather than the length, and people are 
used to capacity behaving in a somewhat surprising way (for example, it is only 
mildly surprising that `Builder::with_capacity(c).capacity()` can return a 
number greater than `c`, whereas if people pushed exactly N elements into some 
builder, they would be very upset if the builder ended up containing more than 
N elements).
   
   **Additional context**
   
   This particular behavior gets in the way of cleanly integrating 
`NullBuilder` into my prototype for 
https://github.com/apache/arrow-rs/issues/5700 .
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to