alamb opened a new issue, #7821:
URL: https://github.com/apache/arrow-rs/issues/7821

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   - part of https://github.com/apache/arrow-rs/issues/6736
   
   The 
[Variant](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#encoding-types)
 encoding uses different sizes for offsets in nested types to optimize the 
encoding size
   
   Specifically
   * Objects [use between 1-4 bytes for the field id depending on the number of 
fields](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#value-header-for-object-basic_type2)
   * Objects [use between 1-4 bytes for the field offset depending on the size 
of the 
children](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#value-header-for-object-basic_type2)
   
   
   **Describe the solution you'd like**
   
   I would like tests that use the `VariantBuilder` API and cover the following 
cases:
   1. `VariantObject` with between 2^8 and 2^16 elements ( 
`field_id_size_minus_1` = 1, 2 byte field ids)
   2. `VariantObject` with between 2^16 and 2^24 elements ( 
`field_id_size_minus_1` = 2, 3 byte field ids)
   3. `VariantObject` with between 2^24 and 2^32 elements ( 
`field_id_size_minus_1` = 3, 4 byte field ids)
   
   1. `VariantObject` with total child data length between 2^8 and 2^16 
elements ( `field_offset_size_minus_1` = 1, 2 byte field offsets)
   2. `VariantObject` with total child data length between 2^16 and 2^24 
elements ( `field_offset_size_minus_1` = 2, 3 byte field offsets)
   3. `VariantObject` with total child data length between 2^24 and 2^32 
elements ( `field_offset_size_minus_1` = 3, 4 byte field offsets)
   
   The "total child data length" can be made by adding some large strings as 
children (for example, by adding 1KB - 1MB `Varaint::String`s via 
`ArrayBuilder::append`)
   
   
   **Describe alternatives you've considered**
   <!--
   A clear and concise description of any alternative solutions or features 
you've considered.
   -->
   
   **Additional context**
   <!--
   Add any other context or screenshots about the feature request here.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to