CookiePieWw opened a new issue, #7593:
URL: https://github.com/apache/arrow-rs/issues/7593

   **Describe the bug**
   <!--
   A clear and concise description of what the bug is.
   -->
   
   As found in 
https://github.com/apache/arrow-rs/pull/7574#discussion_r2119243590, the row 
group statistics emit an empty string as the min value while there are no empty 
strings in the row group.
   
   **To Reproduce**
   <!--
   Steps to reproduce the behavior:
   -->
   Checkout the unit test in 
https://github.com/CookiePieWw/arrow-rs/blob/aee3a6f50f657e63cdb34ee0ad19c48f49821ea3/parquet/tests/arrow_reader/statistics.rs#L2031
   
   The row groups are:
   ```rust
   make_utf8_batch(vec![Some("a"), Some("b"), Some("c"), Some("d"), Some("e")]),
   make_utf8_batch(vec![Some("f"), None, Some("g"), Some("h"), Some("i")]),
   ```
   
   The test should pass but we got a weird result:
   ```
   assertion `left == right` failed: utf8: Mismatch with expected data page 
minimums
     left: StringArray
   [
     "a",
     "",
   ]
    right: StringArray
   [
     "a",
     "f",
   ]
   ```
   
   **Expected behavior**
   <!--
   A clear and concise description of what you expected to happen.
   -->
   As the result in test mentioned above.
   
   **Additional context**
   <!--
   Add any other context about the problem here.
   -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to