jorisvandenbossche commented on code in PR #37467: URL: https://github.com/apache/arrow/pull/37467#discussion_r1310391926
########## cpp/src/arrow/array/validate.cc: ########## @@ -713,8 +713,10 @@ struct ValidateArrayImpl { } // An empty list array can have 0 offsets - const auto required_offsets = (data.length > 0) ? data.length + data.offset + 1 : 0; const auto offsets_byte_size = data.buffers[1]->size(); + const auto required_offsets = ((data.length > 0) || (offsets_byte_size > 0)) + ? data.length + data.offset + 1 + : 0; if (offsets_byte_size / static_cast<int32_t>(sizeof(offset_type)) < required_offsets) { return Status::Invalid("Offsets buffer size (bytes): ", offsets_byte_size, Review Comment: @pitrou do you know if this "empty list array can have 0 offsets" also applies to other types that have offsets, i.e. string types? In the current version of the PR, I am relying on ValidateFull to catch the error of creating a length-0 LargeString array with an offsets buffer of size 4, while it should be 8. At the moment the offsets buffer size is not checked if the length of the array is 0, but I changed this to actually check it if the offset buffer has _some_ length (so assuming that _if_ there is an offsets buffer with a non-zero legth, it should have a correct length, also if the array itself has length 0) But if it's difficult to make guarantees about this / to change ValidateFull, I can also write a dedicated test for this specific case that doesn't rely on calling ValidateFull. ########## cpp/src/arrow/array/validate.cc: ########## @@ -713,8 +713,10 @@ struct ValidateArrayImpl { } // An empty list array can have 0 offsets - const auto required_offsets = (data.length > 0) ? data.length + data.offset + 1 : 0; const auto offsets_byte_size = data.buffers[1]->size(); + const auto required_offsets = ((data.length > 0) || (offsets_byte_size > 0)) + ? data.length + data.offset + 1 + : 0; if (offsets_byte_size / static_cast<int32_t>(sizeof(offset_type)) < required_offsets) { return Status::Invalid("Offsets buffer size (bytes): ", offsets_byte_size, Review Comment: @pitrou do you know if this "empty list array can have 0 offsets" also applies to other types that have offsets, i.e. string types? In the current version of the PR, I am relying on ValidateFull to catch the error of creating a length-0 LargeString array with an offsets buffer of size 4, while it should be 8. At the moment the offsets buffer size is not checked if the length of the array is 0, but I changed this to actually check it if the offset buffer has _some_ length (so assuming that _if_ there is an offsets buffer with a non-zero length, it should have a correct length, also if the array itself has length 0) But if it's difficult to make guarantees about this / to change ValidateFull, I can also write a dedicated test for this specific case that doesn't rely on calling ValidateFull. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org