chriscasola commented on code in PR #13381:
URL: https://github.com/apache/arrow/pull/13381#discussion_r897288469


##########
go/arrow/array/string.go:
##########
@@ -113,6 +113,12 @@ func (a *String) setData(data *Data) {
        if offsets := data.buffers[1]; offsets != nil {
                a.offsets = arrow.Int32Traits.CastFromBytes(offsets.Bytes())
        }
+
+       expNumOffsets := a.array.data.offset + a.array.data.length + 1
+       if a.array.data.length > 0 &&
+               (len(a.offsets) < expNumOffsets || 
int(a.offsets[expNumOffsets-2]) > len(a.values)) {
+               panic("arrow/array: string offsets out of bounds of data 
buffer")
+       }

Review Comment:
   The length check is needed because if length is 0 it appears that lots of 
things allow `len(a.offsets)` to also be `0`. The offsets seem to be ignored in 
that case.
   
   I think `len(a.offsets) < expNumOffsets` is correct. If there aren't at 
least `expNumOffsets` we should panic. If there are extra offsets we can't 
panic because that can happen in slice scenarios like `array.NewSliceData` 
(unit tests seem to confirm this).
   
   `a.offsets[expNumOffsets-2]` is where the last value in the array starts and 
it must be less than `len(a.values)`. I'll change the `>` to a `>=`. The final 
byte, indicated by `a.offsets[expNumOffsets-1]` seems to be allowed to overflow 
and be gracefully handled, based on [this 
test](https://github.com/apache/arrow/blob/master/go/arrow/array/concat_test.go#L293),
 which is why I am not checking it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to