Most of these issues seem to originate from creating arrays from ArrayData. While we could validate buffers in the XArray::from implementations, that would have some performance overhead and also some edge cases for nested data. Thinking about List<List<Utf8>>>, we recursively would need to validate and know which buffers are used for offsets or utf8 data. Since the root cause is always ArrayData::new (and possibly ArrayDataBuilder::finish) I would suggest marking those functions as unsafe.
Adding checks for the top-level buffer length would still be useful, I'm just a bit opposed to having to iterate over whole buffers for validation, turning these operations from O(1) to O(n). Slightly related, GenericListArray::try_new_from_array_data currently validates that the first offset is 0. I think this check could actually be relaxed and the offset only needs to be in bounds for the child array. Jörn On Wed, Sep 29, 2021 at 9:12 PM Andrew Lamb <al...@influxdata.com> wrote: > Dear Rust Developers, > > As a heads up, several pre-existing security tickets filed against arrow-rs > have been added[1][2] to the RUSTSEC database[1][2] a few hours ago. The > author says he plans to file additional RUSTSEC entries for the other > security tickets [3]. > > The criteria used for adding the arrow issues to the RUSTSEC database is > not clear to me, but given widely used tools such as `cargo audit` report > such issues, it is likely that this will become an visible issue for our > users soon. > > Given this, I will likely start looking into existing security issues [4] > reported against arrow-rs and any help would be appreciated. > > Andrew > > [1] https://github.com/rustsec/advisory-db/pull/1057 > [2] https://github.com/rustsec/advisory-db/pull/1059 > [3] > https://github.com/rustsec/advisory-db/pull/1057#issuecomment-930455127 > [4] https://github.com/apache/arrow-rs/labels/security > -- *Jörn Horstmann* | Senior Backend Engineer www.signavio.com Kurfürstenstraße 111, 10787 Berlin, Germany Work with us! <https://hubs.ly/H0wwzcr0> <https://www.linkedin.com/company/signavio/> <https://www.twitter.com/signavio> <https://www.facebook.com/signavio> <https://www.youtube.com/user/signavio> <https://www.xing.com/companies/signaviogmbh> [image: BPI Forum] <https://t-eu.xink.io/Tracking/Index/sSMAALBtAAAnu0MA0> HRB 121584 B Charlottenburg District Court, VAT ID: DE265675123 Managing Directors: Dr. Gero Decker, Rouven Morato Adam