Re: [Rust] Heads up: RUSTSEC security advisory against arrow-rs

2021-09-30 Thread Andrew Lamb
I have created a WIP PR for initial feedback on the approach of validating ArrayData upon creation[1]. If there are no objections to the approach I will complete the implementation over the next few days The approach that Sergey describes of `get` and `unsafe get_unchecked` sounds like a good one

Re: [Rust] Heads up: RUSTSEC security advisory against arrow-rs

2021-09-30 Thread Sergey Davidoff
I believe feature flags are not the right choice here. The problem is that feature flags are toggled globally, so the behavior of your crate can be affected by the behavior of some other crate that toggles the feature. A better approach is to provide two methods, one safe and the other unsafe, see

Re: [Rust] Heads up: RUSTSEC security advisory against arrow-rs

2021-09-30 Thread Jacques Nadeau
In the past I was dealing with something similar. My experience was when data was accepted at the edge, the cost of validating that the first offset is zero, the last is within the data bounds and that all others are equal or increasing was a reasonable overhead associated with validating offsets f

Re: [Rust] Heads up: RUSTSEC security advisory against arrow-rs

2021-09-30 Thread Andrew Lamb
I understand the need to avoid sacrificing performance as much as possible. I have begun looking into adding validation into ArrayData::new as you suggest. I am making progress, but haven't fully figured out the nested types yet. Hope to have a PR up in the next day or two. Should we come up with

Re: [DISCUSS][Rust] Biweekly sync call for arrow/datafusion again?

2021-09-30 Thread Jiayu Liu
Thanks Andrew for facilitating this meeting and very happy to "meet" everyone on the call. Hope you have a great day / evening. On Fri, Oct 1, 2021 at 12:38 AM Andrew Lamb wrote: > Notes from the 16:00 UTC Call: > > Attendees: > Andrew Lamb > Shen Yi Jie > Matt Turner > Zied BF > Remi Dettai > R

Re: [DISCUSS][Rust] Biweekly sync call for arrow/datafusion again?

2021-09-30 Thread Andrew Lamb
Notes from the 16:00 UTC Call: Attendees: Andrew Lamb Shen Yi Jie Matt Turner Zied BF Remi Dettai Rich Ruihang Jaiyu Liu QP Jorn Horstmann Benson Muite Introductions (20 minutes) Discussion Items (10 minutes): * Interest in python binding, though it is lagging behind * Thoughts on boundaries bet

Re: [C++] Decimal arithmetic edge cases

2021-09-30 Thread Keith Kraus
For another point of reference, here's microsoft's docs for SQL server on resulting precision and scale for different operators including its overflow rules: https://docs.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql?view=sql-server-ver15 -Keith On Thu, Sep 30,

[C++] Decimal arithmetic edge cases

2021-09-30 Thread David Li
Hello all, While looking at decimal arithmetic kernels in ARROW-13130, the question of what to do about overflow came up. Currently, our rules are based on Redshift [1], except we raise an error if we exceed the maximum precision (Redshift's docs implies it saturates instead). Hence, we can al

Re: [Rust] Heads up: RUSTSEC security advisory against arrow-rs

2021-09-30 Thread Jörn Horstmann
Most of these issues seem to originate from creating arrays from ArrayData. While we could validate buffers in the XArray::from implementations, that would have some performance overhead and also some edge cases for nested data. Thinking about List>>, we recursively would need to validate and know

Re: [DISCUSS] Deprecate user@ in favor for github issues/discussions

2021-09-30 Thread Nic
I'm +1 for GH issues due to it lowering the barrier for participation. As someone who is sometimes a bit nervous about interacting with new open source projects/communities, adding a GH Issue is fairly familiar and feels inconsequential, whereas emailing everyone on a mailing list is intimidating.

Re: [DISCUSS] Deprecate user@ in favor for github issues/discussions

2021-09-30 Thread Jarek Potiuk
Just a comment on discussions: They already have answered/unanswered filters and they have most of the same properties that "stack overflow" questions have, You do not need to "track" discussions. It's great to answer and react quickly and if you have more discussions all the community might get m