Re: Need help on ArrayaSpan and writing C++ udf

2023-07-17 Thread Jin Shang
Hi Wenbo, Sorry I wasn't clear enough. There are two issues with the code snippet. As Aldrin pointed out, you are working with uint8_t*, which I don't think is right. The buffer stores bytes, not pointers. auto *out_values = out->array_span_mutable()->GetValues(1); > It should be GetValues(1), n

Re: Need help on ArrayaSpan and writing C++ udf

2023-07-17 Thread Weston Pace
> I may be missing something, but why copy to *out_values++ instead of > *out_values and add 32 to out_values afterwards? Otherwise I agree this is > the way to go. I agree with Jin. You should probably be incrementing `out` by 32 each time `VisitValue` is called. On Mon, Jul 17, 2023 at 6:38 AM

Re: Need help on ArrayaSpan and writing C++ udf

2023-07-17 Thread Aldrin
Oh wait, I see now that you're incrementing with a uint8_t*. That could be fine for your own use, but you might want to make sure it aligns with the type of your output (Int64Array vs Int32Array). Sent from Proton Mail for iOS On Mon, Jul 17, 2023 at 06:20, Aldrin wr

Re: Need help on ArrayaSpan and writing C++ udf

2023-07-17 Thread Aldrin
Hi Wenbo,An ArraySpan is like an ArrayData but does not own the data, so the ColumnarFormat doc that Jon shared is relevant for both.In the case of a binary format, the output ArraySpan must have at least 2 buffers: the offsets and the contiguous binary data (values). If the output of your UDF i

Re: Need help on ArrayaSpan and writing C++ udf

2023-07-17 Thread Wenbo Hu
Hi Jin, > but why copy to *out_values++ instead of > *out_values and add 32 to out_values afterwards? I'm implementing the sha256 function as a scalar function, but it always inputs with an array, so on visitor pattern, I'll write a 32 byte hash into the pointer and move to the next for next v

Re: Need help on ArrayaSpan and writing C++ udf

2023-07-17 Thread Jin Shang
Hi Wenbo, I'd like to known what's the *three* `buffers` are in ArraySpan. What are > `1` means when `GetValues` called? The meaning of buffers in an ArraySpan depends on the layout of its data type. FixedSizeBinary is a fixed-size primitive type, so it has two buffers, one validity buffer and on

Need help on ArrayaSpan and writing C++ udf

2023-07-17 Thread Wenbo Hu
Hi, I'm using Acero as the stream executor to run large scale data transformation. The core data used in UDF is `ArraySpan` in `ExecSpan`, but not much document on ArraySpan. I'd like to known what's the *three* `buffers` are in ArraySpan. What are `1` means when `GetValues` called? For in