Hi Matt,

Ah okay that explains so much. It’s very clear in hindsight.

I think the CGO usage might be a bit out of scope for my current project +
would require a lot of learning for me at the moment so I will hold off
trying that out for now.

Thanks a lot for your very detailed response and I’m very excited to the
see the future of the go arrow library in the coming weeks and months.

I would potentially be interested in trying to contribute to the project.
So far my only ideas so far that might make sense is a function like I
described before to “AppendStructs” and potentially a function for reading
arrow structs in to Go (mostly just usability and ergonomic additions I
have half made out of necessity).

I am a fairly amateur go developer but would be interested in learning more
- of course I understand if you feel the project might be unsuitable and I
would then remain a happy consumer :)

Let me know if it might make sense to open a PR or an issue to discuss the
above. Or if there are any other low hanging issues that would be make good
first items.

Cheers,
Gus

On Mon, 17 Apr 2023 at 3:35 pm, Matthew Topol via user <
[email protected]> wrote:

> Hi Gus!
>
> Unfortunately at the moment, I haven't yet implemented the full expression
> evaluation in the Go compute library. So while you can use the library at
> first glance to define expressions, you can't execute them just yet. I've
> ended up taking a different direction here and instead I've been working on
> implementing execution using Substrait[1] expressions intending to replace
> and deprecate the Expression types that currently exist in the Go compute
> library in favor of the ones implemented in the substrait-go repo[2].
>
> > Based on my limited understanding of this and reviewing the C++
> documentation it seems I should pass this expression into a Filter node as
> an argument of some sort.
>
> This is part of the reason why I'm going directly to substrait for compute
> definitions. Rather than trying to replicate the full execution framework
> that Acero defines in C++, my plan is to allow executing Substrait plans as
> they exist so that it isn't necessary to create nodes and pipelines for
> compute in Go. (At least not yet).
>
> In the meantime, if you are able to use CGO, you could theoretically
> serialize the existing Expressions in Go and pass them using the C-Data API
> to the C++ libacero library for execution, then use the C-Data API to bring
> the results back into Go without having to copy the data (by passing the
> pointers around). Sorry that this isn't more helpful. I promise this work
> *is* coming, and will most likely have initial PRs of some basic
> functionality within the next few weeks.
>
> Take care!
> --Matt
>
> [1]: https://substrait.io
> [2]: https://pkg.go.dev/github.com/substrait-io/[email protected]/expr
>
> On Fri, Apr 14, 2023 at 8:47 AM Gus Minto-Cowcher <[email protected]>
> wrote:
>
>> Hi,
>>
>> I am using the Golang implementation and am looking to do some basic data
>> processing on arrow arrays read from a parquet file. I have been looking at
>> using this package
>> <https://pkg.go.dev/github.com/apache/arrow/go/[email protected]/arrow/compute>
>> .
>>
>> While I have figured out how to do a basic filter using CallFunction:
>>
>> fb := array.NewFloat64Builder(memory.DefaultAllocator)
>> fb.AppendValues([]float64{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, nil)
>> array := fb.NewFloat64Array()
>>
>> c := compute.DefaultExecCtx()
>> ctx := context.TODO()
>> compute.SetExecCtx(ctx, c)
>>
>> out, err := compute.CallFunction(ctx, "greater_equal", nil, compute.
>> NewDatum(array), compute.NewDatum(3))
>> if err != nil {
>> log.Fatal(err)
>> }
>> defer out.Release()
>> filter := out.(*compute.ArrayDatum).MakeArray()
>> result, err := compute.FilterArray(ctx, array, filter, *compute.
>> DefaultFilterOptions())
>> if err != nil {
>> log.Fatal(err)
>> }
>>
>> I was looking at attempting to use the Expressions built within the
>> compute library as this appears at first glance to be a much more idiomatic
>> way of using the compute library. IE. something like:
>> expr := compute.GreaterEqual(compute.NewRef(compute.FieldRefIndex(0)),
>> compute.NewLiteral(3))
>>
>> However, I cannot figure out how to actually execute the expression.
>> Based on my limited understanding of this and reviewing the C++
>> documentation it seems I should pass this expression into a Filter node as
>> an argument of some sort. But basically at this stage where I am actually
>> trying to execute an expression on data I am lost.
>>
>> I would really appreciate any input/examples/pointers people might have :)
>>
>> Thanks in advance,
>> Gus
>>
>

Reply via email to