Hi Matt, Ah okay that explains so much. It’s very clear in hindsight.
I think the CGO usage might be a bit out of scope for my current project + would require a lot of learning for me at the moment so I will hold off trying that out for now. Thanks a lot for your very detailed response and I’m very excited to the see the future of the go arrow library in the coming weeks and months. I would potentially be interested in trying to contribute to the project. So far my only ideas so far that might make sense is a function like I described before to “AppendStructs” and potentially a function for reading arrow structs in to Go (mostly just usability and ergonomic additions I have half made out of necessity). I am a fairly amateur go developer but would be interested in learning more - of course I understand if you feel the project might be unsuitable and I would then remain a happy consumer :) Let me know if it might make sense to open a PR or an issue to discuss the above. Or if there are any other low hanging issues that would be make good first items. Cheers, Gus On Mon, 17 Apr 2023 at 3:35 pm, Matthew Topol via user < [email protected]> wrote: > Hi Gus! > > Unfortunately at the moment, I haven't yet implemented the full expression > evaluation in the Go compute library. So while you can use the library at > first glance to define expressions, you can't execute them just yet. I've > ended up taking a different direction here and instead I've been working on > implementing execution using Substrait[1] expressions intending to replace > and deprecate the Expression types that currently exist in the Go compute > library in favor of the ones implemented in the substrait-go repo[2]. > > > Based on my limited understanding of this and reviewing the C++ > documentation it seems I should pass this expression into a Filter node as > an argument of some sort. > > This is part of the reason why I'm going directly to substrait for compute > definitions. Rather than trying to replicate the full execution framework > that Acero defines in C++, my plan is to allow executing Substrait plans as > they exist so that it isn't necessary to create nodes and pipelines for > compute in Go. (At least not yet). > > In the meantime, if you are able to use CGO, you could theoretically > serialize the existing Expressions in Go and pass them using the C-Data API > to the C++ libacero library for execution, then use the C-Data API to bring > the results back into Go without having to copy the data (by passing the > pointers around). Sorry that this isn't more helpful. I promise this work > *is* coming, and will most likely have initial PRs of some basic > functionality within the next few weeks. > > Take care! > --Matt > > [1]: https://substrait.io > [2]: https://pkg.go.dev/github.com/substrait-io/[email protected]/expr > > On Fri, Apr 14, 2023 at 8:47 AM Gus Minto-Cowcher <[email protected]> > wrote: > >> Hi, >> >> I am using the Golang implementation and am looking to do some basic data >> processing on arrow arrays read from a parquet file. I have been looking at >> using this package >> <https://pkg.go.dev/github.com/apache/arrow/go/[email protected]/arrow/compute> >> . >> >> While I have figured out how to do a basic filter using CallFunction: >> >> fb := array.NewFloat64Builder(memory.DefaultAllocator) >> fb.AppendValues([]float64{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, nil) >> array := fb.NewFloat64Array() >> >> c := compute.DefaultExecCtx() >> ctx := context.TODO() >> compute.SetExecCtx(ctx, c) >> >> out, err := compute.CallFunction(ctx, "greater_equal", nil, compute. >> NewDatum(array), compute.NewDatum(3)) >> if err != nil { >> log.Fatal(err) >> } >> defer out.Release() >> filter := out.(*compute.ArrayDatum).MakeArray() >> result, err := compute.FilterArray(ctx, array, filter, *compute. >> DefaultFilterOptions()) >> if err != nil { >> log.Fatal(err) >> } >> >> I was looking at attempting to use the Expressions built within the >> compute library as this appears at first glance to be a much more idiomatic >> way of using the compute library. IE. something like: >> expr := compute.GreaterEqual(compute.NewRef(compute.FieldRefIndex(0)), >> compute.NewLiteral(3)) >> >> However, I cannot figure out how to actually execute the expression. >> Based on my limited understanding of this and reviewing the C++ >> documentation it seems I should pass this expression into a Filter node as >> an argument of some sort. But basically at this stage where I am actually >> trying to execute an expression on data I am lost. >> >> I would really appreciate any input/examples/pointers people might have :) >> >> Thanks in advance, >> Gus >> >
