Hi,
I am using the Golang implementation and am looking to do some basic data
processing on arrow arrays read from a parquet file. I have been looking at
using this package
<https://pkg.go.dev/github.com/apache/arrow/go/[email protected]/arrow/compute>.
While I have figured out how to do a basic filter using CallFunction:
fb := array.NewFloat64Builder(memory.DefaultAllocator)
fb.AppendValues([]float64{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, nil)
array := fb.NewFloat64Array()
c := compute.DefaultExecCtx()
ctx := context.TODO()
compute.SetExecCtx(ctx, c)
out, err := compute.CallFunction(ctx, "greater_equal", nil,
compute.NewDatum(array),
compute.NewDatum(3))
if err != nil {
log.Fatal(err)
}
defer out.Release()
filter := out.(*compute.ArrayDatum).MakeArray()
result, err := compute.FilterArray(ctx, array, filter, *compute.
DefaultFilterOptions())
if err != nil {
log.Fatal(err)
}
I was looking at attempting to use the Expressions built within the compute
library as this appears at first glance to be a much more idiomatic way of
using the compute library. IE. something like:
expr := compute.GreaterEqual(compute.NewRef(compute.FieldRefIndex(0)),
compute.NewLiteral(3))
However, I cannot figure out how to actually execute the expression. Based
on my limited understanding of this and reviewing the C++ documentation it
seems I should pass this expression into a Filter node as an argument of
some sort. But basically at this stage where I am actually trying to
execute an expression on data I am lost.
I would really appreciate any input/examples/pointers people might have :)
Thanks in advance,
Gus