This is an automated email from the ASF dual-hosted git repository.
zeroshade pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-go.git
The following commit(s) were added to refs/heads/main by this push:
new 20ebcc22 fix(arrow/scalar): implement Release()/Retain() on
*scalar.Extension (#851)
20ebcc22 is described below
commit 20ebcc22b232e07c6473997404ef6e20a8b5f154
Author: Matt Topol <[email protected]>
AuthorDate: Fri Jun 12 12:11:55 2026 -0400
fix(arrow/scalar): implement Release()/Retain() on *scalar.Extension (#851)
### Rationale for this change
Closes #827.
`*scalar.Extension` implements `value()`, `equals()`, `Validate()`,
`ValidateFull()`, `CastTo()`, and `String()`, but neither `Release()`
nor `Retain()`.
`compute.ScalarDatum.Release()` only forwards `Release()` when the
wrapped scalar satisfies the unexported `releasable` interface
(`arrow/compute/datum.go`):
```go
func (d *ScalarDatum) Release() {
if v, ok := d.Value.(releasable); ok {
v.Release()
}
}
```
Because `*scalar.Extension` doesn't satisfy that interface, the type
assertion fails and the inner storage scalar is never released — even
though the storage scalar (e.g. `*scalar.FixedSizeList` backing the
interval extension types) carries buffer-bearing arrays. Any code path
that produces an extension scalar and releases it through a
`ScalarDatum` leaks those buffers. Concretely this affects the
`*types.IntervalYearToMonthType` / `*types.IntervalDayType` paths in
`arrow/compute/exprs`.
`*scalar.Extension` was the lone outlier among buffer-holding scalar
types: `*scalar.Binary`, `*scalar.List`, `*scalar.Struct`,
`*scalar.Dictionary`, `*scalar.SparseUnion`, `*scalar.DenseUnion`, and
`*scalar.RunEndEncoded` all already delegate `Retain`/`Release` to their
inner storage.
### What changes are included in this PR?
1. **`*scalar.Extension` now implements `Retain()` and `Release()`**,
delegating to the storage scalar when it satisfies the `Releasable`
interface (`arrow/scalar/scalar.go`). This mirrors `*scalar.Binary` and
the nested scalar types:
```go
func (s *Extension) Retain() {
if r, ok := s.Value.(Releasable); ok {
r.Retain()
}
}
func (s *Extension) Release() {
if r, ok := s.Value.(Releasable); ok {
r.Release()
}
}
```
This is purely additive:
- Storage scalars that don't implement `Releasable` (e.g. primitive
numeric storage) are no-ops, matching today's behavior.
- Storage scalars that do (e.g. `*scalar.FixedSizeList` for the interval
extension types) are now properly released.
- The `Retain()` half restores symmetry so a caller taking ownership of
an extension scalar can bump the storage refcount.
2. **`TestLiteralToDatumIntervalYearToMonth` is switched to
`memory.NewCheckedAllocator` with `defer mem.AssertSize(t, 0)`**
(`arrow/compute/exprs/exec_internal_test.go`). That test previously used
`memory.DefaultAllocator` specifically to dodge this leak and carried a
comment deferring the fix to this issue; the workaround comment is
removed.
### Are these changes tested?
Yes. `TestLiteralToDatumIntervalYearToMonth` now asserts zero
outstanding allocations via a checked allocator. Verified that it
**fails without** the scalar change (`invalid memory size exp=0,
got=384`, with leak traces pointing at the extension scalar's storage)
and **passes with** it. The full `arrow/scalar/...` and
`arrow/compute/exprs/...` suites pass, and `go vet` is clean.
### Are there any user-facing changes?
No breaking changes. Adding `Retain()`/`Release()` methods to an
exported type is non-breaking in Go. The only behavioral change is the
intended one: buffers held by an extension scalar's storage are now
released when the scalar (or a `ScalarDatum` wrapping it) is released,
instead of leaking.
---
arrow/compute/exprs/exec_internal_test.go | 8 ++------
arrow/scalar/scalar.go | 12 ++++++++++++
2 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/arrow/compute/exprs/exec_internal_test.go
b/arrow/compute/exprs/exec_internal_test.go
index 347c2517..f3897a0f 100644
--- a/arrow/compute/exprs/exec_internal_test.go
+++ b/arrow/compute/exprs/exec_internal_test.go
@@ -117,12 +117,8 @@ func TestMakeExecBatch(t *testing.T) {
}
func TestLiteralToDatumIntervalYearToMonth(t *testing.T) {
- // memory.NewCheckedAllocator with AssertSize would fail here:
- // *scalar.Extension does not implement Release() (see
- // arrow/scalar/scalar.go), so an extension scalar's underlying
- // storage is never released even when the wrapping Datum is.
- // This will be fixable once
https://github.com/apache/arrow-go/issues/827 is addressed.
- mem := memory.DefaultAllocator
+ mem := memory.NewCheckedAllocator(memory.DefaultAllocator)
+ defer mem.AssertSize(t, 0)
extSet := NewExtensionSetDefault(
expr.NewEmptyExtensionRegistry(extensions.GetDefaultCollectionWithNoError()))
diff --git a/arrow/scalar/scalar.go b/arrow/scalar/scalar.go
index 05146c7c..f3dd8f99 100644
--- a/arrow/scalar/scalar.go
+++ b/arrow/scalar/scalar.go
@@ -399,6 +399,18 @@ type Extension struct {
Value Scalar
}
+func (s *Extension) Retain() {
+ if r, ok := s.Value.(Releasable); ok {
+ r.Retain()
+ }
+}
+
+func (s *Extension) Release() {
+ if r, ok := s.Value.(Releasable); ok {
+ r.Release()
+ }
+}
+
func (s *Extension) value() interface{} { return s.Value }
func (s *Extension) equals(rhs Scalar) bool {
return Equals(s.Value, rhs.(*Extension).Value)