zeroshade opened a new pull request, #771:
URL: https://github.com/apache/arrow-go/pull/771

   ### Rationale for this change
   
   Attempting to address https://github.com/apache/arrow-adbc/issues/4185, 
there is no built-in way to convert between arrow arrays/records and native Go 
objects and types using reflection. Users currently must manually construct 
builders, iterate columns and handle type mapping for their own schemas. Some 
other Arrow implementations (e.g. pyarrow) offer higher-level APIs for this, so 
we can close the gap for Go.
   
   ### What changes are included in this PR?
   
   Adds a new opt-in sub-package `arrow/array/arreflect` providing 
bidirectional Go↔Arrow conversion via reflection. 
   
   **Public API**:
   - `At[T]`, `ToSlice[T]` — Arrow array → Go value/slice
   - `FromSlice[T]` — Go slice → Arrow array (variadic `Option` for 
dict/listview/ree/decimal/temporal overrides)
   - `RecordToSlice[T]`, `RecordFromSlice[T]` — `RecordBatch` ↔ Go struct slices
   - `RecordAt[T]`, `RecordAtAny` — single-row record accessors (typed and 
runtime-inferred)
   - `RecordToAnySlice` — runtime-inferred full-record conversion (no 
compile-time Go type needed)
   - `InferSchema[T]`, `InferType[T]` — infer `*arrow.Schema` / 
`arrow.DataType` from Go types
   - `InferGoType` — invert Arrow→Go type mapping at runtime via 
`reflect.StructOf`
   - `AtAny`, `ToAnySlice` — dynamic accessors when the Go type is not known at 
compile time
   - `WithDict()`, `WithListView()`, `WithREE()`, `WithDecimal(p,s)`, 
`WithTemporal(s)` — encoding options
   - Sentinel errors `ErrUnsupportedType`, `ErrTypeMismatch` (usable with 
`errors.Is`)
   
   **Supported Arrow types**: all primitives, 
Timestamp/Date32/Date64/Time32/Time64/Duration, Decimal32/64/128/256, Struct, 
List/LargeList/ListView/LargeListView (read), FixedSizeList, Map, Dictionary 
(`dict` tag), RunEndEncoded (`ree` tag).
   
   *Struct tag control* (follows `encoding/json` conventions):
   
   ```go
   type Row struct {
       Name  string            `arrow:"name"`
       Score float64           `arrow:"score"`
       Skip  string            `arrow:"-"`
       Enc   string            `arrow:"enc,dict"`
       When  time.Time         `arrow:"when,date32"`
       Vals  []int             `arrow:"vals,listview"`
       Price decimal128.Num    `arrow:"price,decimal(18,2)"`
   }
   ```
   
   Key implementation details:
   - Pointer fields → nullable Arrow fields (nil = null); multi-level pointers 
fully dereferenced
   - Embedded struct fields promoted following `encoding/json` BFS rules 
(`collectFieldCandidates` + `resolveFieldCandidates`)
   - Struct metadata cached per type via `sync.Map`
   - `WithTemporal` validates input, returning `ErrUnsupportedType` for 
unrecognized values
   - `FromSlice` empty-slice path applies all encoding options consistently 
with the non-empty path (decimal, temporal, dict, listview, ree)
   - Tag parsing uses parenthesis-aware `splitTagTokens` for decimal(p,s) — no 
fragile comma reassembly
   - `InferGoType` validates all runes of exported field names, rejects 
non-identifier characters (hyphens, dots, spaces, digit prefixes), and detects 
duplicate exported names after capitalization
   - `validateDictValueType` enforced on all dict paths (struct tags, 
`FromSlice` opts, empty-slice)
   - Primitive types cached as package-level `reflect.Type` vars
   - Internal duplication minimized via helpers: `asTime`/`asDuration` 
(TypeAssert), `appendListElement` (list builder dispatch with checked type 
assertion), `listLike` interface (Elem() unification)
   - Large list variants (`LARGE_LIST`, `LARGE_LIST_VIEW`) supported for 
reading but not produced by `FromSlice`
   
   ### Are these changes tested?
   
   Yes, comprehensive test coverage along with testable examples that will show 
up in the docs.
   
   ### Are there any user-facing changes?
   
   Yes, the entirely new public API in the new `arrow/array/arreflect`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to