ericseguin-northstar opened a new issue, #669:
URL: https://github.com/apache/iceberg-go/issues/669
### Apache Iceberg version
None
### Please describe the bug 🐞
### Apache Iceberg version
iceberg-go version: v0.4.0
arrow-go version: v18.4.1
### Please describe the bug 🐞
When I call `tbl.Append()` on a partitioned table with a `map(string,
string)` column where the value is nullable, I get the following error:
```
not implemented: function 'array_take' has no kernel matching input types
(map<utf8, utf8, items_nullable>, int64)
```
If the table is unpartitioned or if the `map` column is removed the
operation works without issue.
### To reproduce
Here's a very minimal example that reproduces the issue on my machine:
```go
func TestReproduce(t *testing.T) {
ctx := t.Context()
// create a table with a MAP column and an arbitrary TIMESTAMP column
for partitioning
// use Trino so we can enable partitioning
trinoCfg := trino.Config{
Catalog: "<catalog name in Trino>",
ServerURI: "<trino endpoint>",
Schema: "public",
}
trinoDSN, err := trinoCfg.FormatDSN()
require.NoError(t, err)
db, err := sql.Open("trino", trinoDSN)
require.NoError(t, err)
_, err = db.ExecContext(ctx, `
CREATE TABLE my_table (
date TIMESTAMP,
my_map MAP(VARCHAR, VARCHAR)
) WITH (
partitioning = ARRAY['day(date)']
)`)
require.NoError(t, err)
defer db.ExecContext(ctx, "DROP TABLE my_table")
// get the Iceberg table
cat, err := rest.NewCatalog(ctx, "rest", "<catalog REST endpoint>") //
I'm using nessie
require.NoError(t, err)
tbl, err := cat.LoadTable(ctx, []string{"public", "my_table"})
require.NoError(t, err)
// print obtained schema in case it's helpful for investigation
arrowSchema, err := table.SchemaToArrowSchema(tbl.Schema(), nil, true,
false)
require.NoError(t, err)
t.Logf("Iceberg schema:\n%s", tbl.Schema().String())
t.Logf("Arrow schema:\n%s", arrowSchema.String())
// append a dummy record
arrowSchema, err := table.SchemaToArrowSchema(tbl.Schema(), nil, true,
false)
require.NoError(t, err)
rb := array.NewRecordBuilder(memory.NewGoAllocator(), arrowSchema)
defer rb.Release()
rb.Field(0).(*array.TimestampBuilder).Append(arrow.Timestamp(time.Now().UnixMicro()))
mb := rb.Field(1).(*array.MapBuilder)
mb.Append(true)
mb.KeyBuilder().(*array.StringBuilder).Append("key")
mb.ItemBuilder().(*array.StringBuilder).Append("val")
rec := rb.NewRecordBatch()
defer rec.Release()
rr, err := array.NewRecordReader(arrowSchema, []arrow.RecordBatch{rec})
require.NoError(t, err)
defer rr.Release()
_, err = tbl.Append(ctx, rr, iceberg.Properties{})
require.NoError(t, err) // fails here
}
```
This fails at the `tbl.Append()` call as marked:
```
--- FAIL: TestReproduce (0.31s)
<...>/reproduce_test.go:74: Iceberg schema:
table {
1: date: required timestamp
2: my_map: required map<string, string>
}
<...>/reproduce_test.go:75: Arrow schema:
schema:
fields: 2
- date: type=timestamp[us]
metadata: ["PARQUET:field_id": "1"]
- my_map: type=map<utf8, utf8, items_nullable>
metadata: ["PARQUET:field_id": "2"]
<...>/reproduce_test.go:94:
Error Trace: <...>/reproduce_test.go:94
Error: Received unexpected error:
not implemented: function 'array_take' has no
kernel matching input types (map<utf8, utf8, items_nullable>, int64)
Test: TestReproduce
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]