Kevin Yang created ARROW-18288: ---------------------------------- Summary: [GO]: pqarrow (github.com/apache/arrow/go/v9/parquet/pqarrow) cannot handle arrow's DICTIONARY field Key: ARROW-18288 URL: https://issues.apache.org/jira/browse/ARROW-18288 Project: Apache Arrow Issue Type: Bug Components: Go Affects Versions: 10.0.0, 9.0.0 Reporter: Kevin Yang
Hey, Arrow Go Dev: I was trying to save some arrow tables out to parquet files, with the help of the "[github.com/apache/arrow/go/v9/parquet/pqarrow|http://github.com/apache/arrow/go/v9/parquet/pqarrow]" package. btw, it's generally a great design (of Arrow) and a great Go implementation. However, one issue sticks out: in my original arrow Table I have some DICTIONARY fields, which pqarrow does NOT currently support. I would assume supporting them will be quite straightward: just "denormalize" the DICTIONARY value into corresponding values (string, Timestamp, etc), and it's up to the Parquet to do the right trick (using DICTIONARY encoding, etc). I would have done this conversion on-the-fly by myself, by converting each DICTIONARY field into underlying values. However, the arrow table schema is dynamic and outside my control, and I need to iterate through fields (maybe structs) to locate those) -> it would be much better if pqarrow can support this natively. Can anyone help? thanks! -- This message was sent by Atlassian Jira (v8.20.10#820010)