This sounds reasonable from an Arrow perspective, you might want to CC the ORC list as well or ask someone there to co-review your work in the adapter.
Uwe > Am 18.10.2020 um 17:24 schrieb Ying Zhou <yzhou7...@gmail.com>: > > Hi, > > I’m developing the adapter that converts Arrow Arrays, ChunkedArrays, > RecordBatches and Tables into ORC files. Given the ORC Specification and > Arrow Columnar Format. > > Here is my current type mapping: > > Type::type::NA -> nulllptr > Type::type::BOOL -> liborc::TypeKind::BOOLEAN > Type::type::UINT8 -> liborc::TypeKind::BYTE > Type::type::INT8 -> liborc::TypeKind::BYTE > Type::type::UINT16 -> liborc::TypeKind::SHORT > Type::type::INT16 -> liborc::TypeKind::SHORT > Type::type::UINT32 -> liborc::TypeKind::INT > Type::type::INT32 -> liborc::TypeKind::INT > Type::type::INTERVAL_MONTH -> liborc::TypeKind:INT > Type::type::UINT64 -> liborc::TypeKind::LONG > Type::type::INT64 -> liborc::TypeKind::LONG > Type::type::INTERVAL_DAY_TIME -> liborc::TypeKind:LONG > Type::type::DURATION -> liborc::TypeKind::LONG > Type::type::HALF_FLOAT -> liborc::TypeKind::FLOAT > Type::type::FLOAT -> liborc::TypeKind::FLOAT > Type::type::DOUBLE -> liborc::TypeKind::DOUBLE > Type::type::STRING -> liborc::TypeKind::STRING > Type::type::LARGE_STRING -> liborc::TypeKind::STRING > Type::type::FIXED_SIZE_BINARY -> liborc::TypeKind::CHAR > Type::type::BINARY -> liborc::TypeKind::BINARY > Type::type::LARGE_BINARY -> liborc::TypeKind::BINARY > Type::type::DATE32 -> liborc::TypeKind::DATE > Type::type::TIMESTAMP -> liborc::TypeKind::TIMESTAMP > Type::type::TIME32 -> liborc::TypeKind::TIMESTAMP > Type::type::TIME64 -> liborc::TypeKind::TIMESTAMP > Type::type::DATE64 -> liborc::TypeKind::TIMESTAMP > Type::type::DECIMAL -> liborc::TypeKind::DECIMAL > Type::type::LIST -> liborc::TypeKind::LIST > Type::type::FIXED_SIZE_LIST -> liborc::TypeKind::LIST > Type::type::LARGE_LIST -> liborc::TypeKind::LIST > Type::type::STRUCT -> liborc::TypeKind::STRUCT > Type::type::MAP -> liborc::TypeKind::MAP > Type::type::DENSE_UNION -> liborc::TypeKind::UNION > Type::type::SPARSE_UNION -> liborc::TypeKind::UNION > Type::type::DICTIONARY -> the ORC version of its value type > > There are some concerns particularly related to duration types which don’t > exist for Apache ORC which I have to convert to integers. Is my current > mapping reasonable? Thanks! > > Best, > Ying Zhou