Hi, Currently I’m working on ARROW-11297 https://github.com/mathyingzhou/arrow/tree/ARROW-11297 <https://github.com/mathyingzhou/arrow/tree/ARROW-11297>) which will be filed as soon as the current PR is merged.
I managed to reimplement orc::WriterOptions in Arrow (with naming conventions Arrow-ized) as arrow::adapters::orc::WriterOptions (which is necessary since we do not allow third party headers to be included in our public headers) and finished the C++ part of the work. Now I’m trying to expose WriterOptions in Python. I do wonder how this is supposed to be done in general. After reading the code in array.pxi I think maybe this is the way I want to do it: 1. The end user will see individual ORC writer options (e.g. CompressionKind, that is, whether we use ZLIB, LZ0 or some other form of compression or none at all) as keyword arguments. 2. These keyword arguments will be processed in _orc.pyx first as a dictionary and then using an adapter they will be converted into an arrow::adapters::orc::WriterOptions. Is this the right way? Moreover I do wonder how we should convert the enums. Shall I use a series of if/elif or a mapping dict to force people to use one of the correct strings or get a ValueError? e.g. compression_kind_mapping = {’snappy’:CompressionKind._CompressionKind_SNAPPY, ’zl0’:CompressionKind._CompressionKind_ZL0}} #There are other options, this is just an example If compression_kind not in compression_kind_mapping.keys(): raise ValueError(“Unknown compression_kind”) c_compression_kind = compression_kind_mapping[compression_kind] Ying