Probably not any hard blockers but would need someone to take up the work. On Fri, Sep 23, 2022 at 8:57 AM Louis C <[email protected]> wrote:
> Hello, > It seems that the ORC reader/writer support for attributes (in Arrow it is > called metadata) is limited. The writer does not handle at all the writing > of Arrow metadata (neither for the table nor for fields), and the reader > fills the Arrow schema's metadata with the ORC file metadata, but does > nothing for the fields' metadata, as far as I can tell looking at the code. > > Looking at ORC, it seems that what they call "attributes" serves a similar > purpose as Arrow metadata. See > https://github.com/apache/orc/blob/ff6093c98bf38c06c906dde3207040e1b5b55753/c%2B%2B/include/orc/Type.hh#L50-L69 > As the "Type" object can represent both the table and a particular field, > I think that that could serve for passing the metadata. > Is my understanding correct about the state of the ORC adapter and is > there something that would prevent from doing that? > > Regards > > <https://github.com/apache/orc/blob/ff6093c98bf38c06c906dde3207040e1b5b55753/c%2B%2B/include/orc/Type.hh#L50-L69> > orc/Type.hh at ff6093c98bf38c06c906dde3207040e1b5b55753 · apache/orc > <https://github.com/apache/orc/blob/ff6093c98bf38c06c906dde3207040e1b5b55753/c%2B%2B/include/orc/Type.hh#L50-L69> > Apache ORC - the smallest, fastest columnar storage for Hadoop workloads - > orc/Type.hh at ff6093c98bf38c06c906dde3207040e1b5b55753 · apache/orc > github.com > >
