G'day. I'm happy to accept a writer id. Kind regards, Huw
On Wed, Aug 24, 2022 at 6:34 AM Owen O'Malley <owen.omal...@gmail.com> wrote: > Huw, > Generally we assign each ORC File writer implementation a unique writer > id so that we can determine the writer of the file. Would you like a number > assigned to your writer? We'd ask that your writer always set its id into > the Footer.writer field. > > > https://github.com/apache/orc/blob/75a8f5f2d938a5d13c62619024c1a2443489cce7/proto/orc_proto.proto#L364 > > .. Owen > > On Tue, Aug 23, 2022 at 8:30 PM Dongjoon Hyun <dongjoon.h...@gmail.com> > wrote: > > > Thank you for sharing, Huw. > > > > Dongjoon. > > > > On Mon, Aug 22, 2022 at 10:27 PM Huw Campbell <huw.campb...@gmail.com> > > wrote: > > > > > Hi all, > > > > > > In case you're interested in this. A while ago I wrote up a Haskell > > parser > > > and writer for ORC, which one can find here > > > <https://github.com/HuwCampbell/orc-haskell>. I use it in the day job > a > > > fair bit, and it's come in quite handy for ad-hoc data generators and > > > parsing tasks. > > > > > > It's a "clean room" implementation, and was written almost entirely > from > > > the specification instead of cribbing from the Java or C++ versions. > > > > > > It's also quite capable, being able to read any schemas for v0 and v1 > > files > > > with a few different compression codecs. It writes with v0 style RLEs. > > > > > > Lastly it's pretty compact, being only ~6000 lines of sparsely > formatted > > > Haskell. I think it demonstrates how ORC works quite nicely. > > > > > > Kind regards, > > > Huw > > > > > >