Re: Haskell Implementation of ORC

Huw Campbell Fri, 26 Aug 2022 00:45:50 -0700

G'day.
I'm happy to accept a writer id.

Kind regards,
Huw


On Wed, Aug 24, 2022 at 6:34 AM Owen O'Malley <owen.omal...@gmail.com>
wrote:

> Huw,
>    Generally we assign each ORC File writer implementation a unique writer
> id so that we can determine the writer of the file. Would you like a number
> assigned to your writer? We'd ask that your writer always set its id into
> the Footer.writer field.
>
>
> https://github.com/apache/orc/blob/75a8f5f2d938a5d13c62619024c1a2443489cce7/proto/orc_proto.proto#L364
>
> .. Owen
>
> On Tue, Aug 23, 2022 at 8:30 PM Dongjoon Hyun <dongjoon.h...@gmail.com>
> wrote:
>
> > Thank you for sharing, Huw.
> >
> > Dongjoon.
> >
> > On Mon, Aug 22, 2022 at 10:27 PM Huw Campbell <huw.campb...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > In case you're interested in this. A while ago I wrote up a Haskell
> > parser
> > > and writer for ORC, which one can find here
> > > <https://github.com/HuwCampbell/orc-haskell>. I use it in the day job
> a
> > > fair bit, and it's come in quite handy for ad-hoc data generators and
> > > parsing tasks.
> > >
> > > It's a "clean room" implementation, and was written almost entirely
> from
> > > the specification instead of cribbing from the Java or C++ versions.
> > >
> > > It's also quite capable, being able to read any schemas for v0 and v1
> > files
> > > with a few different compression codecs. It writes with v0 style RLEs.
> > >
> > > Lastly it's pretty compact, being only ~6000 lines of sparsely
> formatted
> > > Haskell. I think it demonstrates how ORC works quite nicely.
> > >
> > > Kind regards,
> > > Huw
> > >
> >
>

Re: Haskell Implementation of ORC

Reply via email to