I didn't think of this as a possible solution, for some reason, but I think
it actually makes a lot of sense. Just as a reference, this is something I
currently do when storing data in a key-value interface:

   - I write a buffer with no batches
   - Write batches in separate buffers
      - these are sized to fully utilize the space for each key-value

It is possible to then read the key-value that only contains a schema.

I believe my approach for doing this can be seen in [1], and I use the
StreamWriter because I want it to use an in-memory format that is
streamable.

[1]:
https://gitlab.com/skyhookdm/skytether-singlecell/-/blob/mainline/src/cpp/processing/dataformats.cpp#L16

Aldrin Montana
Computer Science PhD Student
UC Santa Cruz


On Fri, May 6, 2022 at 12:04 PM Weston Pace <[email protected]> wrote:

> Can you serialize the schema by creating an IPC file with zero record
> batches?  I apologize, but I do not know the JS API as well.  Maybe
> you can create a table from just a schema (or a schema and a set of
> empty arrays) and then turn that into an IPC file?  This shouldn't add
> too much overhead.
>
> On Thu, May 5, 2022 at 8:23 AM Howard Engelhart
> <[email protected]> wrote:
> >
> > I'm looking to implement an Athena federated query custom connector
> using the arrow js lib.  I'm getting stuck on figuring out how to encode a
> Schema properly for the Athena GetTableResponse.  I have found an example
> using python that does something like this.. (paraphrasing...)
> >
> > import pyarrow as pa
> > .....
> >        return {
> >             "@type": "GetTableResponse",
> >             "catalogName": self.catalogName,
> >             "tableName": {'schemaName': self.databaseName, 'tableName':
> self.tableName},
> >             "schema": {"schema":
> base64.b64encode(pa.schema(....args...).serialize().slice(4)).decode("utf-8")},
> >             "partitionColumns": self.partitions,
> >             "requestType": self.request_type
> >         }
> > What i'm looking for is the js equivalent of
> > pa.schema(....args...).serialize()
> >
> > Is there one?  If not, could someone point me in the right direction of
> how to code up something similar?
>

Reply via email to