I didn't think of this as a possible solution, for some reason, but I think
it actually makes a lot of sense. Just as a reference, this is something I
currently do when storing data in a key-value interface:
- I write a buffer with no batches
- Write batches in separate buffers
- these are sized to fully utilize the space for each key-value
It is possible to then read the key-value that only contains a schema.
I believe my approach for doing this can be seen in [1], and I use the
StreamWriter because I want it to use an in-memory format that is
streamable.
[1]:
https://gitlab.com/skyhookdm/skytether-singlecell/-/blob/mainline/src/cpp/processing/dataformats.cpp#L16
Aldrin Montana
Computer Science PhD Student
UC Santa Cruz
On Fri, May 6, 2022 at 12:04 PM Weston Pace <[email protected]> wrote:
> Can you serialize the schema by creating an IPC file with zero record
> batches? I apologize, but I do not know the JS API as well. Maybe
> you can create a table from just a schema (or a schema and a set of
> empty arrays) and then turn that into an IPC file? This shouldn't add
> too much overhead.
>
> On Thu, May 5, 2022 at 8:23 AM Howard Engelhart
> <[email protected]> wrote:
> >
> > I'm looking to implement an Athena federated query custom connector
> using the arrow js lib. I'm getting stuck on figuring out how to encode a
> Schema properly for the Athena GetTableResponse. I have found an example
> using python that does something like this.. (paraphrasing...)
> >
> > import pyarrow as pa
> > .....
> > return {
> > "@type": "GetTableResponse",
> > "catalogName": self.catalogName,
> > "tableName": {'schemaName': self.databaseName, 'tableName':
> self.tableName},
> > "schema": {"schema":
> base64.b64encode(pa.schema(....args...).serialize().slice(4)).decode("utf-8")},
> > "partitionColumns": self.partitions,
> > "requestType": self.request_type
> > }
> > What i'm looking for is the js equivalent of
> > pa.schema(....args...).serialize()
> >
> > Is there one? If not, could someone point me in the right direction of
> how to code up something similar?
>