Hey, as far as I can tell it looks like appending to a streaming file format
isn't currently supported, is that right?
RecordBatchStreamWriter always writes the schema up front, and it doesn't look
like a schema is expected mid file ( assuming im doing this append test
correctly, this is the error I hit when I try to read back this file into
python:
Traceback (most recent call last):
File "/home/ra7293/rba_arrow_mmap.py", line 9, in <module>
table = reader.read_all()
File "ipc.pxi", line 302, in pyarrow.lib._RecordBatchReader.read_all
File "error.pxi", line 79, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Message not expected type: record batch, was: 1
This reader script works fine if I write once / don't append. I can work
around by not appending but creating new files any time I restart, I just
wanted to confirm im not missing something.
Also, fyi, I opened a ticket last week that append is broken with the
FileOutputStream ( unrelated to this email thread )
https://github.com/apache/arrow/issues/2018
Thanks
- Rob
DISCLAIMER: This e-mail message and any attachments are intended solely for the
use of the individual or entity to which it is addressed and may contain
information that is confidential or legally privileged. If you are not the
intended recipient, you are hereby notified that any dissemination,
distribution, copying or other use of this message or its attachments is
strictly prohibited. If you have received this message in error, please notify
the sender immediately and permanently delete this message and any attachments.