No, you'd have to serialize the buffer, chop off the first 8 bytes yourself,
and generate the Protobuf.
On Tue, Feb 8, 2022, at 13:59, R KB wrote:
> Thanks for that; hopefully those links should be fruitful.
>
> Question, before I get started, is there an API exposed so that I could do
> the reverse of this:
>
> `import asyncio
> import pathlib
> import struct
>
> import grpc
> import pyarrow as pa
> import pyarrow.flight as pf
>
> import Flight_pb2, Flight_pb2_grpc
>
> async def main():
> ticket = pf.Ticket("tick")
> async with grpc.aio.insecure_channel("localhost:1234") as channel:
> stub = Flight_pb2_grpc.FlightServiceStub(channel)
> schema = None
> async for data in stub.DoGet(Flight_pb2.Ticket(ticket=ticket.ticket)):
> # 4 bytes: Need IPC continuation token
> token = b'\xff\xff\xff\xff'
> # 4 bytes: message length (little-endian)
> length = struct.pack('<I', len(data.data_header))
> buf = pa.py_buffer(token + length + data.data_header +
> data.data_body)
> message = pa.ipc.read_message(buf)
> print(message)
> if schema is None:
> # This should work but is unimplemented
> # print(pa.ipc.read_schema(message))
> schema = pa.ipc.read_schema(buf)
> print(schema)
> else:
> batch = pa.ipc.read_record_batch(message, schema)
> print(batch)
> print(batch.to_pydict())
>
> asyncio.run(main())`
>
> On Tue, 8 Feb 2022 at 18:33, David Li <[email protected]> wrote:
>> __
>> Unfortunately Flight wraps the C++ Flight implementation, which uses
>> gRPC/C++, which is mostly a separate library entirely from grpcio and does
>> not benefit from any improvements there. (The two do share a common network
>> stack, but that's all; also, grpcio doesn't expose any of the lower level
>> APIs that might make it possible to combine the two somehow.)
>>
>> You might ask why pyarrow.flight didn't use grpcio directly (with bindings
>> to transmute from FlightData to RecordBatch). However at the time the
>> thought is that we would also have non-gRPC transports (which are finally
>> being worked on) and so a from-scratch grpcio/Python implementation was not
>> desirable.
>>
>> That said there are issues filed about better documenting FlightData. See
>> ARROW-15287[1] which links a StackOverflow answer that demonstrates how to
>> glue together asyncio/grpcio/PyArrow.
>>
>> There's also some previous discussion about adding async to Flight more
>> generally [2].
>>
>> [1]: https://issues.apache.org/jira/browse/ARROW-15287
>> [2]: "[C++] Async Arrow Flight" 2021/06/02
>> https://lists.apache.org/thread/jrj6yx53gyj0tr18pfdghtb8krp4gpfv
>>
>> -David
>>
>> On Tue, Feb 8, 2022, at 13:24, R KB wrote:
>>> GRPC has pretty good AsyncIO support at this point, and since Flight is
>>> essentially a wrapper around some GRPC types: why can't we just expose
>>> something that generates FlightData grpc objects?
>>>
>>>
>>>
>>>
>>>
>>