The Arrow compression libraries have too modes batch and streaming decompression. For Snappy streaming hasn't been implemented. I don't recall off the top of my head if Snappy supports streaming decompression generally. If it does, a contribution would be welcome.
To just get the raw snappy compressed bytes passing "compression=None" [1] to the open_input_stream should work [1] https://arrow.apache.org/docs/python/generated/pyarrow.fs.FileSystem.html#pyarrow.fs.FileSystem.open_input_stream On Mon, Jan 31, 2022 at 4:55 PM Jialin Liu <valiant...@gmail.com> wrote: > Hi, I'm trying to read snappy file on HDFS using inputstream, but got the > error: >> >> with fs.open_input_stream(read_path, **open_stream_args) as f: >> File "pyarrow/_fs.pyx", line 627, in >> pyarrow._fs.FileSystem.open_input_stream >> File "pyarrow/_fs.pyx", line 557, in >> pyarrow._fs.FileSystem._wrap_input_stream >> File "pyarrow/io.pxi", line 1283, in >> pyarrow.lib.CompressedInputStream.__init__ >> File "pyarrow/error.pxi", line 143, in >> pyarrow.lib.pyarrow_internal_check_status >> File "pyarrow/error.pxi", line 120, in pyarrow.lib.check_status >> pyarrow.lib.ArrowNotImplementedError: Streaming decompression unsupported >> with Snappy >> > > Can anyone plz help me with this? > > Thanks, > Jialin >