HonahX commented on code in PR #543: URL: https://github.com/apache/iceberg-python/pull/543#discussion_r1554859960
########## tests/io/test_fsspec.py: ########## @@ -586,6 +597,25 @@ def test_writing_avro_file_gcs(generated_manifest_entry_file: str, fsspec_fileio fsspec_fileio_gcs.delete(f"gs://warehouse/{filename}") +@pytest.mark.gcs +def test_fsspec_pickle_roundtrip_gcs(fsspec_fileio_gcs: FsspecFileIO) -> None: + _test_fsspec_pickle_round_trip(fsspec_fileio_gcs, "gs://warehouse/foo.txt") + + +def _test_fsspec_pickle_round_trip(fsspec_fileio: FsspecFileIO, location: str) -> None: + serialized_file_io = pickle.dumps(fsspec_fileio) + deserialized_file_io = pickle.loads(serialized_file_io) + output_file = deserialized_file_io.new_output(location) + with output_file.create() as f: + f.write(b"foo") + + input_file = deserialized_file_io.new_input(location) + with input_file.open() as f: + data = f.read() + assert data == b"foo" + assert len(input_file) == 3 + Review Comment: ```suggestion fsspec_fileio.delete(location) ``` How about deleting the file in the end to make these tests re-runnable? ########## tests/io/test_fsspec.py: ########## @@ -61,7 +62,7 @@ def test_fsspec_new_input_file(fsspec_fileio: FsspecFileIO) -> None: assert input_file.location == f"s3://warehouse/{filename}" -@pytest.mark.s3 +@pytest.mark.s3fsspec_file_io Review Comment: This seems to be an unrelated change ########## tests/io/test_fsspec.py: ########## @@ -586,6 +597,25 @@ def test_writing_avro_file_gcs(generated_manifest_entry_file: str, fsspec_fileio fsspec_fileio_gcs.delete(f"gs://warehouse/{filename}") +@pytest.mark.gcs +def test_fsspec_pickle_roundtrip_gcs(fsspec_fileio_gcs: FsspecFileIO) -> None: + _test_fsspec_pickle_round_trip(fsspec_fileio_gcs, "gs://warehouse/foo.txt") + + +def _test_fsspec_pickle_round_trip(fsspec_fileio: FsspecFileIO, location: str) -> None: + serialized_file_io = pickle.dumps(fsspec_fileio) Review Comment: I just realized that we use both `fileio` and `file_io` in the codespace: (e.g. `fsspec_fileio`, `load_file_io`). I would be good if we could consistently use one of them. This may be done in a separate PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org