pitrou commented on code in PR #39825:
URL: https://github.com/apache/arrow/pull/39825#discussion_r1506238613


##########
python/pyarrow/fs.py:
##########
@@ -123,25 +121,14 @@ def _ensure_filesystem(
                 return LocalFileSystem(use_mmap=use_mmap)
             return PyFileSystem(FSSpecHandler(filesystem))
 
-    # map old filesystems to new ones
-    import pyarrow.filesystem as legacyfs
-
-    if isinstance(filesystem, legacyfs.LocalFileSystem):
-        return LocalFileSystem(use_mmap=use_mmap)
-    # TODO handle HDFS?
-    if allow_legacy_filesystem and isinstance(filesystem, legacyfs.FileSystem):
-        return filesystem
-
     raise TypeError(
         "Unrecognized filesystem: {}. `filesystem` argument must be a "
         "FileSystem instance or a valid file system URI'".format(
             type(filesystem))
     )
 
 
-def _resolve_filesystem_and_path(
-    path, filesystem=None, allow_legacy_filesystem=False, memory_map=False
-):
+def _resolve_filesystem_and_path(path, filesystem=None, memory_map=False):

Review Comment:
   Same here:
   ```suggestion
   def _resolve_filesystem_and_path(path, filesystem=None, *, memory_map=False):
   ```



##########
python/pyarrow/fs.py:
##########
@@ -98,9 +98,7 @@ def _filesystem_from_str(uri):
     return filesystem
 
 
-def _ensure_filesystem(
-    filesystem, use_mmap=False, allow_legacy_filesystem=False
-):
+def _ensure_filesystem(filesystem, use_mmap=False):

Review Comment:
   Can we make sure those optional args are not passed by position?
   ```suggestion
   def _ensure_filesystem(filesystem, *, use_mmap=False):
   ```



##########
python/pyarrow/tests/parquet/test_dataset.py:
##########
@@ -1009,15 +991,24 @@ def _test_write_to_dataset_no_partitions(base_path,
     output_table = pa.Table.from_pandas(output_df)
 
     if filesystem is None:
-        filesystem = LocalFileSystem._get_instance()
+        filesystem = LocalFileSystem()
 
     # Without partitions, append files to root_path
     n = 5
     for i in range(n):
         pq.write_to_dataset(output_table, base_path,
                             filesystem=filesystem)
-    output_files = [file for file in filesystem.ls(str(base_path))
-                    if file.endswith(".parquet")]
+
+    try:
+        output_files = [file for file in filesystem.ls(str(base_path))

Review Comment:
   Does this ever succeed on non-legacy filesystems?



##########
python/pyarrow/tests/parquet/test_dataset.py:
##########
@@ -1009,15 +991,24 @@ def _test_write_to_dataset_no_partitions(base_path,
     output_table = pa.Table.from_pandas(output_df)
 
     if filesystem is None:
-        filesystem = LocalFileSystem._get_instance()
+        filesystem = LocalFileSystem()
 
     # Without partitions, append files to root_path
     n = 5
     for i in range(n):
         pq.write_to_dataset(output_table, base_path,
                             filesystem=filesystem)
-    output_files = [file for file in filesystem.ls(str(base_path))
-                    if file.endswith(".parquet")]
+
+    try:
+        output_files = [file for file in filesystem.ls(str(base_path))
+                        if file.endswith(".parquet")]
+    except AttributeError:
+        selector = FileSelector(str(base_path), allow_not_found=False,
+                                recursive=True)
+        assert selector.base_dir == str(base_path)
+
+        infos = filesystem.get_file_info(selector)
+        output_files = [info for info in infos if 
(info.path.endswith(".parquet"))]

Review Comment:
   Nit
   ```suggestion
           output_files = [info for info in infos if 
info.path.endswith(".parquet")]
   ```



##########
python/pyarrow/io.pxi:
##########
@@ -46,6 +47,15 @@ cdef extern from "Python.h":
     bytearray PyByteArray_FromStringAndSize(char *string, Py_ssize_t len)
 
 
+def have_libhdfs():

Review Comment:
   Do you want to add a docstring here?



##########
python/pyarrow/tests/parquet/conftest.py:
##########
@@ -53,23 +53,22 @@ def s3_bucket(s3_server):
 
 @pytest.fixture
 def s3_example_s3fs(s3_server, s3_bucket):

Review Comment:
   Is it useful to keep this, since there's `s3_example_fs` below already?



##########
python/pyarrow/tests/parquet/test_dataset.py:
##########
@@ -27,11 +27,10 @@
 import pyarrow as pa
 import pyarrow.compute as pc
 from pyarrow import fs
-from pyarrow.filesystem import LocalFileSystem
 from pyarrow.tests import util
 from pyarrow.util import guid
-from pyarrow.vendored.version import Version
 
+from pyarrow.fs import (FileSelector, LocalFileSystem)

Review Comment:
   Style nit: you can move this back above



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to