raulcd commented on code in PR #46330:
URL: https://github.com/apache/arrow/pull/46330#discussion_r2081406088
##########
python/pyarrow/parquet/core.py:
##########
@@ -1825,7 +1826,14 @@ def read_table(source, *, columns=None, use_threads=True,
filesystem, path = _resolve_filesystem_and_path(source, filesystem)
if filesystem is not None:
source = filesystem.open_input_file(path)
- # TODO test that source is not a directory or a list
+ if not (
+ isinstance(source, str)
Review Comment:
The `#TODO` pointed out to validate this was not a directory and checking
whether this is a string is not validating whether source is a file name or a
directory.
##########
python/pyarrow/parquet/core.py:
##########
@@ -1825,7 +1826,14 @@ def read_table(source, *, columns=None, use_threads=True,
filesystem, path = _resolve_filesystem_and_path(source, filesystem)
if filesystem is not None:
source = filesystem.open_input_file(path)
- # TODO test that source is not a directory or a list
+ if not (
+ isinstance(source, str)
+ or isinstance(source, pa.NativeFile)
+ or hasattr(source, "read")
+ ):
+ raise ValueError(
+ "source should be a file name, a pyarrow.NativeFile or a
file-like object"
+ )
Review Comment:
Let's use the same format as the other `ValueError` messaged on the code
block.
```suggestion
raise ValueError(
"source should be a file name, a pyarrow.NativeFile or a
file-like object "
"when the pyarrow.dataset module is not available"
)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]