Copilot commented on code in PR #50044:
URL: https://github.com/apache/arrow/pull/50044#discussion_r3304338871
##########
cpp/src/arrow/filesystem/filesystem.h:
##########
@@ -359,11 +361,34 @@ class ARROW_EXPORT FileSystem
struct FileSystemFactory {
std::function<Result<std::shared_ptr<FileSystem>>(
- const Uri& uri, const io::IOContext& io_context, std::string* out_path)>
+ const Uri& uri, const std::vector<std::pair<std::string, std::any>>&
options,
+ const io::IOContext& io_context, std::string* out_path)>
function;
std::string_view file;
int line;
+ /// Construct from an options-aware factory function.
+ FileSystemFactory(std::function<Result<std::shared_ptr<FileSystem>>(
+ const Uri&, const std::vector<std::pair<std::string,
std::any>>&,
+ const io::IOContext&, std::string*)>
+ fn,
+ std::string_view file, int line)
+ : function(std::move(fn)), file(file), line(line) {}
+
+ /// Construct from a non-options aware factory function maintaing source
compatibility
+ /// with existing factories.
Review Comment:
Typo in comment: “maintaing” should be “maintaining”.
##########
cpp/src/arrow/filesystem/filesystem.h:
##########
@@ -547,6 +572,26 @@ ARROW_EXPORT
Result<std::shared_ptr<FileSystem>> FileSystemFromUri(const std::string& uri,
std::string* out_path =
NULLPTR);
+/// \brief Create a new FileSystem by URI with extended backend-specific
filesystem
+/// options
+///
+/// Recognized schemes are "file", "mock", "hdfs", "viewfs", "s3",
+/// "gs" and "gcs".
+///
+/// Support for other schemes can be added using RegisterFileSystemFactory.
+///
+/// \param[in] uri the URI to give access to
+/// \param[in] options a list of backend-specific filesystem options
+/// Each option is a (name, value) pair.
+/// The expected type is specific to the backend and
+/// option name.
+/// \param[out] out_path (optional) Path inside the filesystem.
+/// \return out_fs FileSystem instance.
+ARROW_EXPORT
+Result<std::shared_ptr<FileSystem>> FileSystemFromUri(
+ const std::string& uri, const std::vector<std::pair<std::string,
std::any>>& options,
+ std::string* out_path = NULLPTR);
Review Comment:
The docstring for the new `FileSystemFromUri(uri, options, ...)` overloads
implies the listed “recognized schemes” support backend-specific `options`, but
the current implementation only forwards `options` to registered factories.
Please clarify this in the documentation (or expand implementation) so callers
don’t assume `options` will be honored for built-in schemes like
hdfs/viewfs/gs/gcs/abfs.
##########
cpp/src/arrow/filesystem/filesystem.h:
##########
@@ -547,6 +572,26 @@ ARROW_EXPORT
Result<std::shared_ptr<FileSystem>> FileSystemFromUri(const std::string& uri,
std::string* out_path =
NULLPTR);
+/// \brief Create a new FileSystem by URI with extended backend-specific
filesystem
+/// options
+///
+/// Recognized schemes are "file", "mock", "hdfs", "viewfs", "s3",
+/// "gs" and "gcs".
+///
+/// Support for other schemes can be added using RegisterFileSystemFactory.
+///
+/// \param[in] uri the URI to give access to
+/// \param[in] options a list of backend-specific filesystem options
+/// Each option is a (name, value) pair.
+/// The expected type is specific to the backend and
+/// option name.
+/// \param[out] out_path (optional) Path inside the filesystem.
+/// \return out_fs FileSystem instance.
+ARROW_EXPORT
+Result<std::shared_ptr<FileSystem>> FileSystemFromUri(
+ const std::string& uri, const std::vector<std::pair<std::string,
std::any>>& options,
+ std::string* out_path = NULLPTR);
Review Comment:
Adding the `FileSystemFromUri(const std::string&, options, std::string*)`
overload can introduce an overload-resolution ambiguity for existing call sites
that used `{}` to mean a default-constructed `io::IOContext` (since `{}` can
now also initialize the `options` vector). Consider documenting this and/or
providing an alternative overload/helper that avoids `{}` ambiguity (e.g.,
requiring explicit `io::IOContext{}` / `io::default_io_context()` at call
sites, or using a strong `FileSystemOptions` type).
##########
cpp/src/arrow/filesystem/s3fs.cc:
##########
@@ -3603,11 +3603,13 @@ Result<std::string> ResolveS3BucketRegion(const
std::string& bucket) {
auto kS3FileSystemModule = ARROW_REGISTER_FILESYSTEM(
"s3",
- [](const arrow::util::Uri& uri, const io::IOContext& io_context,
+ [](const arrow::util::Uri& uri,
+ const std::vector<std::pair<std::string, std::any>>& options,
+ const io::IOContext& io_context,
std::string* out_path) -> Result<std::shared_ptr<fs::FileSystem>> {
RETURN_NOT_OK(EnsureS3Initialized());
Review Comment:
The S3 factory now receives the new `options` argument but ignores it
completely. Given the purpose of adding key-value pairs (e.g., to carry
user-specific tokens/keys outside the URI), this should either be merged into
`S3Options` construction or validated (e.g., reject non-empty `options` until
supported) to avoid silently ignoring caller-supplied credentials/options.
##########
cpp/src/arrow/filesystem/filesystem.cc:
##########
@@ -962,14 +962,28 @@ Result<std::shared_ptr<FileSystem>>
FileSystemFromUriReal(const Uri& uri,
Result<std::shared_ptr<FileSystem>> FileSystemFromUri(const std::string&
uri_string,
std::string* out_path) {
- return FileSystemFromUri(uri_string, io::default_io_context(), out_path);
+ return FileSystemFromUri(uri_string, /*options=*/{},
io::default_io_context(),
+ out_path);
+}
+
+Result<std::shared_ptr<FileSystem>> FileSystemFromUri(
+ const std::string& uri_string,
+ const std::vector<std::pair<std::string, std::any>>& options, std::string*
out_path) {
+ return FileSystemFromUri(uri_string, options, io::default_io_context(),
out_path);
}
Result<std::shared_ptr<FileSystem>> FileSystemFromUri(const std::string&
uri_string,
const io::IOContext&
io_context,
std::string* out_path) {
+ return FileSystemFromUri(uri_string, /*options=*/{}, io_context, out_path);
+}
+
+Result<std::shared_ptr<FileSystem>> FileSystemFromUri(
+ const std::string& uri_string,
+ const std::vector<std::pair<std::string, std::any>>& options,
+ const io::IOContext& io_context, std::string* out_path) {
ARROW_ASSIGN_OR_RAISE(auto fsuri, ParseFileSystemUri(uri_string));
- return FileSystemFromUriReal(fsuri, uri_string, io_context, out_path);
+ return FileSystemFromUriReal(fsuri, uri_string, options, io_context,
out_path);
}
Review Comment:
New overloads for `FileSystemFromUri` that accept `options` were added, but
there’s no accompanying unit test verifying that options are correctly
forwarded to a registered factory (and/or that unsupported schemes handle
non-empty options as intended). Please add a focused test in the existing
FileSystemFromUri test suite to cover this new behavior.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]