timvw commented on issue #2393: URL: https://github.com/apache/arrow-datafusion/issues/2393#issuecomment-1115466140
Currently spark does the following: Datasource (paths) -> (when __globPaths__ option is true) -> checkAndGlobPathIfNecessary when no glob pattern in path -> fs.listfiles(path) when glob pattern in path -> globber.glob(fs.listfiles(path)) As @tustvold already suggested, adding a glob_files method to ObjectStore seems the appropriate way to implement this feature. This method should then: when no glob pattern in path -> simply list_files when glob pattern in path -> glob (list_files) (can implement this with file_stream.filter, similar to existing list_file_with_suffix) Apart from my use-case, list_file_with_suffix is another proof that there is indeed a need for (simple) globbing. Will rework my code in https://github.com/apache/arrow-datafusion/pull/2394 to conform with the above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org