timvw commented on issue #2393:
URL: 
https://github.com/apache/arrow-datafusion/issues/2393#issuecomment-1115466140

   Currently spark does the following:
   Datasource (paths) -> (when __globPaths__ option is true) -> 
checkAndGlobPathIfNecessary
   when no glob pattern in path -> fs.listfiles(path)
   when glob pattern in path -> globber.glob(fs.listfiles(path))
   
   As @tustvold already suggested, adding a glob_files method to ObjectStore 
seems the appropriate way to implement this feature. This method should then:
   when no glob pattern in path -> simply list_files
   when glob pattern in path -> glob (list_files) (can implement this with 
file_stream.filter, similar to existing list_file_with_suffix)
   
   Apart from my use-case, list_file_with_suffix is another proof that there is 
indeed a need for (simple) globbing.
   
   Will rework my code in https://github.com/apache/arrow-datafusion/pull/2394 
to conform with the above.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to