[ https://issues.apache.org/jira/browse/ARROW-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Antoine Pitrou resolved ARROW-17306. ------------------------------------ Fix Version/s: 10.0.0 Resolution: Fixed Issue resolved by pull request 13796 [https://github.com/apache/arrow/pull/13796] > [C++] Provide an optimized`GetFileInfoGenerator` specialization for > `LocalFileSystem` > ------------------------------------------------------------------------------------- > > Key: ARROW-17306 > URL: https://issues.apache.org/jira/browse/ARROW-17306 > Project: Apache Arrow > Issue Type: Sub-task > Components: C++ > Reporter: Pavel Solodovnikov > Assignee: Pavel Solodovnikov > Priority: Major > Labels: pull-request-available > Fix For: 10.0.0 > > Time Spent: 10h 50m > Remaining Estimate: 0h > > At the moment, `LocalFileSystem` does not have a separate optimized > implementation of `GetFileInfoGenerator` with a fallback to the generic > `FileSystem::GetFileInfoGenerator`, which simply queues the synchronous > version of `GetFileInfo(FileSelector)` to the background thread and waits for > its completion before yielding. > This generally defeats all the purpose of `GetFileInfoGenerator` so that we > cannot really use it to push down the `FileInfo` items to whatever consumer > "on the fly" (e.g. `FileSystemDatasetFactory` and `FileSystemDataset`, > correspondingly). > Provide a fair implementation so that it yields more than one time and allows > to retrieve the data in chunks, so that the resulting `FileInfoGenerator` is > usable for the purpose of streaming processing of data. > -- This message was sent by Atlassian Jira (v8.20.10#820010)