sergiimk commented on issue #1384:
URL: 
https://github.com/apache/arrow-datafusion/issues/1384#issuecomment-982937741


   Thanks @houqp, I ended up trying both ways and agree custom `TableProvider` 
is a better option. 
   
   I implemented my own version of `ListingTable` - `PreListedTable`, that 
takes a Vec of paths. The problem is the amount of code duplication.
   
   `ListingTable` does many more things in addition to scanning files:
   - splitting them up into partitions
   - collecting file statistics
   - building physical plan
   
   All this has to be duplicated amounting to >200 LOC, including inlining some 
helper functions that were inaccessible due to being in private modules (e.g. 
`split_files`). This will be a pain to maintain.
   
   I'm curious if you think my use-case makes sense overall - it it something 
datafusion would like to support out-of-the-box? 
   
   If yes - we can discuss the API and I'd be happy to work on adding it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to