BlakeOrth commented on code in PR #17050:
URL: https://github.com/apache/datafusion/pull/17050#discussion_r2267448996
##########
datafusion/core/src/datasource/listing_table_factory.rs:
##########
@@ -63,16 +63,29 @@ impl TableProviderFactory for ListingTableFactory {
))?
.create(session_state, &cmd.options)?;
- let file_extension = get_extension(cmd.location.as_str());
+ let mut table_path = ListingTableUrl::parse(&cmd.location)?;
+ let file_extension = match table_path.is_collection() {
+ true => "",
+ false => &get_extension(cmd.location.as_str()),
+ };
Review Comment:
I'm happy to leave a comment or see if things work with a bit more of an in
depth fix that sets the extension in a "more correct" way. Which would you
prefer?
The test assertion that pushed me towards this specific solution is here:
https://github.com/apache/datafusion/blob/ff9db020d782f25310b77f247af08beda1a80459/datafusion/core/src/datasource/listing_table_factory.rs#L324-L331
On the surface it seems like actually setting the expected full file
extension should be equivalent. Unless I'm mistaken the current behavior of the
code is to list all of the files (empty extension) and filter by the set glob
pattern `*.<file_type>.<compression_type>`. It seems like the results from
setting the file extension correctly should be equivalent based on the
extension/glob filtering code here:
https://github.com/apache/datafusion/blob/ff9db020d782f25310b77f247af08beda1a80459/datafusion/datasource/src/url.rs#L264-L272
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]