adriangb commented on PR #18386:
URL: https://github.com/apache/datafusion/pull/18386#issuecomment-3478145328

   > > Most of these file source implementations cannot operate without schema, 
they all have `.expect("schema must be set")`s that violate using the language 
to enforce correctness.
   > 
   > Can we set the schema for filesource at the FileScanConfigBuilder level?
   > 
   > ```diff
   > diff --git a/datafusion/datasource/src/file_scan_config.rs 
b/datafusion/datasource/src/file_scan_config.rs
   > index 5847a8cf5..660ba4615 100644
   > --- a/datafusion/datasource/src/file_scan_config.rs
   > +++ b/datafusion/datasource/src/file_scan_config.rs
   > @@ -290,10 +290,11 @@ impl FileScanConfigBuilder {
   >          file_schema: SchemaRef,
   >          file_source: Arc<dyn FileSource>,
   >      ) -> Self {
   > +        let table_schema = 
TableSchema::from_file_schema(file_schema.clone());
   >          Self {
   >              object_store_url,
   > -            table_schema: TableSchema::from_file_schema(file_schema),
   > -            file_source,
   > +            table_schema: table_schema.clone(),
   > +            file_source: file_source.with_schema(table_schema),
   >              file_groups: vec![],
   >              statistics: None,
   >              output_ordering: vec![],
   >              ```
   > ```
   
   That's exactly how it was done before! The issue with this is that the 
`FileSource` then has `schema: Option<TableSchema>` which is problematic 
because e.g. when we push down projections, filters, etc. we *need* the schema, 
so we end up doing `self.schema.expect("you have to call with_schema() first")` 
which is very non-idiomatic.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to