adriangb commented on issue #10546:
URL: https://github.com/apache/datafusion/issues/10546#issuecomment-2153462238

   I do think that example would be nice, it's basically what I was trying to 
build 😄 
   
   My approach was going to be something like:
   
   ```rust
   async fn scan(
       &self,
       state: &SessionState,
       projection: Option<&Vec<usize>>,
       filters: &[Expr],
       limit: Option<usize>,
   ) -> Result<Arc<dyn ExecutionPlan>> {
       let object_store_url = ObjectStoreUrl::parse("file://")?;
       let mut file_scan_config = FileScanConfig::new(object_store_url, 
self.schema())
           .with_projection(projection.cloned())
           .with_limit(limit);
   
       // Use the index to get row groups to be scanned
       // Index does best effort to parse filters and push them down into the 
metadata store
       let partitioned_files_with_row_group_selection = 
self.index.get_files(filters).await?;
   
       for file in partitioned_files_with_row_group_selection {
            file_scan_config = file_scan_config.with_file(PartitionedFile::new(
               file.canonical_path.display().to_string(),
               file.file_size,
           ).with_extensions(Arc::new(file.access_plan())));
       }
   
       let df_schema = DFSchema::try_from(self.schema())?;
       // convert filters like [`a = 1`, `b = 2`] to a single filter like `a = 
1 AND b = 2`
       let predicate = conjunction(filters.to_vec());
       let predicate = predicate
           .map(|predicate| state.create_physical_expr(predicate, &df_schema))
           .transpose()?
           .unwrap_or_else(|| datafusion_physical_expr::expressions::lit(true));
   
       let exec = ParquetExec::builder(file_scan_config)
           .with_predicate(predicate)
           .build_arc();
   
       Ok(exec)
   }
   ```
   
   (several functions and types made up)
   
   Does this sound about in line with what you would think of as an example? I 
think implementing the async store as a familiar RDMS (SQLite via SQLx?) would 
be a good example.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to