alamb commented on code in PR #7141:
URL: https://github.com/apache/arrow-datafusion/pull/7141#discussion_r1281147777


##########
datafusion/core/src/datasource/listing/table.rs:
##########
@@ -804,21 +804,25 @@ impl TableProvider for ListingTable {
         .await?;
 
         let file_groups = file_list_stream.try_collect::<Vec<_>>().await?;
-
-        if file_groups.len() > 1 {
-            return Err(DataFusionError::Plan(
-                "Datafusion currently supports tables from single partition 
and/or file."
-                    .to_owned(),
-            ));
+        let writer_mode;
+        //if we are writing a single output_partition to a table backed by a 
single file
+        //we can append to that file. Otherwise, we can write new files into 
the directory
+        //adding new files to the listing table in order to insert to the 
table.
+        let input_partitions = input.output_partitioning().partition_count();
+        if file_groups.len() == 1 && input_partitions == 1 {

Review Comment:
   > Are you envisioningListingTableWriteOptions as part of the ListingOptions 
struct (i.e. a property of the registered table itself)? So the user would do 
something like:
   
   I guess I was thinking there could be a way to register the initil table 
with WriteOptions for when it was written to via `INSERT ...` type queries.
   
   However, I think the more interesting usecase i my mind is passing the 
options at write time, as you show 
   
   ```rust
   df.write_table("table", WriteOptions::new()...)
   ```
   
   I am not quite sure how the code would look and how to make a nice API that 
separates the two concerns (the actual act of writing / passing the write 
options) and the registered table.
   
   It seems like there should also be a way to write directly to a target 
output without having to register a table provider. I am sure there is a way we 
just need to come up with a clever API to dos 



##########
datafusion/core/src/datasource/listing/table.rs:
##########
@@ -804,21 +804,25 @@ impl TableProvider for ListingTable {
         .await?;
 
         let file_groups = file_list_stream.try_collect::<Vec<_>>().await?;
-
-        if file_groups.len() > 1 {
-            return Err(DataFusionError::Plan(
-                "Datafusion currently supports tables from single partition 
and/or file."
-                    .to_owned(),
-            ));
+        let writer_mode;
+        //if we are writing a single output_partition to a table backed by a 
single file
+        //we can append to that file. Otherwise, we can write new files into 
the directory
+        //adding new files to the listing table in order to insert to the 
table.
+        let input_partitions = input.output_partitioning().partition_count();
+        if file_groups.len() == 1 && input_partitions == 1 {

Review Comment:
   > Are you envisioningListingTableWriteOptions as part of the ListingOptions 
struct (i.e. a property of the registered table itself)? So the user would do 
something like:
   
   I guess I was thinking there could be a way to register the initil table 
with WriteOptions for when it was written to via `INSERT ...` type queries.
   
   However, I think the more interesting usecase i my mind is passing the 
options at write time, as you show 
   
   ```rust
   df.write_table("table", WriteOptions::new()...)
   ```
   
   I am not quite sure how the code would look and how to make a nice API that 
separates the two concerns (the actual act of writing / passing the write 
options) and the registered table.
   
   It seems like there should also be a way to write directly to a target 
output without having to register a table provider. I am sure there is a way we 
just need to come up with a clever API to do so 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to