Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]

2025-03-14 Thread via GitHub


alamb commented on code in PR #15204:
URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993507486


##
datafusion/sqllogictest/test_files/explain_tree.slt:
##
@@ -1725,6 +1725,22 @@ physical_plan
 09)│  rows: 1  │
 10)└───┘
 
+query TT
+explain COPY (VALUES (1, 'foo', 1, '2023-01-01'), (2, 'bar', 2, '2023-01-02'), 
(3, 'baz', 3, '2023-01-03'))
+TO 'test_files/scratch/explain_tree/2.csv';
+
+physical_plan
+01)┌───┐
+02)│DataSinkExec   │

Review Comment:
   It is interesting that the format is not shown in this plan 🤔 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]

2025-03-13 Thread via GitHub


irenjj commented on code in PR #15204:
URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993528636


##
datafusion/sqllogictest/test_files/explain_tree.slt:
##
@@ -1725,6 +1725,22 @@ physical_plan
 09)│  rows: 1  │
 10)└───┘
 
+query TT
+explain COPY (VALUES (1, 'foo', 1, '2023-01-01'), (2, 'bar', 2, '2023-01-02'), 
(3, 'baz', 3, '2023-01-03'))
+TO 'test_files/scratch/explain_tree/2.csv';
+
+physical_plan
+01)┌───┐
+02)│DataSinkExec   │

Review Comment:
   It's triggered by `DataSinkExec`:
   `DisplayFormatType::TreeRender => self.sink().fmt_as(t, f)`
   `CsvSink` is only a part of the `DataSinkExec` format.
   A more reasonable output format might be:
   
   ```
DataSinkExec
   ---
  sink=csv
   
  filegroup...
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]

2025-03-13 Thread via GitHub


irenjj commented on code in PR #15204:
URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993708636


##
datafusion/sqllogictest/test_files/explain_tree.slt:
##
@@ -1725,6 +1725,22 @@ physical_plan
 09)│  rows: 1  │
 10)└───┘
 
+query TT
+explain COPY (VALUES (1, 'foo', 1, '2023-01-01'), (2, 'bar', 2, '2023-01-02'), 
(3, 'baz', 3, '2023-01-03'))
+TO 'test_files/scratch/explain_tree/2.csv';
+
+physical_plan
+01)┌───┐
+02)│DataSinkExec   │

Review Comment:
   I created a new PR #15206, and I found that the actual type of the sink can 
be determined from the file extension.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]

2025-03-13 Thread via GitHub


irenjj commented on code in PR #15204:
URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993557152


##
datafusion/datasource-csv/src/file_format.rs:
##
@@ -666,8 +666,10 @@ impl DisplayAs for CsvSink {
 write!(f, ")")
 }
 DisplayFormatType::TreeRender => {
-// TODO: collect info
-write!(f, "")
+if !self.config.file_groups.is_empty() {

Review Comment:
   I will fix this issue together in #15112.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]

2025-03-13 Thread via GitHub


alamb commented on PR #15204:
URL: https://github.com/apache/datafusion/pull/15204#issuecomment-2721286142

   This PR also has new test coverage so I am merging it in and we can improve 
the display in a future follow on


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]

2025-03-13 Thread via GitHub


alamb commented on code in PR #15204:
URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993539350


##
datafusion/sqllogictest/test_files/explain_tree.slt:
##
@@ -1725,6 +1725,22 @@ physical_plan
 09)│  rows: 1  │
 10)└───┘
 
+query TT
+explain COPY (VALUES (1, 'foo', 1, '2023-01-01'), (2, 'bar', 2, '2023-01-02'), 
(3, 'baz', 3, '2023-01-03'))
+TO 'test_files/scratch/explain_tree/2.csv';
+
+physical_plan
+01)┌───┐
+02)│DataSinkExec   │

Review Comment:
   I agree -- can you make a follow on PR for this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]

2025-03-13 Thread via GitHub


alamb merged PR #15204:
URL: https://github.com/apache/datafusion/pull/15204


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]

2025-03-13 Thread via GitHub


alamb commented on code in PR #15204:
URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993536559


##
datafusion/datasource-csv/src/file_format.rs:
##
@@ -666,8 +666,10 @@ impl DisplayAs for CsvSink {
 write!(f, ")")
 }
 DisplayFormatType::TreeRender => {
-// TODO: collect info
-write!(f, "")
+if !self.config.file_groups.is_empty() {

Review Comment:
   I debugged a bit more and it seems the issue is that the actual filename 
target is in `self.config.table_path` rather than self.config_file_groups
   
   However, this way is consisent with the other data sinks
   
   @irenjj  is there any chance you can make a ticket / PR to fix this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]