Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]
alamb commented on code in PR #15204: URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993507486 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -1725,6 +1725,22 @@ physical_plan 09)│ rows: 1 │ 10)└───┘ +query TT +explain COPY (VALUES (1, 'foo', 1, '2023-01-01'), (2, 'bar', 2, '2023-01-02'), (3, 'baz', 3, '2023-01-03')) +TO 'test_files/scratch/explain_tree/2.csv'; + +physical_plan +01)┌───┐ +02)│DataSinkExec │ Review Comment: It is interesting that the format is not shown in this plan 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]
irenjj commented on code in PR #15204: URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993528636 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -1725,6 +1725,22 @@ physical_plan 09)│ rows: 1 │ 10)└───┘ +query TT +explain COPY (VALUES (1, 'foo', 1, '2023-01-01'), (2, 'bar', 2, '2023-01-02'), (3, 'baz', 3, '2023-01-03')) +TO 'test_files/scratch/explain_tree/2.csv'; + +physical_plan +01)┌───┐ +02)│DataSinkExec │ Review Comment: It's triggered by `DataSinkExec`: `DisplayFormatType::TreeRender => self.sink().fmt_as(t, f)` `CsvSink` is only a part of the `DataSinkExec` format. A more reasonable output format might be: ``` DataSinkExec --- sink=csv filegroup... ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]
irenjj commented on code in PR #15204: URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993708636 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -1725,6 +1725,22 @@ physical_plan 09)│ rows: 1 │ 10)└───┘ +query TT +explain COPY (VALUES (1, 'foo', 1, '2023-01-01'), (2, 'bar', 2, '2023-01-02'), (3, 'baz', 3, '2023-01-03')) +TO 'test_files/scratch/explain_tree/2.csv'; + +physical_plan +01)┌───┐ +02)│DataSinkExec │ Review Comment: I created a new PR #15206, and I found that the actual type of the sink can be determined from the file extension. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]
irenjj commented on code in PR #15204:
URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993557152
##
datafusion/datasource-csv/src/file_format.rs:
##
@@ -666,8 +666,10 @@ impl DisplayAs for CsvSink {
write!(f, ")")
}
DisplayFormatType::TreeRender => {
-// TODO: collect info
-write!(f, "")
+if !self.config.file_groups.is_empty() {
Review Comment:
I will fix this issue together in #15112.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]
alamb commented on PR #15204: URL: https://github.com/apache/datafusion/pull/15204#issuecomment-2721286142 This PR also has new test coverage so I am merging it in and we can improve the display in a future follow on -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]
alamb commented on code in PR #15204: URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993539350 ## datafusion/sqllogictest/test_files/explain_tree.slt: ## @@ -1725,6 +1725,22 @@ physical_plan 09)│ rows: 1 │ 10)└───┘ +query TT +explain COPY (VALUES (1, 'foo', 1, '2023-01-01'), (2, 'bar', 2, '2023-01-02'), (3, 'baz', 3, '2023-01-03')) +TO 'test_files/scratch/explain_tree/2.csv'; + +physical_plan +01)┌───┐ +02)│DataSinkExec │ Review Comment: I agree -- can you make a follow on PR for this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]
alamb merged PR #15204: URL: https://github.com/apache/datafusion/pull/15204 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Implement `tree` explain for `CsvSink` [datafusion]
alamb commented on code in PR #15204:
URL: https://github.com/apache/datafusion/pull/15204#discussion_r1993536559
##
datafusion/datasource-csv/src/file_format.rs:
##
@@ -666,8 +666,10 @@ impl DisplayAs for CsvSink {
write!(f, ")")
}
DisplayFormatType::TreeRender => {
-// TODO: collect info
-write!(f, "")
+if !self.config.file_groups.is_empty() {
Review Comment:
I debugged a bit more and it seems the issue is that the actual filename
target is in `self.config.table_path` rather than self.config_file_groups
However, this way is consisent with the other data sinks
@irenjj is there any chance you can make a ticket / PR to fix this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
