Re: [PR] Migrate core test to insta, part1 [datafusion]
alamb commented on PR #16324: URL: https://github.com/apache/datafusion/pull/16324#issuecomment-2980678961 🚀 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
blaginin merged PR #16324: URL: https://github.com/apache/datafusion/pull/16324 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
Chen-Yuan-Lai commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2148615171
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -797,14 +796,16 @@ async fn explain_physical_plan_only() {
let sql = "EXPLAIN select count(*) from (values ('a', 1, 100), ('a', 2,
150)) as t (c1,c2,c3)";
let actual = execute(&ctx, sql).await;
let actual = normalize_vec_for_explain(actual);
Review Comment:
After searching `ExplainNormalizer` in the codebase, I found that only four
test cases used it:
1. `test_physical_plan_display_indent` (initialize `ExplainNormalizer`)
2. `test_physical_plan_display_indent_multi_children` (initialize
`ExplainNormalizer`)
3. `explain_logical_plan_only` (call `normalize_vec_for_explain`)
4. `explain_physical_plan_only` (call `normalize_vec_for_explain`)
1.2. created the physical plan with fixed core numbers (9), and 3. 4.
doesn't have `partitioning=RoundRobinBatch()` string in the snapshot, so I
think this change may not be necessary?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
Chen-Yuan-Lai commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2146791719
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() {
let formatted = arrow::util::pretty::pretty_format_batches(&results)
.unwrap()
.to_string();
+
println!("Query Output:\n\n{formatted}");
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=Partial, gby=[]",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434",
-"metrics=[output_rows=99, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"ProjectionExec: expr=[]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"CoalesceBatchesExec: target_batch_size=4096",
-"metrics=[output_rows=5, elapsed_compute"
-);
-assert_metrics!(
-&formatted,
-"UnionExec",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"WindowAggExec",
-"metrics=[output_rows=1, elapsed_compute="
-);
+let re = Regex::new(r"\|[^|]*\|([^|]*)\|").unwrap();
+let actual = formatted
+.lines()
+.map(|line| re.replace_all(line, "$1").trim_end().to_string())
+.filter(|line| !line.is_empty() && !line.starts_with('+'))
+.collect::>()
+.join("\n");
Review Comment:
Sure, I have revert the change in this PR
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
alamb commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2145201108
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() {
let formatted = arrow::util::pretty::pretty_format_batches(&results)
.unwrap()
.to_string();
+
println!("Query Output:\n\n{formatted}");
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=Partial, gby=[]",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434",
-"metrics=[output_rows=99, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"ProjectionExec: expr=[]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"CoalesceBatchesExec: target_batch_size=4096",
-"metrics=[output_rows=5, elapsed_compute"
-);
-assert_metrics!(
-&formatted,
-"UnionExec",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"WindowAggExec",
-"metrics=[output_rows=1, elapsed_compute="
-);
+let re = Regex::new(r"\|[^|]*\|([^|]*)\|").unwrap();
+let actual = formatted
+.lines()
+.map(|line| re.replace_all(line, "$1").trim_end().to_string())
+.filter(|line| !line.is_empty() && !line.starts_with('+'))
+.collect::>()
+.join("\n");
Review Comment:
maybe we can just keep using `assert_metrics` for this style test (or
perhaps revert the changes from this PR and work on migrating the metrics tests
to a nicer style in a separate PR so we don't delay merging the other parts of
this PR)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
Chen-Yuan-Lai commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2143240105
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() {
let formatted = arrow::util::pretty::pretty_format_batches(&results)
.unwrap()
.to_string();
+
println!("Query Output:\n\n{formatted}");
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=Partial, gby=[]",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434",
-"metrics=[output_rows=99, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"ProjectionExec: expr=[]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"CoalesceBatchesExec: target_batch_size=4096",
-"metrics=[output_rows=5, elapsed_compute"
-);
-assert_metrics!(
-&formatted,
-"UnionExec",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"WindowAggExec",
-"metrics=[output_rows=1, elapsed_compute="
-);
+let re = Regex::new(r"\|[^|]*\|([^|]*)\|").unwrap();
+let actual = formatted
+.lines()
+.map(|line| re.replace_all(line, "$1").trim_end().to_string())
+.filter(|line| !line.is_empty() && !line.starts_with('+'))
+.collect::>()
+.join("\n");
Review Comment:
I have tried to assert the raw table, but I found that it caused
inconsistent trailing whitespace in each snapshot testing. That is why I
trimmed the table to only keep the plan context.
Do you have any idea for keeping the consistent table format in each
execution? thx
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
Chen-Yuan-Lai commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2143152053
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() {
let formatted = arrow::util::pretty::pretty_format_batches(&results)
.unwrap()
.to_string();
+
println!("Query Output:\n\n{formatted}");
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=Partial, gby=[]",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434",
-"metrics=[output_rows=99, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"ProjectionExec: expr=[]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"CoalesceBatchesExec: target_batch_size=4096",
-"metrics=[output_rows=5, elapsed_compute"
-);
-assert_metrics!(
-&formatted,
-"UnionExec",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"WindowAggExec",
-"metrics=[output_rows=1, elapsed_compute="
-);
+let re = Regex::new(r"\|[^|]*\|([^|]*)\|").unwrap();
+let actual = formatted
+.lines()
+.map(|line| re.replace_all(line, "$1").trim_end().to_string())
+.filter(|line| !line.is_empty() && !line.starts_with('+'))
+.collect::>()
+.join("\n");
+insta::with_settings!({filters => vec![
+(r"\d+\.?\d*[µmn]?s", "[TIME]"),
Review Comment:
> you can probably use lookahead
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion
Unfortunately, Rust's regex crate does not support lookahead or lookbehind
assertions 😢 ( [ref](https://github.com/rust-lang/regex/issues/127) )
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
Chen-Yuan-Lai commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2143152053
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() {
let formatted = arrow::util::pretty::pretty_format_batches(&results)
.unwrap()
.to_string();
+
println!("Query Output:\n\n{formatted}");
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=Partial, gby=[]",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434",
-"metrics=[output_rows=99, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"ProjectionExec: expr=[]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"CoalesceBatchesExec: target_batch_size=4096",
-"metrics=[output_rows=5, elapsed_compute"
-);
-assert_metrics!(
-&formatted,
-"UnionExec",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"WindowAggExec",
-"metrics=[output_rows=1, elapsed_compute="
-);
+let re = Regex::new(r"\|[^|]*\|([^|]*)\|").unwrap();
+let actual = formatted
+.lines()
+.map(|line| re.replace_all(line, "$1").trim_end().to_string())
+.filter(|line| !line.is_empty() && !line.starts_with('+'))
+.collect::>()
+.join("\n");
+insta::with_settings!({filters => vec![
+(r"\d+\.?\d*[µmn]?s", "[TIME]"),
Review Comment:
> you can probably use lookahead
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion
Unfortunately, Rust's regex crate does not support lookahead or lookbehind
assertions 😢 ([ref](https://github.com/rust-lang/regex/issues/127))
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
Chen-Yuan-Lai commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2143152053
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() {
let formatted = arrow::util::pretty::pretty_format_batches(&results)
.unwrap()
.to_string();
+
println!("Query Output:\n\n{formatted}");
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=Partial, gby=[]",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434",
-"metrics=[output_rows=99, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"ProjectionExec: expr=[]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"CoalesceBatchesExec: target_batch_size=4096",
-"metrics=[output_rows=5, elapsed_compute"
-);
-assert_metrics!(
-&formatted,
-"UnionExec",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"WindowAggExec",
-"metrics=[output_rows=1, elapsed_compute="
-);
+let re = Regex::new(r"\|[^|]*\|([^|]*)\|").unwrap();
+let actual = formatted
+.lines()
+.map(|line| re.replace_all(line, "$1").trim_end().to_string())
+.filter(|line| !line.is_empty() && !line.starts_with('+'))
+.collect::>()
+.join("\n");
+insta::with_settings!({filters => vec![
+(r"\d+\.?\d*[µmn]?s", "[TIME]"),
Review Comment:
> you can probably use lookahead
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion
Unfortunately, Rust's regex crate explicitly does not support lookahead or
lookbehind assertions 😢 ([ref](https://github.com/rust-lang/regex/issues/127))
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
blaginin commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2141000796
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() {
let formatted = arrow::util::pretty::pretty_format_batches(&results)
.unwrap()
.to_string();
+
println!("Query Output:\n\n{formatted}");
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=Partial, gby=[]",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434",
-"metrics=[output_rows=99, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"ProjectionExec: expr=[]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"CoalesceBatchesExec: target_batch_size=4096",
-"metrics=[output_rows=5, elapsed_compute"
-);
-assert_metrics!(
-&formatted,
-"UnionExec",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"WindowAggExec",
-"metrics=[output_rows=1, elapsed_compute="
-);
+let re = Regex::new(r"\|[^|]*\|([^|]*)\|").unwrap();
+let actual = formatted
+.lines()
+.map(|line| re.replace_all(line, "$1").trim_end().to_string())
+.filter(|line| !line.is_empty() && !line.starts_with('+'))
+.collect::>()
+.join("\n");
+insta::with_settings!({filters => vec![
+(r"\d+\.?\d*[µmn]?s", "[TIME]"),
Review Comment:
Happy with keeping as is, but on this
> might match irrelevant strings
you can probably use lookahead
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
Chen-Yuan-Lai commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2140683717
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() {
let formatted = arrow::util::pretty::pretty_format_batches(&results)
.unwrap()
.to_string();
+
println!("Query Output:\n\n{formatted}");
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=Partial, gby=[]",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434",
-"metrics=[output_rows=99, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"ProjectionExec: expr=[]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"CoalesceBatchesExec: target_batch_size=4096",
-"metrics=[output_rows=5, elapsed_compute"
-);
-assert_metrics!(
-&formatted,
-"UnionExec",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"WindowAggExec",
-"metrics=[output_rows=1, elapsed_compute="
-);
+let re = Regex::new(r"\|[^|]*\|([^|]*)\|").unwrap();
+let actual = formatted
+.lines()
+.map(|line| re.replace_all(line, "$1").trim_end().to_string())
+.filter(|line| !line.is_empty() && !line.starts_with('+'))
+.collect::>()
+.join("\n");
+insta::with_settings!({filters => vec![
+(r"\d+\.?\d*[µmn]?s", "[TIME]"),
Review Comment:
Agree, current pattern can't match the longer time (ex. min, h, d). However,
a broader pattern might match irrelevant strings. For example, match on `[ ]`:
metrics=[output_rows=1, elapsed_compute=110.947µs] -> metrics=[TIME]
I'm currently considering to add more time units in the regex pattern:
```regex
r"\d+\.?\d*(?:µs|ms|ns|s|min|h)\b"
```
Or, do you have any suggestions for a more precise way to match time-related
strings? many thanks.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
alamb commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2140697002
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() {
let formatted = arrow::util::pretty::pretty_format_batches(&results)
.unwrap()
.to_string();
+
println!("Query Output:\n\n{formatted}");
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=Partial, gby=[]",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434",
-"metrics=[output_rows=99, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"ProjectionExec: expr=[]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"CoalesceBatchesExec: target_batch_size=4096",
-"metrics=[output_rows=5, elapsed_compute"
-);
-assert_metrics!(
-&formatted,
-"UnionExec",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"WindowAggExec",
-"metrics=[output_rows=1, elapsed_compute="
-);
+let re = Regex::new(r"\|[^|]*\|([^|]*)\|").unwrap();
+let actual = formatted
+.lines()
+.map(|line| re.replace_all(line, "$1").trim_end().to_string())
+.filter(|line| !line.is_empty() && !line.starts_with('+'))
+.collect::>()
+.join("\n");
+insta::with_settings!({filters => vec![
+(r"\d+\.?\d*[µmn]?s", "[TIME]"),
Review Comment:
Maybe just for this teset case we could avoid using `insta` and use the
previous approach of substring matches
While that approach is also non ideal, it has seemed to work this far
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
Chen-Yuan-Lai commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2138296977
##
datafusion/core/tests/optimizer/mod.rs:
##
@@ -107,16 +126,15 @@ fn concat_ws_literals() -> Result<()> {
let sql = "SELECT concat_ws('-', true, col_int32, false, null, 'hello',
col_utf8, 12, '', 3.4) \
AS col
FROM test";
-let expected =
-"Projection: concat_ws(Utf8(\"-\"), Utf8(\"true\"),
CAST(test.col_int32 AS Utf8), Utf8(\"false-hello\"), test.col_utf8,
Utf8(\"12--3.4\")) AS col\
-\n TableScan: test projection=[col_int32, col_utf8]";
-quick_test(sql, expected);
-Ok(())
-}
-
-fn quick_test(sql: &str, expected_plan: &str) {
let plan = test_sql(sql).unwrap();
-assert_eq!(expected_plan, format!("{plan}"));
+assert_snapshot!(
+format!("{plan}"),
Review Comment:
Thanks for the comment, the `format!` is exactly unnecessary in these cases
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
Chen-Yuan-Lai commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2138296977
##
datafusion/core/tests/optimizer/mod.rs:
##
@@ -107,16 +126,15 @@ fn concat_ws_literals() -> Result<()> {
let sql = "SELECT concat_ws('-', true, col_int32, false, null, 'hello',
col_utf8, 12, '', 3.4) \
AS col
FROM test";
-let expected =
-"Projection: concat_ws(Utf8(\"-\"), Utf8(\"true\"),
CAST(test.col_int32 AS Utf8), Utf8(\"false-hello\"), test.col_utf8,
Utf8(\"12--3.4\")) AS col\
-\n TableScan: test projection=[col_int32, col_utf8]";
-quick_test(sql, expected);
-Ok(())
-}
-
-fn quick_test(sql: &str, expected_plan: &str) {
let plan = test_sql(sql).unwrap();
-assert_eq!(expected_plan, format!("{plan}"));
+assert_snapshot!(
+format!("{plan}"),
Review Comment:
Thanks for the comment, the `format!` is exactly unnecessary
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
Chen-Yuan-Lai commented on PR #16324: URL: https://github.com/apache/datafusion/pull/16324#issuecomment-2957481921 > Just FYI, if you expect more work on this issue, you may want to change "closes" to "part of" - because otherwise github will actually close the issue once this PR is merged Oh! Sorry, I have corrected the PR body -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
blaginin commented on code in PR #16324:
URL: https://github.com/apache/datafusion/pull/16324#discussion_r2136447543
##
datafusion/core/tests/optimizer/mod.rs:
##
@@ -107,16 +126,15 @@ fn concat_ws_literals() -> Result<()> {
let sql = "SELECT concat_ws('-', true, col_int32, false, null, 'hello',
col_utf8, 12, '', 3.4) \
AS col
FROM test";
-let expected =
-"Projection: concat_ws(Utf8(\"-\"), Utf8(\"true\"),
CAST(test.col_int32 AS Utf8), Utf8(\"false-hello\"), test.col_utf8,
Utf8(\"12--3.4\")) AS col\
-\n TableScan: test projection=[col_int32, col_utf8]";
-quick_test(sql, expected);
-Ok(())
-}
-
-fn quick_test(sql: &str, expected_plan: &str) {
let plan = test_sql(sql).unwrap();
-assert_eq!(expected_plan, format!("{plan}"));
+assert_snapshot!(
+format!("{plan}"),
Review Comment:
assert_snapshot triggers format internally so you can just do
```suggestion
plan,
```
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() {
let formatted = arrow::util::pretty::pretty_format_batches(&results)
.unwrap()
.to_string();
+
println!("Query Output:\n\n{formatted}");
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=Partial, gby=[]",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434",
-"metrics=[output_rows=99, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"ProjectionExec: expr=[]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"CoalesceBatchesExec: target_batch_size=4096",
-"metrics=[output_rows=5, elapsed_compute"
-);
-assert_metrics!(
-&formatted,
-"UnionExec",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"WindowAggExec",
-"metrics=[output_rows=1, elapsed_compute="
-);
+let re = Regex::new(r"\|[^|]*\|([^|]*)\|").unwrap();
+let actual = formatted
+.lines()
+.map(|line| re.replace_all(line, "$1").trim_end().to_string())
+.filter(|line| !line.is_empty() && !line.starts_with('+'))
+.collect::>()
+.join("\n");
Review Comment:
what do you think about just asserting the table itself without
modifications? I fear that this code will:
a - make test a bit more complicated
b - needs to be manually maintained when we change the formatting
##
datafusion/core/tests/sql/explain_analyze.rs:
##
@@ -52,43 +54,45 @@ async fn explain_analyze_baseline_metrics() {
let formatted = arrow::util::pretty::pretty_format_batches(&results)
.unwrap()
.to_string();
+
println!("Query Output:\n\n{formatted}");
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=Partial, gby=[]",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"FilterExec: c13@1 != C2GT5KVyOPZpgKVl110TyZO0NcJ434",
-"metrics=[output_rows=99, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"ProjectionExec: expr=[]",
-"metrics=[output_rows=5, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"CoalesceBatchesExec: target_batch_size=4096",
-"metrics=[output_rows=5, elapsed_compute"
-);
-assert_metrics!(
-&formatted,
-"UnionExec",
-"metrics=[output_rows=3, elapsed_compute="
-);
-assert_metrics!(
-&formatted,
-"WindowAggExec",
-"metrics=[output_rows=1, elapsed_compute="
-);
+let re = Regex::new(r"\|[^|]*\|([^|]*)\|").unwrap();
+let actual = formatted
+.lines()
+.map(|line| re.replace_all(line, "$1").trim_end().to_string())
+.filter(|line| !line.is_empty() && !line.starts_with('+'))
+.collect::>()
+.join("\n");
+insta::with_settings!({filters => vec![
+(r"\d+\.?\d*[µmn]?s", "[TIME]"),
Review Comment:
`[µmn]` - I think this can be somewhat flaky it the test takes more time -
maybe match on `[]`?
##
datafusion/core/tests/optimizer/mod.rs:
##
@@ -107,16 +126,15 @@ fn concat_ws_literals() -> Result<()> {
let sql = "SELECT concat_ws('-', true, col_int32, false, null, 'hello',
col_utf8, 12, '', 3.4) \
AS col
FROM test";
-let expected =
-"Project
Re: [PR] Migrate core test to insta, part1 [datafusion]
blaginin commented on PR #16324: URL: https://github.com/apache/datafusion/pull/16324#issuecomment-2956983950 Also could you please resolve the conflicts -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
blaginin commented on PR #16324: URL: https://github.com/apache/datafusion/pull/16324#issuecomment-2956908455 Hey @Chen-Yuan-Lai! In the PR title you have > part1 but in the PR body you write > Closes https://github.com/apache/datafusion/issues/15791 Just FYI, if you expect more work on this issue, you may want to change "closes" to "part of" - because otherwise github will actually close the issue once this PR is merged -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Migrate core test to insta, part1 [datafusion]
alamb commented on PR #16324: URL: https://github.com/apache/datafusion/pull/16324#issuecomment-2955861986 Thank you @Chen-Yuan-Lai -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
