Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 11.0.0 RC1

Andrew Lamb Wed, 17 Aug 2022 01:46:38 -0700

This looks similar to [1]

Do you by any chance have the ARROW_TEST_DATA environment set? If so I
think it needs to end with a `/` or be unset to run the script.


The difference is there is something wrong with the normalization:

Expected:
> files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,

Actual
> files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv],
has_header=true,

Andrew

[1] https://github.com/apache/arrow-datafusion/issues/2719

On Tue, Aug 16, 2022 at 9:18 PM Remzi Yang <1371656737...@gmail.com> wrote:

> Some tests failed. Verified on M1 Mac.
>
> failures:
>
>
> ---- sql::explain_analyze::csv_explain stdout ----
>
> thread 'sql::explain_analyze::csv_explain' panicked at 'assertion failed:
> `(left == right)`
>
>   left: `[["logical_plan", "Projection: #aggregate_test_100.c1\n  Filter:
> #aggregate_test_100.c2 > Int64(10)\n    TableScan: aggregate_test_100
> projection=[c1, c2], partial_filters=[#aggregate_test_100.c2 >
> Int64(10)]"], ["physical_plan", "ProjectionExec: expr=[c1@0 as c1]\n
> CoalesceBatchesExec:
> target_batch_size=4096\n    FilterExec: CAST(c2@1 AS Int64) > 10\n
>  RepartitionExec:
> partitioning=RoundRobinBatch(NUM_CORES)\n        CsvExec:
> files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1, c2]\n"]]`,
>
>  right: `[["logical_plan", "Projection: #aggregate_test_100.c1\n  Filter:
> #aggregate_test_100.c2 > Int64(10)\n    TableScan: aggregate_test_100
> projection=[c1, c2], partial_filters=[#aggregate_test_100.c2 >
> Int64(10)]"], ["physical_plan", "ProjectionExec: expr=[c1@0 as c1]\n
> CoalesceBatchesExec:
> target_batch_size=4096\n    FilterExec: CAST(c2@1 AS Int64) > 10\n
>  RepartitionExec:
> partitioning=RoundRobinBatch(NUM_CORES)\n        CsvExec:
> files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1, c2]\n"]]`',
> datafusion/core/tests/sql/explain_analyze.rs:769:5
>
>
> ---- sql::explain_analyze::test_physical_plan_display_indent_multi_children
> stdout ----
>
> thread
> 'sql::explain_analyze::test_physical_plan_display_indent_multi_children'
> panicked at 'assertion failed: `(left == right)`
>
>   left: `["ProjectionExec: expr=[c1@0 as c1]", "  CoalesceBatchesExec:
> target_batch_size=4096", "    HashJoinExec: mode=Partitioned,
> join_type=Inner, on=[(Column { name: \"c1\", index: 0 }, Column { name:
> \"c2\", index: 0 })]", "      CoalesceBatchesExec: target_batch_size=4096",
> "        RepartitionExec: partitioning=Hash([Column { name: \"c1\", index:
> 0 }], 9000)", "          ProjectionExec: expr=[c1@0 as c1]", "
>    ProjectionExec:
> expr=[c1@0 as c1]", "              RepartitionExec:
> partitioning=RoundRobinBatch(9000)", "                CsvExec:
> files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1]", "      CoalesceBatchesExec:
> target_batch_size=4096", "        RepartitionExec:
> partitioning=Hash([Column { name: \"c2\", index: 0 }], 9000)", "
>    ProjectionExec:
> expr=[c2@0 as c2]", "            ProjectionExec: expr=[c1@0 as c2]", "
>         RepartitionExec: partitioning=RoundRobinBatch(9000)", "
>     CsvExec: files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv],
> has_header=true, limit=None, projection=[c1]"]`,
>
>  right: `["ProjectionExec: expr=[c1@0 as c1]", "  CoalesceBatchesExec:
> target_batch_size=4096", "    HashJoinExec: mode=Partitioned,
> join_type=Inner, on=[(Column { name: \"c1\", index: 0 }, Column { name:
> \"c2\", index: 0 })]", "      CoalesceBatchesExec: target_batch_size=4096",
> "        RepartitionExec: partitioning=Hash([Column { name: \"c1\", index:
> 0 }], 9000)", "          ProjectionExec: expr=[c1@0 as c1]", "
>    ProjectionExec:
> expr=[c1@0 as c1]", "              RepartitionExec:
> partitioning=RoundRobinBatch(9000)", "                CsvExec:
> files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1]", "      CoalesceBatchesExec:
> target_batch_size=4096", "        RepartitionExec:
> partitioning=Hash([Column { name: \"c2\", index: 0 }], 9000)", "
>    ProjectionExec:
> expr=[c2@0 as c2]", "            ProjectionExec: expr=[c1@0 as c2]", "
>         RepartitionExec: partitioning=RoundRobinBatch(9000)", "
>     CsvExec: files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv],
> has_header=true, limit=None, projection=[c1]"]`: expected:
>
> [
>
>     "ProjectionExec: expr=[c1@0 as c1]",
>
>     "  CoalesceBatchesExec: target_batch_size=4096",
>
>     "    HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column {
> name: \"c1\", index: 0 }, Column { name: \"c2\", index: 0 })]",
>
>     "      CoalesceBatchesExec: target_batch_size=4096",
>
>     "        RepartitionExec: partitioning=Hash([Column { name: \"c1\",
> index: 0 }], 9000)",
>
>     "          ProjectionExec: expr=[c1@0 as c1]",
>
>     "            ProjectionExec: expr=[c1@0 as c1]",
>
>     "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
>
>     "                CsvExec:
> files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1]",
>
>     "      CoalesceBatchesExec: target_batch_size=4096",
>
>     "        RepartitionExec: partitioning=Hash([Column { name: \"c2\",
> index: 0 }], 9000)",
>
>     "          ProjectionExec: expr=[c2@0 as c2]",
>
>     "            ProjectionExec: expr=[c1@0 as c2]",
>
>     "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
>
>     "                CsvExec:
> files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1]",
>
> ]
>
> actual:
>
>
> [
>
>     "ProjectionExec: expr=[c1@0 as c1]",
>
>     "  CoalesceBatchesExec: target_batch_size=4096",
>
>     "    HashJoinExec: mode=Partitioned, join_type=Inner, on=[(Column {
> name: \"c1\", index: 0 }, Column { name: \"c2\", index: 0 })]",
>
>     "      CoalesceBatchesExec: target_batch_size=4096",
>
>     "        RepartitionExec: partitioning=Hash([Column { name: \"c1\",
> index: 0 }], 9000)",
>
>     "          ProjectionExec: expr=[c1@0 as c1]",
>
>     "            ProjectionExec: expr=[c1@0 as c1]",
>
>     "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
>
>     "                CsvExec:
> files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1]",
>
>     "      CoalesceBatchesExec: target_batch_size=4096",
>
>     "        RepartitionExec: partitioning=Hash([Column { name: \"c2\",
> index: 0 }], 9000)",
>
>     "          ProjectionExec: expr=[c2@0 as c2]",
>
>     "            ProjectionExec: expr=[c1@0 as c2]",
>
>     "              RepartitionExec: partitioning=RoundRobinBatch(9000)",
>
>     "                CsvExec:
> files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1]",
>
> ]
>
> ', datafusion/core/tests/sql/explain_analyze.rs:734:5
>
>
> ---- sql::explain_analyze::test_physical_plan_display_indent stdout ----
>
> thread 'sql::explain_analyze::test_physical_plan_display_indent' panicked
> at 'assertion failed: `(left == right)`
>
>   left: `["GlobalLimitExec: skip=None, fetch=10", "  SortExec: [the_min@2
> DESC]", "    CoalescePartitionsExec", "      ProjectionExec: expr=[c1@0 as
> c1, MAX(aggregate_test_100.c12)@1 as MAX(aggregate_test_100.c12),
> MIN(aggregate_test_100.c12)@2 as the_min]", "        AggregateExec:
> mode=FinalPartitioned, gby=[c1@0 as c1],
> aggr=[MAX(aggregate_test_100.c12),
> MIN(aggregate_test_100.c12)]", "          CoalesceBatchesExec:
> target_batch_size=4096", "            RepartitionExec:
> partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)", "
>   AggregateExec: mode=Partial, gby=[c1@0 as c1],
> aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]", "
>         CoalesceBatchesExec: target_batch_size=4096", "
>   FilterExec:
> c12@1 < CAST(10 AS Float64)", "                    RepartitionExec:
> partitioning=RoundRobinBatch(9000)", "                      CsvExec:
> files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1, c12]"]`,
>
>  right: `["GlobalLimitExec: skip=None, fetch=10", "  SortExec: [the_min@2
> DESC]", "    CoalescePartitionsExec", "      ProjectionExec: expr=[c1@0 as
> c1, MAX(aggregate_test_100.c12)@1 as MAX(aggregate_test_100.c12),
> MIN(aggregate_test_100.c12)@2 as the_min]", "        AggregateExec:
> mode=FinalPartitioned, gby=[c1@0 as c1],
> aggr=[MAX(aggregate_test_100.c12),
> MIN(aggregate_test_100.c12)]", "          CoalesceBatchesExec:
> target_batch_size=4096", "            RepartitionExec:
> partitioning=Hash([Column { name: \"c1\", index: 0 }], 9000)", "
>   AggregateExec: mode=Partial, gby=[c1@0 as c1],
> aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]", "
>         CoalesceBatchesExec: target_batch_size=4096", "
>   FilterExec:
> c12@1 < CAST(10 AS Float64)", "                    RepartitionExec:
> partitioning=RoundRobinBatch(9000)", "                      CsvExec:
> files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1, c12]"]`: expected:
>
> [
>
>     "GlobalLimitExec: skip=None, fetch=10",
>
>     "  SortExec: [the_min@2 DESC]",
>
>     "    CoalescePartitionsExec",
>
>     "      ProjectionExec: expr=[c1@0 as c1, MAX(aggregate_test_100.c12)@1
> as MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)@2 as the_min]",
>
>     "        AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1],
> aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
>
>     "          CoalesceBatchesExec: target_batch_size=4096",
>
>     "            RepartitionExec: partitioning=Hash([Column { name: \"c1\",
> index: 0 }], 9000)",
>
>     "              AggregateExec: mode=Partial, gby=[c1@0 as c1],
> aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
>
>     "                CoalesceBatchesExec: target_batch_size=4096",
>
>     "                  FilterExec: c12@1 < CAST(10 AS Float64)",
>
>     "                    RepartitionExec:
> partitioning=RoundRobinBatch(9000)",
>
>     "                      CsvExec:
> files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1, c12]",
>
> ]
>
> actual:
>
>
> [
>
>     "GlobalLimitExec: skip=None, fetch=10",
>
>     "  SortExec: [the_min@2 DESC]",
>
>     "    CoalescePartitionsExec",
>
>     "      ProjectionExec: expr=[c1@0 as c1, MAX(aggregate_test_100.c12)@1
> as MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)@2 as the_min]",
>
>     "        AggregateExec: mode=FinalPartitioned, gby=[c1@0 as c1],
> aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
>
>     "          CoalesceBatchesExec: target_batch_size=4096",
>
>     "            RepartitionExec: partitioning=Hash([Column { name: \"c1\",
> index: 0 }], 9000)",
>
>     "              AggregateExec: mode=Partial, gby=[c1@0 as c1],
> aggr=[MAX(aggregate_test_100.c12), MIN(aggregate_test_100.c12)]",
>
>     "                CoalesceBatchesExec: target_batch_size=4096",
>
>     "                  FilterExec: c12@1 < CAST(10 AS Float64)",
>
>     "                    RepartitionExec:
> partitioning=RoundRobinBatch(9000)",
>
>     "                      CsvExec:
> files=[privateARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true,
> limit=None, projection=[c1, c12]",
>
> ]
>
> ', datafusion/core/tests/sql/explain_analyze.rs:683:5
>
>
>
> failures:
>
>     sql::explain_analyze::csv_explain
>
>     sql::explain_analyze::test_physical_plan_display_indent
>
>     sql::explain_analyze::test_physical_plan_display_indent_multi_children
>
>
> test result: FAILED. 459 passed; 3 failed; 2 ignored; 0 measured; 0
> filtered out; finished in 2.76s
>
>
> error: test failed, to rerun pass '-p datafusion --test sql_integration'
>
> + cleanup
>
> + '[' no = yes ']'
>
> On Wed, 17 Aug 2022 at 04:36, Ian Joiner <iajoiner...@gmail.com> wrote:
>
> > Never mind the PS in the last email haha.
> >
> > On Tue, Aug 16, 2022 at 1:16 PM Ian Joiner <iajoiner...@gmail.com>
> wrote:
> >
> > > +1 (Non-binding)
> > >
> > > Verified on macOS 12.2.1 / Apple M1 Pro
> > >
> > > P.S. If verified with zsh instead of bash we got a command not found
> for
> > shasum
> > > -a 256 -c on verify_dir_artifact_signatures:10 unless shasums are
> > > disabled which has been happening for a while. Not sure whether we want
> > to
> > > fix this. If so I will file a PR for that.
> > >
> > > On Tue, Aug 16, 2022 at 12:15 PM Andy Grove <andygrov...@gmail.com>
> > wrote:
> > >
> > >> Hi,
> > >>
> > >> I would like to propose a release of Apache Arrow DataFusion
> > >> Implementation,
> > >> version 11.0.0.
> > >>
> > >> This release candidate is based on commit:
> > >> 8ee31cc69f43a4de0c0678d18a57f27cb4d0ead1 [1]
> > >> The proposed release tarball and signatures are hosted at [2].
> > >> The changelog is located at [3].
> > >>
> > >> Please download, verify checksums and signatures, run the unit tests,
> > and
> > >> vote
> > >> on the release. The vote will be open for at least 72 hours.
> > >>
> > >> Only votes from PMC members are binding, but all members of the
> > community
> > >> are
> > >> encouraged to test the release and vote with "(non-binding)".
> > >>
> > >> The standard verification procedure is documented at
> > >>
> > >>
> >
> https://github.com/apache/arrow-datafusion/blob/master/dev/release/README.md#verifying-release-candidates
> > >> .
> > >>
> > >> [ ] +1 Release this as Apache Arrow DataFusion 11.0.0
> > >> [ ] +0
> > >> [ ] -1 Do not release this as Apache Arrow DataFusion 11.0.0
> because...
> > >>
> > >> Here is my vote:
> > >>
> > >> +1
> > >>
> > >> [1]:
> > >>
> > >>
> >
> https://github.com/apache/arrow-datafusion/tree/8ee31cc69f43a4de0c0678d18a57f27cb4d0ead1
> > >> [2]:
> > >>
> > >>
> >
> https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-datafusion-11.0.0-rc1
> > >> [3]:
> > >>
> > >>
> >
> https://github.com/apache/arrow-datafusion/blob/8ee31cc69f43a4de0c0678d18a57f27cb4d0ead1/CHANGELOG.md
> > >>
> > >
> >
>

Re: [VOTE][RUST][DataFusion] Release Apache Arrow DataFusion 11.0.0 RC1

Reply via email to