[arrow-datafusion] branch main updated: Minor: Update the testing section of contributor guide (#6357)

alamb Tue, 16 May 2023 14:35:59 -0700

This is an automated email from the ASF dual-hosted git repository.

alamb pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git



The following commit(s) were added to refs/heads/main by this push:
     new 33b15c1e8a Minor: Update the testing section of contributor guide 
(#6357)
33b15c1e8a is described below

commit 33b15c1e8a670bee7ceb11f5f02e445e0e16bff0
Author: Andrew Lamb <[email protected]>
AuthorDate: Tue May 16 17:35:47 2023 -0400

    Minor: Update the testing section of contributor guide (#6357)
---
 docs/source/contributor-guide/index.md | 45 +++++++++++++++++-----------------
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/docs/source/contributor-guide/index.md 
b/docs/source/contributor-guide/index.md
index 7c19ff2e89..f8457b8854 100644
--- a/docs/source/contributor-guide/index.md
+++ b/docs/source/contributor-guide/index.md
@@ -33,7 +33,7 @@ list to help you get started.
 
 # Developer's guide
 
-## Pull Requests
+## Pull Request Overview
 
 We welcome pull requests (PRs) from anyone from the community.
 
@@ -115,42 +115,41 @@ or run them all at once:
 
 - [dev/rust_lint.sh](../../../dev/rust_lint.sh)
 
-### Test Organization
+## Testing
 
-Tests are very important to ensure that improvemens or fixes are not 
accidentally broken during subsequent refactorings.
+Tests are critical to ensure that DataFusion is working properly and
+is not accidentally broken during refactorings. All new features
+should have test coverage.
 
 DataFusion has several levels of tests in its [Test
 Pyramid](https://martinfowler.com/articles/practical-test-pyramid.html)
-and tries to follow rust standard [Testing 
Organization](https://doc.rust-lang.org/book/ch11-03-test-organization.html) in 
the The Book.
+and tries to follow the Rust standard [Testing 
Organization](https://doc.rust-lang.org/book/ch11-03-test-organization.html) in 
the The Book.
 
-This section highlights the most important test modules that exist
+### Unit tests
 
-#### Unit tests
+Tests for code in an individual module are defined in the same source file 
with a `test` module, following Rust convention.
 
-Tests for the code in an individual module are defined in the same source file 
with a `test` module, following Rust convention.
+### sqllogictests Tests
 
-#### Rust Integration Tests
+DataFusion's SQL implementation is tested using 
[sqllogictest](https://github.com/apache/arrow-datafusion/tree/main/datafusion/core/tests/sqllogictests)
 which are run like any other Rust test using `cargo test --test sqllogictests`.
 
-There are several tests of the public interface of the DataFusion library in 
the 
[tests](https://github.com/apache/arrow-datafusion/tree/main/datafusion/core/tests)
 directory.
-
-You can run these tests individually using a command such as
+`sqllogictests` tests may be less convenient for new contributors who are 
familiar with writing `.rs` tests as they require learning another tool. 
However, `sqllogictest` based tests are much easier to develop and maintain as 
they 1) do not require a slow recompile/link cycle and 2) can be automatically 
updated via `cargo test --test sqllogictests -- --complete`.
 
-```shell
-cargo test -p datafusion --test sql_integration
-```
+Like similar systems such as [DuckDB](https://duckdb.org/dev/testing), 
DataFusion has chosen to trade off a slightly higher barrier to contribution 
for longer term maintainability. While we are still in the process of 
[migrating some old sql_integration 
tests](https://github.com/apache/arrow-datafusion/issues/6195), all new tests 
should be written using sqllogictests if possible.
 
-One very important test is the 
[sql_integration](https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/tests/sql_integration.rs)
 test which validates DataFusion's ability to run a large assortment of SQL 
queries against an assortment of data setups.
+### Rust Integration Tests
 
-#### sqllogictests Tests
+There are several tests of the public interface of the DataFusion library in 
the 
[tests](https://github.com/apache/arrow-datafusion/tree/main/datafusion/core/tests)
 directory.
 
-The 
[sqllogictests](https://github.com/apache/arrow-datafusion/tree/main/datafusion/core/tests/sqllogictests)
 also validate DataFusion SQL against an assortment of data setups.
+You can run these tests individually using `cargo` as normal command such as
 
-Data Driven tests have many benefits including being easier to write and 
maintain. We are in the process of [migrating sql_integration 
tests](https://github.com/apache/arrow-datafusion/issues/4460) and encourage
-you to add new tests using sqllogictests if possible.
+```shell
+cargo test -p datafusion --test dataframe
+```
 
-### Benchmarks
+## Benchmarks
 
-#### Criterion Benchmarks
+### Criterion Benchmarks
 
 [Criterion](https://docs.rs/criterion/latest/criterion/index.html) is a 
statistics-driven micro-benchmarking framework used by DataFusion for 
evaluating the performance of specific code-paths. In particular, the criterion 
benchmarks help to both guide optimisation efforts, and prevent performance 
regressions within DataFusion.
 
@@ -164,7 +163,7 @@ A full list of benchmarks can be found 
[here](https://github.com/apache/arrow-da
 
 _[cargo-criterion](https://github.com/bheisler/cargo-criterion) may also be 
used for more advanced reporting._
 
-#### Parquet SQL Benchmarks
+### Parquet SQL Benchmarks
 
 The parquet SQL benchmarks can be run with
 
@@ -178,7 +177,7 @@ If the environment variable `PARQUET_FILE` is set, the 
benchmark will run querie
 
 The benchmark will automatically remove any generated parquet file on exit, 
however, if interrupted (e.g. by CTRL+C) it will not. This can be useful for 
analysing the particular file after the fact, or preserving it to use with 
`PARQUET_FILE` in subsequent runs.
 
-#### Upstream Benchmark Suites
+### Upstream Benchmark Suites
 
 Instructions and tooling for running upstream benchmark suites against 
DataFusion can be found in 
[benchmarks](https://github.com/apache/arrow-datafusion/tree/main/benchmarks).

[arrow-datafusion] branch main updated: Minor: Update the testing section of contributor guide (#6357)

Reply via email to