samueleresca opened a new issue, #17550: URL: https://github.com/apache/datafusion/issues/17550
### Is your feature request related to a problem or challenge? _(Raising this as followup [this discussion](https://github.com/apache/datafusion/issues/17522#issuecomment-3283520158))_ It appears that there are multiple crates (included as `dependencies` or `dev-dependencies`) that are not being used across datafusion. I have run `cargo machete` to get a list out A few notes: - The majority of the unused detections are correct (marked as `CORRECT`). - There is a minority of the unused detections that are false positive (marked as `FALSE POSITIVE`). I linked a test PR that results from running `cargo machete`: https://github.com/apache/datafusion/pull/17545 Below the extraction: ``` datafusion-cli -- ./datafusion-cli/Cargo.toml: assert_cmd -- CORRECT predicates -- CORRECT datafusion-datasource -- ./datafusion/datasource/Cargo.toml: parquet -- CORRECT, referred by parquet feature datafusion-physical-plan -- ./datafusion/physical-plan/Cargo.toml: tempfile -- CORRECT datafusion-datasource-json -- ./datafusion/datasource-json/Cargo.toml: datafusion-catalog -- CORRECT datafusion-physical-expr -- CORRECT serde_json -- CORRECT datafusion-datasource-avro -- ./datafusion/datasource-avro/Cargo.toml: chrono -- CORRECT datafusion-catalog -- CORRECT datafusion-execution -- CORRECT datafusion-physical-expr -- CORRECT rstest -- CORRECT tokio -- CORRECT datafusion -- ./datafusion/core/Cargo.toml: dashmap -- CORRECT datafusion-doc -- CORRECT datafusion-macros -- CORRECT hex -- CORRECT, referred by parquet_encryption feature datafusion-physical-expr -- ./datafusion/physical-expr/Cargo.toml: log -- CORRECT datafusion-proto-common -- ./datafusion/proto-common/Cargo.toml: serde_json -- CORRECT, referred by "json" feature datafusion-catalog -- ./datafusion/catalog/Cargo.toml: datafusion-sql -- CORRECT datafusion-physical-expr-adapter -- ./datafusion/physical-expr-adapter/Cargo.toml: insta -- CORRECT rstest -- CORRECT datafusion-pruning -- ./datafusion/pruning/Cargo.toml: arrow-schema -- CORRECT datafusion-functions-aggregate -- ./datafusion/functions-aggregate/Cargo.toml: datafusion-doc -- FALSE POSITIVE datafusion-wasmtest -- ./datafusion/wasmtest/Cargo.toml: chrono -- FALSE POSITIVE, chrono must be compiled with wasmbind feature getrandom -- FALSE POSITIVE, "The \"wasm_js\" backend requires the `wasm_js` feature for `getrandom` insta -- CORRECT datafusion-physical-optimizer -- ./datafusion/physical-optimizer/Cargo.toml: datafusion-functions-nested -- CORRECT log -- CORRECT datafusion-catalog-listing -- ./datafusion/catalog-listing/Cargo.toml: datafusion-session datafusion-datasource-parquet -- ./datafusion/datasource-parquet/Cargo.toml: datafusion-catalog -- CORRECT datafusion-physical-optimizer -- CORRECT hex -- CORRECT, referred by parquet_encryption feature rand -- CORRECT datafusion-functions-window -- ./datafusion/functions-window/Cargo.toml: datafusion-doc -- FALSE POSITIVE datafusion-spark -- ./datafusion/spark/Cargo.toml: datafusion-macros -- CORRECT xxhash-rust -- CORRECT datafusion-datasource-csv -- ./datafusion/datasource-csv/Cargo.toml: datafusion-catalog -- CORRECT datafusion-physical-expr --CORRECT datafusion-session -- ./datafusion/session/Cargo.toml: arrow -- CORRECT dashmap -- CORRECT datafusion-common-runtime -- CORRECT datafusion-physical-expr -- CORRECT datafusion-sql -- CORRECT futures -- CORRECT itertools -- CORRECT log -- CORRECT object_store -- CORRECT tokio -- CORRECT datafusion-benchmarks -- ./benchmarks/Cargo.toml: test-utils -- CORRECT ``` I believe this topic has been discussed previously (e.g., https://github.com/apache/arrow-rs/issues/6796), and likely I don't have the full picture. I'm keen to gather feedback from the maintainers on: - whether this would be something useful - If it would cause issues, I'm not seeing caused by enabling non-default features. ### Describe the solution you'd like Include the `cargo machete` command as part of the CI. We will need to add some ignores at the workspace level. This includes: - `datafusion-doc` - `getrandom` in datafusion-wasmtest - `chrono` in datafusion-wasmtest ### Describe alternatives you've considered RustRover does a good job in spotting unused crates. An alternative would be to keep the process manual and to check the unused crate locally and avoid blocking the CI (as is now). ### Additional context _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
