samueleresca opened a new issue, #17550:
URL: https://github.com/apache/datafusion/issues/17550

   ### Is your feature request related to a problem or challenge?
   
   _(Raising this as followup [this 
discussion](https://github.com/apache/datafusion/issues/17522#issuecomment-3283520158))_
   
   It appears that there are multiple crates (included as `dependencies` or 
`dev-dependencies`) that are not being used across datafusion. I have run 
`cargo machete` to get a list out
   
   A few notes:
   - The majority of the unused detections are correct (marked as `CORRECT`).
   - There is a minority of the unused detections that are false positive 
(marked as `FALSE POSITIVE`).
   
   I linked a test PR that results from running `cargo machete`: 
https://github.com/apache/datafusion/pull/17545
   
   Below the extraction:
   
   ```
   datafusion-cli -- ./datafusion-cli/Cargo.toml:
           assert_cmd -- CORRECT
           predicates -- CORRECT
   datafusion-datasource -- ./datafusion/datasource/Cargo.toml:
           parquet  -- CORRECT, referred by parquet feature
   datafusion-physical-plan -- ./datafusion/physical-plan/Cargo.toml:
           tempfile -- CORRECT
   datafusion-datasource-json -- ./datafusion/datasource-json/Cargo.toml:
           datafusion-catalog -- CORRECT
           datafusion-physical-expr -- CORRECT
           serde_json -- CORRECT
   datafusion-datasource-avro -- ./datafusion/datasource-avro/Cargo.toml:
           chrono -- CORRECT
           datafusion-catalog -- CORRECT
           datafusion-execution -- CORRECT
           datafusion-physical-expr -- CORRECT
           rstest -- CORRECT
           tokio -- CORRECT
   datafusion -- ./datafusion/core/Cargo.toml:
           dashmap -- CORRECT
           datafusion-doc -- CORRECT
           datafusion-macros -- CORRECT
           hex -- CORRECT, referred by parquet_encryption feature
   datafusion-physical-expr -- ./datafusion/physical-expr/Cargo.toml:
           log -- CORRECT
   datafusion-proto-common -- ./datafusion/proto-common/Cargo.toml:
           serde_json -- CORRECT, referred by "json" feature
   datafusion-catalog -- ./datafusion/catalog/Cargo.toml:
           datafusion-sql -- CORRECT
   datafusion-physical-expr-adapter -- 
./datafusion/physical-expr-adapter/Cargo.toml:
           insta -- CORRECT
           rstest -- CORRECT
   datafusion-pruning -- ./datafusion/pruning/Cargo.toml:
           arrow-schema -- CORRECT
   datafusion-functions-aggregate -- 
./datafusion/functions-aggregate/Cargo.toml:
           datafusion-doc -- FALSE POSITIVE
   datafusion-wasmtest -- ./datafusion/wasmtest/Cargo.toml:
           chrono -- FALSE POSITIVE, chrono must be compiled with wasmbind 
feature
           getrandom -- FALSE POSITIVE,  "The \"wasm_js\" backend requires the 
`wasm_js` feature for `getrandom`
           insta -- CORRECT
   datafusion-physical-optimizer -- ./datafusion/physical-optimizer/Cargo.toml:
           datafusion-functions-nested -- CORRECT
           log -- CORRECT
   datafusion-catalog-listing -- ./datafusion/catalog-listing/Cargo.toml:
           datafusion-session
   datafusion-datasource-parquet -- ./datafusion/datasource-parquet/Cargo.toml:
           datafusion-catalog -- CORRECT
           datafusion-physical-optimizer -- CORRECT
           hex -- CORRECT, referred by parquet_encryption feature
           rand -- CORRECT
   datafusion-functions-window -- ./datafusion/functions-window/Cargo.toml:
           datafusion-doc -- FALSE POSITIVE
   datafusion-spark -- ./datafusion/spark/Cargo.toml:
           datafusion-macros -- CORRECT
           xxhash-rust -- CORRECT
   datafusion-datasource-csv -- ./datafusion/datasource-csv/Cargo.toml:
           datafusion-catalog -- CORRECT
           datafusion-physical-expr --CORRECT
   datafusion-session -- ./datafusion/session/Cargo.toml:
           arrow -- CORRECT
           dashmap -- CORRECT
           datafusion-common-runtime -- CORRECT
           datafusion-physical-expr -- CORRECT
           datafusion-sql -- CORRECT
           futures -- CORRECT
           itertools -- CORRECT
           log -- CORRECT
           object_store -- CORRECT
           tokio -- CORRECT
   datafusion-benchmarks -- ./benchmarks/Cargo.toml:
           test-utils -- CORRECT
   ```
   
   I believe this topic has been discussed previously (e.g., 
https://github.com/apache/arrow-rs/issues/6796), and likely I don't have the 
full picture.
   
    I'm keen to gather feedback from the maintainers on:
   - whether this would be something useful
   - If it would cause issues, I'm not seeing caused by enabling non-default 
features.
   
   ### Describe the solution you'd like
   
   Include the `cargo machete` command as part of the CI. We will need to add 
some ignores at the workspace level. This includes:
   - `datafusion-doc`
   - `getrandom` in datafusion-wasmtest
   - `chrono` in datafusion-wasmtest
   
   ### Describe alternatives you've considered
   
   RustRover does a good job in spotting unused crates. An alternative would be 
to keep the process manual and to check the unused crate locally and avoid 
blocking the CI (as is now).
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to