alamb commented on code in PR #6554: URL: https://github.com/apache/arrow-rs/pull/6554#discussion_r1802840595
########## arrow/CONTRIBUTING.md: ########## @@ -109,6 +109,36 @@ specific JIRA issues and reference them in these code comments. For example: // This is not sound because .... see https://issues.apache.org/jira/browse/ARROW-nnnnn ``` +### Usage if SIMD / Auto vectorization + +This create does not use SIMD intrinsics (e.g. [`std::simd`] directly, but +instead relies on LLVM's auto-vectorization. + +SIMD intrinsics are difficult to maintain and can be difficult to reason about. +The auto-vectorizer in LLVM is quite good and often produces better code than +hand-written manual uses of SIMD. In fact, this crate used to to have a fair Review Comment: fixed ########## arrow/CONTRIBUTING.md: ########## @@ -109,6 +109,36 @@ specific JIRA issues and reference them in these code comments. For example: // This is not sound because .... see https://issues.apache.org/jira/browse/ARROW-nnnnn ``` +### Usage if SIMD / Auto vectorization + +This create does not use SIMD intrinsics (e.g. [`std::simd`] directly, but +instead relies on LLVM's auto-vectorization. Review Comment: I changed the docs to say "the Rust compilers auto-vectorization" as I think that is the high level description of what is going on In this context, I think the use of `llvm` is an "implementation detail" (albliet an important one) about how that auto-vectorization is accomplished. ########## arrow/CONTRIBUTING.md: ########## @@ -109,6 +109,36 @@ specific JIRA issues and reference them in these code comments. For example: // This is not sound because .... see https://issues.apache.org/jira/browse/ARROW-nnnnn ``` +### Usage if SIMD / Auto vectorization + +This create does not use SIMD intrinsics (e.g. [`std::simd`] directly, but +instead relies on LLVM's auto-vectorization. + +SIMD intrinsics are difficult to maintain and can be difficult to reason about. +The auto-vectorizer in LLVM is quite good and often produces better code than +hand-written manual uses of SIMD. In fact, this crate used to to have a fair +amount of manual SIMD, and over time we've removed it as the auto-vectorized +code was faster. + +[`std::simd`]: https://doc.rust-lang.org/std/simd/index.html + +LLVM is relatively good at vectorizing vertical operations provided: + +1. No conditionals within the loop body +2. Not too much inlining , as the vectorizer gives up if the code is too complex +3. No bitwise horizontal reductions or masking Review Comment: > Perhaps we could link to https://rust-lang.github.io/packed_simd/perf-guide/vert-hor-ops.html TIL: That is a nice description I reworded this item to > 3. No [horizontal reductions] or data dependencies ########## arrow/CONTRIBUTING.md: ########## @@ -109,6 +109,36 @@ specific JIRA issues and reference them in these code comments. For example: // This is not sound because .... see https://issues.apache.org/jira/browse/ARROW-nnnnn ``` +### Usage if SIMD / Auto vectorization + +This create does not use SIMD intrinsics (e.g. [`std::simd`] directly, but +instead relies on LLVM's auto-vectorization. + +SIMD intrinsics are difficult to maintain and can be difficult to reason about. +The auto-vectorizer in LLVM is quite good and often produces better code than +hand-written manual uses of SIMD. In fact, this crate used to to have a fair +amount of manual SIMD, and over time we've removed it as the auto-vectorized +code was faster. + +[`std::simd`]: https://doc.rust-lang.org/std/simd/index.html + +LLVM is relatively good at vectorizing vertical operations provided: + +1. No conditionals within the loop body +2. Not too much inlining , as the vectorizer gives up if the code is too complex +3. No bitwise horizontal reductions or masking +4. You've enabled SIMD instructions in the target ISA (e.g. `target-cpu` `RUSTFLAGS` flag) + +The last point is especially important as the default `target-cpu` doesn't +support many SIMD instructions. See the Performance Tips section at the +end of <https://crates.io/crates/arrow> + +To ensure your code is fully vectorized, we recommend getting familiar with +tools like <https://rust.godbolt.org/> (again being sure to set `RUSTFLAGS`) and Review Comment: done ########## arrow/CONTRIBUTING.md: ########## @@ -109,6 +109,36 @@ specific JIRA issues and reference them in these code comments. For example: // This is not sound because .... see https://issues.apache.org/jira/browse/ARROW-nnnnn ``` +### Usage if SIMD / Auto vectorization + +This create does not use SIMD intrinsics (e.g. [`std::simd`] directly, but +instead relies on LLVM's auto-vectorization. + +SIMD intrinsics are difficult to maintain and can be difficult to reason about. +The auto-vectorizer in LLVM is quite good and often produces better code than +hand-written manual uses of SIMD. In fact, this crate used to to have a fair +amount of manual SIMD, and over time we've removed it as the auto-vectorized +code was faster. Review Comment: I rephrased the sentence to hopefully be clearer now "In fact, this crate used to contain several manual SIMD implementations, which were removed after discovering the auto-vectorized code was faster." ########## arrow/CONTRIBUTING.md: ########## @@ -109,6 +109,36 @@ specific JIRA issues and reference them in these code comments. For example: // This is not sound because .... see https://issues.apache.org/jira/browse/ARROW-nnnnn ``` +### Usage if SIMD / Auto vectorization + +This create does not use SIMD intrinsics (e.g. [`std::simd`] directly, but +instead relies on LLVM's auto-vectorization. + +SIMD intrinsics are difficult to maintain and can be difficult to reason about. +The auto-vectorizer in LLVM is quite good and often produces better code than +hand-written manual uses of SIMD. In fact, this crate used to to have a fair +amount of manual SIMD, and over time we've removed it as the auto-vectorized +code was faster. + +[`std::simd`]: https://doc.rust-lang.org/std/simd/index.html + +LLVM is relatively good at vectorizing vertical operations provided: + +1. No conditionals within the loop body +2. Not too much inlining , as the vectorizer gives up if the code is too complex +3. No bitwise horizontal reductions or masking +4. You've enabled SIMD instructions in the target ISA (e.g. `target-cpu` `RUSTFLAGS` flag) Review Comment: Changed to "Suitable SIMD instructions available in the target ISA (e.g. `target-cpu` `RUSTFLAGS` flag)" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
