[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17613912#comment-17613912 ] Kouhei Sutou edited comment on ARROW-15678 at 10/7/22 7:54 AM: --- I propose one more approach for the proposed solution 1.: How about always enabling inline optimization for SIMD optimized compile units ({{level_conversion_bmi2.cc}}) even when an user specifies {{-DCMAKE_BUILD_TYPE=MinSizeRel}}? It may increases binary size but it may be better that SIMD related code prioritizes performance than binary size. We don't need to write manual {{template}}/{{\_\_attribute\_\_((always\_inline))}} s with this approach. was (Author: kou): I propose one more approach for the proposed solution 1.: How about always enabling inline optimization for SIMD optimized compile units ({{level_conversion_bmi2.cc}}) even when an user specifies {{-DCMAKE_BUILD_TYPE=MinSizeRel}}? It may increases binary size but it may be better that SIMD related code prioritizes performance than binary size. We don't need to write manual {{template}}/{{\_\_attribute\_\_((always\_inline))}}s with this approach. > [C++][CI] a crossbow job with MinRelSize enabled > > > Key: ARROW-15678 > URL: https://issues.apache.org/jira/browse/ARROW-15678 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Continuous Integration >Reporter: Jonathan Keane >Assignee: Kouhei Sutou >Priority: Blocker > Labels: pull-request-available > Fix For: 10.0.0 > > Time Spent: 13.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17570163#comment-17570163 ] Jonathan Keane edited comment on ARROW-15678 at 7/22/22 7:28 PM: - Homebrew only accepted that as a temporary workaround and has threatened to turn off optimizations if we don't resolve this. They haven't yet followed through yet, though. https://github.com/Homebrew/homebrew-core/issues/94724#issuecomment-1063031123 was (Author: jonkeane): Homebrew only accepted that as a temporary workaround and has threatened to turn off optimizations if we don't resolve this. They haven't yet followed through yet, though. > [C++][CI] a crossbow job with MinRelSize enabled > > > Key: ARROW-15678 > URL: https://issues.apache.org/jira/browse/ARROW-15678 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Continuous Integration >Reporter: Jonathan Keane >Priority: Blocker > Labels: pull-request-available > Fix For: 9.0.0 > > Time Spent: 13h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542244#comment-17542244 ] Antoine Pitrou edited comment on ARROW-15678 at 6/10/22 3:04 PM: - On further investigation, we can include immintrin.h with or without -mavx2 and clang at least will not complain unless the intrinsics are referenced, so {code} #include [[gnu::target("avx2")]] void use_simd() { __m256i arg; _mm256_abs_epi16 (arg); } int main() { use_simd(); } {code} compiles and runs happily without any special compilation flags. Using an attribute like this seems viable provided we can be certain that the modified target isn't transitively applied to functions which might be invoked for the first time inside a SIMD enabled function was (Author: bkietz): On further investigation, we can include immintrin.h with or without -mavx2 and clang at least will not complain unless the intrinsics are referenced, so {{code}} #include [[gnu::target("avx2")]] void use_simd() { __m256i arg; _mm256_abs_epi16 (arg); } int main() { use_simd(); } {{code}} compiles and runs happily without any special compilation flags. Using an attribute like this seems viable provided we can be certain that the modified target isn't transitively applied to functions which might be invoked for the first time inside a SIMD enabled function > [C++][CI] a crossbow job with MinRelSize enabled > > > Key: ARROW-15678 > URL: https://issues.apache.org/jira/browse/ARROW-15678 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Continuous Integration >Reporter: Jonathan Keane >Priority: Blocker > Labels: pull-request-available > Fix For: 9.0.0 > > Time Spent: 13h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542235#comment-17542235 ] Ben Kietzman edited comment on ARROW-15678 at 5/25/22 8:56 PM: --- IIUC, we'll still need to pass {{-mavx2}} so that we can include immintrin.h so the attribute described in the {{ARROW_SPECIALIZED_SIMD_TARGET}} approach would need to be attached to the {*}non{*}-SIMD functions to ensure that they're compiled with no special instructions ... or I suppose we could try to declare all the intrinsics manually at function scope {code:java} ARROW_SIMD_FUNCTION(avx2) void SimdThing() { // inlined from immintrin.h: typedef unsigned short __mmask16; extern __inline __mmask16 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm512_int2mask (int __M); }{code} was (Author: bkietz): IIUC, we'll still need to pass {{-mavx2}} so that we can include immintrin.h so the attribute described in the {{ARROW_SPECIALIZED_SIMD_TARGET}} approach would need to be attached to the *non*-SIMD functions to ensure that they're compiled with no special instructions > [C++][CI] a crossbow job with MinRelSize enabled > > > Key: ARROW-15678 > URL: https://issues.apache.org/jira/browse/ARROW-15678 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Continuous Integration >Reporter: Jonathan Keane >Priority: Blocker > Labels: pull-request-available > Fix For: 9.0.0 > > Time Spent: 13h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539107#comment-17539107 ] Jonathan Keane edited comment on ARROW-15678 at 5/18/22 10:03 PM: -- [~kou] Do you think you might be able to take a look at this? The comment at https://github.com/apache/arrow/pull/12928#issuecomment-1105955726 has a good explanation of what's going on and following that there are a few possible fixes (though none of them were fully implemented or decided was (Author: jonkeane): @kou Do you think you might be able to take a look at this? The comment at https://github.com/apache/arrow/pull/12928#issuecomment-1105955726 has a good explanation of what's going on and following that there are a few possible fixes (though none of them were fully implemented or decided > [C++][CI] a crossbow job with MinRelSize enabled > > > Key: ARROW-15678 > URL: https://issues.apache.org/jira/browse/ARROW-15678 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Continuous Integration >Reporter: Jonathan Keane >Priority: Blocker > Labels: pull-request-available > Fix For: 9.0.0 > > Time Spent: 13h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526053#comment-17526053 ] Antoine Pitrou edited comment on ARROW-15678 at 4/21/22 7:51 PM: - So, currently we are doing something such as: {code} clang -c something_avx2.cc -mavx2 {code} An alternative would be not to pass the optimization flag on the command line but enable it selectively inside the source code, e.g.: {code} clang -c something_avx2.cc -DARROW_SPECIALIZED_SIMD_TARGET=avx2 {code} {code:c++} namespace parquet { namespace internal { namespace PARQUET_IMPL_NAMESPACE { #ifdef ARROW_SPECIALIZED_SIMD_TARGET #define STRINGIFY_EXPANDED(a) ARROW_STRINGIFY(a) #pragma clang attribute push (__attribute__((target( STRINGIFY_EXPANDED(ARROW_SPECIALIZED_SIMD_TARGET)) )), apply_to=function) #endif ... #ifdef ARROW_SPECIALIZED_SIMD_TARGET #pragma clang attribute pop #endif } // namespace PARQUET_IMPL_NAMESPACE } // namespace internal } // namespace parquet {code} This way we would avoid enabling optimization on code inlined from other headers. Of course, perhaps that's not actually desirable... was (Author: pitrou): So, currently we are doing something such as: {code} clang -c something_avx2.cc -mavx2 {code} An alternative would be not to pass the optimization flag on the command line but enable it selectively inside the source code, e.g.: {code} clang -c something_avx2.cc -DARROW_SPECIALIZED_SIMD_TARGET=avx2 {code} {code:c++} namespace parquet { namespace internal { namespace PARQUET_IMPL_NAMESPACE { #ifdef ARROW_SPECIALIZED_SIMD_TARGET #define STRINGIFY_EXPANDED(a) ARROW_STRINGIFY(a) #pragma clang attribute push (__attribute__((target( STRINGIFY_EXPANDED(ARROW_SPECIALIZED_SIMD_TARGET)) )), apply_to=function) #endif ... #ifdef ARROW_SPECIALIZED_SIMD_TARGET #pragma clang attribute pop #endif } // namespace PARQUET_IMPL_NAMESPACE } // namespace internal } // namespace parquet {code} > [C++][CI] a crossbow job with MinRelSize enabled > > > Key: ARROW-15678 > URL: https://issues.apache.org/jira/browse/ARROW-15678 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Continuous Integration >Reporter: Jonathan Keane >Assignee: Jonathan Keane >Priority: Blocker > Labels: pull-request-available > Fix For: 8.0.0 > > Time Spent: 11h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526053#comment-17526053 ] Antoine Pitrou edited comment on ARROW-15678 at 4/21/22 7:51 PM: - So, currently we are doing something such as: {code} clang -c something_avx2.cc -mavx2 {code} An alternative would be not to pass the optimization flag on the command line but enable it selectively inside the source code, e.g.: {code} clang -c something_avx2.cc -DARROW_SPECIALIZED_SIMD_TARGET=avx2 {code} {code:c++} namespace parquet { namespace internal { namespace PARQUET_IMPL_NAMESPACE { #ifdef ARROW_SPECIALIZED_SIMD_TARGET #define STRINGIFY_EXPANDED(a) ARROW_STRINGIFY(a) #pragma clang attribute push (__attribute__((target( STRINGIFY_EXPANDED(ARROW_SPECIALIZED_SIMD_TARGET)) )), apply_to=function) #endif ... #ifdef ARROW_SPECIALIZED_SIMD_TARGET #pragma clang attribute pop #endif } // namespace PARQUET_IMPL_NAMESPACE } // namespace internal } // namespace parquet {code} This way we would avoid enabling the particular instruction set on code inlined from other headers. Of course, perhaps that's not actually desirable... was (Author: pitrou): So, currently we are doing something such as: {code} clang -c something_avx2.cc -mavx2 {code} An alternative would be not to pass the optimization flag on the command line but enable it selectively inside the source code, e.g.: {code} clang -c something_avx2.cc -DARROW_SPECIALIZED_SIMD_TARGET=avx2 {code} {code:c++} namespace parquet { namespace internal { namespace PARQUET_IMPL_NAMESPACE { #ifdef ARROW_SPECIALIZED_SIMD_TARGET #define STRINGIFY_EXPANDED(a) ARROW_STRINGIFY(a) #pragma clang attribute push (__attribute__((target( STRINGIFY_EXPANDED(ARROW_SPECIALIZED_SIMD_TARGET)) )), apply_to=function) #endif ... #ifdef ARROW_SPECIALIZED_SIMD_TARGET #pragma clang attribute pop #endif } // namespace PARQUET_IMPL_NAMESPACE } // namespace internal } // namespace parquet {code} This way we would avoid enabling optimization on code inlined from other headers. Of course, perhaps that's not actually desirable... > [C++][CI] a crossbow job with MinRelSize enabled > > > Key: ARROW-15678 > URL: https://issues.apache.org/jira/browse/ARROW-15678 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Continuous Integration >Reporter: Jonathan Keane >Assignee: Jonathan Keane >Priority: Blocker > Labels: pull-request-available > Fix For: 8.0.0 > > Time Spent: 11h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled
[ https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525995#comment-17525995 ] Antoine Pitrou edited comment on ARROW-15678 at 4/21/22 7:08 PM: - Wow, thanks for the diagnosis [~westonpace]. So, it turns out that our method for compiling multiple versions of code is violating the one-definition-rule for any inline function or method used in the caller code. was (Author: pitrou): Wow, thanks for the diagnosis [~westonpace]. So, it turns out that our method for compiling multiple versions of code is violating the one-definition-rule for any inline function or method. > [C++][CI] a crossbow job with MinRelSize enabled > > > Key: ARROW-15678 > URL: https://issues.apache.org/jira/browse/ARROW-15678 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Continuous Integration >Reporter: Jonathan Keane >Assignee: Jonathan Keane >Priority: Blocker > Labels: pull-request-available > Fix For: 8.0.0 > > Time Spent: 11h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007)