[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-10-07 Thread Kouhei Sutou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17613912#comment-17613912
 ] 

Kouhei Sutou edited comment on ARROW-15678 at 10/7/22 7:54 AM:
---

I propose one more approach for the proposed solution 1.:

How about always enabling inline optimization for SIMD optimized compile units 
({{level_conversion_bmi2.cc}}) even when an user specifies 
{{-DCMAKE_BUILD_TYPE=MinSizeRel}}?

It may increases binary size but it may be better that SIMD related code 
prioritizes performance than binary size.

We don't need to write manual 
{{template}}/{{\_\_attribute\_\_((always\_inline))}} s with this approach.


was (Author: kou):
I propose one more approach for the proposed solution 1.:

How about always enabling inline optimization for SIMD optimized compile units 
({{level_conversion_bmi2.cc}}) even when an user specifies 
{{-DCMAKE_BUILD_TYPE=MinSizeRel}}?

It may increases binary size but it may be better that SIMD related code 
prioritizes performance than binary size.

We don't need to write manual 
{{template}}/{{\_\_attribute\_\_((always\_inline))}}s with this approach.

> [C++][CI] a crossbow job with MinRelSize enabled
> 
>
> Key: ARROW-15678
> URL: https://issues.apache.org/jira/browse/ARROW-15678
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Continuous Integration
>Reporter: Jonathan Keane
>Assignee: Kouhei Sutou
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 10.0.0
>
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-07-22 Thread Jonathan Keane (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17570163#comment-17570163
 ] 

Jonathan Keane edited comment on ARROW-15678 at 7/22/22 7:28 PM:
-

Homebrew only accepted that as a temporary workaround and has threatened to 
turn off optimizations if we don't resolve this. They haven't yet followed 
through yet, though. 
https://github.com/Homebrew/homebrew-core/issues/94724#issuecomment-1063031123 


was (Author: jonkeane):
Homebrew only accepted that as a temporary workaround and has threatened to 
turn off optimizations if we don't resolve this. They haven't yet followed 
through yet, though.

> [C++][CI] a crossbow job with MinRelSize enabled
> 
>
> Key: ARROW-15678
> URL: https://issues.apache.org/jira/browse/ARROW-15678
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Continuous Integration
>Reporter: Jonathan Keane
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 9.0.0
>
>  Time Spent: 13h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-06-10 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542244#comment-17542244
 ] 

Antoine Pitrou edited comment on ARROW-15678 at 6/10/22 3:04 PM:
-

On further investigation, we can include immintrin.h with or without -mavx2 and 
clang at least will not complain unless the intrinsics are referenced, so

{code}
#include 

[[gnu::target("avx2")]]
void use_simd() {
  __m256i arg;
  _mm256_abs_epi16 (arg);
}

int main() { use_simd(); }
{code}

compiles and runs happily without any special compilation flags. Using an 
attribute like this seems viable provided we can be certain that the modified 
target isn't transitively applied to functions which might be invoked for the 
first time inside a SIMD enabled function


was (Author: bkietz):
On further investigation, we can include immintrin.h with or without -mavx2 and 
clang at least will not complain unless the intrinsics are referenced, so

{{code}}
#include 

[[gnu::target("avx2")]]
void use_simd() {
  __m256i arg;
  _mm256_abs_epi16 (arg);
}

int main() { use_simd(); }
{{code}}

compiles and runs happily without any special compilation flags. Using an 
attribute like this seems viable provided we can be certain that the modified 
target isn't transitively applied to functions which might be invoked for the 
first time inside a SIMD enabled function

> [C++][CI] a crossbow job with MinRelSize enabled
> 
>
> Key: ARROW-15678
> URL: https://issues.apache.org/jira/browse/ARROW-15678
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Continuous Integration
>Reporter: Jonathan Keane
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 9.0.0
>
>  Time Spent: 13h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-05-25 Thread Ben Kietzman (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542235#comment-17542235
 ] 

Ben Kietzman edited comment on ARROW-15678 at 5/25/22 8:56 PM:
---

IIUC, we'll still need to pass {{-mavx2}} so that we can include immintrin.h so 
the attribute described in the {{ARROW_SPECIALIZED_SIMD_TARGET}} approach would 
need to be attached to the {*}non{*}-SIMD functions to ensure that they're 
compiled with no special instructions

... or I suppose we could try to declare all the intrinsics manually at 
function scope
{code:java}
ARROW_SIMD_FUNCTION(avx2) void SimdThing() {
  // inlined from immintrin.h:
  typedef unsigned short __mmask16;
  extern
__inline
__mmask16
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_int2mask (int __M);
}{code}


was (Author: bkietz):
IIUC, we'll still need to pass {{-mavx2}} so that we can include immintrin.h so 
the attribute described in the {{ARROW_SPECIALIZED_SIMD_TARGET}} approach would 
need to be attached to the *non*-SIMD functions to ensure that they're compiled 
with no special instructions

> [C++][CI] a crossbow job with MinRelSize enabled
> 
>
> Key: ARROW-15678
> URL: https://issues.apache.org/jira/browse/ARROW-15678
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Continuous Integration
>Reporter: Jonathan Keane
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 9.0.0
>
>  Time Spent: 13h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-05-18 Thread Jonathan Keane (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17539107#comment-17539107
 ] 

Jonathan Keane edited comment on ARROW-15678 at 5/18/22 10:03 PM:
--

[~kou] Do you think you might be able to take a look at this?

The comment at 
https://github.com/apache/arrow/pull/12928#issuecomment-1105955726 has a good 
explanation of what's going on and following that there are a few possible 
fixes (though none of them were fully implemented or decided


was (Author: jonkeane):
@kou Do you think you might be able to take a look at this?

The comment at 
https://github.com/apache/arrow/pull/12928#issuecomment-1105955726 has a good 
explanation of what's going on and following that there are a few possible 
fixes (though none of them were fully implemented or decided

> [C++][CI] a crossbow job with MinRelSize enabled
> 
>
> Key: ARROW-15678
> URL: https://issues.apache.org/jira/browse/ARROW-15678
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Continuous Integration
>Reporter: Jonathan Keane
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 9.0.0
>
>  Time Spent: 13h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-04-21 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526053#comment-17526053
 ] 

Antoine Pitrou edited comment on ARROW-15678 at 4/21/22 7:51 PM:
-

So, currently we are doing something such as:
{code}
clang -c something_avx2.cc -mavx2 
{code}

An alternative would be not to pass the optimization flag on the command line 
but enable it selectively inside the source code, e.g.:
{code}
clang -c something_avx2.cc -DARROW_SPECIALIZED_SIMD_TARGET=avx2
{code}

{code:c++}
namespace parquet {
namespace internal {
namespace PARQUET_IMPL_NAMESPACE {

#ifdef ARROW_SPECIALIZED_SIMD_TARGET

#define STRINGIFY_EXPANDED(a) ARROW_STRINGIFY(a)
#pragma clang attribute push (__attribute__((target( 
STRINGIFY_EXPANDED(ARROW_SPECIALIZED_SIMD_TARGET)) )), apply_to=function)

#endif

...

#ifdef ARROW_SPECIALIZED_SIMD_TARGET
#pragma clang attribute pop
#endif

}  // namespace PARQUET_IMPL_NAMESPACE
}  // namespace internal
}  // namespace parquet
{code}

This way we would avoid enabling optimization on code inlined from other 
headers. Of course, perhaps that's not actually desirable...


was (Author: pitrou):
So, currently we are doing something such as:
{code}
clang -c something_avx2.cc -mavx2 
{code}

An alternative would be not to pass the optimization flag on the command line 
but enable it selectively inside the source code, e.g.:
{code}
clang -c something_avx2.cc -DARROW_SPECIALIZED_SIMD_TARGET=avx2
{code}

{code:c++}
namespace parquet {
namespace internal {
namespace PARQUET_IMPL_NAMESPACE {

#ifdef ARROW_SPECIALIZED_SIMD_TARGET

#define STRINGIFY_EXPANDED(a) ARROW_STRINGIFY(a)
#pragma clang attribute push (__attribute__((target( 
STRINGIFY_EXPANDED(ARROW_SPECIALIZED_SIMD_TARGET)) )), apply_to=function)

#endif

...

#ifdef ARROW_SPECIALIZED_SIMD_TARGET
#pragma clang attribute pop
#endif

}  // namespace PARQUET_IMPL_NAMESPACE
}  // namespace internal
}  // namespace parquet
{code}


> [C++][CI] a crossbow job with MinRelSize enabled
> 
>
> Key: ARROW-15678
> URL: https://issues.apache.org/jira/browse/ARROW-15678
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Continuous Integration
>Reporter: Jonathan Keane
>Assignee: Jonathan Keane
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 8.0.0
>
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-04-21 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17526053#comment-17526053
 ] 

Antoine Pitrou edited comment on ARROW-15678 at 4/21/22 7:51 PM:
-

So, currently we are doing something such as:
{code}
clang -c something_avx2.cc -mavx2 
{code}

An alternative would be not to pass the optimization flag on the command line 
but enable it selectively inside the source code, e.g.:
{code}
clang -c something_avx2.cc -DARROW_SPECIALIZED_SIMD_TARGET=avx2
{code}

{code:c++}
namespace parquet {
namespace internal {
namespace PARQUET_IMPL_NAMESPACE {

#ifdef ARROW_SPECIALIZED_SIMD_TARGET

#define STRINGIFY_EXPANDED(a) ARROW_STRINGIFY(a)
#pragma clang attribute push (__attribute__((target( 
STRINGIFY_EXPANDED(ARROW_SPECIALIZED_SIMD_TARGET)) )), apply_to=function)

#endif

...

#ifdef ARROW_SPECIALIZED_SIMD_TARGET
#pragma clang attribute pop
#endif

}  // namespace PARQUET_IMPL_NAMESPACE
}  // namespace internal
}  // namespace parquet
{code}

This way we would avoid enabling the particular instruction set on code inlined 
from other headers. Of course, perhaps that's not actually desirable...


was (Author: pitrou):
So, currently we are doing something such as:
{code}
clang -c something_avx2.cc -mavx2 
{code}

An alternative would be not to pass the optimization flag on the command line 
but enable it selectively inside the source code, e.g.:
{code}
clang -c something_avx2.cc -DARROW_SPECIALIZED_SIMD_TARGET=avx2
{code}

{code:c++}
namespace parquet {
namespace internal {
namespace PARQUET_IMPL_NAMESPACE {

#ifdef ARROW_SPECIALIZED_SIMD_TARGET

#define STRINGIFY_EXPANDED(a) ARROW_STRINGIFY(a)
#pragma clang attribute push (__attribute__((target( 
STRINGIFY_EXPANDED(ARROW_SPECIALIZED_SIMD_TARGET)) )), apply_to=function)

#endif

...

#ifdef ARROW_SPECIALIZED_SIMD_TARGET
#pragma clang attribute pop
#endif

}  // namespace PARQUET_IMPL_NAMESPACE
}  // namespace internal
}  // namespace parquet
{code}

This way we would avoid enabling optimization on code inlined from other 
headers. Of course, perhaps that's not actually desirable...

> [C++][CI] a crossbow job with MinRelSize enabled
> 
>
> Key: ARROW-15678
> URL: https://issues.apache.org/jira/browse/ARROW-15678
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Continuous Integration
>Reporter: Jonathan Keane
>Assignee: Jonathan Keane
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 8.0.0
>
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (ARROW-15678) [C++][CI] a crossbow job with MinRelSize enabled

2022-04-21 Thread Antoine Pitrou (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-15678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17525995#comment-17525995
 ] 

Antoine Pitrou edited comment on ARROW-15678 at 4/21/22 7:08 PM:
-

Wow, thanks for the diagnosis [~westonpace].
So, it turns out that our method for compiling multiple versions of code is 
violating the one-definition-rule for any inline function or method used in the 
caller code.


was (Author: pitrou):
Wow, thanks for the diagnosis [~westonpace].
So, it turns out that our method for compiling multiple versions of code is 
violating the one-definition-rule for any inline function or method.

> [C++][CI] a crossbow job with MinRelSize enabled
> 
>
> Key: ARROW-15678
> URL: https://issues.apache.org/jira/browse/ARROW-15678
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Continuous Integration
>Reporter: Jonathan Keane
>Assignee: Jonathan Keane
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 8.0.0
>
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)