Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
zanmato1984 commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2605782470
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +124,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
I've pushed a commit containing more fixes that I see necessary. I don't
have a big-endian hardware so I'm not able to test it in local. Please pull the
code and see if the tests pass and let me know the result. Thanks
@Vishwanatha-HD .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
zanmato1984 commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2603002807
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +124,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
Hi @Vishwanatha-HD , I guess that test
KeyCompare.CompareColumnsToRowsCuriousFSB would still fail even w/o the change
I proposed. The fact that it is failing means it is calling
`SafeLoadUpTo8Bytes`, which is supposed to be a `DCHECK` failure. Your change
of removing that `DCHECK`, which is against its by-design intention, makes it
passing false-positively.
Meanwhile, do you see other tests failing with the change I proposed?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
zanmato1984 commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2601835445
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +124,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
Thanks for the update. I'll look into it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
Vishwanatha-HD commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2601769494
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +124,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
@zanmato1984.. Please get my latest code changes to util.cc file..
```
inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes, int num_bytes) {
ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
if (num_bytes == 8) {
return util::SafeLoad(reinterpret_cast(bytes));
} else {
uint64_t word = 0;
#if ARROW_LITTLE_ENDIAN
for (int i = 0; i < num_bytes; ++i) {
word |= static_cast(bytes[i]) << (8 * i);
}
#else
// Big-endian: most significant byte first
for (int i = 0; i < num_bytes; ++i) {
word |= static_cast(bytes[i]) << (8 * (num_bytes - 1 - i));
}
#endif
return word;
}
}
```
In the bits_to_indexes_internal() function >>
```
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
#if !ARROW_LITTLE_ENDIAN
word = ::arrow::bit_util::ByteSwap(word);
#endif
if (bit_to_search == 0) {
word = ~word;
}
word &= ~0ULL >> (64 - tail);
if (filter_input_indexes) {
bits_filter_indexes_helper(word, input_indexes + num_bits - tail,
num_indexes,
indexes);
} else {
bits_to_indexes_helper(word, num_bits - tail + base_index,
num_indexes, indexes);
}
}
}
```
In the bytes_to_bits() function
```
if (tail) {
uint64_t bytes_next;
bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
#if !ARROW_LITTLE_ENDIAN
bytes_next = ::arrow::bit_util::ByteSwap(bytes_next);
#endif
bytes_next &= 0x0101010101010101ULL;
bytes_next |= (bytes_next >> 7); // Pairs of adjacent output bits in
individual bytes
bytes_next |= (bytes_next >> 14); // 4 adjacent output bits in
individual bytes
bytes_next |= (bytes_next >> 28); // All 8 output bits in the lowest
byte
bits[num_bits / 8] = static_cast(bytes_next & 0xff);
}
```
And, yes.. This looks to be a new testcase failure with the above mentioned
changes..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
zanmato1984 commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2600993495
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +124,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
Also, please update the code so I can help on any further failures. The
current change isn't sufficient and I'm worrying about we may have false
positives.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
zanmato1984 commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2600970788
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +124,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
Are you saying it is a new test that is failing and the asof join test
passed?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
Vishwanatha-HD commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2599504808
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +124,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
Hi @zanmato1984..
I made the code changes as per your suggestion above.. but unfortunately,
the testcase doesnt pass.. The thing is that its not just the "byteswap" that
is required on the BE systems..
./debug/arrow-compute-row-test
--gtest_filter=KeyCompare.CompareColumnsToRowsCuriousFSB
Note: Google Test filter = KeyCompare.CompareColumnsToRowsCuriousFSB
[==] Running 1 test from 1 test suite.
[--] Global test environment set-up.
[--] 1 test from KeyCompare
[ RUN ] KeyCompare.CompareColumnsToRowsCuriousFSB
arrow/cpp/src/arrow/compute/row/compare_test.cc:103: Failure
Expected equality of these values:
num_rows_no_match
Which is: 7
1
[ FAILED ] KeyCompare.CompareColumnsToRowsCuriousFSB (2 ms)
[--] 1 test from KeyCompare (2 ms total)
[--] Global test environment tear-down
[==] 1 test from 1 test suite ran. (18 ms total)
[ PASSED ] 0 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] KeyCompare.CompareColumnsToRowsCuriousFSB
1 FAILED TEST
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
zanmato1984 commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2592121965
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +124,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
IIUC, here you want to load these bytes in little-endian to be further
processed by `CountTrailingZeros`. What you are doing is not leveraging
`SafeLoadUpTo8Bytes()`, which is supposed to load bytes in big-endian (and
currently not implemented), but write your own little-endian loading.
This should work. But I think we'd better do it the other way:
1. Implement the big-endian loading in `SafeLoadUpTo8Bytes()` (you already
did it in your previous commit), keep the call to it here, for both little- and
big-endian.
2. For big-endian, issue an explicit byte swapping for big-endian: ```#if
!ARROW_LITTLE_ENDIAN word = bit_util::ByteSwap(word); #endif```
This way, the code can be more compact and semantic clear. The cost is an
extra byte-swapping, which is trivial imho. cc @kou
##
cpp/src/arrow/compute/util.cc:
##
@@ -30,34 +30,40 @@ namespace util {
namespace bit_util {
inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes, int num_bytes) {
- // This will not be correct on big-endian architectures.
-#if !ARROW_LITTLE_ENDIAN
- ARROW_DCHECK(false);
-#endif
ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
if (num_bytes == 8) {
return util::SafeLoad(reinterpret_cast(bytes));
Review Comment:
So we are not going to update this function for big-endian because it won't
be called? If so, why don't we keep the above `DCHECK(false)`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
Vishwanatha-HD commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2591499936
##
cpp/src/arrow/compute/util.cc:
##
@@ -299,7 +313,17 @@ void bytes_to_bits(int64_t hardware_flags, const int
num_bits, const uint8_t* by
}
int tail = num_bits % unroll;
if (tail) {
-uint64_t bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+uint64_t bytes_next;
+#if ARROW_LITTLE_ENDIAN
+bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+#else
+// On Big-endian systems, for bytes_to_bits, load all tail bytes in
little-endian
+// order to ensure compatibility with subsequent bit operations
+bytes_next = 0;
+for (int i = 0; i < tail; ++i) {
+ bytes_next |= static_cast((bytes + num_bits - tail)[i]) << (8
* i);
+}
+#endif
Review Comment:
@kou.. The SafeLoadUpTo8Bytes() function remains unchanged..
```
inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes, int num_bytes) {
ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
if (num_bytes == 8) {
return util::SafeLoad(reinterpret_cast(bytes));
} else {
uint64_t word = 0;
for (int i = 0; i < num_bytes; ++i) {
word |= static_cast(bytes[i]) << (8 * i);
}
return word;
}
}
```
The bytes_to_bits() function is handling the endianness fix independently..
If I do the endianness conversion inside SafeLoadUpTo8Bytes() function, rather
than here, then the testcase is not working..
```
if (tail) {
uint64_t bytes_next;
#if ARROW_LITTLE_ENDIAN
bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
#else
// On Big-endian systems, for bytes_to_bits, load all tail bytes in
little-endian
// order to ensure compatibility with subsequent bit operations
bytes_next = 0;
for (int i = 0; i < tail; ++i) {
bytes_next |= static_cast((bytes + num_bits - tail)[i]) <<
(8 * i);
}
#endif
bytes_next &= 0x0101010101010101ULL;
bytes_next |= (bytes_next >> 7); // Pairs of adjacent output bits in
individual bytes
bytes_next |= (bytes_next >> 14); // 4 adjacent output bits in
individual bytes
bytes_next |= (bytes_next >> 28); // All 8 output bits in the lowest
byte
bits[num_bits / 8] = static_cast(bytes_next & 0xff);
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
zanmato1984 commented on PR #48180: URL: https://github.com/apache/arrow/pull/48180#issuecomment-3611785936 > I thought that we need to convert to little endian. But is my assumption wrong...? If so, my suggested code was wrong. Sorry. Thanks for explaining. I'm not sure either. My assumption is that by `SafeLoad/Store` we need to preserve the machine endian. That is for example: ``` uint64_t value = SafeLoad(bytes); uint8_t *p_value = reinterprete_cast(&value); assert(p_value[0] == bytes[0]); assert(p_value[1] == bytes[1]); ... assert(p_value[7] == bytes[7]); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
kou commented on PR #48180: URL: https://github.com/apache/arrow/pull/48180#issuecomment-3611737569 I thought that we need to convert to little endian. But is my assumption wrong...? If so, my suggested code was wrong. Sorry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
zanmato1984 commented on PR #48180:
URL: https://github.com/apache/arrow/pull/48180#issuecomment-3611660521
> Does this work?
>
> ```diff
> diff --git a/cpp/src/arrow/compute/util.cc b/cpp/src/arrow/compute/util.cc
> index b90b3a6405..163a80d9d4 100644
> --- a/cpp/src/arrow/compute/util.cc
> +++ b/cpp/src/arrow/compute/util.cc
> @@ -30,33 +30,41 @@ namespace util {
> namespace bit_util {
>
> inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes, int num_bytes) {
> - // This will not be correct on big-endian architectures.
> -#if !ARROW_LITTLE_ENDIAN
> - ARROW_DCHECK(false);
> -#endif
>ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
>if (num_bytes == 8) {
> -return util::SafeLoad(reinterpret_cast(bytes));
> +auto word = util::SafeLoad(reinterpret_cast(bytes));
> +#if !ARROW_LITTLE_ENDIAN
> +word = bit_util::ByteSwap(word);
> +#endif
> +return word;
>} else {
> uint64_t word = 0;
> for (int i = 0; i < num_bytes; ++i) {
> +#if ARROW_LITTLE_ENDIAN
>word |= static_cast(bytes[i]) << (8 * i);
> +#else
> + word |= static_cast(bytes[num_bytes - 1 - i]) << (8 * i);
> +#endif
> }
> return word;
>}
> }
>
> inline void SafeStoreUpTo8Bytes(uint8_t* bytes, int num_bytes, uint64_t
value) {
> - // This will not be correct on big-endian architectures.
> -#if !ARROW_LITTLE_ENDIAN
> - ARROW_DCHECK(false);
> -#endif
>ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
>if (num_bytes == 8) {
> +#if ARROW_LITTLE_ENDIAN
> util::SafeStore(reinterpret_cast(bytes), value);
> +#else
> +util::SafeStore(reinterpret_cast(bytes),
bit_util::ByteSwap(value));
> +#endif
>} else {
> for (int i = 0; i < num_bytes; ++i) {
> +#if ARROW_LITTLE_ENDIAN
>bytes[i] = static_cast(value >> (8 * i));
> +#else
> + bytes[i] = static_cast(value >> (8 * (num_bytes - 1 - i)));
> +#endif
> }
>}
> }
> ```
Hi @kou, I have a question: why do we need to swap the bytes for big-endian
when `num_bytes == 8`? The underlying `util::SafeLoad/Store` are just `memcpy`
so the byte orders between `value` and the `bytes` should be the same right?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
kou commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2562038802
##
cpp/src/arrow/compute/util.cc:
##
@@ -47,17 +43,20 @@ inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes,
int num_bytes) {
}
inline void SafeStoreUpTo8Bytes(uint8_t* bytes, int num_bytes, uint64_t value)
{
- // This will not be correct on big-endian architectures.
-#if !ARROW_LITTLE_ENDIAN
- ARROW_DCHECK(false);
-#endif
ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
if (num_bytes == 8) {
util::SafeStore(reinterpret_cast(bytes), value);
Review Comment:
Hmm... I haven't seen the changes in
https://github.com/apache/arrow/pull/48180/files#diff-ecbf847c1b02ae3e2118d9d73fdb2a279e5b3107d1cdc4a7779a31b0fef9ed4cR35
.
I'm mentioning this part in
https://github.com/apache/arrow/pull/48180#pullrequestreview-3496008897 here:
```diff
+#if ARROW_LITTLE_ENDIAN
util::SafeStore(reinterpret_cast(bytes), value);
+#else
+util::SafeStore(reinterpret_cast(bytes),
bit_util::ByteSwap(value));
+#endif
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
kou commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2558343983
##
cpp/src/arrow/compute/util.cc:
##
@@ -47,17 +43,20 @@ inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes,
int num_bytes) {
}
inline void SafeStoreUpTo8Bytes(uint8_t* bytes, int num_bytes, uint64_t value)
{
- // This will not be correct on big-endian architectures.
-#if !ARROW_LITTLE_ENDIAN
- ARROW_DCHECK(false);
-#endif
ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
if (num_bytes == 8) {
util::SafeStore(reinterpret_cast(bytes), value);
Review Comment:
If we apply a change in
https://github.com/apache/arrow/pull/48180#pullrequestreview-3496008897 for
this part, `SafeStoreUpTo8Bytes()` works with `num_bytes = 8` on big endian?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
kou commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2562026695
##
cpp/src/arrow/compute/util.cc:
##
@@ -299,7 +313,17 @@ void bytes_to_bits(int64_t hardware_flags, const int
num_bits, const uint8_t* by
}
int tail = num_bits % unroll;
if (tail) {
-uint64_t bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+uint64_t bytes_next;
+#if ARROW_LITTLE_ENDIAN
+bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+#else
+// On Big-endian systems, for bytes_to_bits, load all tail bytes in
little-endian
+// order to ensure compatibility with subsequent bit operations
+bytes_next = 0;
+for (int i = 0; i < tail; ++i) {
+ bytes_next |= static_cast((bytes + num_bits - tail)[i]) << (8
* i);
+}
+#endif
Review Comment:
Could you share code you tried?
https://github.com/apache/arrow/pull/48180#pullrequestreview-3496008897
includes `word |= static_cast(bytes[num_bytes - 1 - i]) << (8 * i);`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
Vishwanatha-HD commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2559828298
##
cpp/src/arrow/compute/util.cc:
##
@@ -47,17 +43,20 @@ inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes,
int num_bytes) {
}
inline void SafeStoreUpTo8Bytes(uint8_t* bytes, int num_bytes, uint64_t value)
{
- // This will not be correct on big-endian architectures.
-#if !ARROW_LITTLE_ENDIAN
- ARROW_DCHECK(false);
-#endif
ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
if (num_bytes == 8) {
util::SafeStore(reinterpret_cast(bytes), value);
Review Comment:
@kou, my recent code changes have been modified in this area..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
Vishwanatha-HD commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2559822759
##
cpp/src/arrow/compute/util.cc:
##
@@ -299,7 +313,17 @@ void bytes_to_bits(int64_t hardware_flags, const int
num_bits, const uint8_t* by
}
int tail = num_bits % unroll;
if (tail) {
-uint64_t bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+uint64_t bytes_next;
+#if ARROW_LITTLE_ENDIAN
+bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+#else
+// On Big-endian systems, for bytes_to_bits, load all tail bytes in
little-endian
+// order to ensure compatibility with subsequent bit operations
+bytes_next = 0;
+for (int i = 0; i < tail; ++i) {
+ bytes_next |= static_cast((bytes + num_bits - tail)[i]) << (8
* i);
+}
+#endif
Review Comment:
@kou.. I tried doing that but the testcase failed on s390x.. We need the
"bytes_next |= static_cast((bytes + num_bits - tail)[i]) << (8 * i);"
on big-endian, which we dont get it when we directly call the
SafeLoadUpTo8Bytes()..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
kou commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2558343983
##
cpp/src/arrow/compute/util.cc:
##
@@ -47,17 +43,20 @@ inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes,
int num_bytes) {
}
inline void SafeStoreUpTo8Bytes(uint8_t* bytes, int num_bytes, uint64_t value)
{
- // This will not be correct on big-endian architectures.
-#if !ARROW_LITTLE_ENDIAN
- ARROW_DCHECK(false);
-#endif
ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
if (num_bytes == 8) {
util::SafeStore(reinterpret_cast(bytes), value);
Review Comment:
If we have apply a change in
https://github.com/apache/arrow/pull/48180#pullrequestreview-3496008897 for
this part, `SafeStoreUpTo8Bytes()` works with `num_bytes = 8` on big endian?
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +117,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
Can we revert this change with the latest `SafeLoadUpTo8Bytes()` (that has
big endian support)?
##
cpp/src/arrow/compute/util.cc:
##
@@ -30,10 +30,6 @@ namespace util {
namespace bit_util {
inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes, int num_bytes) {
- // This will not be correct on big-endian architectures.
-#if !ARROW_LITTLE_ENDIAN
- ARROW_DCHECK(false);
-#endif
Review Comment:
I think that we need a change in
https://github.com/apache/arrow/pull/48180#pullrequestreview-3496008897 , to
remove this check.
##
cpp/src/arrow/compute/util.cc:
##
@@ -299,7 +313,17 @@ void bytes_to_bits(int64_t hardware_flags, const int
num_bits, const uint8_t* by
}
int tail = num_bits % unroll;
if (tail) {
-uint64_t bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+uint64_t bytes_next;
+#if ARROW_LITTLE_ENDIAN
+bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+#else
+// On Big-endian systems, for bytes_to_bits, load all tail bytes in
little-endian
+// order to ensure compatibility with subsequent bit operations
+bytes_next = 0;
+for (int i = 0; i < tail; ++i) {
+ bytes_next |= static_cast((bytes + num_bits - tail)[i]) << (8
* i);
+}
+#endif
Review Comment:
Can we revert this change with the latest `SafeLoadUpTo8Bytes()` (that has
big endian support)?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
Vishwanatha-HD commented on PR #48180: URL: https://github.com/apache/arrow/pull/48180#issuecomment-3572197297 > Mostly looks good to me — just one thought after reading through the recent back-and-forth... > > Given the updated handling of tail bytes and the SafeLoadUpTo8Bytes discussion, I think this PR’s direction still makes sense. I’d just double-check that the tail==8 path really can’t happen with the current unroll logic, since @kou kou raised that question. Otherwise the fixes seem aligned with the latest comments. > > Willing to help test once the approach is finalized. @k8ika0s.. Thanks very much for your review on this.. Yeah sure.. You please go ahead and cherry-pick my PR patches and run the tests from your end.. Please let me know the final status.. Thanks.. !! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
Vishwanatha-HD commented on PR #48180: URL: https://github.com/apache/arrow/pull/48180#issuecomment-3569820645 > Mostly looks good to me — just one thought after reading through the recent back-and-forth... > > Given the updated handling of tail bytes and the SafeLoadUpTo8Bytes discussion, I think this PR’s direction still makes sense. I’d just double-check that the tail==8 path really can’t happen with the current unroll logic, since @kou kou raised that question. Otherwise the fixes seem aligned with the latest comments. > > Willing to help test once the approach is finalized. Thanks @k8ika0s as well for your review comments.. I have checked the tail==8 code path, and its not required anymore. I have reverted the changes and pushed the code changes again.. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
Vishwanatha-HD commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2555467301
##
cpp/src/arrow/compute/util.cc:
##
@@ -299,7 +313,22 @@ void bytes_to_bits(int64_t hardware_flags, const int
num_bits, const uint8_t* by
}
int tail = num_bits % unroll;
if (tail) {
-uint64_t bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+uint64_t bytes_next;
+#if ARROW_LITTLE_ENDIAN
+bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+#else
+if (tail == 8) {
Review Comment:
Thanks for pointing this out @kou..
I have tested by putting different debug prints at various different code
paths and I see that "tail" is never 8..
I have removed the "if loop" and pushed the code again.. Please take a
look.. Thanks..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
Vishwanatha-HD commented on PR #48180:
URL: https://github.com/apache/arrow/pull/48180#issuecomment-3569766657
> Does this work?
>
> ```diff
> diff --git a/cpp/src/arrow/compute/util.cc b/cpp/src/arrow/compute/util.cc
> index b90b3a6405..163a80d9d4 100644
> --- a/cpp/src/arrow/compute/util.cc
> +++ b/cpp/src/arrow/compute/util.cc
> @@ -30,33 +30,41 @@ namespace util {
> namespace bit_util {
>
> inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes, int num_bytes) {
> - // This will not be correct on big-endian architectures.
> -#if !ARROW_LITTLE_ENDIAN
> - ARROW_DCHECK(false);
> -#endif
>ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
>if (num_bytes == 8) {
> -return util::SafeLoad(reinterpret_cast(bytes));
> +auto word = util::SafeLoad(reinterpret_cast(bytes));
> +#if !ARROW_LITTLE_ENDIAN
> +word = bit_util::ByteSwap(word);
> +#endif
> +return word;
>} else {
> uint64_t word = 0;
> for (int i = 0; i < num_bytes; ++i) {
> +#if ARROW_LITTLE_ENDIAN
>word |= static_cast(bytes[i]) << (8 * i);
> +#else
> + word |= static_cast(bytes[num_bytes - 1 - i]) << (8 * i);
> +#endif
> }
> return word;
>}
> }
>
> inline void SafeStoreUpTo8Bytes(uint8_t* bytes, int num_bytes, uint64_t
value) {
> - // This will not be correct on big-endian architectures.
> -#if !ARROW_LITTLE_ENDIAN
> - ARROW_DCHECK(false);
> -#endif
>ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
>if (num_bytes == 8) {
> +#if ARROW_LITTLE_ENDIAN
> util::SafeStore(reinterpret_cast(bytes), value);
> +#else
> +util::SafeStore(reinterpret_cast(bytes),
bit_util::ByteSwap(value));
> +#endif
>} else {
> for (int i = 0; i < num_bytes; ++i) {
> +#if ARROW_LITTLE_ENDIAN
>bytes[i] = static_cast(value >> (8 * i));
> +#else
> + bytes[i] = static_cast(value >> (8 * (num_bytes - 1 - i)));
> +#endif
> }
>}
> }
> ```
Hi @kou ,
I have now reverted the changes done to "SafeLoadUpTo8Bytes()" function on
s390x.. Its totally not required.. Thanks..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
k8ika0s commented on PR #48180: URL: https://github.com/apache/arrow/pull/48180#issuecomment-3568429612 Mostly looks good to me — just one thought after reading through the recent back-and-forth... Given the updated handling of tail bytes and the SafeLoadUpTo8Bytes discussion, I think this PR’s direction still makes sense. I’d just double-check that the tail==8 path really can’t happen with the current unroll logic, since @kou kou raised that question. Otherwise the fixes seem aligned with the latest comments. Willing to help test once the approach is finalized. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
kou commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2552360230
##
cpp/src/arrow/compute/util.cc:
##
@@ -299,7 +313,22 @@ void bytes_to_bits(int64_t hardware_flags, const int
num_bits, const uint8_t* by
}
int tail = num_bits % unroll;
if (tail) {
-uint64_t bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+uint64_t bytes_next;
+#if ARROW_LITTLE_ENDIAN
+bytes_next = SafeLoadUpTo8Bytes(bytes + num_bits - tail, tail);
+#else
+if (tail == 8) {
Review Comment:
Does this happen?
I think that `tail` must not be `8` because `unroll` is always `8` and `tail
= num_bits % unroll`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
Vishwanatha-HD commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2545344185
##
cpp/src/arrow/compute/util.cc:
##
@@ -30,34 +30,40 @@ namespace util {
namespace bit_util {
inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes, int num_bytes) {
- // This will not be correct on big-endian architectures.
-#if !ARROW_LITTLE_ENDIAN
- ARROW_DCHECK(false);
-#endif
ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
if (num_bytes == 8) {
return util::SafeLoad(reinterpret_cast(bytes));
Review Comment:
Thanks for pointing out this.. Now with the way we are handling the
tail_bytes and loading the word data, we dont actually need to change
"SafeLoadUpTo8Bytes()" function.. With the conditional compilation, this
function will never be called on Big-endian architecture.
I have reverted this change.. Tested completely on s390x to see if all the
test work. I have pushed a new commit. Please give your review comments. Thanks.
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +124,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
Now that I have removed the big-endian support to SafeLoadUpTo8Bytes()
function, these changes are required as these handle the way we handle the
tail_bytes on big-endian systems. If the tail_bytes are equal to 8, then we
call directly the SafeLoad to load the data onto "word" variable. And for rest
other cases, we need to take care of loading least significant bits to ensure
compatibility with "CountTrailingZeros". This is the reason why we wont be able
to make a direct call "SafeLoadUpTo8Bytes()" for every tail_bytes.
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +124,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
I have fixed the lint errors and pushed my changes. Thanks..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
kou commented on code in PR #48180:
URL: https://github.com/apache/arrow/pull/48180#discussion_r2543957465
##
cpp/src/arrow/compute/util.cc:
##
@@ -30,34 +30,40 @@ namespace util {
namespace bit_util {
inline uint64_t SafeLoadUpTo8Bytes(const uint8_t* bytes, int num_bytes) {
- // This will not be correct on big-endian architectures.
-#if !ARROW_LITTLE_ENDIAN
- ARROW_DCHECK(false);
-#endif
ARROW_DCHECK(num_bytes >= 0 && num_bytes <= 8);
if (num_bytes == 8) {
return util::SafeLoad(reinterpret_cast(bytes));
Review Comment:
Does this work on big-endian system?
##
cpp/src/arrow/compute/util.cc:
##
@@ -118,7 +124,22 @@ void bits_to_indexes_internal(int64_t hardware_flags,
const int num_bits,
// Optionally process the last partial word with masking out bits outside
range
if (tail) {
const uint8_t* bits_tail = bits + (num_bits - tail) / 8;
+#if ARROW_LITTLE_ENDIAN
uint64_t word = SafeLoadUpTo8Bytes(bits_tail, (tail + 7) / 8);
+#else
+int tail_bytes = (tail + 7) / 8;
+uint64_t word;
+if (tail_bytes == 8) {
+ word = util::SafeLoad(reinterpret_cast(bits_tail));
+} else {
+ // For bit manipulation, always load into least significant bits
+ // to ensure compatibility with CountTrailingZeros on Big-endian systems
+ word = 0;
+ for (int i = 0; i < tail_bytes; ++i) {
+word |= static_cast(bits_tail[i]) << (8 * i);
+ }
+}
+#endif
Review Comment:
Why do we need this?
The `SafeLoadUpTo8Bytes()` change adds support for big-endian, right?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Re: [PR] GH-48177: [C++][Parquet] Fix arrow-acero-asof-join-node-test failures on s390x [arrow]
github-actions[bot] commented on PR #48180: URL: https://github.com/apache/arrow/pull/48180#issuecomment-3554765984 :warning: GitHub issue #48177 **has been automatically assigned in GitHub** to PR creator. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
