pitrou commented on code in PR #48203:
URL: https://github.com/apache/arrow/pull/48203#discussion_r2556419213
##########
cpp/src/parquet/decoder.cc:
##########
@@ -2306,6 +2345,17 @@ class ByteStreamSplitDecoder<FLBAType> : public
ByteStreamSplitDecoderBase<FLBAT
const int num_decoded = this->DecodeRaw(decode_out, max_values);
DCHECK_EQ(num_decoded, max_values);
+#if !ARROW_LITTLE_ENDIAN
+ // On big-endian, ByteStreamSplitDecode (DoMergeStreams) reverses stream
positions
+ // to produce numeric values in native byte order. For FLBA (opaque byte
arrays),
+ // we need to undo this reversal to preserve the original byte sequence.
Review Comment:
Hmm, doing a second byteswap to undo the first one sounds entirely wasteful.
Perhaps the ByteStreamSplit routines should instead take an optional flag to do
the byte-swapping?
For example have something like:
```c++
// If fix_endianness is `true`, output will be converted from little endian
to native endian
inline void ByteStreamSplitDecode(const uint8_t* data, int width, int64_t
num_values,
int64_t stride, bool fix_endianness,
uint8_t* out) {
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]