pitrou commented on code in PR #36972:
URL: https://github.com/apache/arrow/pull/36972#discussion_r1288794380
##########
cpp/src/parquet/encoding.cc:
##########
@@ -340,32 +340,28 @@ class PlainEncoder<BooleanType> : public EncoderImpl,
virtual public BooleanEnco
throw ParquetException("direct put to boolean from " +
values.type()->ToString() +
" not supported");
}
-
const auto& data = checked_cast<const ::arrow::BooleanArray&>(values);
+
if (data.null_count() == 0) {
-
PARQUET_THROW_NOT_OK(sink_.Reserve(bit_util::BytesForBits(data.length())));
- // no nulls, just dump the data
- ::arrow::internal::CopyBitmap(data.data()->GetValues<uint8_t>(1),
data.offset(),
- data.length(), sink_.mutable_data(),
sink_.length());
+ ArrowPoolVector<bool> boolean_data(data.length(), this->memory_pool());
+ for (int i = 0; i < data.length(); ++i) {
+ boolean_data[i] = data.Value(i);
+ }
+ PutImpl(boolean_data, static_cast<int>(data.length()));
Review Comment:
You're right, it seems unused (by us!) for now. But ideally we _should_ use
it in `TypedColumnWriterImpl<BooleanType>::WriteArrowDense` (see
`column_writer.cc`), instead of wasting CPU time unpacking then repacking the
boolean bitmap.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]