paleolimbot commented on code in PR #555:
URL: https://github.com/apache/arrow-nanoarrow/pull/555#discussion_r1683116808
##########
src/nanoarrow/common/inline_array.h:
##########
@@ -740,6 +740,31 @@ static inline int8_t ArrowArrayViewIsNull(const struct
ArrowArrayView* array_vie
}
}
+static inline int64_t ArrowArrayViewGetNullCount(
+ const struct ArrowArrayView* array_view) {
+ if (array_view->null_count != -1) {
+ return array_view->null_count;
+ }
+
+ switch (array_view->storage_type) {
+ case NANOARROW_TYPE_NA:
+ return array_view->length;
+ case NANOARROW_TYPE_DENSE_UNION:
+ case NANOARROW_TYPE_SPARSE_UNION:
+ // Unions are "never null" in Arrow land
+ return 0;
+ default:
+ break;
+ }
+
+ const uint8_t* validity_buffer = array_view->buffer_views[0].data.as_uint8;
+ if (validity_buffer) {
Review Comment:
For better or worse, all of our existing null checks in nanoarrow are
explicit (i.e., `if (validity_buffer == NULL) {`)
##########
src/nanoarrow/common/inline_array.h:
##########
@@ -740,6 +740,31 @@ static inline int8_t ArrowArrayViewIsNull(const struct
ArrowArrayView* array_vie
}
}
+static inline int64_t ArrowArrayViewGetNullCount(
+ const struct ArrowArrayView* array_view) {
+ if (array_view->null_count != -1) {
+ return array_view->null_count;
+ }
+
+ switch (array_view->storage_type) {
+ case NANOARROW_TYPE_NA:
+ return array_view->length;
+ case NANOARROW_TYPE_DENSE_UNION:
+ case NANOARROW_TYPE_SPARSE_UNION:
+ // Unions are "never null" in Arrow land
+ return 0;
Review Comment:
Are these cases tested?
##########
src/nanoarrow/common/inline_array.h:
##########
@@ -740,6 +740,31 @@ static inline int8_t ArrowArrayViewIsNull(const struct
ArrowArrayView* array_vie
}
}
+static inline int64_t ArrowArrayViewGetNullCount(
Review Comment:
This would also need an entry in `nanoarrow.h` with documentation. I have a
slight preference towards having this be `ArrowArrayViewComputeNullCount()`
(with the user choosing if they want to trust the existing null count or not).
A dedicated PR for this would probably make sense (it could also use this to
replace some Python-specific functionality:
https://github.com/apache/arrow-nanoarrow/blob/fcf3a809f8b8f8facfb8f29284c006429cc91d49/python/src/nanoarrow/_lib.pyx#L355-L373
)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]