cyb70289 commented on a change in pull request #9758:
URL: https://github.com/apache/arrow/pull/9758#discussion_r629076290



##########
File path: cpp/src/arrow/compute/kernels/aggregate_basic_internal.h
##########
@@ -256,8 +262,8 @@ struct MinMaxImpl : public ScalarAggregator {
     using ScalarType = typename TypeTraits<ArrowType>::ScalarType;
 
     std::vector<std::shared_ptr<Scalar>> values;
-    if (!state.has_values ||
-        (state.has_nulls && options.null_handling == 
MinMaxOptions::EMIT_NULL)) {
+    if (options.min_count > 0 &&

Review comment:
       Current code returns maximal numerical limit for `min([], min_count=0)`, 
this is strange.
   I think `min_count = 0` is not a valid case for min_max, we can verify the 
options and fail if it's 0.
   Further more, `min_count` here is only checked for 0 or non 0, so basically 
it's useless if 0 is invalid. We can simply ignore it. Or improve min_max 
kernel (count valid values) as sum kernel does.

##########
File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc
##########
@@ -75,48 +75,55 @@ struct CountImpl : public ScalarAggregator {
 
   Status Finalize(KernelContext* ctx, Datum* out) override {
     const auto& state = checked_cast<const CountImpl&>(*ctx->state());
-    switch (state.options.count_mode) {
-      case CountOptions::COUNT_NON_NULL:
-        *out = Datum(state.non_nulls);
-        break;
-      case CountOptions::COUNT_NULL:
-        *out = Datum(state.nulls);
-        break;
-      default:
-        return Status::Invalid("Unknown CountOptions encountered");
+    if (state.options.skip_nulls) {
+      *out = Datum(state.non_nulls);
+    } else {
+      *out = Datum(state.nulls);

Review comment:
       `skip_nulls = true` matches `COUNT_NON_NULL`, the meaning is clear.
   
   `skip_nulls = false` is a bit confusing. E.g., `count(list, 
skip_nulls=False)` should return the total length including nulls, but it 
actually returns only #nulls.
   I'm okay to accept current code, just wondering if there's better way.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to