EnricoMi commented on code in PR #48724:
URL: https://github.com/apache/arrow/pull/48724#discussion_r2664633686


##########
cpp/src/arrow/compute/kernels/vector_selection_test.cc:
##########
@@ -513,11 +552,13 @@ TYPED_TEST(TestFilterKernelWithNumeric, 
CompareScalarAndFilterRandomNumeric) {
   using CType = typename TypeTraits<TypeParam>::CType;
 
   auto rand = random::RandomArrayGenerator(kRandomSeed);
+  std::default_random_engine gen(kRandomSeed);
+  ::arrow::random::uniform_real_distribution<double> null_dist(0.0, 1.0);
   for (size_t i = 3; i < 10; i++) {
     const int64_t length = static_cast<int64_t>(1ULL << i);
-    // TODO(bkietz) rewrite with some nulls
-    auto array =
-        checked_pointer_cast<ArrayType>(rand.Numeric<TypeParam>(length, 0, 
100, 0));
+    double null_probability = null_dist(gen);

Review Comment:
   Why do we need `null_dist` and `gen` when all we use is a `double` 
representing a null probability?
   
   Why not having
   ```suggestion
       double null_probability = 0.1;
   ```
   Can you elaborate on this, please?



##########
cpp/src/arrow/compute/kernels/vector_selection_test.cc:
##########
@@ -513,11 +552,13 @@ TYPED_TEST(TestFilterKernelWithNumeric, 
CompareScalarAndFilterRandomNumeric) {
   using CType = typename TypeTraits<TypeParam>::CType;
 
   auto rand = random::RandomArrayGenerator(kRandomSeed);
+  std::default_random_engine gen(kRandomSeed);
+  ::arrow::random::uniform_real_distribution<double> null_dist(0.0, 1.0);
   for (size_t i = 3; i < 10; i++) {
     const int64_t length = static_cast<int64_t>(1ULL << i);
-    // TODO(bkietz) rewrite with some nulls
-    auto array =
-        checked_pointer_cast<ArrayType>(rand.Numeric<TypeParam>(length, 0, 
100, 0));
+    double null_probability = null_dist(gen);

Review Comment:
   I guess you want different null probabilities across iterations of `i`.
   
   We could be specific about the actual null probabilities used:
   
   ```suggestion
       double null_probability = 1 / i;
   ```
   
   or
   
   ```suggestion
       double null_probability = (i-3) / i;
   ```
   if `null_probability == 0` has to be among the test cases.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to