Copilot commented on code in PR #12395:
URL: https://github.com/apache/gluten/pull/12395#discussion_r3493481722


##########
cpp/velox/compute/delta/DeltaDeletionVectorReader.cpp:
##########
@@ -183,15 +183,29 @@ void 
DeltaDeletionVectorReader::applyDeletionFilter(uint64_t baseReadOffset, uin
   auto* rawBitmap = deleteBitmap->asMutable<uint64_t>();
   std::memset(rawBitmap, 0, bits::nbytes(size));
 
+  // Use an iterator-based approach instead of per-row contains() lookups.
+  // This is O(deletions_in_range) rather than O(batch_size), which is
+  // significantly faster when deletions are sparse relative to batch size.
+  const uint64_t rangeEnd = baseReadOffset + size;

Review Comment:
   `rangeEnd` is computed as `baseReadOffset + size` without overflow 
protection. If `baseReadOffset` is near `UINT64_MAX`, this wraps and can cause 
the loop to stop early (missing deletions) or compute an out-of-range 
`relativeIndex` for `bits::setBit`.



##########
cpp/velox/benchmarks/DeltaBitmapBenchmark.cc:
##########
@@ -276,4 +280,69 @@ BENCHMARK_CAPTURE(
     ->Args({1 << 18, 64})
     ->Unit(benchmark::kMillisecond);
 
+// Benchmark for applyDeletionFilter: measures the hot path where a batch of
+// rows is checked against the deletion vector bitmap.
+// deletionPercent: fraction of rows in the total file that are deleted.
+// batchSize: number of rows per batch (typical Velox batch size).
+void BM_ApplyDeletionFilter(benchmark::State& state, double deletionPercent) {
+  const auto batchSize = static_cast<uint64_t>(state.range(0));
+  const uint64_t totalFileRows = 1000000; // 1M row file
+  const auto numDeleted = static_cast<uint64_t>(totalFileRows * 
deletionPercent / 100.0);
+
+  // Build a DV with deletions spread across the file.
+  RoaringBitmapArray bitmap;
+  const uint64_t stride = numDeleted > 0 ? totalFileRows / numDeleted : 0;
+  for (uint64_t i = 0; i < numDeleted; ++i) {
+    bitmap.addSafe(i * stride);
+  }
+  const auto payload = bitmap.serializeToString(true);
+
+  // Load the DV reader.
+  DeltaDeletionVectorReader reader;
+  reader.loadSerializedDeletionVector(std::string_view(payload.data(), 
payload.size()));
+
+  // Allocate the output bitmap buffer.
+  memory::MemoryManager::testingSetInstance(memory::MemoryManager::Options{});
+  auto pool = memory::memoryManager()->addLeafPool();

Review Comment:
   `memory::MemoryManager::testingSetInstance(...)` is called inside 
`BM_ApplyDeletionFilter`, so it will run once per benchmark capture (and 
potentially multiple times per process). In this codebase it’s typically 
initialized once (e.g., in test suite setup / benchmark main), and 
re-initializing a global singleton can be unsafe or leak resources.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to