Re: [PR] Implement method to add all stream elements into a PriorityQueue. [lucene]

via GitHub Mon, 30 Mar 2026 09:48:27 -0700


uschindler commented on code in PR #15823:
URL: https://github.com/apache/lucene/pull/15823#discussion_r3010979109



##########
lucene/core/src/java/org/apache/lucene/util/PriorityQueue.java:
##########
@@ -174,6 +177,38 @@ public void addAll(Collection<T> elements) {
     }
   }
 
+  /**
+   * Adds all elements of the stream into the queue. This method should be 
preferred over calling
+   * {@link #add(Object)} in loop if all elements are known in advance as it 
builds queue faster.
+   *
+   * <p>If one needs to map or filter element in the iteration of elements in 
this method, call this
+   * method with elements wrapped by {@link Stream#map(Function)} or {@link
+   * Stream#filter(Predicate)}, etc. In these cases, this method should be 
preferred over calling
+   * {@link #addAll(Collection)}.
+   *
+   * <p>If one tries to add more objects than the maxSize passed in the 
constructor, an {@link
+   * ArrayIndexOutOfBoundsException} is thrown. Which may result in parts of 
elements added into the
+   * queue, but the heap is still stay in correct state. In this case, if 
caller wants to readd or
+   * {@link #updateTop(Object)} with remaining elements, it should use a new 
stream, and use {@link
+   * Stream#skip(long)} to skip consumed elements with the delta size of queue.
+   */
+  public void addAll(Stream<T> elements) {
+    // Heap with size S always takes first S elements of the array,
+    // and thus it's safe to fill array further - no actual non-sentinel value 
will be overwritten.
+    try {
+      elements.forEachOrdered(
+          element -> {
+            this.heap[size + 1] = element;
+            this.size++;

Review Comment:
   > No doubt I likely have a misunderstanding here :). I don't often leverage 
streams to be honest.
   > 
   > If I'm understanding, your point here is that `forEachOrdered` is going to 
guarantee sequential invocation. Is that correct? I took a look at the 
[documentation](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/stream/Stream.html#forEachOrdered(java.util.function.Consumer))[1]
 and it seems to highlight this point as well.
   
   Yes that what I am after. The difference betwen forEach() and forEachOrdered 
is just the order. Because of the defined order (oteratoion order of the 
stream), you can repeat the operation after an exception, stepping over the 
already processed ones (using skipping as mentioned in the docs here).
   
   > Apologies for the confusion. I should have looked at this documentation 
first.
   
   No problem. Parallel streams often make people think they need to prevent 
this, but it is only an issue if non-terminal operations change the stream. 
E.g., if the map() method's lambda would change the collection. Foreach is a 
terminal operation, you can be sure it is running in order with happens-before. 
    
   > [1] "Performing the action for one element 
[**happens-before**](https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/concurrent/package-summary.html#MemoryVisibility)
 performing the action for subsequent elements"
   
   Thanks for that documentation hint, this makes sure for-each is sequential 
and itsself is not parallelized. In fact, `forEach()`, `forEachOrdered() or 
iterator() disables parallelization on the stream anyways, therefor features 
like `` collect()` or `reduce()` are much better when you have a parallel 
stream.
   
   Foreach converts the stream to a for-each loop. Alternatively, we could also 
call `iterator()` on the stream and consume the iterator like this:
   
   ```java
   Iterable<T> it = () -> stream.iterator();
   for (T element : it) {
     this.heap[size + 1] = element;
     this.size++;
   }
   ```
   
   Of course that's not really better readable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Implement method to add all stream elements into a PriorityQueue. [lucene]

Reply via email to