Hello!

Currently Pattern.splitAsStream JavaDoc says [1]:

 * <p> If the input sequence is mutable, it must remain constant during the
 * execution of the terminal stream operation.  Otherwise, the result of the
 * terminal stream operation is undefined.

However in reality the sequence must remain constant from the stream
creation till the end of the terminal operation. Let's check:

public static void main(String[] args) {
    StringBuilder sb = new StringBuilder("a,b,c,d,e");
    Stream<String> stream = Pattern.compile(",").splitAsStream(sb);
    // Modify the CharSequence after stream creation
    sb.setLength(3);
    // During the terminal operation it remains constant
    stream.forEach(System.out::println);
}

The result is:
a
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String 
index out of range: 3
    at java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:210)
    at java.lang.StringBuilder.charAt(StringBuilder.java:76)
...

So I feel either the JavaDoc or the implementation should be changed.
Changing the implementation to fit the JavaDoc is quite simple. See
the attached pattern-patch.txt and pattern-patch2.txt for two possible
alternatives.

What do you think?

With best regards,
Tagir Valeev.

[1] 
http://hg.openjdk.java.net/jdk9/dev/jdk/file/3c3a5343044c/src/java.base/share/classes/java/util/regex/Pattern.java#l5803
diff --git a/src/java.base/share/classes/java/util/regex/Pattern.java 
b/src/java.base/share/classes/java/util/regex/Pattern.java
--- a/src/java.base/share/classes/java/util/regex/Pattern.java
+++ b/src/java.base/share/classes/java/util/regex/Pattern.java
@@ -5878,7 +5878,8 @@
                 }
             }
         }
-        return StreamSupport.stream(Spliterators.spliteratorUnknownSize(
-                new MatcherIterator(), Spliterator.ORDERED | 
Spliterator.NONNULL), false);
+        int characteristics = Spliterator.ORDERED | Spliterator.NONNULL;
+        return StreamSupport.stream(() -> Spliterators.spliteratorUnknownSize(
+                new MatcherIterator(), characteristics), characteristics, 
false);
     }
 }
diff --git a/src/java.base/share/classes/java/util/regex/Pattern.java 
b/src/java.base/share/classes/java/util/regex/Pattern.java
--- a/src/java.base/share/classes/java/util/regex/Pattern.java
+++ b/src/java.base/share/classes/java/util/regex/Pattern.java
@@ -5814,7 +5814,7 @@
      */
     public Stream<String> splitAsStream(final CharSequence input) {
         class MatcherIterator implements Iterator<String> {
-            private final Matcher matcher;
+            private Matcher matcher;
             // The start position of the next sub-sequence of input
             // when current == input.length there are no more elements
             private int current;
@@ -5823,14 +5823,6 @@
             // > 0 if there are N next empty elements
             private int emptyElementCount;
 
-            MatcherIterator() {
-                this.matcher = matcher(input);
-                // If the input is an empty string then the result can only be 
a
-                // stream of the input.  Induce that by setting the empty
-                // element count to 1
-                this.emptyElementCount = input.length() == 0 ? 1 : 0;
-            }
-
             public String next() {
                 if (!hasNext())
                     throw new NoSuchElementException();
@@ -5846,6 +5838,13 @@
             }
 
             public boolean hasNext() {
+                if (matcher == null) {
+                    matcher = matcher(input);
+                    // If the input is an empty string then the result can 
only be a
+                    // stream of the input.  Induce that by setting the empty
+                    // element count to 1
+                    emptyElementCount = input.length() == 0 ? 1 : 0;
+                }
                 if (nextElement != null || emptyElementCount > 0)
                     return true;

Reply via email to