Hello!
Currently Pattern.splitAsStream JavaDoc says [1]:
* <p> If the input sequence is mutable, it must remain constant during the
* execution of the terminal stream operation. Otherwise, the result of the
* terminal stream operation is undefined.
However in reality the sequence must remain constant from the stream
creation till the end of the terminal operation. Let's check:
public static void main(String[] args) {
StringBuilder sb = new StringBuilder("a,b,c,d,e");
Stream<String> stream = Pattern.compile(",").splitAsStream(sb);
// Modify the CharSequence after stream creation
sb.setLength(3);
// During the terminal operation it remains constant
stream.forEach(System.out::println);
}
The result is:
a
Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String
index out of range: 3
at java.lang.AbstractStringBuilder.charAt(AbstractStringBuilder.java:210)
at java.lang.StringBuilder.charAt(StringBuilder.java:76)
...
So I feel either the JavaDoc or the implementation should be changed.
Changing the implementation to fit the JavaDoc is quite simple. See
the attached pattern-patch.txt and pattern-patch2.txt for two possible
alternatives.
What do you think?
With best regards,
Tagir Valeev.
[1]
http://hg.openjdk.java.net/jdk9/dev/jdk/file/3c3a5343044c/src/java.base/share/classes/java/util/regex/Pattern.java#l5803
diff --git a/src/java.base/share/classes/java/util/regex/Pattern.java
b/src/java.base/share/classes/java/util/regex/Pattern.java
--- a/src/java.base/share/classes/java/util/regex/Pattern.java
+++ b/src/java.base/share/classes/java/util/regex/Pattern.java
@@ -5878,7 +5878,8 @@
}
}
}
- return StreamSupport.stream(Spliterators.spliteratorUnknownSize(
- new MatcherIterator(), Spliterator.ORDERED |
Spliterator.NONNULL), false);
+ int characteristics = Spliterator.ORDERED | Spliterator.NONNULL;
+ return StreamSupport.stream(() -> Spliterators.spliteratorUnknownSize(
+ new MatcherIterator(), characteristics), characteristics,
false);
}
}
diff --git a/src/java.base/share/classes/java/util/regex/Pattern.java
b/src/java.base/share/classes/java/util/regex/Pattern.java
--- a/src/java.base/share/classes/java/util/regex/Pattern.java
+++ b/src/java.base/share/classes/java/util/regex/Pattern.java
@@ -5814,7 +5814,7 @@
*/
public Stream<String> splitAsStream(final CharSequence input) {
class MatcherIterator implements Iterator<String> {
- private final Matcher matcher;
+ private Matcher matcher;
// The start position of the next sub-sequence of input
// when current == input.length there are no more elements
private int current;
@@ -5823,14 +5823,6 @@
// > 0 if there are N next empty elements
private int emptyElementCount;
- MatcherIterator() {
- this.matcher = matcher(input);
- // If the input is an empty string then the result can only be
a
- // stream of the input. Induce that by setting the empty
- // element count to 1
- this.emptyElementCount = input.length() == 0 ? 1 : 0;
- }
-
public String next() {
if (!hasNext())
throw new NoSuchElementException();
@@ -5846,6 +5838,13 @@
}
public boolean hasNext() {
+ if (matcher == null) {
+ matcher = matcher(input);
+ // If the input is an empty string then the result can
only be a
+ // stream of the input. Induce that by setting the empty
+ // element count to 1
+ emptyElementCount = input.length() == 0 ? 1 : 0;
+ }
if (nextElement != null || emptyElementCount > 0)
return true;