Re: RFR(m) 2: 8072722: add stream support to Scanner

Stuart Marks Tue, 15 Sep 2015 21:49:31 -0700


On 9/10/15 2:12 PM, Xueming Shen wrote:

I think it might be a "nice to have" for a "fail-fast" effort after the the
consumer consumed/accepted the result (the second check), but isn't it a bug
for the consumer to accept any result if there is CME condition occurred
already?

I'm not sure which spliterator we're talking about at this point, but the issueis similar between them. Prior to calling the consumer's accept() method, inFindSpliterator, the modCount has previously been asserted to be equal toexpectedCount. In TokenSpliterator, the expectedCount is refreshed from themodCount immediately prior to calling accept(). (This is done because advancingthe spliterator in this case increments the modCount.)

In both spliterators, then, the expectedCount should be equal to the modCountimmediately prior to the call to accept(). Also in both spliterators, themodCount and expectedCount are compared immediately after accept(), and if theyaren't equal, CME is thrown.

What this guards against is the accept() method -- really, one of theapplication's lambdas that's been passed to a pipeline operation -- modifyingthe state of the scanner. This only really works in a sequential stream, butit's all we've got. (In a parallel stream, I think the element is bufferedsomewhere and is handed to another thread. If that other thread attempts tomodify the scanner's state, all bets are off because of memory visibility issues.)

Anyway, at least for sequential streams, this check does properly guard againstthe case where somebody modifies the scanner's state from within a pipelineoperation. There are tests for this too; see ScanTest.streamComodTest().

It'd be better to initialize expectedCount to modCount in constrocutor?


That's how I had it initially, but at Paul Sandoz' suggestion I delayed the
initialization to the first call to tryAdvance(). This allows the Scanner's
state to be modified after stream creation but before stream pipeline
execution. This is the way that Paul's stream code in Matcher works. I'm not
sure how important this is. Having Scanner be gratuitously different from
Matcher seems like it would be irritating though.


I noticed the spec says "Scanning starts upon initiation of the terminal
stream operation, using the current state of this scanner..." guess it means
the "CME" enforcement starts with the "stream operation" starts (a kinda of
later-initialization). But personally feel it may create a unnecessary
inconsistent situation, depends on whether or not there is state change
between the creation of the Stream object and the starting of the stream
operation. But I'm not a stream > expert :-)

Well, one of my earlier revisions basically said that you can't touch theScanner at all after tokens() or findAll() has been called. This works, but isunnessarily restrictive, and it's inconsistent with Paul's approach withMatcher.results().

This is pretty easy to see because the constructors for the new spliteratorssimply initialize themselves, but they don't hang onto any state from thescanner. The only actual dependence on the state of the scanner starts at thefirst call to tryAdvance(), which is when the first element is actuallyintroduced to the stream. It's safe for the application to change the state ofthe scanner any time up until that point. It does introduce a little bit ofcomplexity in that there's an additional state in the expectedCount checking (aswe've seen) :-). But it does allow a bit more flexibility with the caller'shandling of the scanner and a stream derived from it.


s'marks

Re: RFR(m) 2: 8072722: add stream support to Scanner

Reply via email to