RFR: JDK-8297299 SequenceInputStream should not use Vector
There is no need to use a temporary Vector within the constructor of SynchronizedInputStream, as more efficient (non-synchronized) alternative code (like List.of) will do the same in possibly less time. While the optimization is not dramatic, it still makes sense to replace Vector unless synchronization is really needed. - Commit messages: - Replacing Vector by List.of Changes: https://git.openjdk.org/jdk/pull/11249/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11249&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8297299 Stats: 8 lines in 1 file changed: 2 ins; 4 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/11249.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11249/head:pull/11249 PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector
On Sat, 19 Nov 2022 21:36:56 GMT, Markus KARG wrote: > There is no need to use a temporary Vector within the constructor of > SynchronizedInputStream, as more efficient (non-synchronized) alternative > code (like List.of) will do the same in possibly less time. While the > optimization is not dramatic, it still makes sense to replace Vector unless > synchronization is really needed. src/java.base/share/classes/java/io/SequenceInputStream.java line 43: > 41: * on the last of the contained input streams. > 42: * > 43: * @author Arthur van Hoff Good catch. Looking at some other files in the JDK, your change here looks correct. src/java.base/share/classes/java/io/SequenceInputStream.java line 83: > 81: */ > 82: public SequenceInputStream(InputStream s1, InputStream s2) { > 83: e = Collections.enumeration(List.of(s1, s2)); Hello Markus, I think removing the `Vector` usage looks fine. The use of `List.of` in this constructor introduces a small change in behaviour of this constructor if the constructor is being passed a `null` `Inputstream`. Previously, before this change, if the first `InputStream` parameter was null this constructor would throw a `NullPointerException` (because the call to `peekNextStream()` in this constructor throws that). However, if the second `InputStream` was `null` then the constructor would return back fine. Only later when the `SequenceInputStream` was used and it was time to move to use the second `InputStream`, it would throw the `NullPointerException` (from the same `peekNextStream()`). Now, with the use of `List.of(s1, s2))`, the `NullPointerException` will get thrown early if either of these two parameters is `null`. Whether or not this change in behaviour related to `NullPointerException` is acceptable (backed by a CSR) is something that others more familiar with this area would be able to tell. But I think it may not be desirable because depending on how the application might have been using the `SequenceInputStream` instance they may not have been encountering the `NullPointerException` (for the second param) if they never read the `SequenceInputStream` completely. You could still remove the `Vector` usage from this code though, by constructing the collection differently instead of `List.of`. - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector
On Sat, 19 Nov 2022 21:36:56 GMT, Markus KARG wrote: > There is no need to use a temporary Vector within the constructor of > SynchronizedInputStream, as more efficient (non-synchronized) alternative > code (like List.of) will do the same in possibly less time. While the > optimization is not dramatic, it still makes sense to replace Vector unless > synchronization is really needed. Jai is correct, this constructor appears to have tolerated null in the second parameter since JDK 1.0 so we need to proceed with care. The SIS constructor that takes an Enumeration is similar in that it only throws NPE if the first element is null, it probably should have taken a defensive copy but that can't be changed now as there may be lazy implementations in the wild that depend on existing behavior. - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector
On Sun, 20 Nov 2022 07:56:59 GMT, Alan Bateman wrote: > Jai is correct, this constructor appears to have tolerated null in the second > parameter since JDK 1.0 so we need to proceed with care. I have fixe this in https://github.com/openjdk/jdk/pull/11249/commits/b6ccec209f3989ca37ac15585d27e28c2dddfb14, so it still uses `List.of` but keeps the historic behavior. I wonder if we shouldn't change the Javadocs so they clearly tell the reader that `s2` MAY be `null`? - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
> There is no need to use a temporary Vector within the constructor of > SynchronizedInputStream, as more efficient (non-synchronized) alternative > code (like List.of) will do the same in possibly less time. While the > optimization is not dramatic, it still makes sense to replace Vector unless > synchronization is really needed. Markus KARG has updated the pull request incrementally with one additional commit since the last revision: allowing s2 to be null - Changes: - all: https://git.openjdk.org/jdk/pull/11249/files - new: https://git.openjdk.org/jdk/pull/11249/files/3495a8b2..b6ccec20 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11249&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11249&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/11249.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11249/head:pull/11249 PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Sun, 20 Nov 2022 02:16:02 GMT, Jaikiran Pai wrote: >> Markus KARG has updated the pull request incrementally with one additional >> commit since the last revision: >> >> allowing s2 to be null > > src/java.base/share/classes/java/io/SequenceInputStream.java line 83: > >> 81: */ >> 82: public SequenceInputStream(InputStream s1, InputStream s2) { >> 83: e = Collections.enumeration(List.of(s1, s2)); > > Hello Markus, > I think removing the `Vector` usage looks fine. The use of `List.of` in this > constructor introduces a small change in behaviour of this constructor if the > constructor is being passed a `null` `Inputstream`. > > Previously, before this change, if the first `InputStream` parameter was null > this constructor would throw a `NullPointerException` (because the call to > `peekNextStream()` in this constructor throws that). However, if the second > `InputStream` was `null` then the constructor would return back fine. Only > later when the `SequenceInputStream` was used and it was time to move to use > the second `InputStream`, it would throw the `NullPointerException` (from the > same `peekNextStream()`). > > Now, with the use of `List.of(s1, s2))`, the `NullPointerException` will get > thrown early if either of these two parameters is `null`. > > Whether or not this change in behaviour related to `NullPointerException` is > acceptable (backed by a CSR) is something that others more familiar with this > area would be able to tell. But I think it may not be desirable because > depending on how the application might have been using the > `SequenceInputStream` instance they may not have been encountering the > `NullPointerException` (for the second param) if they never read the > `SequenceInputStream` completely. > > > You could still remove the `Vector` usage from this code though, by > constructing the collection differently instead of `List.of`. Thank you for this quick review and bringing up this topic! I will follow-up on this in the below opened by @AlanBateman. - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Sun, 20 Nov 2022 09:07:21 GMT, Markus KARG wrote: >> There is no need to use a temporary Vector within the constructor of >> SynchronizedInputStream, as more efficient (non-synchronized) alternative >> code (like List.of) will do the same in possibly less time. While the >> optimization is not dramatic, it still makes sense to replace Vector unless >> synchronization is really needed. > > Markus KARG has updated the pull request incrementally with one additional > commit since the last revision: > > allowing s2 to be null src/java.base/share/classes/java/io/SequenceInputStream.java line 82: > 80: * @param s2 the second input stream to read. > 81: */ > 82: public SequenceInputStream(InputStream s1, InputStream s2) { BTW, what is your opinion @jaikiran and @AlanBateman: We could simplify the 2-arg constructor by calling `this(...)` instead of repeating the 1-arg constructor's implementation here. Is that a *preferred* or a *disliked* pattern in OpenJDK? - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Sun, 20 Nov 2022 09:14:32 GMT, Markus KARG wrote: >> Markus KARG has updated the pull request incrementally with one additional >> commit since the last revision: >> >> allowing s2 to be null > > src/java.base/share/classes/java/io/SequenceInputStream.java line 82: > >> 80: * @param s2 the second input stream to read. >> 81: */ >> 82: public SequenceInputStream(InputStream s1, InputStream s2) { > > BTW, what is your opinion @jaikiran and @AlanBateman: We could simplify the > 2-arg constructor by calling `this(...)` instead of repeating the 1-arg > constructor's implementation here. Is that a *preferred* or a *disliked* > pattern in OpenJDK? The updated code now changes the behaviour in the other direction: Previously, if `s2` was null a NPE was thrown in `peekNextStream` when `s1` was exhausted. In the current code, `s2` is silently ignored if it is null. A safer alternative that preserves the behaviour of nulls seems to be the replace `List.of` with `Arrays.asList`. These subtle changes in behaviour demonstrates the problem with even trivial updates to legacy code... - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Sun, 20 Nov 2022 09:47:35 GMT, Jens Lidestrom wrote: >> src/java.base/share/classes/java/io/SequenceInputStream.java line 82: >> >>> 80: * @param s2 the second input stream to read. >>> 81: */ >>> 82: public SequenceInputStream(InputStream s1, InputStream s2) { >> >> BTW, what is your opinion @jaikiran and @AlanBateman: We could simplify the >> 2-arg constructor by calling `this(...)` instead of repeating the 1-arg >> constructor's implementation here. Is that a *preferred* or a *disliked* >> pattern in OpenJDK? > > The updated code now changes the behaviour in the other direction: > > In the original code, if `s2` was null a NPE was thrown in `peekNextStream` > when `s1` was exhausted. > > In the current code, `s2` is silently ignored if it is null. > > A safer alternative that preserves the behaviour of nulls seems to be the > replace `List.of` with `Arrays.asList`. > > These subtle changes in behaviour demonstrates the problem with even trivial > updates to legacy code... It depends on *how far* we want to align the behavior. I do see a benefit in accepting `s2` being `null`. I do not see a benefit in throwing NPE at a later time. Why should an application want to expect that? - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Sun, 20 Nov 2022 12:06:55 GMT, Markus KARG wrote: >> The updated code now changes the behaviour in the other direction: >> >> In the original code, if `s2` was null a NPE was thrown in `peekNextStream` >> when `s1` was exhausted. >> >> In the current code, `s2` is silently ignored if it is null. >> >> A safer alternative that preserves the behaviour of nulls seems to be the >> replace `List.of` with `Arrays.asList`. >> >> These subtle changes in behaviour demonstrates the problem with even trivial >> updates to legacy code... > > It depends on *how far* we want to align the behavior. I do see a benefit in > accepting `s2` being `null`. I do not see a benefit in throwing NPE at a > *later* time. Why should an application want to expect that? I do understand > your opinion but I think that backwards compatibility also should have > limits, and in this particular case I would say such a limit is clearly > reached. Otherwise we must not replace Vector at all, because someone could > actually rely on the synchronization, also. @AlanBateman WDYT? - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Sun, 20 Nov 2022 13:06:40 GMT, Markus KARG wrote: > @AlanBateman WDYT? The long standing and undocumented behavior is unfortunate. I don't think the 1-arg constructor is fixable while still allowing for lazy behavior. So I think the only thing we can do is document the existing behavior that a null element in the enumeration will lead to read or close throwing NPE at a later time. For the 2-arg constructor then we need to be cautious. This is very old API and we have to assume the phenomenon that is Hyrum's Law. You may be right that nobody can reasonably depend on the current behavior but I think it would need a survey to inform the discussion. I can't tell if this PR was intended to rehabilitate SequenceInputStream or just a reaction to seeing that it uses Vector. If the latter then you could just change it to: var list = new ArrayList(); list.add(s1); list.add(s2); e = Collections.enumeration(list); peekNextStream(); and preserve the existing behavior. While SequenceInputStream may be legacy I guess it would be interesting to get some idea of usage or usages that use something better in libraries. Maybe there is a case for InputStream to have a concat method for example. - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v3]
> There is no need to use a temporary Vector within the constructor of > SynchronizedInputStream, as more efficient (non-synchronized) alternative > code (like List.of) will do the same in possibly less time. While the > optimization is not dramatic, it still makes sense to replace Vector unless > synchronization is really needed. Markus KARG has updated the pull request incrementally with one additional commit since the last revision: Alan's Proposal - Changes: - all: https://git.openjdk.org/jdk/pull/11249/files - new: https://git.openjdk.org/jdk/pull/11249/files/b6ccec20..9c902512 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11249&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11249&range=01-02 Stats: 6 lines in 1 file changed: 4 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/11249.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11249/head:pull/11249 PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Sun, 20 Nov 2022 15:21:31 GMT, Alan Bateman wrote: >> @AlanBateman WDYT? > >> @AlanBateman WDYT? > > The long standing and undocumented behavior is unfortunate. I don't think the > 1-arg constructor is fixable while still allowing for lazy behavior. So I > think the only thing we can do is document the existing behavior that a null > element in the enumeration will lead to read or close throwing NPE at a later > time. For the 2-arg constructor then we need to be cautious. This is very old > API and we have to assume the phenomenon that is Hyrum's Law. You may be > right that nobody can reasonably depend on the current behavior but I think > it would need a survey to inform the discussion. I can't tell if this PR was > intended to rehabilitate SequenceInputStream or just a reaction to seeing > that it uses Vector. If the latter then you could just change it to: > > var list = new ArrayList(); > list.add(s1); > list.add(s2); > e = Collections.enumeration(list); > peekNextStream(); > > and preserve the existing behavior. > > While SequenceInputStream may be legacy I guess it would be interesting to > get some idea of usage or usages that use something better in libraries. > Maybe there is a case for InputStream to have a concat method for example. Indeed my intention solely is to get rid of `Vector`, so I am fine with keeping a low profile and being full backwards compatible. I assume SIS is seldomly used, so opening a can of warms is not worth it (I think). I am perfectly happy with your proposal, but for the sake of brevity I'd rather go with a shorter way of doing the pretty same: ```this(Collections.enumeration(Arrays.asList(s1, s2))``` Are we all fine with that? Otherwise I'd switch this PR back to Alan`s proposal. - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v4]
> There is no need to use a temporary Vector within the constructor of > SynchronizedInputStream, as more efficient (non-synchronized) alternative > code (like List.of) will do the same in possibly less time. While the > optimization is not dramatic, it still makes sense to replace Vector unless > synchronization is really needed. Markus KARG has updated the pull request incrementally with one additional commit since the last revision: Jens' Proposal - Changes: - all: https://git.openjdk.org/jdk/pull/11249/files - new: https://git.openjdk.org/jdk/pull/11249/files/9c902512..9b3fe93d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11249&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11249&range=02-03 Stats: 6 lines in 1 file changed: 0 ins; 4 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/11249.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11249/head:pull/11249 PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Sun, 20 Nov 2022 16:49:16 GMT, Markus KARG wrote: >>> @AlanBateman WDYT? >> >> The long standing and undocumented behavior is unfortunate. I don't think >> the 1-arg constructor is fixable while still allowing for lazy behavior. So >> I think the only thing we can do is document the existing behavior that a >> null element in the enumeration will lead to read or close throwing NPE at a >> later time. For the 2-arg constructor then we need to be cautious. This is >> very old API and we have to assume the phenomenon that is Hyrum's Law. You >> may be right that nobody can reasonably depend on the current behavior but I >> think it would need a survey to inform the discussion. I can't tell if this >> PR was intended to rehabilitate SequenceInputStream or just a reaction to >> seeing that it uses Vector. If the latter then you could just change it to: >> >> var list = new ArrayList(); >> list.add(s1); >> list.add(s2); >> e = Collections.enumeration(list); >> peekNextStream(); >> >> and preserve the existing behavior. >> >> While SequenceInputStream may be legacy I guess it would be interesting to >> get some idea of usage or usages that use something better in libraries. >> Maybe there is a case for InputStream to have a concat method for example. > > Indeed my intention solely is to get rid of `Vector`, so I am fine with > keeping a low profile and being full backwards compatible. I assume SIS is > seldomly used, so opening a can of warms is not worth it (I think). I am > perfectly happy with your proposal, but for the sake of brevity I'd rather go > with a shorter way of doing the pretty same: > ```this(Collections.enumeration(Arrays.asList(s1, s2))``` > Are we all fine with that? Otherwise I'd switch this PR back to Alan`s > proposal. N.B.: Regarding usage numbers, I will do a quick poll on Twitter. - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v5]
> There is no need to use a temporary Vector within the constructor of > SynchronizedInputStream, as more efficient (non-synchronized) alternative > code (like List.of) will do the same in possibly less time. While the > optimization is not dramatic, it still makes sense to replace Vector unless > synchronization is really needed. Markus KARG has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Jens' Proposal - Changes: - all: https://git.openjdk.org/jdk/pull/11249/files - new: https://git.openjdk.org/jdk/pull/11249/files/9b3fe93d..2e957354 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11249&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11249&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/11249.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11249/head:pull/11249 PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v5]
On Sun, 20 Nov 2022 17:41:28 GMT, Markus KARG wrote: >> There is no need to use a temporary Vector within the constructor of >> SynchronizedInputStream, as more efficient (non-synchronized) alternative >> code (like List.of) will do the same in possibly less time. While the >> optimization is not dramatic, it still makes sense to replace Vector unless >> synchronization is really needed. > > Markus KARG has refreshed the contents of this pull request, and previous > commits have been removed. The incremental views will show differences > compared to the previous content of the PR. The pull request contains one new > commit since the last revision: > > Jens' Proposal Thank you Markus for the changes. The latest version in `2e957354` looks fine to me. While we are at it, would you be willing to change the member variables `e` to `private final` and the `in` to `private`? From what I can see, I don't see any other class accessing these package private fields. - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Sun, 20 Nov 2022 16:58:56 GMT, Markus KARG wrote: >> Indeed my intention solely is to get rid of `Vector`, so I am fine with >> keeping a low profile and being full backwards compatible. I assume SIS is >> seldomly used, so opening a can of warms is not worth it (I think). I am >> perfectly happy with your proposal, but for the sake of brevity I'd rather >> go with a shorter way of doing the pretty same: >> ```this(Collections.enumeration(Arrays.asList(s1, s2)))``` >> Are we all fine with that? Otherwise I'd switch this PR back to Alan's >> proposal. > > N.B.: Regarding usage numbers, I will do a quick poll on Twitter. > Indeed my intention solely is to get rid of `Vector`, so I am fine with > keeping a low profile and being full backwards compatible. I assume SIS is > seldomly used, so opening a can of warms is not worth it (I think). If you have the cycles then collecting data on usages and finding examples where the constructor is called with a second parameter of null would be useful. It's a lot of work to do that of course. The changes in update 2e957354 look okay, I can't immediately see any behavior change with that version. - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v5]
On Mon, 21 Nov 2022 06:31:47 GMT, Jaikiran Pai wrote: > While we are at it, would you be willing to change the member variables `e` > to `private final` and the `in` to `private`? From what I can see, I don't > see any other class accessing these package private fields. Good catch, will do later today. - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v6]
> There is no need to use a temporary Vector within the constructor of > SynchronizedInputStream, as more efficient (non-synchronized) alternative > code (like List.of) will do the same in possibly less time. While the > optimization is not dramatic, it still makes sense to replace Vector unless > synchronization is really needed. Markus KARG has updated the pull request incrementally with one additional commit since the last revision: private member variables - Changes: - all: https://git.openjdk.org/jdk/pull/11249/files - new: https://git.openjdk.org/jdk/pull/11249/files/2e957354..8d298318 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11249&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11249&range=04-05 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/11249.diff Fetch: git fetch https://git.openjdk.org/jdk pull/11249/head:pull/11249 PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v6]
On Mon, 21 Nov 2022 17:39:24 GMT, Markus KARG wrote: >> There is no need to use a temporary Vector within the constructor of >> SynchronizedInputStream, as more efficient (non-synchronized) alternative >> code (like List.of) will do the same in possibly less time. While the >> optimization is not dramatic, it still makes sense to replace Vector unless >> synchronization is really needed. > > Markus KARG has updated the pull request incrementally with one additional > commit since the last revision: > > private member variables Marked as reviewed by alanb (Reviewer). - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v5]
On Mon, 21 Nov 2022 07:48:01 GMT, Markus KARG wrote: > While we are at it, would you be willing to change the member variables `e` > to `private final` and the `in` to `private`? From what I can see, I don't > see any other class accessing these package private fields. Fixed in 8d29831 :-) - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v5]
On Mon, 21 Nov 2022 06:31:47 GMT, Jaikiran Pai wrote: >> Markus KARG has refreshed the contents of this pull request, and previous >> commits have been removed. The incremental views will show differences >> compared to the previous content of the PR. The pull request contains one >> new commit since the last revision: >> >> Jens' Proposal > > Thank you Markus for the changes. The latest version in `2e957354` looks fine > to me. While we are at it, would you be willing to change the member > variables `e` to `private final` and the `in` to `private`? From what I can > see, I don't see any other class accessing these package private fields. @jaikiran Kindly requesting your review approval. :-) - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Mon, 21 Nov 2022 07:42:35 GMT, Alan Bateman wrote: >> N.B.: Regarding usage numbers, I will do a quick poll on Twitter. > >> Indeed my intention solely is to get rid of `Vector`, so I am fine with >> keeping a low profile and being full backwards compatible. I assume SIS is >> seldomly used, so opening a can of warms is not worth it (I think). > > If you have the cycles then collecting data on usages and finding examples > where the constructor is called with a second parameter of null would be > useful. It's a lot of work to do that of course. The changes in update > 2e957354 look okay, I can't immediately see any behavior change with that > version. I think the public visibility of my Twitter account is not wide enough to get *really robust* answers, unfortunately. So far, 92% answered that they not even used SIS in their whole life. So the users of two-args constructor must be really neglectable. Nevertheless, they are definitively not zero, so all we could do is marking it deprecated some day. Anyways, thanks for the review. At least we get rid of `Vector` here. Would be happy if you mark the PR as reviewed. :-) - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Mon, 21 Nov 2022 17:43:58 GMT, Markus KARG wrote: > I think the public visibility of my Twitter account is not wide enough to get > really robust answers, unfortunately. One alternative is to search GitHub. It's amazing how fast they can search such a huge code corpus. Example: https://github.com/search?l=&q=%22new+SequenceInputStream%22+-filename%3ASequenceInputStreamTest.java&type=code One problem with GitHub search is that often you get a lot of duplicate matches. In the example above I have filtered out `SequenceInputStreamTest.java` which showed up a lot. - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v6]
On Mon, 21 Nov 2022 17:43:23 GMT, Markus KARG wrote: >> There is no need to use a temporary Vector within the constructor of >> SynchronizedInputStream, as more efficient (non-synchronized) alternative >> code (like List.of) will do the same in possibly less time. While the >> optimization is not dramatic, it still makes sense to replace Vector unless >> synchronization is really needed. > > Markus KARG has updated the pull request incrementally with one additional > commit since the last revision: > > private member variables Thank you Markus for these changes. The latest state in `8d298318` looks good to me. I'll run some tests today before sponsoring this. - Marked as reviewed by jpai (Reviewer). PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Mon, 21 Nov 2022 18:22:26 GMT, Jens Lidestrom wrote: >> I think the public visibility of my Twitter account is not wide enough to >> get *really robust* answers, unfortunately. So far, 92% answered that they >> not even used SIS in their whole life. So the users of two-args constructor >> must be really neglectable. Nevertheless, they are definitively not zero, so >> all we could do is marking it deprecated some day. >> >> Anyways, thanks for the review. At least we get rid of `Vector` here. Would >> be happy if you mark the PR as reviewed. :-) > >> I think the public visibility of my Twitter account is not wide enough to >> get really robust answers, unfortunately. > > One alternative is to search GitHub. It's amazing how fast they can search > such a huge code corpus. > > Example: > https://github.com/search?l=&q=%22new+SequenceInputStream%22+-filename%3ASequenceInputStreamTest.java&type=code > > One problem with GitHub search is that often you get a lot of duplicate > matches. In the example above I have filtered out > `SequenceInputStreamTest.java` which showed up a lot. Twitter survey results after one week: 5.4% of all voters have actually used SIS before! This number is big enough to consider SIS as "not irrelevant". OTOH this number is so small that any further poll whether this 5.4% actually used the one-arg or two-arg constructor would not provide stable results. Looking at the results of @jensli's GitHub search tells us that apparently *all* of those findings are using 2-args. So we actually need to be careful, not to break (up to) 5.4% of all existing applications. So what could be our plan to improve the live of that 5.4%? Some thoughts: 1. We could safely add a `public InputStream concat(InputStream)` *default method* to `InputStream` as a factory for *anonymous* sequences. This allows any IS implementation to provide an optimized variant if needed/wanted, falling back to the existing SIS *by default* as a starting point (until we come up with a better fallback implementation). 2. Having done that, we could deprecate-for-removal SIS with the clear note that from now on `concat` should be used instead. This triggers people to overhaul their existing code little by little, while we gain time to work on a replacement for SIS. 3. We then could provide a **non-public** `ConcatenatedInputStream` which holds **exactly two** IS (`first` and `last` MUST be given), and make that one the default implementation of `concat`, gaining a modern, easy-to-maintain, fully tested, high-performance, sequence of streams. 4. Some fine day we do remove SIS, effectively getting rid of the discussed quirky behavior. @AlanBateman How does that sound? Shall I go on with a PR for step 1.? - PR: https://git.openjdk.org/jdk/pull/11249
Re: RFR: JDK-8297299 SequenceInputStream should not use Vector [v2]
On Sun, 27 Nov 2022 17:49:25 GMT, Markus KARG wrote: >>> I think the public visibility of my Twitter account is not wide enough to >>> get really robust answers, unfortunately. >> >> One alternative is to search GitHub. It's amazing how fast they can search >> such a huge code corpus. >> >> Example: >> https://github.com/search?l=&q=%22new+SequenceInputStream%22+-filename%3ASequenceInputStreamTest.java&type=code >> >> One problem with GitHub search is that often you get a lot of duplicate >> matches. In the example above I have filtered out >> `SequenceInputStreamTest.java` which showed up a lot. > > Twitter survey results after one week: 5.4% of all voters have actually used > SIS before! This number is big enough to consider SIS as "not irrelevant". > OTOH this number is so small that any further poll whether this 5.4% actually > used the one-arg or two-arg constructor would not provide stable results. > Looking at the results of @jensli's GitHub search tells us that apparently > *all* of those findings are using 2-args. So we actually need to be careful, > not to break (up to) 5.4% of all existing applications. > > So what could be our plan to improve the live of that 5.4%? Some thoughts: > 1. We could safely add a `public InputStream concat(InputStream)` *default > method* to `InputStream` as a factory for *anonymous* sequences. This allows > any IS implementation to provide an optimized variant if needed/wanted, > falling back to the existing SIS *by default* as a starting point (until we > come up with a better fallback implementation). > 2. Having done that, we could deprecate-for-removal SIS with the clear note > that from now on `concat` should be used instead. This triggers people to > overhaul their existing code little by little, while we gain time to work on > a replacement for SIS. > 3. We then could provide a **non-public** `ConcatenatedInputStream` which > holds **exactly two** IS (`first` and `last` MUST be given), and make that > one the default implementation of `concat`, gaining a modern, > easy-to-maintain, fully tested, high-performance, sequence of streams. > 4. Some fine day we do remove SIS, effectively getting rid of the discussed > quirky behavior. > > @AlanBateman How does that sound? Shall I go on with a PR for step 1.? I did a search of 30M classes in 130k artifacts from Maven Central and it found usages in 1670 classes in 852 artifacts. I haven't dug into it too closely but it looks like the usages are mostly (if not all) the 2-arg constructor. If you have the cycles, then prototyping a _concat_ method (you could try both instance and static) would help inform on issues and whether this is worth doing. Adding methods to older, and widely used, classes such as InputStream will often encounter compatibility issues. For the instance method you'll need to think about security/snooping issues. I think it would also be useful to survey libraries that provide an equivalent and if there is much usage. SIS may be legacy but I don't think there is a case to deprecate it at this time. - PR: https://git.openjdk.org/jdk/pull/11249