> On 15 Apr 2021, at 08:33, Сергей Цыпанов <[email protected]> wrote:
>
> Hello,
>
> buffering with j.i.BufferedInputStream is a handy way to improve performance
> of IO operations.
> However in many cases buffering is redundant. Consider this code snippet from
> Spring Framework:
>
> static ClassReader getClassReader(Resource rsc) throws Exception {
> try (var is = new BufferedInputStream(rsc.getInputStream())) {
> return new ClassReader(is);
> }
> }
>
> Interface Resource has lots of implementations some of them return buffered
> InputStream,
> others don't, but all of them get wrapped with BufferedInputStream.
>
> Apart from obvious misuses like BufferedInputStream cascade such wrapping is
> not necessary,
> e.g. when we read a huge file using FileInputStream.readAllBytes(),
> in others cases it's harmful e.g. when we read a small (comparing to the
> default
> size of buffer in BufferedInputStream) file with readAllBytes() or
> when we read from ByteArrayInputStream which is kind of buffered one by its
> nature.
>
> I think an instance of InputStream should decide itself whether it requires
> buffering,
> so I suggest to add a couple of methods into j.i.InputStream:
>
> // subclasses can override this
> protected boolean needsBuffer() {
> return true;
> }
>
> public InputStream buffered() {
> return needsBuffer() ? new BufferedInputStream(this) : this;
> }
>
> this allows subclasses of InputStream to override needsBuffer() to declare
> buffering redundancy.
> Let's imagine we've overridden needsBuffer() in BufferedInputStream:
>
> public class BufferedInputStream {
> @Override
> public boolean needsBuffer() {
> return true;
> }
> }
>
> then the code we've mentioned above should be rewritten as
>
> static ClassReader getClassReader(Resource rsc) throws Exception {
> try (var is = rsc.getInputStream().buffered()) {
> return new ClassReader(is);
> }
> }
>
> preventing cascade of BufferedInputStreams.
>
When I read this part
> There are also cases when the need for buffering depends on the way how we
> read from InputStream:
>
> new FileInputStream(file).buffered().readAllBytes() // buffering is redundant
my knee-jerk reaction was that a better solution likely lies with introducing a
marker interface and selectively implementing it as opposed to adding two new
methods to the existing class and selectively overriding them. Let's call this
interface java.io.Buffered: Bufferred is to InputStream as RandomAccess is to
List.
Just to be clear: I'm not proposing to actually do this. It's just a thought.
-Pavel
> var data = new DataInputStream(new FileInputStream(file).buffered())
> for (int i = 0; i < 1000; i++) {
> data.readInt(); // readInt() does 4 calls to
> InputStream.read() so buffering is needed
> }
>
> here if FileInputStream.needsBuffer() is overridden and returns false
> (assuming most of reads from it are bulk)
> then we won't have buffering for DataInputStream. To avoid this we can also
> add InputStream.buffered(boolean enforceBuffering) to have manual control.
>
> To sum up, my proposal is to add those methods to InputStream:
>
> protected static boolean needsBuffer() {
> return true;
> }
>
> public InputStream buffered() {
> return buffered(needsBuffer());
> }
>
> public InputStream buffered(boolean enforceBuffering) {
> return enforceBuffering ? new BufferedInputStream(this) : this;
> }
>
> What do you think?
>
> With best regards,
> Sergey Tsypanov