Re: Discussion: easier Stream closing

2021-10-03 Thread Tagir Valeev
> At that point, the real question is why not call close() at the end of all 
> terminal operations by wrapping each implementation in a try-with-resources ?

Now, it will be incompatible change. Also, this won't work with
iterator() and spliterator() terminal operations. I believe it's best
described by Brian in this SO answer:
https://stackoverflow.com/a/34073306/4856258

Quoted here, for convenience:

> Yes, this was a deliberate decision. We considered both alternatives.
>
> The operating design principle here is "whoever acquires the resource should 
> release the resource". Files don't auto-close when you read to EOF; we expect 
> files to be closed explicitly by whoever opened them. Streams that are backed 
> by IO resources are the same.
>
> Fortunately, the language provides a mechanism for automating this for you: 
> try-with-resources. Because Stream implements AutoCloseable, you can do:
>
> try (Stream s = Files.lines(...)) {
>s.forEach(...);
> }
>
> The argument that "it would be really convenient to auto-close so I could 
> write it as a one-liner" is nice, but would mostly be the tail wagging the 
> dog. If you opened a file or other resource, you should also be prepared to 
> close it. Effective and consistent resource management trumps "I want to 
> write this in one line", and we chose not to distort the design just to 
> preserve the one-line-ness.

With best regards,
Tagir Valeev.

>
> Rémi
>
> - Original Message -
> > From: "Brian Goetz" 
> > To: "Tagir Valeev" , "core-libs-dev" 
> > 
> > Sent: Lundi 27 Septembre 2021 22:41:00
> > Subject: Re: Discussion: easier Stream closing
>
> > In Java 8, I think we were reluctant to lean on the idiom of "pass me a
> > lambda and I'll pass it the confined data"), because Java developers
> > were already struggling to understand lambdas.  But now that we're
> > mostly over that hurdle, API points that accept Consumer are
> > a powerful way to gain confinement (as we've seen in StackWalker, and
> > also in the explorations of ScopeLocal in Loom.)  So I have no objection
> > to this idiom.
> >
> > I have some concern that putting these side-by-side with existing
> > Files.walk(...) methods will be a source of confusion, creating two
> > different ways to do the same thing.
> >
> > I would rather not add anything new to BaseStream; it was a mistake to
> > expose at all, and I'd rather see it deprecated than add more to it.
> > However, adding it individually to the Stream implementations is
> > equivalent, and doesn't have this problem.
> >
> > The consumeAndClose approach is clever, in that it adds one API point
> > that works for all streams, rather than having to add a new API point
> > for every factory of a closeable stream; on the other hand, it is
> > dramatically less discoverable, and so requires more education to get
> > people to use it (and, as you say, has the exception problem.)
> >
> > On 9/26/2021 5:27 AM, Tagir Valeev wrote:
> >> Hello!
> >>
> >> With current NIO file API, a very simple problem to get the list of
> >> all files in the directory requires some ceremony:
> >>
> >> List paths;
> >> try (Stream stream = Files.list(Path.of("/etc"))) {
> >>  paths = stream.toList();
> >> }
> >>
> >> If we skip try-with-resources, we may experience OS file handles leak,
> >> so it's desired to keep it. Yet, it requires doing boring stuff. In
> >> this sense, classic new File("/etc").list() was somewhat more
> >> convenient (despite its awful error handling).
> >>
> >> I like how this problem is solved in StackWalker API [1]: it limits
> >> the lifetime of the Stream by requiring a user-specified function to
> >> consume it. After the function is applied, the stream is closed
> >> automatically. We could add a similar overload to the Files API. As an
> >> additional feature, we could also translate all UncheckedIOException's
> >> back to IOException for more uniform exception processing:
> >>
> >> /**
> >>   * @param dir  The path to the directory
> >>   * @param function  function to apply to the stream of directory files
> >>   * @param  type of the result
> >>   * @return result of the function
> >>   * @throws IOException if an I/O error occurs when opening the directory, 
> >> or
> >>   * UncheckedIOException is thrown by the supplied function.
> >>   */
> >> public static  T list(Path dir, F

Re: Discussion: easier Stream closing

2021-10-02 Thread Remi Forax
At that point, the real question is why not call close() at the end of all 
terminal operations by wrapping each implementation in a try-with-resources ?

Rémi

- Original Message -
> From: "Brian Goetz" 
> To: "Tagir Valeev" , "core-libs-dev" 
> 
> Sent: Lundi 27 Septembre 2021 22:41:00
> Subject: Re: Discussion: easier Stream closing

> In Java 8, I think we were reluctant to lean on the idiom of "pass me a
> lambda and I'll pass it the confined data"), because Java developers
> were already struggling to understand lambdas.  But now that we're
> mostly over that hurdle, API points that accept Consumer are
> a powerful way to gain confinement (as we've seen in StackWalker, and
> also in the explorations of ScopeLocal in Loom.)  So I have no objection
> to this idiom.
> 
> I have some concern that putting these side-by-side with existing
> Files.walk(...) methods will be a source of confusion, creating two
> different ways to do the same thing.
> 
> I would rather not add anything new to BaseStream; it was a mistake to
> expose at all, and I'd rather see it deprecated than add more to it.
> However, adding it individually to the Stream implementations is
> equivalent, and doesn't have this problem.
> 
> The consumeAndClose approach is clever, in that it adds one API point
> that works for all streams, rather than having to add a new API point
> for every factory of a closeable stream; on the other hand, it is
> dramatically less discoverable, and so requires more education to get
> people to use it (and, as you say, has the exception problem.)
> 
> On 9/26/2021 5:27 AM, Tagir Valeev wrote:
>> Hello!
>>
>> With current NIO file API, a very simple problem to get the list of
>> all files in the directory requires some ceremony:
>>
>> List paths;
>> try (Stream stream = Files.list(Path.of("/etc"))) {
>>  paths = stream.toList();
>> }
>>
>> If we skip try-with-resources, we may experience OS file handles leak,
>> so it's desired to keep it. Yet, it requires doing boring stuff. In
>> this sense, classic new File("/etc").list() was somewhat more
>> convenient (despite its awful error handling).
>>
>> I like how this problem is solved in StackWalker API [1]: it limits
>> the lifetime of the Stream by requiring a user-specified function to
>> consume it. After the function is applied, the stream is closed
>> automatically. We could add a similar overload to the Files API. As an
>> additional feature, we could also translate all UncheckedIOException's
>> back to IOException for more uniform exception processing:
>>
>> /**
>>   * @param dir  The path to the directory
>>   * @param function  function to apply to the stream of directory files
>>   * @param  type of the result
>>   * @return result of the function
>>   * @throws IOException if an I/O error occurs when opening the directory, or
>>   * UncheckedIOException is thrown by the supplied function.
>>   */
>> public static  T list(Path dir, Function, ?
>> extends T> function) throws IOException {
>>  try (Stream stream = Files.list(dir)) {
>>  return function.apply(stream);
>>  }
>>  catch (UncheckedIOException exception) {
>>  throw exception.getCause();
>>  }
>> }
>>
>> In this case, getting the List of all files in the directory will be
>> as simple as
>>
>> List paths = Files.list(Path.of("/etc"), Stream::toList);
>> This doesn't limit the flexibility. For example, if we need only file
>> names instead of full paths, we can do this:
>> List paths = Files.list(Path.of("/etc"), s ->
>> s.map(Path::getFileName).toList());
>>
>> Alternatively, we could enhance the BaseStream interface in a similar
>> way. It won't allow us to translate exceptions, but could be useful
>> for every stream that must be closed after consumption:
>>
>> // in BaseStream:
>>
>> /**
>>   * Apply a given function to this stream, then close the stream.
>>   * No further operation on the stream will be possible after that.
>>   *
>>   * @param function function to apply
>>   * @param  type of the function result
>>   * @return result of the function
>>   * @see #close()
>>   */
>> default  R consumeAndClose(Function function) {
>>  try(@SuppressWarnings("unchecked") S s = (S) this) {
>>  return function.apply(s);
>>  }
>> }
>>
>> The usage would be a little bit longer but still more pleasant than
>> explicit try-with-resour

Re: Discussion: easier Stream closing

2021-09-27 Thread Tagir Valeev
Hello, Brian! Thanks for your thoughts.

> The consumeAndClose approach is clever, in that it adds one API point that 
> works for all streams, rather than having to add a new API point for every 
> factory of a closeable stream; on the other hand, it is dramatically less 
> discoverable, and so requires more education to get people to use it (and, as 
> you say, has the exception problem.)

I think this could be solved via IDE/static analysis features. It
looks quite easy to statically rewrite the try-catch statement to
consumeAndClose. And, of course, static analyzers may recommend
consumeAndClose when no resource management is used at all on the
stream that is known to hold the resource.

I filed https://bugs.openjdk.java.net/browse/JDK-8274412 to move this forward.


Re: Discussion: easier Stream closing

2021-09-27 Thread Brian Goetz
In Java 8, I think we were reluctant to lean on the idiom of "pass me a 
lambda and I'll pass it the confined data"), because Java developers 
were already struggling to understand lambdas.  But now that we're 
mostly over that hurdle, API points that accept Consumer are 
a powerful way to gain confinement (as we've seen in StackWalker, and 
also in the explorations of ScopeLocal in Loom.)  So I have no objection 
to this idiom.


I have some concern that putting these side-by-side with existing 
Files.walk(...) methods will be a source of confusion, creating two 
different ways to do the same thing.


I would rather not add anything new to BaseStream; it was a mistake to 
expose at all, and I'd rather see it deprecated than add more to it.  
However, adding it individually to the Stream implementations is 
equivalent, and doesn't have this problem.


The consumeAndClose approach is clever, in that it adds one API point 
that works for all streams, rather than having to add a new API point 
for every factory of a closeable stream; on the other hand, it is 
dramatically less discoverable, and so requires more education to get 
people to use it (and, as you say, has the exception problem.)


On 9/26/2021 5:27 AM, Tagir Valeev wrote:

Hello!

With current NIO file API, a very simple problem to get the list of
all files in the directory requires some ceremony:

List paths;
try (Stream stream = Files.list(Path.of("/etc"))) {
 paths = stream.toList();
}

If we skip try-with-resources, we may experience OS file handles leak,
so it's desired to keep it. Yet, it requires doing boring stuff. In
this sense, classic new File("/etc").list() was somewhat more
convenient (despite its awful error handling).

I like how this problem is solved in StackWalker API [1]: it limits
the lifetime of the Stream by requiring a user-specified function to
consume it. After the function is applied, the stream is closed
automatically. We could add a similar overload to the Files API. As an
additional feature, we could also translate all UncheckedIOException's
back to IOException for more uniform exception processing:

/**
  * @param dir  The path to the directory
  * @param function  function to apply to the stream of directory files
  * @param  type of the result
  * @return result of the function
  * @throws IOException if an I/O error occurs when opening the directory, or
  * UncheckedIOException is thrown by the supplied function.
  */
public static  T list(Path dir, Function, ?
extends T> function) throws IOException {
 try (Stream stream = Files.list(dir)) {
 return function.apply(stream);
 }
 catch (UncheckedIOException exception) {
 throw exception.getCause();
 }
}

In this case, getting the List of all files in the directory will be
as simple as

List paths = Files.list(Path.of("/etc"), Stream::toList);
This doesn't limit the flexibility. For example, if we need only file
names instead of full paths, we can do this:
List paths = Files.list(Path.of("/etc"), s ->
s.map(Path::getFileName).toList());

Alternatively, we could enhance the BaseStream interface in a similar
way. It won't allow us to translate exceptions, but could be useful
for every stream that must be closed after consumption:

// in BaseStream:

/**
  * Apply a given function to this stream, then close the stream.
  * No further operation on the stream will be possible after that.
  *
  * @param function function to apply
  * @param  type of the function result
  * @return result of the function
  * @see #close()
  */
default  R consumeAndClose(Function function) {
 try(@SuppressWarnings("unchecked") S s = (S) this) {
 return function.apply(s);
 }
}

The usage would be a little bit longer but still more pleasant than
explicit try-with-resources:

List list = Files.list(Path.of("/etc")).consumeAndClose(Stream::toList);

On the other hand, in this case, we are free to put intermediate
operations outside of consumeAndClose, so we won't need to nest
everything inside the function. Only terminal operation should be
placed inside the consumeAndClose. E.g., if we need file names only,
like above, we can do this:

List list =
Files.list(Path.of("/etc")).map(Path::getFileName).consumeAndClose(Stream::toList);

What do you think?


[1] 
https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html#walk(java.util.function.Function)




Re: Discussion: easier Stream closing

2021-09-26 Thread Tagir Valeev
On Sun, Sep 26, 2021 at 6:13 PM Remi Forax  wrote:
>
> > List list =
> > Files.list(Path.of("/etc")).map(Path::getFileName).consumeAndClose(Stream::toList);
> >
> > What do you think?
>
> This one does not work because if Path::getFileName fails with an exception, 
> close() will not be called.

Well, as Path::getFileName is not actually executed here, only linked,
there are only a few chances
when map(Path::getFileName) may fail. Probably StackOverflowError,
OutOfMemoryError, or LinkageError.
Probably not so important for properly functioning application. I
mean, if something of these happens, leaking
file descriptor would be a lesser problem.

With best regards,
Tagir Valeev.

> What you are proposing here is equivalent to
>   try(var stream = Files.list(Path.of("/etc")).map(Path::getFileName)) {
> stream.toList()
>   }
>
> instead of
>   try(var stream = Files.list(Path.of("/etc"))) {
> stream..map(Path::getFileName).toList()
>   }
>
> regards,
> Rémi
>
> >
> >
> > [1]
> > https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html#walk(java.util.function.Function)


Re: Discussion: easier Stream closing

2021-09-26 Thread Stephen Colebourne
On Sun, 26 Sept 2021 at 10:29, Tagir Valeev  wrote:
> List list =
> Files.list(Path.of("/etc")).map(Path::getFileName).consumeAndClose(Stream::toList);
>
> What do you think?

I fully support this. We have our own utility to do this kind of
thing, as bugs with `Files` are common:
https://github.com/OpenGamma/Strata/blob/main/modules/collect/src/main/java/com/opengamma/strata/collect/io/SafeFiles.java

Stephen


Re: Discussion: easier Stream closing

2021-09-26 Thread Remi Forax
- Original Message -
> From: "Tagir Valeev" 
> To: "core-libs-dev" 
> Sent: Dimanche 26 Septembre 2021 11:27:58
> Subject: Discussion: easier Stream closing

> Hello!
> 
> With current NIO file API, a very simple problem to get the list of
> all files in the directory requires some ceremony:
> 
> List paths;
> try (Stream stream = Files.list(Path.of("/etc"))) {
>paths = stream.toList();
> }
> 
> If we skip try-with-resources, we may experience OS file handles leak,
> so it's desired to keep it. Yet, it requires doing boring stuff. In
> this sense, classic new File("/etc").list() was somewhat more
> convenient (despite its awful error handling).
> 
> I like how this problem is solved in StackWalker API [1]: it limits
> the lifetime of the Stream by requiring a user-specified function to
> consume it. After the function is applied, the stream is closed
> automatically. We could add a similar overload to the Files API. As an
> additional feature, we could also translate all UncheckedIOException's
> back to IOException for more uniform exception processing:
> 
> /**
> * @param dir  The path to the directory
> * @param function  function to apply to the stream of directory files
> * @param  type of the result
> * @return result of the function
> * @throws IOException if an I/O error occurs when opening the directory, or
> * UncheckedIOException is thrown by the supplied function.
> */
> public static  T list(Path dir, Function, ?
> extends T> function) throws IOException {
>try (Stream stream = Files.list(dir)) {
>return function.apply(stream);
>}
>catch (UncheckedIOException exception) {
>throw exception.getCause();
>}
> }
> 
> In this case, getting the List of all files in the directory will be
> as simple as
> 
> List paths = Files.list(Path.of("/etc"), Stream::toList);
> This doesn't limit the flexibility. For example, if we need only file
> names instead of full paths, we can do this:
> List paths = Files.list(Path.of("/etc"), s ->
> s.map(Path::getFileName).toList());


It seems a nice addition, forgetting the try-with-resources is a very common 
source of bugs in my students code.
Part of the issue is that there is no way to easily test if there are still 
open descriptors during tests.
I wonder if the VM can not offer such test using a special flag 
(-XX:+CheckPendingFileDescriptor) or something like that.

> 
> Alternatively, we could enhance the BaseStream interface in a similar
> way. It won't allow us to translate exceptions, but could be useful
> for every stream that must be closed after consumption:
> 
> // in BaseStream:
> 
> /**
> * Apply a given function to this stream, then close the stream.
> * No further operation on the stream will be possible after that.
> *
> * @param function function to apply
> * @param  type of the function result
> * @return result of the function
> * @see #close()
> */
> default  R consumeAndClose(Function function) {
>try(@SuppressWarnings("unchecked") S s = (S) this) {
>return function.apply(s);
>}
> }
> 
> The usage would be a little bit longer but still more pleasant than
> explicit try-with-resources:
> 
> List list = Files.list(Path.of("/etc")).consumeAndClose(Stream::toList);
> 
> On the other hand, in this case, we are free to put intermediate
> operations outside of consumeAndClose, so we won't need to nest
> everything inside the function. Only terminal operation should be
> placed inside the consumeAndClose. E.g., if we need file names only,
> like above, we can do this:
> 
> List list =
> Files.list(Path.of("/etc")).map(Path::getFileName).consumeAndClose(Stream::toList);
> 
> What do you think?

This one does not work because if Path::getFileName fails with an exception, 
close() will not be called.
What you are proposing here is equivalent to
  try(var stream = Files.list(Path.of("/etc")).map(Path::getFileName)) {
stream.toList()
  }

instead of
  try(var stream = Files.list(Path.of("/etc"))) {
stream..map(Path::getFileName).toList()
  }

regards,
Rémi

> 
> 
> [1]
> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html#walk(java.util.function.Function)


Discussion: easier Stream closing

2021-09-26 Thread Tagir Valeev
Hello!

With current NIO file API, a very simple problem to get the list of
all files in the directory requires some ceremony:

List paths;
try (Stream stream = Files.list(Path.of("/etc"))) {
paths = stream.toList();
}

If we skip try-with-resources, we may experience OS file handles leak,
so it's desired to keep it. Yet, it requires doing boring stuff. In
this sense, classic new File("/etc").list() was somewhat more
convenient (despite its awful error handling).

I like how this problem is solved in StackWalker API [1]: it limits
the lifetime of the Stream by requiring a user-specified function to
consume it. After the function is applied, the stream is closed
automatically. We could add a similar overload to the Files API. As an
additional feature, we could also translate all UncheckedIOException's
back to IOException for more uniform exception processing:

/**
 * @param dir  The path to the directory
 * @param function  function to apply to the stream of directory files
 * @param  type of the result
 * @return result of the function
 * @throws IOException if an I/O error occurs when opening the directory, or
 * UncheckedIOException is thrown by the supplied function.
 */
public static  T list(Path dir, Function, ?
extends T> function) throws IOException {
try (Stream stream = Files.list(dir)) {
return function.apply(stream);
}
catch (UncheckedIOException exception) {
throw exception.getCause();
}
}

In this case, getting the List of all files in the directory will be
as simple as

List paths = Files.list(Path.of("/etc"), Stream::toList);
This doesn't limit the flexibility. For example, if we need only file
names instead of full paths, we can do this:
List paths = Files.list(Path.of("/etc"), s ->
s.map(Path::getFileName).toList());

Alternatively, we could enhance the BaseStream interface in a similar
way. It won't allow us to translate exceptions, but could be useful
for every stream that must be closed after consumption:

// in BaseStream:

/**
 * Apply a given function to this stream, then close the stream.
 * No further operation on the stream will be possible after that.
 *
 * @param function function to apply
 * @param  type of the function result
 * @return result of the function
 * @see #close()
 */
default  R consumeAndClose(Function function) {
try(@SuppressWarnings("unchecked") S s = (S) this) {
return function.apply(s);
}
}

The usage would be a little bit longer but still more pleasant than
explicit try-with-resources:

List list = Files.list(Path.of("/etc")).consumeAndClose(Stream::toList);

On the other hand, in this case, we are free to put intermediate
operations outside of consumeAndClose, so we won't need to nest
everything inside the function. Only terminal operation should be
placed inside the consumeAndClose. E.g., if we need file names only,
like above, we can do this:

List list =
Files.list(Path.of("/etc")).map(Path::getFileName).consumeAndClose(Stream::toList);

What do you think?


[1] 
https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/StackWalker.html#walk(java.util.function.Function)