Hi Joe,
Looks good to me.
Typo: StringCoding.java:1026 "unmappble" (no new webrev needed)
Regards, Roger
On 6/12/2018 12:52 PM, Joe Wang wrote:
Hi all,
It's been a while since the 1st round of reviews. Thank you all for
the valuable comments and suggestions! Note that Roger's last
response went to nio-dev, but not core-libs-dev, I've therefore
attached it below.
CSR [2]: the CSR is now approved. Note the write method has been
changed to writeString.
Impl [3]: for performance reason and the different requirement from
byte-string conversion for malformed/unmappable character handing, the
implementation uses a specific method separate from the existing ones.
Both Sherman and Alan preferred specific method than adding parameters
to the APIs that might be error prone. Thanks Sherman for the code!
JMH benchmark tests showed the new read method performs better than an
operation that reads the bytes and convert to string. The gains start
to be significant even at quite reasonable sizes (< 10K).
[1] JBS: https://bugs.openjdk.java.net/browse/JDK-8201276
[2] CSR: https://bugs.openjdk.java.net/browse/JDK-8202055
[3] webrevs: http://cr.openjdk.java.net/~joehw/jdk11/8201276/webrev/
Please review.
Thanks,
Joe
On 5/15/18, 7:51 AM, Roger Riggs wrote:
Hi Joe, et. al.
My $.02 on line separators:
- We should avoid clever tricks trying to solve problems that are
infrequent such as cross OS file line operations.
Most files will be read/written on a single system so the line
separators will be as expected.
- Have/use APIs that split lines consistently accepting both line
separators so developer code can be agnostic to line separators.
aka BufferedReader.readLine for developers that are processing
the contents *as lines*.
Those other methods already exist; if there are any gaps in line
oriented processing that's a separate task.
- These new file methods are defined to handle Charset
encoding/decoding and buffering.
Since there are other methods to deal with files as lines these
methods should not look for or break to lines.
- Performance: adding code to look for line characters will slow it
down and in most cases would have no impact
since the line endings will match the system.
- The strings passed to writeString (CharSequence) should have been
constructed using the proper line separators.
There are any number of methods that insert the os specific line
terminator and its easy to write File.separator as needed.
As for the method naming:
I'd prefer "writeString" as a familiar term that isn't stretched too
far by making the argument be a CharSequence.
its fine to call a CharSequence a string (with a lower case s).
$.02, Roger
On 4/27/18 6:33 PM, Joe Wang wrote:
On 4/27/2018 12:50 PM, fo...@univ-mlv.fr wrote:
----- Mail original -----
De: "Joe Wang" <huizhe.w...@oracle.com>
À: "Remi Forax" <fo...@univ-mlv.fr>, "Alan Bateman"
<alan.bate...@oracle.com>
Cc: "nio-dev" <nio-...@openjdk.java.net>, "core-libs-dev"
<core-libs-dev@openjdk.java.net>
Envoyé: Vendredi 27 Avril 2018 21:21:01
Objet: Re: RFR (JDK11/NIO) 8201276: (fs) Add methods to Files for
reading/writing a string from/to a file
Hi Rémi, Alan,
Hi Joe,
I'm not sure we'd want to replace system dependent line separators
with
'\n'. There are cases as you described where the replacement makes
sense. However, there are other cases too. For example, the
purpose is
to read, edit a text file and then write it back. If this is on
Windows,
and if line separators are replaced with '\n', that would cause the
resulting file not formatted properly in for example Notepad.
There are
cases where users write text files on Windows using Java, and only
found
the lines were not separated in Notepad.
I agree that why the counterpart of readString() that write the
string should inserts the platform's line separator.
BTW, i just found that those methods are not named writeString, or
writeCharSequence, i think they should.
While readString() does not modify the original content (e.g. by
replacing the platform's line separator with '\n'), write(String)
won't either, by adding extra characters such as the line separator.
I would think interfaces shall read along with the parameters.
readString(Path) == read as a String from the specified Path
(one could argue for readToString, readAsString, but we generally
omit the preps)
write(Path, CharSequence) == write the CharSequence to the file,
since CharSequence is already in the method signature as a
parameter, we probably don't want to add that to the name, otherwise
it would read like repeating the word CharSequence.
It is in a similar situation as write(Path, Iterable) where it was
defined as writeLines(Path, Iterable).
Files.write(Path, Iterable) is also specified to terminate each line
with the platform's line separator. If readString does the
replacement,
it would be inconsistent.
Anyway, if we look for consistency the methods writeCharSequence
should transform the \n in the CharSequence to the platform's line
separator.
Files.write(Path, Iterable) is is not a counterpart of
readString(), it's consistent with Files.lines() or
Files.readLines() (or BufferedReader.readLine()) that all suppress
the line separators. Anyway, Files.write(path,
readString(path)::line) will be consistent if you replace the line
separators or not because String.line() suppresses the line
separators.
readString pairs with write(String), therefore it's more like
Files.write(path, readString(path)) than readString(path)::line. The
use case:
String s = Files.read(path);
Files.write(path, s.replace("/config/location", "/new/location"));
would then work as expected.
These two methods are one-off (open/read/write/close file)
operation. write(String) therefore is not intended for
adding/appending multiple Strings from other operations such as
String.line(). If an app needs to put the result of String.line() or
any other processes into a file using this method, it needs to take
care of adding the line separator itself when necessary. "when
necessary" would be a judgement call by the developer.
That said, if there's a consensus on the idea of terminating the
string with a line separator, then readString method would need to
strip it off to go with the write method.
I would think readString behaves like readAllBytes, that would
simply read all content faithfully.
readString does an interpolation due to the Charset so it's not the
real content, again, the idea is that developers should not have to
care about the platform's line separators (they have more important
things to think).
The other solution is to just not introduce those new methods,
after all Files.lines().collect(Collectors.joining("\n")) already
does the job, no ?
While there are many ways to do it, none is as straight-forward.
"Read (entire) file to String"/"Write String to file" are popular
requests from users. Read to string -> do some String manipulation
-> write it back is such a simple use case, it really needs to be
very easy to do as illustrated in the above code snippet.
Cheers,
Joe
Best,
Joe
regards,
Rémi
On 4/27/2018 4:43 AM, fo...@univ-mlv.fr wrote:
Hi Alan,
People do not read the documentation.
So adding something in the documentation about when a method
should be used or
not is never a solution.
Here the user want a String and provides a charset so you have no
way but to
decode the content to substitute the line separator.
cheers,
Rémi
----- Mail original -----
De: "Alan Bateman" <alan.bate...@oracle.com>
À: "Remi Forax" <fo...@univ-mlv.fr>, "Joe Wang"
<huizhe.w...@oracle.com>
Cc: "nio-dev" <nio-...@openjdk.java.net>, "core-libs-dev"
<core-libs-dev@openjdk.java.net>
Envoyé: Vendredi 27 Avril 2018 13:34:12
Objet: Re: RFR (JDK11/NIO) 8201276: (fs) Add methods to Files for
reading/writing a string from/to a file
On 27/04/2018 12:29, Remi Forax wrote:
I think that having a readString that includes OS dependent
line separators is a
mistake,
Java does a great job to try to shield the developer from the
kind of things
that makes programs behave differently on different platforms.
readString should subtitute (\r)?\n to \n otherwise either
people will do a call
replace() which is less efficient or will learn the lesson the
hard way.
raw string literal does the same substitution for the same reason.
Yes, there are several discussion points around this and
somewhat timely
with multi-string support.
One thing that I think Joe will need in this API is some note to
make it
clearer what the intended usage is (as I think the intention is
simple
cases with mostly single lines of text).
-Alan.