Re: [cp-patches] Minor refactoring of PrintStre am

Ivan Maidanski Thu, 17 Jun 2010 23:14:54 -0700

Wed, 16 Jun 2010 20:44:27 +0100 Andrew John Hughes <ahug...@redhat.com>:


> 2010/6/16 Ivan Maidanski <iv...@mail.ru>:
> > Hi!
> >
> > Tue, 15 Jun 2010 21:58:02 +0200 Mario Torre <neug...@limasoftware.net>:
> >
> >> Il giorno mar, 15/06/2010 alle 22.59 +0400, Ivan Maidanski ha scritto:
> >> > Hi!
> >> >
> >> > How fast!
> >> >
> >> > Tue, 15 Jun 2010 19:07:18 +0100 Andrew John Hughes <ahug...@redhat.com>:
> >> >
> >> > > 2010/6/15 Ivan Maidanski <iv...@mail.ru>:
> >> > > > Hi!
> >> > > >
> >> > > > Here, the proposed changes of PrintStream are:
> >> > > > 1. to be closer to the RI behavior (see changes for line_separator, 
> >> > > > out.flush(), append());
> >> > >
> >> > > While some of these are sensible changes, how are you discovering the
> >> > > 'RI behaviour' in the cases where it wouldn't be visible from simple
> >> > > runtime testing?
> >> >
> >> > There are a lot of places in the Classpath saying that while the JDK
> >> > docs say nothing (or, even, the opposite) about some feature, the RI
> >> > behavior is followed. (For the samples, search for "RI" word in the
> >> > Classpath source files.)
> >> >
> >> > Which one can't be visible, you think, from a test?
> >>
> >> Hi Ivan,
> >>
> >> We can't look at the OpenJDK code and then easily make the Classpath one
> >> match unless where obvious fixes are in place (if RI is the closed one,
> >> we can *never* make this even of obvious fixes), can you please provide
> >> us the tests you wrote to detect the faulty functionality?
> >
> > Let me explain... I ran some application on a JDK and ran it on a VM 
> > equipped with the Classpath, and observed some difference in the behavior 
> > in that app. I don't think exploiting obscure implementation details of JDK 
> > is good but JDK is somewhat the industry standard, so I chose the way of 
> > reflecting JDK behavior. I don't look into the proprietary sources, I'll 
> > supply you with the tests (I assume comparing the two products using tests 
> > is legal).
> 
> That's how we usually do it as explained below.
> 
> What do you mean by 'JDK'?  Oracle's proprietary JDK? OpenJDK?
> Something else?  If it's just being used as a binary, it doesn't
> matter as much but it would be good to know.

This one: http://java.sun.com/javase/downloads/

> >> > > > 2. some code optimizations (like removal of writeChars() pos/len 
> >> > > > parameters).
> >> > > >
> >> > > > ChangeLog entries:
> >> > > >        * java/io/PrintStream.java (line_separator): Convert static 
> >> > > > field to
> >> > > >        an instance one (to match the RI functionality).
> >> > >
> >> > > I don't see why you would want to make this an instance field.
> >> >
> >> > only to match the RI functionality ;) - test how BufferedWriter is 
> >> > working in the RI.
> >>
> >> I can't see why you want this change. This field is private, and I think
> >> its value never changes during the lifetime of the VM, Actually, even in
> >> the JDK the path separator is a field in the File class that is cached
> >> at class initialisation time, and finally, the field is static anyway. I
> >> can't really see how this impacts the functionality, can you please
> >> provide a test?
> >
> > See below.
> >
> >> > > >        * java/io/PrintStream.java (error_occurred): Remove an 
> >> > > > unnecessary
> >> > > >        initialization to false.
> >> > >
> >> > > I think having it explicit does make things clearer.
> >> >
> >> > Ok. (Although, I don't think that explicitly setting it to false after 
> >> > it was implicitly set to the same value makes things clearer.)
> >>
> >> I don't completely share Andrew's point of view on this one, and
> >> although in this specific case the line of code is harmless, I believe
> >> that we should always trust the VM to initialise class and instance
> >> fields for us to the default values (which is mandatory anyway), so your
> >> change is OK for me.
> >> > >
> >> > > >        * java/io/PrintStream.java (PrintStream): Use
> >> > > >        "new FileOutputStream(fileName)" instead of
> >> > > >        "new FileOutputStream(new File(fileName))".
> >> > >
> >> > > Ok (there are two instances of this, your ChangeLog only lists one)
> >> >
> >> > I know that. What's the preferred way of listing such things? specify 
> >> > with the arguments? I'll change the log.
> >>
> >> Not sure about this, I guess we use a couple of different ways, I would
> >> add in parenthesis the arguments of the two constructors.
> >>
> >> > > >        * java/io/PrintStream.java (PrintStream): Throw NPE if out is 
> >> > > > null.
> >> > >
> >> > > Where is this specified? The Javadoc for this don't specify an NPE is 
> >> > > thrown.
> >> > > If OpenJDK does, then the documentation needs updating.  We should 
> >> > > include
> >> > > a message saying what the problem was.
> >> >
> >> > the undocumented RI behavior - throw NPE for a null stream (instead of 
> >> > UnsupportedEncodingException one).
> >>
> >> I would add a comment so (and, again, a test in Mauve).
> >
> > Adding a comment is ok but I don't think adding a test for an poorly or 
> > undocumented behavior is good. Only just to discover/show the difference...
> >
> >> > >
> >> > > >        * java/io/PrintStream.java (print): Call out.flush() only
> >> > if needed
> >> > > >        (to match the RI).
> >> > >
> >> > > Where are these extra conditions coming from?
> >> >
> >> > Again, the RI behaves differently (the RI flushes the underlying
> >> > stream only in case of '\n' while the Classpath flushes self (not the
> >> > underlying stream directly) every time (provided auto_flush is on).
> >>
> >> Can you provide some minimal functional test to prove that different
> >> behaviour affects user code (sometime those things do affect
> >> functionality, but at times not)?
> >
> > Ok.
> >
> > 1. test case for flush() -> out.flush() replacement:
> >
> > import java.io.*;
> > class MyPrintStream extends PrintStream {
> >  MyPrintStream(OutputStream out) {
> >    super(out, true);
> >  }
> >
> >  public void flush() {
> >    throw new Error("flush shouldn't be called");
> >  }
> > }
> >
> > class TestCase1 {
> >
> >  public static void main(String[] args) {
> >    (new MyPrintStream(System.out)).println();
> >    System.out.println("PASSED");
> >  }
> > }
> >
> > 2. test case for flushing only if the data contains '\n':
> >
> > import java.io.*;
> > class MyPrintStream extends PrintStream {
> >  MyPrintStream(OutputStream out) {
> >    super(out);
> >  }
> >
> >  public void flush() {
> >    throw new Error("shouldn't be called");
> >  }
> > }
> >
> > class TestCase2 {
> >
> >  public static void main(String[] args) {
> >    PrintStream out = new PrintStream(new MyPrintStream(System.out), true);
> >    out.print("");
> >    out.print(new char[] {});
> >  }
> > }
> >
> > I've just looked into OpenJDK source and found its write() implementation 
> > is not optimal - see line 476 in 
> > http://hg.openjdk.java.net/nio/nio/jdk/file/a11b4063024f/src/share/classes/java/io/PrintStream.java
> >
> > You can see that a break statement should be placed after out.flush().
> >
> > IMHO, my code is a bit cleaner here. Or not? :
> > +        if (auto_flush
> > +            && (println || chars == line_separator
> > +                || lastIndexOfNewLine(chars) >= 0))
> > +          out.flush();
> >
> >>
> >> > > The append change looks good as do the flush() changes.  I don't see
> >> > > the reason for changing the ordering of the && statement as checking
> >> > > auto_flush is the cheaper check.
> >> >
> >> > Ok. Reordering here is not a big deal for performance (IMHO, local var 
> >> > fetch should be faster than global memory access).
> >>
> >> Well, this depends...
> >
> > Depends on, but not limited to, the probabilities... ;)
> > Ok.
> >
> >>
> >> Cheers,
> >> Mario
> >
> > Tue, 15 Jun 2010 22:09:28 +0200 Mario Torre <neug...@limasoftware.net>:
> >
> >> Il giorno mar, 15/06/2010 alle 21.58 +0200, Mario Torre ha scritto:
> >>
> >> > Actually, even in the JDK the path separator is a field in the File 
> >> > class that is cached
> >>
> >> Ops, forget about this, this is not path, but line separator...
> >>
> >> Anyway, the rest of the argument stays, I don't see how this can affect
> >> the behaviour, and I don't think a line separator changes in any system
> >> I have ever seen at runtime.
> >
> > Agreed. I'd like someone to tell the OpenJDK group about this. ;)
> >
> > The test case (for Unix):
> >
> >  new PrintStream(System.out).println(); // prints "\n"
> >  System.setProperty("line.separator", "\r\n");
> >  new PrintStream(System.out).println(); // prints "\r\n"
> >
> >>
> >> Cheers,
> >> Mario
> >
> > Wed, 16 Jun 2010 13:37:45 +0100 Andrew John Hughes <ahug...@redhat.com>:
> >
> >> 2010/6/15 Ivan Maidanski <iv...@mail.ru>:
> >> > Hi!
> >> >
> >> > How fast!
> >> >
> >> > Tue, 15 Jun 2010 19:07:18 +0100 Andrew John Hughes <ahug...@redhat.com>:
> >> >
> >> >> 2010/6/15 Ivan Maidanski <iv...@mail.ru>:
> >> >> > Hi!
> >> >> >
> >> >> > Here, the proposed changes of PrintStream are:
> >> >> > 1. to be closer to the RI behavior (see changes for line_separator, 
> >> >> > out.flush(), append());
> >> >>
> >> >> While some of these are sensible changes, how are you discovering the
> >> >> 'RI behaviour' in the cases where it wouldn't be visible from simple
> >> >> runtime testing?
> >> >
> >> > There are a lot of places in the Classpath saying that while the JDK 
> >> > docs say nothing (or, even, the opposite) about some feature, the RI 
> >> > behavior is followed. (For the samples, search for "RI" word in the 
> >> > Classpath source files.)
> >> >
> >> > Which one can't be visible, you think, from a test?
> >>
> >> I'm well aware of our general guidelines on implementation
> >> differences, though, as with anything, things differ on a case-by-case
> >> basis.  This doesn't mean that Classpath becomes a copy of the 'RI' or
> >> that the RI's way of doing things is sacrosanct; it means that if you
> >> have a testcase for some behaviour which is unspecified in the
> >> Javadocs and this is noticeably different for users, then it should
> >> change to the reference implementation behaviour.
> >
> > Ok.
> >
> >>
> >> Mario touched on my main concern which was about what you are
> >> referring to as the RI and how you are testing it.  Looking at
> >> proprietary source code is not allowed:
> >> http://www.gnu.org/software/classpath/docs/cp-hacking.html#SEC2.  The
> >> situation is now a little different than it was in previous years, as
> >> there are now effectively two reference implementations: the
> >> proprietary JDK binaries provided by Oracle (the 'old RI') and
> >> OpenJDK, which is under a number of FOSS licenses.  It's still
> >> preferable not to consult the source code if possible, but it is
> >> possible to consult the OpenJDK source code and still work on
> >> Classpath.  That doesn't mean chunks should be copied verbatim for
> >> OpenJDK.  I believe it does mean you can get the gist of how something
> >> undocumented works in OpenJDK and then reimplement the ideas in GNU
> >> Classpath, something which wasn't possible before.  I am not a lawyer,
> >> and any situation where a substantial amount of OpenJDK code was
> >> referenced would probably involve contacting FSF legal to be sure.
> >
> > Ok. Let me not consult the OpenJDK sources.
> >
> >>
> >> There's obviously no problem with running a test against OpenjDK and
> >> GNU Classpath+VM to determine the difference.  If this is what has
> >> been done, we need to see the tests and preferably have them
> >> integrated into Mauve for future reference.
> >
> > Ok.
> >
> >>
> >> I'm sorry if this is a little draconian, but we have to be careful of
> >> what sources are being consulted for work on GNU Classpath to avoid
> >> getting sued.
> >>
> >> >
> >> >>
> >> >> > 2. some code optimizations (like removal of writeChars() pos/len 
> >> >> > parameters).
> >> >> >
> >> >> > ChangeLog entries:
> >> >> >        * java/io/PrintStream.java (line_separator): Convert static 
> >> >> > field to
> >> >> >        an instance one (to match the RI functionality).
> >> >>
> >> >> I don't see why you would want to make this an instance field.
> >> >
> >> > only to match the RI functionality ;) - test how BufferedWriter is 
> >> > working in the RI.
> >>
> >> Can you provide such a test?  I can't immediately see why this would
> >> make a difference given the field is final.  It seems like someone
> >> just forgot to make the field static in 'the RI'.  It's also (more
> >> worryingly) not clear how you determined this about the RI.  I guess
> >> reflection would work, but still seems dubious unless there is good
> >> reason.
> >
> > No, no reflection is needed (just use System.setProperty()). See the test 
> > above.
> >
> > I could agree there's no need to change this thing. (Hopefully, someone 
> > will change the OpenJDK behavior.)
> >
> 
> Ok, for some reason I was reading this as it was just setting the
> value to '\n' not reading a property.
> The change does make sense.  If you change the property, you'd expect
> PrintStreams created after to reflect that.
> 
> >>
> >> >
> >> >>
> >> >> >        * java/io/PrintStream.java (error_occurred): Remove an 
> >> >> > unnecessary
> >> >> >        initialization to false.
> >> >>
> >> >> I think having it explicit does make things clearer.
> >> >
> >> > Ok. (Although, I don't think that explicitly setting it to false after 
> >> > it was implicitly set to the same value makes things clearer.)
> >> >
> >>
> >> It's clearer in the sense that you otherwise have to remember that
> >> booleans default to being false in Java.
> >> I would happily let through the line without the initialiser in a new
> >> patch.  I just don't see any reason to change this from how it already
> >> is.
> >
> > Ok. Not a big deal.
> >
> >>
> >> >>
> >> >> >        * java/io/PrintStream.java (PrintStream): Use
> >> >> >        "new FileOutputStream(fileName)" instead of
> >> >> >        "new FileOutputStream(new File(fileName))".
> >> >>
> >> >> Ok (there are two instances of this, your ChangeLog only lists one)
> >> >
> >> > I know that. What's the preferred way of listing such things? specify 
> >> > with the arguments? I'll change the log.
> >> >
> >>
> >> I don't understand what Mario is referring to with us using 'a couple
> >> of different ways'.  The format is specified in
> >> http://www.gnu.org/software/classpath/docs/cp-hacking.html#SEC13 and
> >> there are many examples in the existing ChangeLogs in GNU Classpath.
> >> You list each file only once, and under it you list each method in
> >> brackets, with a brief description of the change made.  With the one
> >> you listed, there seem to be mismatches between the ChangeLog and the
> >> patch, with me seeing patch changes that weren't explained, and
> >> over-repetition of the filename.
> >>
> >> e.g.
> >>
> >> 2009-02-05  Mark Wielaard  <m...@klomp.org>
> >>
> >>         PR classpath/38912:
> >>       * gnu/xml/stream/XMLParser.java:
> >>         (getLocalName()): Respect stringInterning.
> >>       (getName()): Likewise.
> >>         (getPrefix()): Likewise.
> >>
> >> (deliberately chose a good one that wasn't mine ;-) )
> >
> > Ok. One question: is it possible to use "(getName(), getPrifix()): 
> > Likewise." for the above sample?
> >
> 
> I think it's clearer to write them separately as in the example.
> emacs has a nice mode for editing ChangeLogs with correct syntax
> highlighting.

Ok.

> >> >> >        * java/io/PrintStream.java (PrintStream): Throw NPE if out is 
> >> >> > null.
> >> >>
> >> >> Where is this specified? The Javadoc for this don't specify an NPE is 
> >> >> thrown.
> >> >> If OpenJDK does, then the documentation needs updating.  We should 
> >> >> include
> >> >> a message saying what the problem was.
> >> >
> >> > the undocumented RI behavior - throw NPE for a null stream (instead of 
> >> > UnsupportedEncodingException one).
> >>
> >> Again, this needs a testcase.  When I mentioned the documentation, I
> >> was thinking more that OpenJDK probably needs a fix here too if the
> >> NPE isn't mentioned.
> >
> > Agreed.
> >
> > The test case code:
> >
> >  boolean passed = false;
> >  try {
> >    new PrintStream(null);
> >  } catch (NullPointerException e) {
> >    passed = true;
> >  }
> >  System.out.println((passed ? "PASSED" : "FAILED")
> >        + ": new PrintStream(null)");
> >
> >>
> >> >
> >> >>
> >> >> >        * java/io/PrintStream.java (writeChars, print, println): Don't 
> >> >> > pass
> >> >> >        "pos" and "len" parameters to writeChars() (as they are always 
> >> >> > set
> >> >> >        to 0 and buf.length, respectively).
> >> >>
> >> >> I think the original version should stay.  It's clearer and avoids a
> >> >> method call.
> >> >
> >> > Why is it clearer? (Is writeChars(char[] buf, int offset, int count) 
> >> > clearer than writeChars(char[] buf) (equipped with the appropriate doc)?
> >> > And what's the method call is avoided?
> >>
> >> If those are the default values, then I presume writeChars(char[])
> >> just results in the same call to writeChars(char[], 0, len) in another
> >> method.
> >> It's clearer in the sense you know the arguments being used rather
> >> than them being implicit.
> >
> > Still not clear. writeChars(char[] buf) writes all characters in buf. Are 
> > there any implicit arguments?
> >
> 
> You are saying by your change that writeChars(buf) == writeChars(buf,
> 0, buf.length); 0 and buf.length would be implicit in the first call.
> I didn't realise writeChars was local to this class, but some other
> method implemented as:
> 
> writeChars(char[] buf)
> {
> writeChars(buf, 0, buf.length);
> }
> 
> which would introduce another method call.  As these are private
> methods and all calls pass 0 and length as arguments, yes you can do
> the suggested change.  I guess pos and length were used in the past
> and the method was never simplified when this changed.
> 
> You could actually now make writeChars(char[]) call writeChars(String)
> to simplify further.

Yes.

> Didn't spot this before:
> 
> -  public synchronized void print (char ch)
> +  public void print (char ch)
> 
> Why is synchronized being dropped?

Yes, I've overlooked this. Synchronized is redundant here because it calls 
private print() which is synchronized.

> Also:
> 
> -        out.write (oneByte & 0xff);
> +        out.write (oneByte);
> 
> seems wrong; oneByte is an int so the higher bits need masking AFAICS.
>  Was there a reason for this change?

Consider the following pattern:

abstract class MyOutputStream extends FilterOutputStream {

  public void write (int oneByte) throws IOException
  {
    out.write(oneByte & 0xff); // here we call void write(int)
  }
  ...
}

Why should we transform (truncate) oneByte instead of just passing it thru 
as-is?

> 
> >>
> >> >
> >> > IMHO, it's arguable that it would be good to still keep for the private 
> >> > functions additional unnecessary arguments (and process them - e.g. call 
> >> > substring(offset, offset+count)).
> >> >
> >> >>
> >> >> This change occurs twice but is listed here only once.
> >> >
> >> > Ok. I'll change.
> >> >
> >> >>
> >> >> >        * java/io/PrintStream.java (print): Call out.flush() only if 
> >> >> > needed
> >> >> >        (to match the RI).
> >> >>
> >> >> Where are these extra conditions coming from?
> >> >
> >> > Again, the RI behaves differently (the RI flushes the underlying stream 
> >> > only in case of '\n' while the Classpath flushes self (not the 
> >> > underlying stream directly) every time (provided auto_flush is on).
> >> >
> >>
> >> Again, a testcase would be appreciated.  At least with this and the
> >> NPE, I can see how a test would reveal them.  With the removal of
> >> static, I don't see that.
> >
> > See above (for '\n' and for line_separator).
> >
> >>
> >> >>
> >> >> >        * java/io/PrintStream.java (lastIndexOfNewLine): New private 
> >> >> > static
> >> >> >        method (called from print() only).
> >> >>
> >> >> >        * java/io/PrintStream.java (write): Remove unnecessary "&" 
> >> >> > operation.
> >> >> >        * java/io/PrintStream.java (print, write): Directly call 
> >> >> > out.flush()
> >> >> >        instead of flush() (the same as in the RI).
> >> >> >        * java/io/PrintStream.java (append): Call subSequence() also 
> >> >> > for
> >> >> >        "null" string (the same as in the RI).
> >> >> >
> >> >> >
> >> >>
> >> >> The append change looks good as do the flush() changes.  I don't see
> >> >> the reason for changing the ordering of the && statement as checking
> >> >> auto_flush is the cheaper check.
> >> >
> >> > Ok. Reordering here is not a big deal for performance (IMHO, local var 
> >> > fetch should be faster than global memory access).
> >> >
> >>
> >> Depends on the VM.  It's not just a variable fetch, but also casting
> >> the '\n' to an integer and then doing the comparison.
> >
> > No cast is needed (since '\n' is an int const according to JVM spec).
> >
> 
> There's still a comparison between two integers.

Both comparing against const integer and testing an integer (against zero) 
typically have equal cost (I mean the number of a processor cycles) on modern 
CPUs (and not only on modern ;) ).

> 
> > (Even with a cast it might be faster on modern CPUs.)
> >
> >> But either way it's splitting heirs and depends on the VM; there seems
> >> to be no good reason to change it.
> >
> > Agreed.
> >
> >>
> >> >> Additional arguments should stay.
> >> >
> >> > Still, I don't agree. See above.
> >> >
> >> >>
> >> >> The ChangeLog needs some work as it was hard to follow and doesn't
> >> >> match up to the patch.
> >> >
> >> > Why hard? due to changes grouping? And, please point me the places where 
> >> > it doesn't match.
> >> >
> >>
> >> Answered above.  Parts of the patch were missing and the overuse of
> >> the filename made it noisy.
> >
> > Ok.
> >
> >>
> >> > Regards. Thanks for reviewing the code (and for being the opponent).
> >> >
> >> > I'll resubmit the modified/final patch/changelog upon we agree the 
> >> > changes.
> >> >
> >> >> --
> >> >> Andrew :-)
> >> >>
> >> >> Free Java Software Engineer
> >> >> Red Hat, Inc. (http://www.redhat.com)
> >> >
> >> >
> >> >
> >>
> >> Thanks,
> >> --
> >> Andrew :-)
> >
> > Regards.
> 
> -- 
> Andrew :-)
> 
> Free Java Software Engineer
> Red Hat, Inc. (http://www.redhat.com)

I'll change and resubmit the patch (and log entries) a bit later.

I also resubmit the log entries for my other patches (according your 
suggestions) as soon as you look thru that patches.

Regards.

Re: [cp-patches] Minor refactoring of PrintStre am

Reply via email to