TestSerializationMismatch.sh failed on windows

Stuart Marks Tue, 05 Feb 2013 14:45:29 -0800

Hi, thanks for the updates.

Capturing the Client output in a file seems preferable to putting a shellpipeline in backquotes. But, the old code piped both stdout and stderr intogrep using 2>&1, whereas the new code redirects only stdout, which is thensearched by grep. Was that intentional? I'm not sure whether the error messagein question occurs in stdout or stderr.

Otherwise, it looks like you've taken care of the common failure modes. It justgoes to show how deceptively simple shell programming can be, when in factcatching all the potential errors can be quite tedious.

The following discussion is mainly for your information, to consider the nexttime you decide to write a shell test. :-) I'm not suggesting any action forthis changeset.

You had mentioned earlier (see thread below) that you needed to compileincompatible versions of the same class. It's indeed difficult to coerce jtregto do that, so this is the rationale for writing a shell test.


It's possible and somewhat hacky to use jtreg for this, though; see

http://hg.openjdk.java.net/lambda/lambda/jdk/file/tip/test/java/io/Serializable/defaultSVID/

Basically it uses the @compile tag to compile the first version of the file,then @run with an arg that tells the test to move the generated classfilesomewhere, another @compile tag to compile the second version of the file, andfinally an @run tag to run the test for real.


I don't suggest that you copy this technique. :-)

Jon Gibbons suggested invoking the compiler API directly from java instead ofwriting a shell script. Doing this seems fairly simple, and I think it would beadvantageous to keep things entirely in Java. I may attempt to rewrite thedefaultSVID test using the compiler API.


s'marks


On 2/5/13 6:40 AM, Jaroslav Bachorik wrote:

Updates in http://cr.openjdk.java.net/~jbachorik/8005472/webrev.05

Comments inline

On 01/10/2013 10:44 PM, Stuart Marks wrote:

On 1/10/13 7:20 AM, Jaroslav Bachorik wrote:

Update: http://cr.openjdk.java.net/~jbachorik/8005472/webrev.04


Thanks for the update.

Note, argv[0] is used before argv.length is checked, so if no args are
passed this gives index out of bounds instead of the usage message.


It's gone. Shouldn't have been left there anyway.


I see you take pains to write and flush the URL to stdout before writing
the signaling file. Good. The obvious alternative (which I started
writing but then erased) is just to put the URL into the signaling file.
But this has a race between creation of the file and the writing of its
contents. So, what you have works. (This kind of rendezvous problem
occurs a lot; it seems like there ought to be a simpler way.)

I suspect the -e option caused hangs because if something failed, it
would leave the server running, spoiling the next test run. The usual
way to deal with this is to use the shell 'trap' statement, to kill
subprocesses and remove temp files before exiting the shell. Probably a
good practice in general, but perhaps too much shell hackery for this
change. (Up to you if you want to tackle it.)


I would rather not ...


Regarding how the test is detecting success/failure, the concern is that
if the client fails for some reason other than the failure being checked
for, the test will still report passing. Since the error message is
coming out of the client JVM, in principle it ought to be possible to
redirect it somehow in order to do the assertion checking in Java. With
the current shell scheme, not only are other failures reported as the
test passing, these other failures are erased in the grep pipeline, so
they're not even visible in the test log.


I've changed the logic slightly to check for the java process exit
status as well as for the presence of the trigger text in the process
output. This should catch all the regular failures.

Unfortunately, the way the notification handling is implemented now it
is not really possible to check for the error from within the
application - the error state is captured by the *NotificationForwarder
class and only reported to the logger.

-JB-


This last issue is rather far afield from this webrev, and fixing it
will probably require some rearchitecting of the test. So maybe it
should be considered independently. I just happened to notice this going
on, and I noticed the similarity to what's going on in the RMI tests.

s'marks

On 01/10/2013 09:52 AM, Stuart Marks wrote:

On 1/7/13 3:23 AM, Jaroslav Bachorik wrote:

On 01/04/2013 11:37 PM, Kelly O'Hair wrote:

I suspect it is not hanging because it does not exist, but that some
other windows process has it's hands on it.
This is the stdout file from the server being started up right?
Could the server from a previous test run be still running?


Exactly. Amy confirmed this and provided a patch which resolves the
hanging problem.

The update patch is at
http://cr.openjdk.java.net/~jbachorik/8005472/webrev.01


Hi Jaroslav,

The change to remove the parentheses from around the server program
looks right. It avoids forking an extra process (at least in some
shells) and lets $! refer to the actual JVM, not an intermediate shell
process. The rm -f from Kelly's suggestion is good too.

But there are other things wrong with the script. I don't think they
could cause hanging, but they could cause the script to fail in
unforeseen ways, or even to report success incorrectly.

One problem is introduced by the change, where the Server's stderr is
also redirected into $URL_PATH along with stdout. This means that if the
Server program reports any errors, they'll get mixed into the URL_PATH
file instead of appearing in the test log. The URL_PATH file's contents
is never reported, so these error messages will be invisible.


Fixed, only the stdout is redirected to $URL_PATH.


The exit status of some of the critical commands (such as the
compilations) isn't checked, so if javac fails for some reason, the test
might not report failure. Instead, some weird error might or might not
be reported later (though one will still see the javac errors in the
log).


Fixed, introduced the check. The "set -e" was hanging the script so I
have to check for the exit status manually.


I don't think the sleep at line 80 is necessary, since the client runs
synchronously and should have exited by this point.


And it's gone.


The wait loop checking for the existence of the URL_PATH file doesn't
actually guarantee that the server is running or has initialized yet.
The file is actually created by the shell before the Server JVM starts
up. Thus, runClient might try to read from it before the server has
written anything to it. Or, as mentioned above, the server might have
written some error messages into the URL_PATH file instead of the
expected contents. Thus, the contents of the JMXURL variable can quite
possibly be incorrect.


The err is not redirected to the file. A separate file is used to signal
the availability of the server and that file is created from the java
code after the server has been started. Also, the err and out  streams
are flushed to make sure the JMX URL makes it into the file.


If this occurs, what will happen when the client runs? It may emit some
error message, and this will be filtered out by the grep pipeline. Thus,
HAS_ERRORS might end up empty, and the test will report passing, even
though everything has failed!


Shouldn't happen with only the controlled stdout redirected to the file.


For this changeset I'd recommend at a minimum removing the redirection
of stderr to URL_PATH. If the server fails we'll at least see errors in
the test log.

For checking the notification message, is there a way to modify the
client to report an exit status or throw an exception? Throwing an
exception from main() will exit the JVM with a nonzero status, so this
can be checked more easily from the script. I think this is less
error-prone than grepping the output for a specific error message. The
test should fail if there is *any* error; it should not succeed if an
expected error is absent.


This is unfortunately not possible. The notification processing needs to
be robust enough to prevent exiting JVM in cases like this. Therefore it
only reports the problem, dumps the notification and carries on. The
only place one can find something went wrong is the err stream.


You might consider having jtreg build the client and server classes.
This might simplify some of the setup. Also, jtreg is meticulous about
aborting the test if any compilations fail, so it takes care of that for
you.


I need same name classes with incompatible code compiled to two
different locations - client and server. I was not able to figure out
how to use jtreg to accomplish that.

-JB-


It would be nice if there were a better way to have the client
rendezvous with the server. I hate to suggest it, but sleeping
unconditionally after starting the server is probably necessary.
Anything more robust probably requires rearchitecting the test, though.

Sorry to dump all this on you. But one of the shell-based RMI tests
suffers from *exactly* the same pathologies. (I have yet to fix it.)
Unfortunately, I believe that there are a lot of other shell-based tests
in the test suite that have similar problems. The lesson here is that
writing reliable shell tests is a lot harder than it seems.

Thanks,

s'marks

Re: jmx-dev [PATCH] JDK-8005472: com/sun/jmx/remote/NotificationMarshalVersions/TestSerializationMismatch.sh failed on windows

Reply via email to