>>> I tried not modifying libpng but still ended up with lingering references 
>>> to longjmp in pngread.o, despite libpng having png_ptr->longjmp_fn (bug in 
>>> libpng?). pngread.c calls setjmp to set a default location to jump to in 
>>> case the caller doesn't call setjmp, so if we continue down this path 
>>> something in libpng must be modified. The only other option is to create 
>>> our own setjmp.h and order it before /usr/include/setjmp.h, which seems 
>>> dubious at best.
>>> 
>>> I'm curious if the libpng changes are even needed since it's only used for 
>>> splashscreen, which happens very early in the launch process. Also note 
>>> that we didn't originally even call png_set_longjmp_fn, so any error should 
>>> have resulted in an abort() instead of a call to longjmp... it appears we 
>>> could retain the functionality we have today and #undef 
>>> PNG_SETJMP_SUPPORTED (pngconf.h?). That would put the onus on developers to 
>>> make sure their pngs don't have errors in them, or libsplashscreen will 
>>> abort()...
>>> 
>>> 
>> 
>> That's an interesting question and the answer might extend to the 
>> splashscreen changes too.
>> Its platform specific code and on MAC, the thread is created using pthreads 
>> directly and that
>> thread goes away once splashscreen is done. But its running at the same time 
>> as the VM
>> is booting up and creating threads and setting their signal masks. So I 
>> don't think you can
>> guarantee that it won't mess up the masks on the JRE threads if the PNG is 
>> bad. And I'm
>> also not sure you want to remove error handling from the library either.
>> So a HIGHLY VISIBLE DO NOT REMOVE comment might be the best you can do here.
> 
> I have a better idea:
> 
> png_default_error is the only place where png_longjmp is called. We could 
> call png_set_error_fn to set up our own error handler (for Mac only), compile 
> with PNG_SETJMP_SUPPORTED unset so it doesn't pull in setjmp/longjmp and our 
> own implementation of the error handler would call _longjmp, which would jump 
> back to where we call setjmp currently.

Ok, I figured out what's going on. It's not quite intuitive...

png_jmpbuf is a macro defined in png.h, this calls png_set_longjmp_fn with 
longjmp, which is why I was seeing references to longjmp in the object file. 
That's what was throwing me off as it seems like it should only be getting the 
jmp_buf ptr stored in the png_ptr. I guess the intention was that 
setjmp/longjmp was optional, if you don't call setjmp then it just abort()s.


I changed splashscreen_png.c to:
#ifdef __APPLE__
    if (_setjmp(png_set_longjmp_fn(png_ptr, _longjmp, sizeof(jmp_buf)))) {
#else
    if (setjmp(png_jmpbuf(png_ptr))) {
#endif

and it calls _longjmp instead. I verified this works by changing the macro to 
set png_longjmp to exit() and without the above change it does indeed exit 
prematurely with a bad png, with the change it reports the error but continues 
to load the application as would be expected.

pngread.o still has a symbol table entry for _longjmp instead of __longjmp, but 
it's benign since we're ultimately forcing it to use the correct function. So 
I've left libpng completely unchanged.


With the change and using a bad png for splashscreen, I was able to get a stack 
trace once the application was running. Without the change to 
splashscreen_png.c, jstack was unable to connect to the process. So 
splashscreen absolutely can interfere with the signal handling.


Updated webrev:
http://cr.openjdk.java.net/~ddehaven/8026385/jdk.1/

I can look into writing a regression test for this. It might not be trivial 
though since we're dealing with signal handlers, and if timing is a factor the 
test may not be reliable.

-DrD-

Reply via email to