> The behaviour only seems to trigger when you configure --without-zlib. I 
> don't know why yet, but there are zlib specific #ifdefs in the loading 
> and URL mangling code, so there could be something funny going on that 
> isn't triggered when zlib is disabled.

Okay, I found the bug, it's very simple.

Here is the comment for xmlFileOpen:

  * Wrapper around xmlFileOpen_real that try it with an unescaped
  * version of @filename, if this fails fallback to @filename

However, the code does *not* do this:

     unescaped = xmlURIUnescapeString(filename, 0, NULL);
     if (unescaped != NULL) {
         retval = xmlFileOpen_real(unescaped);
         xmlFree(unescaped);
     } else {
         retval = xmlFileOpen_real(filename);
     }
     return retval;

The code is unescaping the filename first and trying to load that. If it 
fails, then it fails. Shouldn't it try and load the filename as-is 
first, and if *that* fails try unescaping it? Or better yet, not try 
unescaping it all, I mean since when did filenames use % escapes anyway?

So I suggest this patch to xmlFileOpen in xmlIO.c:

     retval = xmlFileOpen_real(filename);
     if (retval == NULL) {
         unescaped = xmlURIUnescapeString(filename, 0, NULL);
         if (unescaped != NULL) {
             retval = xmlFileOpen_real(unescaped);
             xmlFree(unescaped);
         }
     }
     return retval;

With this code the file "hello%2Fworld.xml" will be loaded first, and 
only if it is not found will "hello/world.xml" be loaded. But yeah, I 
would rather delete that entire if test, as it seems to me that any URL 
unescaping should be handled a lot earlier before xmlFileOpen sees it.

Michael

-- 
Print XML with Prince!
http://www.princexml.com
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to