Re: [xml] Minor improvement possible
I blogged about the importance of adding const in order to get best results here: https://randomascii.wordpress.com/2017/01/08/add-a-const-here-delete-a-const-there/ Some of the post talks about a VC++ compiler bug - ignore that. Note that having constant data in the read-only data segment is a best practice even for single-processing. It's not a huge advantage, but it is strictly better than having logically-const globals in the read/write data segment. On Thu, Jan 5, 2017 at 3:00 PM, Bruce Dawson wrote: > The constant array xmlUnicodeBlocks is not marked as 'const' so it ends up > in the read/write data segment instead of the read-only data segment, which > can lead to it becoming per-process private data instead of image-backed > shared data. > > A patch file (made in the context of Chrome, but it should be applicable > to libxml2) is attached. > > -- > Bruce Dawson > -- Bruce Dawson ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[xml] Minor improvement possible
The constant array xmlUnicodeBlocks is not marked as 'const' so it ends up in the read/write data segment instead of the read-only data segment, which can lead to it becoming per-process private data instead of image-backed shared data. A patch file (made in the context of Chrome, but it should be applicable to libxml2) is attached. -- Bruce Dawson commit eafa8bbee03e8ebba366be1fb8d5b1596a989a0a Author: Bruce Dawson Date: Thu Jan 5 11:25:58 2017 -0800 Add const in five places to move 1 KiB to .rdata xmlUnicodeBlocks is logically const but was not marked as such. This fixes that, thus moving it to the read-only data segment. BUG=677351 diff --git a/third_party/libxml/src/genUnicode.py b/third_party/libxml/src/genUnicode.py index 56e4e9b..4487eeb 100755 --- a/third_party/libxml/src/genUnicode.py +++ b/third_party/libxml/src/genUnicode.py @@ -267,14 +267,14 @@ typedef struct { } xmlUnicodeRange; typedef struct { -xmlUnicodeRange *table; +const xmlUnicodeRange *table; intnumentries; } xmlUnicodeNameTable; static xmlIntFunc *xmlUnicodeLookup(xmlUnicodeNameTable *tptr, const char *tname); -static xmlUnicodeRange xmlUnicodeBlocks[] = { +static const xmlUnicodeRange xmlUnicodeBlocks[] = { """ % (webpage, date, sources)); flag = 0 diff --git a/third_party/libxml/src/xmlunicode.c b/third_party/libxml/src/xmlunicode.c index ce6e9a4..6d0a96a 100644 --- a/third_party/libxml/src/xmlunicode.c +++ b/third_party/libxml/src/xmlunicode.c @@ -29,14 +29,14 @@ typedef struct { } xmlUnicodeRange; typedef struct { -xmlUnicodeRange *table; +const xmlUnicodeRange *table; intnumentries; } xmlUnicodeNameTable; static xmlIntFunc *xmlUnicodeLookup(xmlUnicodeNameTable *tptr, const char *tname); -static xmlUnicodeRange xmlUnicodeBlocks[] = { +static const xmlUnicodeRange xmlUnicodeBlocks[] = { {"AegeanNumbers", xmlUCSIsAegeanNumbers}, {"AlphabeticPresentationForms", xmlUCSIsAlphabeticPresentationForms}, {"Arabic", xmlUCSIsArabic}, @@ -945,7 +945,7 @@ static xmlUnicodeNameTable xmlUnicodeCatTbl = {xmlUnicodeCats, 36}; static xmlIntFunc *xmlUnicodeLookup(xmlUnicodeNameTable *tptr, const char *tname) { int low, high, mid, cmp; -xmlUnicodeRange *sptr; +const xmlUnicodeRange *sptr; if ((tptr == NULL) || (tname == NULL)) return(NULL); ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[xml] VS 2015 code-gen bug triggered by libxml
I hit a code-gen bug in libxml when it is compiled with VS 2015 Update 2 and optimized for size. I filed the bug here: https://connect.microsoft.com/VisualStudio/feedback/details/2582138 The tentative workaround for Chromium can be found here: https://codereview.chromium.org/1878963005/ A similar fix upstream and perhaps a test of xmlUTF8Strsize()'s behavior might be a good idea. The bug does not occur when optimizing for speed. FWIW. -- Bruce Dawson ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
Re: [xml] win32\VC10\config.h and VS 2015
Now is an odd time to be writing snprintf fallbacks for older MSVC versions - I look forward to never building with such versions again! Could be useful for some people of course, but it is a diminishing audience. On Fri, Mar 11, 2016 at 1:24 PM, Nick Wellnhofer wrote: > On 07/03/2016 22:46, Bruce Dawson wrote: > >> The config.h file in win32\VC10 is incompatible with VS 2015 because VS >> 2015 >> provides an implementation of snprintf and prohibits using the >> preprocessor to >> define it. Therefor an ifdef check is needed around that definition in >> VC10\config.h. Here is a patch that is compatible with the change that we >> made >> to Chromium's copy: >> >> >> diff --git a/win32/VC10/config.h b/win32/VC10/config.h >> index 8629944..891b57e 100644 >> --- a/win32/VC10/config.h >> +++ b/win32/VC10/config.h >> @@ -96,7 +96,9 @@ static int isnan (double d) { >> >> #if defined(_MSC_VER) >> #define mkdir(p,m) _mkdir(p) >> +#if _MSC_VER < 1900 // Cannot define this in VS 2015 and above! >> #define snprintf _snprintf >> +#endif >> #if _MSC_VER < 1500 >> #define vsnprintf(b,c,f,a) _vsnprintf(b,c,f,a) >> #endif >> >> >> It would be great to get that same patch in libxml2, and other developers >> will >> need it if they use VC10\config.h. >> > > If you're at it, can you implement a better snprintf fallback for older > MSVC versions? Here's the change I made to libxslt: > > > https://git.gnome.org/browse/libxslt/commit/?id=e75b5da121cdc67cb2303b1e2b77d5dd1cdf2784 > > For background information see: > > https://bugzilla.gnome.org/show_bug.cgi?id=756691 > http://stackoverflow.com/a/8712996/1956010 > > Nick > > -- Bruce Dawson ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
Re: [xml] VC10\config.h and include\win32config.h
The change to VC10\config.h looks good and will avoid problems with VS 2015 if anybody uses that file in the future. Rotting in multiple directories is the remaining problem. I don't know if libxml still supports VS 10, or if anybody is using VC10\config.h, but Chromium would certainly have no objections to that directory being deleted and include\win32config.h being the canonical version. If so then this line should probably be added to include\win32config.h: #define HAVE_STDINT_H These two lines create defines that are needed for some features. Their presence or absence does not affect us and I don't know if the NOP cast they create is appropriate, but I'd say leave them in include\win32config.h: #define SEND_ARG2_CAST #define GETHOSTBYNAME_ARG_CAST If anybody has any more information or recommendations about these defines, please share. So... 1) Committed change looks good. 2) Could also add #define HAVE_STDINT_H to include\win32config.h 3) Could also delete VC10 directory. On Fri, Mar 11, 2016 at 12:13 AM, Daniel Veillard wrote: > On Mon, Mar 07, 2016 at 02:15:28PM -0800, Peter Kasting wrote: > > I echo Bruce's request for a single upstream Windows config file. It > looks > > like include/win32config.h is older; win32/VC10/config.h seems to have > only > > its initial addition in its log. It looks like that addition happened > here: > > > > > https://git.gnome.org/browse/libxml2/commit/?id=066c69777207436e3517d1b930a97f0e0a8aa060 > > > > This seems to have been a convenience for > > https://bugzilla.gnome.org/show_bug.cgi?id=666491 (see > > https://bugzilla.gnome.org/show_bug.cgi?id=666491#c3 , "This is a > different > > and more convenient approach to libxml2 usage, than the nmake approach"), > > but it's not clear to me that libxml wants to do this sort of thing. In > > https://bugzilla.gnome.org/show_bug.cgi?id=666491#c5 , Daniel notes > "Adding > > a VC10 subdir under win32 is a bit heavy, I hope there won't be a new one > > for each new version of the Microsoft tools." Those fears seem to me to > be > > somewhat confirmed; because Microsoft has changed things about the > project > > format, I believe the .vcxproj files in this directory wouldn't serve > > directly for all later versions of MSVC. So this is really a "Visual > > Studio 2010 only" directory. > > > > > To me, this whole directory should be nuked, and include/win32config.h > > updated (see my notes below). People using this directory should either > > carry their own project/build files (as Chromium does) or use the > existing > > nmake method. libxml2 shouldn't be responsible for carrying this sort of > > thing, especially when it's for one version of one compiler on one > platform. > > I never use Windows, so really can't comment on the *right* approach > there > > > Another possibility is for this directory to remain, and further > > directories be added for other MSVC versions (which seems unappealing). > > instead of having code rot in one directory we would have multiple > version rotting in multiple directories, unappealing indeed ! > > > In any case, I think win32config.h should be updated; see below for my > > responses to Bruce's comments about the differences between the configs. > > Sure, if people who are using the Windows platform could comment > I will follow what people suggest, > > so feedback welcome ! > > Daniel > > > On Mon, Mar 7, 2016 at 1:47 PM, Bruce Dawson > wrote: > > > > > #define HAVE_STDINT_H > > > > > > > I believe this line is correct for MSVC 2010+. If libxml doesn't support > > anything earlier, then that line should be in win32config.h as-is. > > Otherwise, it should probably be in but with an appropriate _MSC_VER > check. > > > > #define SEND_ARG2_CAST > > > #define GETHOSTBYNAME_ARG_CAST > > > > > > > I think these must be defined if nano{http,ftp}.* are built. If these > are > > ever built on Windows, win32config.h should include these lines. > > > > PK > > > ___ > > xml mailing list, project page http://xmlsoft.org/ > > xml@gnome.org > > https://mail.gnome.org/mailman/listinfo/xml > > > -- > Daniel Veillard | Open Source and Standards, Red Hat > veill...@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ > http://veillard.com/ | virtualization library http://libvirt.org/ > -- Bruce Dawson ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[xml] VC10\config.h and include\win32config.h
Related to my previous e-mail, we noticed that VC10\config.h differs from include\win32config.h. The differences (in addition to the lack of wrappers around snprintf in VC10\config.h) are that VC10\config.h includes this line: #define HAVE_STDINT_H and include\win32config.h includes these two lines: #define SEND_ARG2_CAST #define GETHOSTBYNAME_ARG_CAST Which version should we prefer, and can we get the other version deleted from libxml2 to avoid confusion? -- Bruce Dawson ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[xml] win32\VC10\config.h and VS 2015
The config.h file in win32\VC10 is incompatible with VS 2015 because VS 2015 provides an implementation of snprintf and prohibits using the preprocessor to define it. Therefor an ifdef check is needed around that definition in VC10\config.h. Here is a patch that is compatible with the change that we made to Chromium's copy: diff --git a/win32/VC10/config.h b/win32/VC10/config.h index 8629944..891b57e 100644 --- a/win32/VC10/config.h +++ b/win32/VC10/config.h @@ -96,7 +96,9 @@ static int isnan (double d) { #if defined(_MSC_VER) #define mkdir(p,m) _mkdir(p) +#if _MSC_VER < 1900 // Cannot define this in VS 2015 and above! #define snprintf _snprintf +#endif #if _MSC_VER < 1500 #define vsnprintf(b,c,f,a) _vsnprintf(b,c,f,a) #endif It would be great to get that same patch in libxml2, and other developers will need it if they use VC10\config.h. -- Bruce Dawson ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
Re: [xml] Possible 64-bit issues in xml 2.9.2
Sure. I'll do that. On old architectures it would be easy enough to use a typedef to 'create' intptr_t, so any compatibility problems that such a patch introduces will be easily fixed. On Tue, Nov 10, 2015 at 5:32 PM, Daniel Veillard wrote: > On Tue, Nov 10, 2015 at 01:04:51PM -0800, Bruce Dawson wrote: > > Resending now that I've joined the mailing list... > > > > While building 64-bit Chromium with VC++ 2015 Update 1 I noticed a > > significant number of pointer truncation warnings in libxml, especially > in > > xpath.c. A typical warning is: > > > > warning C4311: 'type cast': pointer truncation from 'xmlChar *' to 'long' > > > > which triggers on the last two lines of this block: > > > > case XML_ELEMENT_NODE: > > if (node2->type == XML_ELEMENT_NODE) { > > if ((0 > (long) node1->content) && /* TODO: Would a != 0 suffice here? */ > > (0 > (long) node2->content) && > > > > The intent is not entirely clear but if these are supposed to be NULL > > checks then they could easily give the wrong result. In the VC++ world > > 'long' is always a 32-bit type, so converting a pointer to a long both > > throws away the top 32 bits and also makes it a signed type. That means > > that pointers to the first 2 GB of address space will behave as expected, > > and pointers beyond that will have a 50% chance of behaving incorrectly! > > > > Another example is these two conversions. The code is apparently trying > to > > sort by content address, but on 64-bit Windows builds it will just be > > sorting by the bottom 32 bits of the content address. > > l1 = -((long) node1->content); > > l2 = -((long) node2->content); > > if (l1 < l2) > >return(1); > > if (l1 > l2) > >return(-1); > > } > > > > This is not a problem with gcc or clang because their 64-bit builds have > > 64-bit long, but the C/C++ standard do not mandate and that is not the > case > > with VC++. > > > > I think that the correct type would be ptrdiff_t, or size_t, or intptr_t, > > or uintptr_t. I've attached the full set of warnings in case that is of > any > > assistance. Even though some of these warnings do not indicate actual > bugs > > it would still be best to use the correct type in order to avoid > confusion > > and warnings. > > > > Sounds like intptr_t would be the proper type to use for those, I just > wonder what kind of old arches will break if they don't have it. > > Can you come up with a patch which just change the casts and possibly the > target > types ? > > Daniel > > -- > Daniel Veillard | Open Source and Standards, Red Hat > veill...@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ > http://veillard.com/ | virtualization library http://libvirt.org/ > -- Bruce Dawson ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
Re: [xml] Possible 64-bit issues in xml 2.9.2
Thanks for the reply. I work on Chromium and there many good reasons why it should be a 64-bit application - security being one of them. Since libxml is being compiled into Chromium we need it to support 64-bit. > I run AIX mostly and it's compiler defaults to unsigned. I'm not sure that's relevant. The only problem I'm hitting is with the assumption that sizeof(long) == sizeof(void*), which is not a portable assumption. I can't fix with compiler switches because there is no option to change the size of 'long'. I can suppress the warnings, but since some of the warnings appear to be real issues, that would just be ignoring the problem, not solving it. I don't know how serious the issues are since it's not clear to me what the pointer comparisons are for. On Tue, Nov 10, 2015 at 1:26 PM, Eric S. Eberhard wrote: > I am not 100% sure this applies but I'd check into it. I run AIX mostly > and it's compiler defaults to unsigned. When moving my products to VC++ I > had to set some switches so that unspecified (e.g. char x; instead of > unsigned char x;) would default to unsigned. Then my stuff compiles fine. > I did make my own projects for VC++ and libxml2 and they do have the > unsigned/signed switches set. It is fairly current libxml2 on VS 2008. > Which you are welcome to the projects if you want I can drop a tarball onto > my public FTP site. > > My point is you may be able to fix with compiler switches. > > Another idea -- I believe that I also compiled them as 32 bit apps. > libxml2 has no need to be a 64 bit app. It need to run on a 64 bit > computer which is no problem. But libxml2 will never address that much > space and the O/S is decent in Windows at handling that (and great in AIX > and pretty good in Linux and OS/X and HP/UX). Windows simply puts 32 bit > apps at the bottom of memory and 64 bit at the top ... which is fine if you > don't run out of space. I heard that later Windows does it more like > Unix. AIX actually intercepts addresses on 32 bit apps and makes a > translation -- so I may or may not be using 64 bit addresses and it does > not matter, it is always translated to a 32 bit. Again, I don't need to > access a pointer more than 32 bits (excessive in all but the largest of > systems). And horsepower on these boxes is so high that the translation is > not hurting. I have sites that literally exchange 1.5 million - 2.0 > million XML files per day (most in 10 hours) on a weenie little IBM using > libxml2 to parse and create XML. > > My point on this is you may be able to fix by making it a 32 bit > application. > > E > > On 11/10/2015 2:04 PM, Bruce Dawson wrote: > > Resending now that I've joined the mailing list... > > While building 64-bit Chromium with VC++ 2015 Update 1 I noticed a > significant number of pointer truncation warnings in libxml, especially in > xpath.c. A typical warning is: > > warning C4311: 'type cast': pointer truncation from 'xmlChar *' to 'long' > > which triggers on the last two lines of this block: > > case XML_ELEMENT_NODE: > if (node2->type == XML_ELEMENT_NODE) { > if ((0 > (long) node1->content) && /* TODO: Would a != 0 suffice here? */ > (0 > (long) node2->content) && > > The intent is not entirely clear but if these are supposed to be NULL > checks then they could easily give the wrong result. In the VC++ world > 'long' is always a 32-bit type, so converting a pointer to a long both > throws away the top 32 bits and also makes it a signed type. That means > that pointers to the first 2 GB of address space will behave as expected, > and pointers beyond that will have a 50% chance of behaving incorrectly! > > Another example is these two conversions. The code is apparently trying to > sort by content address, but on 64-bit Windows builds it will just be > sorting by the bottom 32 bits of the content address. > l1 = -((long) node1->content); > l2 = -((long) node2->content); > if (l1 < l2) >return(1); > if (l1 > l2) >return(-1); > } > > This is not a problem with gcc or clang because their 64-bit builds have > 64-bit long, but the C/C++ standard do not mandate and that is not the case > with VC++. > > I think that the correct type would be ptrdiff_t, or size_t, or intptr_t, > or uintptr_t. I've attached the full set of warnings in case that is of any > assistance. Even though some of these warnings do not indicate actual bugs > it would still be best to use the correct type in order to avoid confusion > and warnings. > > -- > Bruce Dawson > > > _____
[xml] Possible 64-bit issues in xml 2.9.2
Resending now that I've joined the mailing list... While building 64-bit Chromium with VC++ 2015 Update 1 I noticed a significant number of pointer truncation warnings in libxml, especially in xpath.c. A typical warning is: warning C4311: 'type cast': pointer truncation from 'xmlChar *' to 'long' which triggers on the last two lines of this block: case XML_ELEMENT_NODE: if (node2->type == XML_ELEMENT_NODE) { if ((0 > (long) node1->content) && /* TODO: Would a != 0 suffice here? */ (0 > (long) node2->content) && The intent is not entirely clear but if these are supposed to be NULL checks then they could easily give the wrong result. In the VC++ world 'long' is always a 32-bit type, so converting a pointer to a long both throws away the top 32 bits and also makes it a signed type. That means that pointers to the first 2 GB of address space will behave as expected, and pointers beyond that will have a 50% chance of behaving incorrectly! Another example is these two conversions. The code is apparently trying to sort by content address, but on 64-bit Windows builds it will just be sorting by the bottom 32 bits of the content address. l1 = -((long) node1->content); l2 = -((long) node2->content); if (l1 < l2) return(1); if (l1 > l2) return(-1); } This is not a problem with gcc or clang because their 64-bit builds have 64-bit long, but the C/C++ standard do not mandate and that is not the case with VC++. I think that the correct type would be ptrdiff_t, or size_t, or intptr_t, or uintptr_t. I've attached the full set of warnings in case that is of any assistance. Even though some of these warnings do not indicate actual bugs it would still be best to use the correct type in order to avoid confusion and warnings. -- Bruce Dawson third_party\libxml\src\parser.c(11675): warning C4311: 'type cast': pointer truncation from 'void *' to 'long' third_party\libxml\src\parser.c(1318): warning C4312: 'type cast': conversion from 'long' to 'void *' of greater size third_party\libxml\src\parser.c(1337): warning C4311: 'type cast': pointer truncation from 'void *' to 'long' third_party\libxml\src\parser.c(1849): warning C4312: 'type cast': conversion from 'long' to 'void *' of greater size third_party\libxml\src\parser.c(9190): warning C4311: 'type cast': pointer truncation from 'void *' to 'long' third_party\libxml\src\relaxng.c(4401): warning C4312: 'type cast': conversion from 'long' to 'void *' of greater size third_party\libxml\src\relaxng.c(4409): warning C4312: 'type cast': conversion from 'long' to 'void *' of greater size third_party\libxml\src\relaxng.c(4413): warning C4312: 'type cast': conversion from 'long' to 'void *' of greater size third_party\libxml\src\relaxng.c(4420): warning C4312: 'type cast': conversion from 'long' to 'void *' of greater size third_party\libxml\src\relaxng.c(4424): warning C4312: 'type cast': conversion from 'long' to 'void *' of greater size third_party\libxml\src\relaxng.c(9384): warning C4311: 'type cast': pointer truncation from 'void *' to 'long' third_party\libxml\src\sax2.c(1911): warning C4312: 'type cast': conversion from 'long' to 'void *' of greater size third_party\libxml\src\tree.c(4579): warning C4311: 'type cast': pointer truncation from 'void *const ' to 'long' third_party\libxml\src\xmlio.c(3006): warning C4312: 'type cast': conversion from 'long' to 'void *' of greater size third_party\libxml\src\xmlio.c(3112): warning C4312: 'type cast': conversion from 'long' to 'void *' of greater size third_party\libxml\src\xmlio.c(829): warning C4311: 'type cast': pointer truncation from 'void *' to 'long' third_party\libxml\src\xmlio.c(850): warning C4311: 'type cast': pointer truncation from 'void *' to 'long' third_party\libxml\src\xmlio.c(868): warning C4311: 'type cast': pointer truncation from 'void *' to 'long' third_party\libxml\src\xmlmemory.c(468): warning C4311: 'type cast': pointer truncation from 'void *' to 'unsigned long' third_party\libxml\src\xpath.c(170): warning C4311: 'type cast': pointer truncation from 'xmlChar *' to 'long' third_party\libxml\src\xpath.c(171): warning C4311: 'type cast': pointer truncation from 'xmlChar *' to 'long' third_party\libxml\src\xpath.c(174): warning C4311: 'type cast