Re: qsort_r again
In article <20160827201952.gb6...@netbsd.org>, David Hollandwrote: >(Cc: tech-kern because of kheapsort()) > >Some time back I made a set of patches for qsort_r(), that is, a >version of qsort that allows passing a data pointer through, along >with corresponding mergesort_r() and heapsort_r(). > >After some discussion the conclusion was that instead of using thunks >to implement qsort in terms of qsort_r, it was better to implement >both versions in terms of a common internal version. The following >patchset does that. > >heapsort() is used in two places in the kernel as kheapsort(), which >takes an extra argument so that the heapsort code itself doesn't have >to know how to malloc in the kernel. I have done the following: > - add kheapsort_r() > - change the signature of plain kheapsort() to move the extra > argument to a place it is less likely to cause confusion with the > data argument; > - update the callers for the signature change; >but I have not changed either caller to call kheapsort_r >instead. Based on the code, this should probably be done later; then >the plain kheapsort can be removed. > >Note that this set of names and signatures matches what's in Linux. >Apparently FreeBSD did it differently, but there are reasons to prefer >this arrangement of arguments. > >Joerg also says that the _r names are inappropriate because the >originals are reentrant, but I think that's tilting at windmills and >it's more useful to be compatible. I agree with joerg, but that's water under the bridge... Go for it. christos
Re: qsort_r
On Mon, 09 Dec 2013, Mouse wrote: I actually don't see anything that promises that a pointer to a function type may be converted to a pointer to void, nor back again (except, in each direction, when the original pointer is nil), much less promising anything about the results if it is done. But I haven't read over the whole thing recently enough to be sure there isn't such a promise hiding somewhere. Sorry, I did not express myself clearly enough. C does not promise that function pointers can be converted to or from void* pointers, but I believe that all existing NetBSD implementations do allow such conversions. --apb (Alan Barrett)
Re: qsort_r
Date:Mon, 9 Dec 2013 02:37:30 -0500 (EST) From:Mouse mo...@rodents-montreal.org Message-ID: 201312090737.caa22...@chip.rodents-montreal.org | According to the manpage, their values are relevant: Just in case someday someone ever thinks of a use for them. Very unlikely. But in any case, since the doc explicitly shows init by memset(..,0,..), and that's what everyone does, the implementation would need to keep that working, even if it ever happened that the null pointer was not all 0 bits (and personally, I'm not going to every worry about that happening - it won't.) | In any case, it's hardly the only example of that assumption; it's just | the first one that came to mind. Yes, there are certainly plenty of cases where structs are cleared (by calloc, or explicit memset) and then it's assumed that any pointers in them are NULL. That's the kind of economic imperative that I referred to - any implementation that makes that code fail (however legal it might be according to the rules of C) will simply be ignored in the world - if hardware/software manufacturers (including us who give stuff away for free) want it to be used, we have to make stuff that the users are happy with - and all my working code fails on this system isn't going to cut it. | That's exactly what such a checkout compiler would be for. I have no problem at all with tools that check/validate assumptions. It is the changes (including places where it can be argued that there was no explicit specification previously) from accepted practice over the past (almost) 40 years to the C language spec that I don't like. Had all this been done in 1980, it might have made sense, by 1990, the ship had already sailed, it was already too late for anyone to even consider changing any of the basic (written or unwritten) assumptions about how hardware works. Pretending it is even remotely possible that arithmetic might go back to 1's complement, or anything other than 0 bits will ever be the null pointer, or even that memory won't be byte addressable (for whatever size byte ends up being in use, but it is almost inconceivable anything other than 8 bits will ever be supported) is just insane. Making programmers work harder, just in case their code happens to want to run on such a nonexistent computer is ludicrous. That stuff is all extinct. kre
re: qsort_r
On Mon, 09 Dec 2013, Mouse wrote: I actually don't see anything that promises that a pointer to a function type may be converted to a pointer to void, nor back again (except, in each direction, when the original pointer is nil), much less promising anything about the results if it is done. But I haven't read over the whole thing recently enough to be sure there isn't such a promise hiding somewhere. Sorry, I did not express myself clearly enough. C does not promise that function pointers can be converted to or from void* pointers, but I believe that all existing NetBSD implementations do allow such conversions. if you include the powerpc64 port in the list of existing netbsd implementations, then we do have one.. or at least sort of. i don't recall the details, but they're not just simple pointers you jump to, and the descriptors take up more space than a normal pointer. .mrg.
Re: qsort_r
On Mon, Dec 09, 2013 at 22:20:47 +1100, matthew green wrote: On Mon, 09 Dec 2013, Mouse wrote: I actually don't see anything that promises that a pointer to a function type may be converted to a pointer to void, nor back again (except, in each direction, when the original pointer is nil), much less promising anything about the results if it is done. But I haven't read over the whole thing recently enough to be sure there isn't such a promise hiding somewhere. Sorry, I did not express myself clearly enough. C does not promise that function pointers can be converted to or from void* pointers, but I believe that all existing NetBSD implementations do allow such conversions. if you include the powerpc64 port in the list of existing netbsd implementations, then we do have one.. or at least sort of. i don't recall the details, but they're not just simple pointers you jump to, and the descriptors take up more space than a normal pointer. Itanium has fat function pointers - they point to a struct of code address plus gp register value. -uwe
Re: qsort_r
Sneaking in because of a reason.. | On Mon, 09 Dec 2013, Mouse wrote: | I actually don't see anything that promises that a pointer to a | function type may be converted to a pointer to void, nor back | again (except, in each direction, when the original pointer is | | Sorry, I did not express myself clearly enough. C does not | promise that function pointers can be converted to or from void* | pointers, but I believe that all existing NetBSD implementations Rich Felker corrected my worldview and correctly pointed out that POSIX actually does add an additional constraint in [1]: Note that conversion from a void * pointer to a function pointer as in: fptr = (int (*)(int))dlsym(handle, my_function); is not defined by the ISO C standard. This standard requires this conversion to work correctly on conforming implementations. [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlsym.html --steffen
Re: qsort_r
In article 20131209061036.ge2...@apb-laptoy.apb.alt.za, Alan Barrett a...@cequrux.com wrote: On Sun, 08 Dec 2013, David Holland wrote: My irritation with not being able to pass a data pointer through qsort() boiled over just now. Apparently Linux and/or GNU has a qsort_r() that supports this; so, following is a patch that gives us a compatible qsort_r() plus mergesort_r(), and heapsort_r(). Apparently FreeBSD [1] and GNU [2] have incompatible versions of qsort_r, passing the extra 'thunk' or 'data' argument in a different position. [1]: FreeBSD qsort_r http://www.manpagez.com/man/3/qsort_r/ [2]: Linux qsort_r http://man7.org/linux/man-pages/man3/qsort.3.html If we have to pick one, let's pick the FreeBSD version. Actually let's not (fortunately dh@ chose the right one). We should pick the linux one: http://sourceware.org/ml/libc-alpha/2008-12/msg7.html christos
Re: qsort_r
On Mon, Dec 09, 2013 at 03:55:30AM +, David Holland wrote: On Sun, Dec 08, 2013 at 11:26:47PM +, David Laight wrote: I have done it by having the original, non-_r functions provide a thunk for the comparison function, as this is least invasive. If we think this is too expensive, an alternative is generating a union of function pointers and making tests at the call sites; another option is to duplicate the code (hopefully with cpp rather than CP) but that seems like a bad plan. I'd prefer to not have another indirect call. The only difference is the definition and expanding a CMP macro differently? Is just casting the function pointers safe in C (well in NetBSD)? (with the calling conventions that Unix effectively requires) No. Well, it is, but it's explicitly illegal C and I don't think we should do it. Actually given that these functions are in libc, their interface is defined by the architecture's function call ABI, not by the C language. Consider what you would do if you wrote an asm wrapper for qsort(a,b) in terms of an asm qsort_r(a,b,d)? For ABI where the the first 3 arguments are passed in registers (eg: amd64, sparc, sparc64) and for ABI where arguments are stacked and cleared by the caller (eg i386) I don't you'd consider doing anything other than putting an extra label on the same code. There might be ABI where the this isn't true - in which case the 'thunk' is an option, but I don't think NetBSD has one. FWIW I think Linux is moving to an alternate ppc64 ABI that doesn't use 'fat pointers'. David -- David Laight: da...@l8s.co.uk
Re: qsort_r
On Sun, Dec 08, 2013 at 11:44:28PM +0100, Joerg Sonnenberger wrote: I have done it by having the original, non-_r functions provide a thunk for the comparison function, as this is least invasive. If we think this is too expensive, an alternative is generating a union of function pointers and making tests at the call sites; another option is to duplicate the code (hopefully with cpp rather than CP) but that seems like a bad plan. I'd prefer to not have another indirect call. The only difference is the definition and expanding a CMP macro differently? Yes. But I'd rather not duplicate the code... -- David A. Holland dholl...@netbsd.org
Re: qsort_r
On Sun, Dec 08, 2013 at 11:44:28PM +0100, Joerg Sonnenberger wrote: On Sun, Dec 08, 2013 at 10:29:53PM +, David Holland wrote: I have done it by having the original, non-_r functions provide a thunk for the comparison function, as this is least invasive. If we think this is too expensive, an alternative is generating a union of function pointers and making tests at the call sites; another option is to duplicate the code (hopefully with cpp rather than CP) but that seems like a bad plan. I'd prefer to not have another indirect call. The only difference is the definition and expanding a CMP macro differently? Is just casting the function pointers safe in C (well in NetBSD)? (with the calling conventions that Unix effectively requires) Can anything slightly less nasty be done with varags functions? David -- David Laight: da...@l8s.co.uk
Re: qsort_r
In article 20131208222953.gb25...@netbsd.org, David Holland dholland-t...@netbsd.org wrote: (Cc: tech-kern because of kheapsort()) My irritation with not being able to pass a data pointer through qsort() boiled over just now. Apparently Linux and/or GNU has a qsort_r() that supports this; so, following is a patch that gives us a compatible qsort_r() plus mergesort_r(), and heapsort_r(). I have done it by having the original, non-_r functions provide a thunk for the comparison function, as this is least invasive. If we think this is too expensive, an alternative is generating a union of function pointers and making tests at the call sites; another option is to duplicate the code (hopefully with cpp rather than CP) but that seems like a bad plan. Note that the thunks use an extra struct to hold the function pointer; this is to satisfy C standards pedantry about void pointers vs. function pointers, and if we decide not to care it could be simplified. This patch was supposed to have all the necessary support widgetry, like namespace.h changes, but there's at least one more thing not in it: MLINKS for the new functions and corresponding setlist changes. If I've forgotten anything else, let me know. heapsort() is used in one place in the kernel as kheapsort(), which takes an extra argument so that the heapsort code itself doesn't have to know how to malloc in the kernel. I have done the following: - add kheapsort_r() - change the signature of plain kheapsort() to move the extra argument to a place it is less likely to cause confusion with the userdata argument; - update the caller for the signature change; but I have not changed the caller to call kheapsort_r instead. Based on the code, this should probably be done later as like many sort calls it's using a global for context. At this point the plain kheapsort can be removed. LGTM. christos
Re: qsort_r
My irritation with not being able to pass a data pointer through qsort() boiled over just now. This sort of thing is one reason I do not consider nested function support optional in any compiler I consider using: they make it easy to curry away function arguments to generate something suitable for passing to (say) qsort, thereby fixing the problem once, for all uses, rather than depending on each called thing being fixed separately. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: qsort_r
On Sun, Dec 08, 2013 at 10:29:53PM +, David Holland wrote: I have done it by having the original, non-_r functions provide a thunk for the comparison function, as this is least invasive. If we think this is too expensive, an alternative is generating a union of function pointers and making tests at the call sites; another option is to duplicate the code (hopefully with cpp rather than CP) but that seems like a bad plan. Note that the thunks use an extra struct to hold the function pointer; this is to satisfy C standards pedantry about void pointers vs. function pointers, and if we decide not to care it could be simplified. On most architectures I think just: __weak_alias(heapsort_r,heapsort) __weak_alias(heapsort_r,_heapsort) will work. David -- David Laight: da...@l8s.co.uk
Re: qsort_r
On Sun, Dec 08, 2013 at 06:21:16PM -0500, Mouse wrote: My irritation with not being able to pass a data pointer through qsort() boiled over just now. This sort of thing is one reason I do not consider nested function support optional in any compiler I consider using: they make it easy to curry away function arguments to generate something suitable for passing to (say) qsort, thereby fixing the problem once, for all uses, rather than depending on each called thing being fixed separately. Nested functions the way GCC implements them are broken. For once, they introduce security issues by requiring an executable stack. Anonymous functions with lexical scope can be done properly, lambdas in C++11 are an example. They shouldn't be shoe horned into C this way though. Joerg
Re: qsort_r
On Sun, Dec 08, 2013 at 11:26:47PM +, David Laight wrote: I have done it by having the original, non-_r functions provide a thunk for the comparison function, as this is least invasive. If we think this is too expensive, an alternative is generating a union of function pointers and making tests at the call sites; another option is to duplicate the code (hopefully with cpp rather than CP) but that seems like a bad plan. I'd prefer to not have another indirect call. The only difference is the definition and expanding a CMP macro differently? Is just casting the function pointers safe in C (well in NetBSD)? (with the calling conventions that Unix effectively requires) No. Well, it is, but it's explicitly illegal C and I don't think we should do it. Can anything slightly less nasty be done with varags functions? Don't immediately see how... -- David A. Holland dholl...@netbsd.org
Re: qsort_r
Is just casting the function pointers safe in C No. As soon as you call through a pointer to the wrong type you're off in nasal demon territory. (Loosely put; I'd have to look up the exact wording - there is a little wiggle room, but, if I've understood the subject of the discussion correctly, not enough.) (well in NetBSD)? (with the calling conventions that Unix effectively requires) How does Unix (to the extent that there is a single such thing) restrict the compiler-and-machine's calling conventions? If you can find me a description of what NetBSD assumes beyond what C promises, I can have a stab at answering that question. (I know a few of the things, such as there is an exactly-8-bit type, but I feel fairly sure I don't know them all; I'm even now still discovering such things occasionally.) /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: qsort_r
[Should this maybe be moved to tech-toolchain?] My irritation with not being able to pass a data pointer through qsort() boiled over just now. This sort of thing is one reason I do not consider nested function support optional in any compiler I consider using: [...] Nested functions the way GCC implements them are broken. I didn't say nested function support the way gcc implements them. Also, I disagree that they are broken. They work; I use them routinely. You may find them inappropriate for your tradeoffs, but that does not make them broken; it just means they're not a right tool for you. For once, they introduce security issues by requiring an executable stack. I think that's overstating the case; they require an executable stack, which in some cases (an important qualification) can introduce security issues. Yes. I don't like stack trampolines. I would much prefer fat function pointers. One of my blue-sky projects has long been to build a compiler that uses fat function pointers to do nested functions. (I haven't looked much at teaching gcc to do fat function pointers; the little I've seen from when I _have_ had my fingers in gcc's internals makes me suspect doing it in gcc would involve major surgery.) Anonymous functions with lexical scope can be done properly, gcc-style nested functions are not anonymous. (Admittedly, this is a quibble.) Yes, they can. For many purposes, gcc's implementation is an example, the above quibble aside. But nothing says it's the only way; even I, as inexpert as I am at compilers, can think of another implementation technique I'm certain could be made to work which avoids the execution out of the stack that bothers you so, another technique which probably could be made to work (though I'm less sure), and there probably are plenty of others I haven't thought of. They shouldn't be shoe horned into C this way though. What's wrong with gcc's method? Aside from implementation issues, that is; I'm trying to see what you think is wrong with the language, not the implementation of it, since you seem to think there _is_ something wrong, at the language level, with gcc's version. Or did I misread? /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: qsort_r
On Sun, Dec 08, 2013 at 11:51:15PM -0500, Mouse wrote: If you can find me a description of what NetBSD assumes beyond what C promises, I can have a stab at answering that question. (I know a few of the things, such as there is an exactly-8-bit type, but I feel fairly sure I don't know them all; I'm even now still discovering such things occasionally.) I don't think we have such a list. In practice though it's defined by what properties all normal architectures share, filtered by which of these properties gcc reliably allows to show through to the source level. This is a moving target; for example, not all that long ago you could rely on signed shifts to behave properly, but nowadays you can't. Historically the C standard was understood to provide certain wiggle room for the architecture and other wiggle room for the compiler, such that people writing MD code could reasonably assume that properties of their architecture would be reflected through by the compiler. The gcc guys have repeatedly demonstrated that they have no use for this model, no interest in preserving such properties, and are willing to break them arbitrarily. In this environment I don't think it's a good idea to make any assumptions at all about things the C standard doesn't guarantee. (Also, as far as 8-bit types, while POSIX promises that char is 8 bits wide, there's still from time to time loose talk about doing a port to a 36-bit machine, where char would presumably be 9 bits. I think doing this and making it work this would be good for the general quality of the code base and I'd rather not introduce unnecessary complications.) The one exception to all this that I can think of is that we don't care about platforms where int is 16 bits wide. Or 18, either. -- David A. Holland dholl...@netbsd.org
Re: qsort_r
If you can find me a description of what NetBSD assumes beyond what C promises, [...] I don't think we have such a list. I didn't think so either, but wasn't nearly sure enough to assume there wasn't one. [...]. In this environment I don't think it's a good idea to make any assumptions at all about things the C standard doesn't guarantee. Then there are a lot of things that need fixing; the one I'm most aware of is the assumption that all-bits-0 is a nil pointer. (This most often shows up in the use of calloc, bzero, etc on space that includes a pointer and then assuming the pointer is nil; see, for example, the example code in 5.2's getaddrinfo(3) manpage.) (Also, as far as 8-bit types, while POSIX promises that char is 8 bits wide, [...] Exactly 8 bits, or at least 8 bits? C promises it's at least 8 bits; that isn't one of the things I had heard that POSIX said anything about. [...] there's still from time to time loose talk about doing a port to a 36-bit machine, where char would presumably be 9 bits. I think doing this and making it work this would be good for the general quality of the code base and I'd rather not introduce unnecessary complications.) I agree. (I've often thought it would be nice to have a checkout compiler that went out of its way to break as many as possible of the usual assumptions that are true everywhere but that C does not actually promise) The one exception to all this that I can think of is that we don't care about platforms where int is 16 bits wide. Or 18, either. Or anything else less than 32, I suspect. I would even hazard a guess that there's code in the tree that assumes int is exactly 32 bits, though code assuming long is also 32 has most probably been eradicated by now. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: qsort_r
On Sun, 08 Dec 2013, David Holland wrote: My irritation with not being able to pass a data pointer through qsort() boiled over just now. Apparently Linux and/or GNU has a qsort_r() that supports this; so, following is a patch that gives us a compatible qsort_r() plus mergesort_r(), and heapsort_r(). Apparently FreeBSD [1] and GNU [2] have incompatible versions of qsort_r, passing the extra 'thunk' or 'data' argument in a different position. [1]: FreeBSD qsort_r http://www.manpagez.com/man/3/qsort_r/ [2]: Linux qsort_r http://man7.org/linux/man-pages/man3/qsort.3.html If we have to pick one, let's pick the FreeBSD version. I have done it by having the original, non-_r functions provide a thunk for the comparison function, as this is least invasive. If we think this is too expensive, an alternative is generating a union of function pointers and making tests at the call sites; another option is to duplicate the code (hopefully with cpp rather than CP) but that seems like a bad plan. I'd probably duplicate the code via CPP, to trade time for space, but your way is fine. Note that the thunks use an extra struct to hold the function pointer; this is to satisfy C standards pedantry about void pointers vs. function pointers, and if we decide not to care it could be simplified. That adds more run-time overhead. Could you make it conditional on whether it's really necessary? All existing NetBSD platforms can convert back and forth between void * and function pointers without any trouble. --apb (Alan Barrett)
Re: qsort_r
On Sun, 08 Dec 2013, Mouse wrote: Is just casting the function pointers safe in C No. As soon as you call through a pointer to the wrong type you're off in nasal demon territory. (Loosely put; I'd have to look up the exact wording - there is a little wiggle room, but, if I've understood the subject of the discussion correctly, not enough.) You can't call through a function pointer of the wrong type, but you can cast from one type to another. I think that's enough, provided that void * is large enough to be converted to and from a function pointer. If you can find me a description of what NetBSD assumes beyond what C promises, I can have a stab at answering that question. There is no such list. That's a bug in NetBSD's documentation. --apb (Alan Barrett)
Re: qsort_r
On Mon, Dec 09, 2013 at 12:59:49AM -0500, Mouse wrote: [...]. In this environment I don't think it's a good idea to make any assumptions at all about things the C standard doesn't guarantee. Then there are a lot of things that need fixing; the one I'm most aware of is the assumption that all-bits-0 is a nil pointer. (This most often shows up in the use of calloc, bzero, etc on space that includes a pointer and then assuming the pointer is nil; see, for example, the example code in 5.2's getaddrinfo(3) manpage.) Yes. I think the official stance on that is to worry about it when a real platform appears where it matters. Myself, I don't write such code (either for pointers or floats) -- in most cases it is clearer to initialize explicitly and it's not hard for the compiler to turn a bunch of explicit initializations into a bzero call if it thinks that's better. The one family of cases where this isn't so clear is kernel things where in some cases you get zeroed pages that you know are zeroed and it's stupid and slow to write more zeros over them unnecessarily. (Also, as far as 8-bit types, while POSIX promises that char is 8 bits wide, [...] Exactly 8 bits, or at least 8 bits? C promises it's at least 8 bits; that isn't one of the things I had heard that POSIX said anything about. POSIX says that CHAR_BIT is 8. [...] there's still from time to time loose talk about doing a port to a 36-bit machine, where char would presumably be 9 bits. I think doing this and making it work this would be good for the general quality of the code base and I'd rather not introduce unnecessary complications.) I agree. (I've often thought it would be nice to have a checkout compiler that went out of its way to break as many as possible of the usual assumptions that are true everywhere but that C does not actually promise) Many people think that, but nobody's really done it, as it's a bit of a large project. Go for it :-) The one exception to all this that I can think of is that we don't care about platforms where int is 16 bits wide. Or 18, either. Or anything else less than 32, I suspect. I would even hazard a guess that there's code in the tree that assumes int is exactly 32 bits, Yeah probably. It's a big tree and there aren't really adequate checking tools for this. though code assuming long is also 32 has most probably been eradicated by now. Most of that's even gone from pkgsrc at this point. I've been hunting it down. -- David A. Holland dholl...@netbsd.org
Re: qsort_r
Is just casting the function pointers safe in C No. As soon as you call through a pointer to the wrong type you're off in nasal demon territory. You can't call through a function pointer of the wrong type, but you can cast from one type to another. True. I wrote that with more context than the above quote gives. I was responding to something I might flesh out as Is just casting the function pointers enough to make [the thing under discussion] safe in C?. I think that's enough, Maybe. It's possible I misunderstood something; I didn't go through the whole patch in detail. But C99 does promise that [#8] A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the pointed-to type, the behavior is undefined. (6.3.2.3 in the version I have). provided that void * is large enough to be converted to and from a function pointer. I actually don't see anything that promises that a pointer to a function type may be converted to a pointer to void, nor back again (except, in each direction, when the original pointer is nil), much less promising anything about the results if it is done. But I haven't read over the whole thing recently enough to be sure there isn't such a promise hiding somewhere. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: qsort_r
[...] I don't think it's a good idea to make any assumptions at all about things the C standard doesn't guarantee. Then there are a lot of things that need fixing; the one I'm most aware of is the assumption that all-bits-0 is a nil pointer. Yes. I think the official stance on that is to worry about it when a real platform appears where it matters. So, rather than start fixing it now, at leisure, a big scramble when it actually becomes important is preferable? Well, I suppose it's their collective neck - or, rather, their project's. The one family of cases where this isn't so clear is kernel things where in some cases you get zeroed pages that you know are zeroed and it's stupid and slow to write more zeros over them unnecessarily. Also possible in userland, though less commonly. (I've often thought it would be nice to have a checkout compiler that went out of its way to break as many as possible of the usual assumptions that are true everywhere but that C does not actually promise) Many people think that, but nobody's really done it, as it's a bit of a large project. Go for it :-) Heh. Yeah, perhaps some one of these years I'll find the spare time to start seriously hacking compilers. I would even hazard a guess that there's code in the tree that assumes int is exactly 32 bits, Yeah probably. It's a big tree and there aren't really adequate checking tools for this. Checkout compiler! NetBSD/pdp10! :-) /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: qsort_r
On Sun, Dec 08, 2013 at 10:50:15PM +, David Holland wrote: I have done it by having the original, non-_r functions provide a thunk for the comparison function, as this is least invasive. If we think this is too expensive, an alternative is generating a union of function pointers and making tests at the call sites; another option is to duplicate the code (hopefully with cpp rather than CP) but that seems like a bad plan. I'd prefer to not have another indirect call. The only difference is the definition and expanding a CMP macro differently? Yes. But I'd rather not duplicate the code... So, to ground this a bit, the realistic options are: starting from this (various details elided): void sort(void *p, size_t n, size_t sz, int (*cmp)(const void *, const void *)) { ... r = cmp(a, b); ... } (1) thunks [what's in my current patch] void sort_r(void *p, size_t n, size_t sz, int (*cmp)(const void *, const void *, void *), void *data) { ... r = cmp(a, b, data); ... } struct sort { int (*cmp)(const void *, const void *); }; static int thunk(const void *a, const void *b, void *data) { return ((struct sort *)data)-cmp(a, b) } void sort(void *p, size_t n, size_t sz, int (*cmp)(const void *, const void *)) { struct sort x = { .cmp = cmp }; sort_r(p, n, sz, thunk, x); } pros: + already written + noninvasive + reasonably tidy + does not duplicate the output code cons: - extra indirect calls (2) union of function pointers union sort { int (*cmp)(const void *, const void *); int (*cmp_r)(const void *, const void *, void *); }; static void sort_internal(void *p, size_t n, size_t sz, bool is_r, union sort cmpu, void *data) { ... r = is_r ? cmpu.cmp(a, b) : cmpu.cmp_r(a, b, data); ... } void sort(void *p, size_t n, size_t sz, int (*cmp)(const void *, const void *)) { union sort cmpu = { .cmp = cmp }; sort_internal(p, n, sz, true, cmpu, NULL); } void sort_r(void *p, size_t n, size_t sz, int (*cmp_r)(const void *, const void *, void *), void *data) { union sort cmpu = { .cmp_r = cmp }; sort_internal(p, n, sz, true, cmpu, data); } pros: + not excessively invasive + no indirect calls + does not duplicate the output code + most likely to have unnecessary logic removed by the compiler cons: + a bit ugly (3) cloning the code with cpp --- in sort.c --- #ifdef MAKE_ME_REENTRANT #define SORTsort_r #define EXTRA , void *data #define CMP(a, b) cmp(a, b, data) #else #define SORTsort #define EXTRA #define CMP(a, b) cmp(a, b) #endif void SORT(void *p, size_t n, size_t sz, int (*cmp)(const void *, const void * EXTRA) EXTRA) { ... r = CMP(a, b); ... } --- in sort_r.c --- #define MAKE_ME_REENTRANT #include sort.c pros: + noninvasive + no runtime overhead cons: + ugly + duplicates the output code Of these I think if we're concerned about the overhead of the thunks in #1, #2 is the best bet. While with normal calling conventions sort can end up being identical object code to sort_r, as various people have noted, I think this kind of optimization is best left to the compiler. With #2, the compiler has a fighting chance of being able to do this. (I haven't checked; I'd be moderately, but not greatly, surprised if any of the compilers we have do.) -- David A. Holland dholl...@netbsd.org
Re: qsort_r
Date:Mon, 9 Dec 2013 00:59:49 -0500 (EST) From:Mouse mo...@rodents-montreal.org Message-ID: 201312090559.aaa21...@chip.rodents-montreal.org | see, for example, the | example code in 5.2's getaddrinfo(3) manpage.) I assume you mean memset(hints, 0, sizeof(hints)); That one is perfectly fine, in any C, the values of the pointer fields of hints are irrelevant, they're never used (we may wonder at the interface design that used a struct addrinfo for this purpose, but that's beyond fixing now). Only the int fields of the hints struct have any defined use - the others must be zero or the null pointer. (It doesn't say that the pointer fields must be the null pointer, and other fields must be zero, though that would be a sane implementation, just that the other fields must each be 0, or be the null pointer, the memset() makes them all be 0 - which conforms with the requirements.) | I agree. (I've often thought it would be nice to have a checkout | compiler that went out of its way to break as many as possible of the | usual assumptions that are true everywhere but that C does not | actually promise) Of academic use only - in the real world economics (that existing code, however badly written, that works today, must also work on future systems, or no-one will buy them) puts all kinds of limitations on what hardware designers are able to do, other than for very special purpose uses, which NetBSD is never going to care about. What the C standards people, and then the gcc implementors, are doing is apalling - changing (or even just tightening) the spec to allow for processor designs that once existed, but will never be seen again, or that allow optimisations to make code run raster, when what needs optimisation most is programmer effort, if we have programmers fighting against arcane language rules that compiler implementors have used to make some code segment run a few seconds faster, rather than doing the obvious intended operation, then everyone suffers, and but the rules say we can do that, and you shouldn't have assumed that we wouldn't are no consolation at all. I'm all for additions/corrections/... that aid writing more correct, and more portable code - that's always a good thing. But changes that force the writing of any of that, rather than simply promote it, are evil. C used to be a great language, precisely because it didn't get in the way, and make programmers much more productive. These days, much of that has gone, and it is becoming just another obstruction to getting work actually done. kre
Re: qsort_r
On Mon, Dec 09, 2013 at 01:49:29AM -0500, Mouse wrote: Then there are a lot of things that need fixing; the one I'm most aware of is the assumption that all-bits-0 is a nil pointer. Yes. I think the official stance on that is to worry about it when a real platform appears where it matters. So, rather than start fixing it now, at leisure, a big scramble when it actually becomes important is preferable? Well, I suppose it's their collective neck - or, rather, their project's. Well... what is a plausible scenario where this becomes important? In what situation is someone going to try to change this convention, or more accurately, in what situation is someone going to try to change this convention in a way that makes it time-critical to fix problems that arise, such that necks become involved? The only cases I can think of are deliberately working this issue out with suitable tools, and porting to some vintage machine where you can't for some reason use 0 as the null pointer representation. Neither is exactly an urgent situation. No new machine will ever have this property because there's too much existing code that would break and there's no plausible reason to bother. Similarly, for all-bits-0 floats, no new machine is going to use anything but IEEE floats. The one family of cases where this isn't so clear is kernel things where in some cases you get zeroed pages that you know are zeroed and it's stupid and slow to write more zeros over them unnecessarily. Also possible in userland, though less commonly. Vanishingly rarely; I suppose calloc can avoid rezeroing when it gets fresh pages from the kernel, but does any calloc actually *do* that rather than just calling malloc and bzero? (I've often thought it would be nice to have a checkout compiler that went out of its way to break as many as possible of the usual assumptions that are true everywhere but that C does not actually promise) Many people think that, but nobody's really done it, as it's a bit of a large project. Go for it :-) Heh. Yeah, perhaps some one of these years I'll find the spare time to start seriously hacking compilers. If you keep worrying about how porky gcc is, sooner or later you'll get irritated enough to write your own... I would even hazard a guess that there's code in the tree that assumes int is exactly 32 bits, Yeah probably. It's a big tree and there aren't really adequate checking tools for this. Checkout compiler! NetBSD/pdp10! :-) One of the reasons I proposed/started on mint was to enable checks for stuff like this. But I haven't really put any time into it yet... -- David A. Holland dholl...@netbsd.org
Re: qsort_r
On Mon, Dec 09, 2013 at 06:10:36AM +, Alan Barrett wrote: My irritation with not being able to pass a data pointer through qsort() boiled over just now. Apparently Linux and/or GNU has a qsort_r() that supports this; so, following is a patch that gives us a compatible qsort_r() plus mergesort_r(), and heapsort_r(). Apparently FreeBSD [1] and GNU [2] have incompatible versions of qsort_r, passing the extra 'thunk' or 'data' argument in a different position. [1]: FreeBSD qsort_r http://www.manpagez.com/man/3/qsort_r/ [2]: Linux qsort_r http://man7.org/linux/man-pages/man3/qsort.3.html If we have to pick one, let's pick the FreeBSD version. Given that it's just left vs. right side, the only technical reason to favor one over the other is that the right-side one (the Linux one) allows sharing the same object code for both versions, at least on common platforms, as discussed elsewhere in this thread. Also, all else being equal the Linux one de facto has greater market penetration. So I think picking the Linux one is a marginally better choice. Note that the thunks use an extra struct to hold the function pointer; this is to satisfy C standards pedantry about void pointers vs. function pointers, and if we decide not to care it could be simplified. That adds more run-time overhead. Could you make it conditional on whether it's really necessary? All existing NetBSD platforms can convert back and forth between void * and function pointers without any trouble. I think switching approaches (as in the other post I just made) is probably a better idea, because it also gets rid of this issue. -- David A. Holland dholl...@netbsd.org
Re: qsort_r
corrections from Mouse: On Mon, Dec 09, 2013 at 07:02:39AM +, David Holland wrote: (2) union of function pointers r = is_r ? cmpu.cmp(a, b) : cmpu.cmp_r(a, b, data); this test is backwards. void sort(void *p, size_t n, size_t sz, int (*cmp)(const void *, const void *)) { union sort cmpu = { .cmp = cmp }; sort_internal(p, n, sz, true, cmpu, NULL); and that should be false. sigh. pseudocode... -- David A. Holland dholl...@netbsd.org
Re: qsort_r
see, for example, the example code in 5.2's getaddrinfo(3) manpage.) I assume you mean memset(hints, 0, sizeof(hints)); Yes. That one is perfectly fine, in any C, the values of the pointer fields of hints are irrelevant, According to the manpage, their values are relevant: All other elements of the addrinfo structure passed via hints must be zero or the null pointer. (It doesn't say that the pointer fields must be the null pointer, and other fields must be zero, though that would be a sane implementation, just that the other fields must each be 0, or be the null pointer, the memset() makes them all be 0 - which conforms with the requirements.) I would disagree that all zero bits is a reasonable choice for what it means for a pointer to be zero, unless all zero bits happens to be the implementation's choice for a nil pointer. It's a C interface; I have trouble interpreting be zero as anything but hold a value which compares equal to zero, which for pointers means a nil pointer. In any case, it's hardly the only example of that assumption; it's just the first one that came to mind. Of academic use only - [...] I'm all for additions/corrections/... that aid writing more correct, and more portable code - that's always a good thing. That's exactly what such a checkout compiler would be for. Indeed, to be really useful it would need to have some kind of controls for exactly which assumptions it breaks, so that, for example, you could tell it to make nil pointers all-0-bits but break the assumption that int has no padding bits in it. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: qsort_r
No new machine will ever have this property because there's too much existing code that would break and there's no plausible reason to bother. Similarly, for all-bits-0 floats, no new machine is going to use anything but IEEE floats. If you really believe those, there's no point in carrying this part of the thread any farther. I find each of them ludicrously implausible; ever is way too long a time, and nothing will use anything but IEEE floats is about as certain as everything is little-endian. Indeed, I suspect C is piling up trouble for itself by mandating binary (when (not if) a non-binary computer arises, it makes C that much more likely to be abandoned than adapted). The only question in my mind is how far off these changes are. I'm reasonably sure none of them will hit the mass market within the next decade or so, but that's about all I'm confident of. If you keep worrying about how porky gcc is, sooner or later you'll get irritated enough to write your own... True. That it will ever happen, though, depends on a bunch of assumptions which are looking less and less plausible these years. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B