Re: qsort_r again

2016-08-28 Thread Christos Zoulas
In article <20160827201952.gb6...@netbsd.org>,
David Holland   wrote:
>(Cc: tech-kern because of kheapsort())
>
>Some time back I made a set of patches for qsort_r(), that is, a
>version of qsort that allows passing a data pointer through, along
>with corresponding mergesort_r() and heapsort_r().
>
>After some discussion the conclusion was that instead of using thunks
>to implement qsort in terms of qsort_r, it was better to implement
>both versions in terms of a common internal version. The following
>patchset does that.
>
>heapsort() is used in two places in the kernel as kheapsort(), which
>takes an extra argument so that the heapsort code itself doesn't have
>to know how to malloc in the kernel. I have done the following:
>   - add kheapsort_r()
>   - change the signature of plain kheapsort() to move the extra
> argument to a place it is less likely to cause confusion with the
> data argument;
>   - update the callers for the signature change;
>but I have not changed either caller to call kheapsort_r
>instead. Based on the code, this should probably be done later; then
>the plain kheapsort can be removed.
>
>Note that this set of names and signatures matches what's in Linux.
>Apparently FreeBSD did it differently, but there are reasons to prefer
>this arrangement of arguments.
>
>Joerg also says that the _r names are inappropriate because the
>originals are reentrant, but I think that's tilting at windmills and
>it's more useful to be compatible.

I agree with joerg, but that's water under the bridge...
Go for it.

christos



Re: qsort_r

2013-12-09 Thread Alan Barrett

On Mon, 09 Dec 2013, Mouse wrote:
I actually don't see anything that promises that a pointer to a 
function type may be converted to a pointer to void, nor back 
again (except, in each direction, when the original pointer is 
nil), much less promising anything about the results if it is 
done.  But I haven't read over the whole thing recently enough 
to be sure there isn't such a promise hiding somewhere.


Sorry, I did not express myself clearly enough.  C does not 
promise that function pointers can be converted to or from void* 
pointers, but I believe that all existing NetBSD implementations 
do allow such conversions.


--apb (Alan Barrett)



Re: qsort_r

2013-12-09 Thread Robert Elz
Date:Mon, 9 Dec 2013 02:37:30 -0500 (EST)
From:Mouse mo...@rodents-montreal.org
Message-ID:  201312090737.caa22...@chip.rodents-montreal.org

  | According to the manpage, their values are relevant:

Just in case someday someone ever thinks of a use for them.
Very unlikely.   But in any case, since the doc explicitly shows
init by memset(..,0,..), and that's what everyone does, the
implementation would need to keep that working, even if it ever
happened that the null pointer was not all 0 bits (and personally,
I'm not going to every worry about that happening - it won't.)

  | In any case, it's hardly the only example of that assumption; it's just
  | the first one that came to mind.

Yes, there are certainly plenty of cases where structs are cleared
(by calloc, or explicit memset) and then it's assumed that any pointers
in them are NULL.

That's the kind of economic imperative that I referred to - any implementation
that makes that code fail (however legal it might be according to the
rules of C) will simply be ignored in the world - if hardware/software
manufacturers (including us who give stuff away for free) want it to be
used, we have to make stuff that the users are happy with - and all my
working code fails on this system isn't going to cut it.

  | That's exactly what such a checkout compiler would be for.

I have no problem at all with tools that check/validate assumptions.

It is the changes (including places where it can be argued that there
was no explicit specification previously) from accepted practice over the
past (almost) 40 years to the C language spec that I don't like.

Had all this been done in 1980, it might have made sense, by 1990, the
ship had already sailed, it was already too late for anyone to even consider
changing any of the basic (written or unwritten) assumptions about how
hardware works.  Pretending it is even remotely possible that arithmetic
might go back to 1's complement, or anything other than 0 bits will ever
be the null pointer, or even that memory won't be byte addressable (for
whatever size byte ends up being in use, but it is almost inconceivable
anything other than 8 bits will ever be supported) is just insane.  Making
programmers work harder, just in case their code happens to want to run
on such a nonexistent computer is ludicrous.  That stuff is all extinct.

kre



re: qsort_r

2013-12-09 Thread matthew green

 On Mon, 09 Dec 2013, Mouse wrote:
  I actually don't see anything that promises that a pointer to a 
  function type may be converted to a pointer to void, nor back 
  again (except, in each direction, when the original pointer is 
  nil), much less promising anything about the results if it is 
  done.  But I haven't read over the whole thing recently enough 
  to be sure there isn't such a promise hiding somewhere.
 
 Sorry, I did not express myself clearly enough.  C does not 
 promise that function pointers can be converted to or from void* 
 pointers, but I believe that all existing NetBSD implementations 
 do allow such conversions.

if you include the powerpc64 port in the list of existing
netbsd implementations, then we do have one.. or at least
sort of.  i don't recall the details, but they're not just
simple pointers you jump to, and the descriptors take up
more space than a normal pointer.


.mrg.


Re: qsort_r

2013-12-09 Thread Valery Ushakov
On Mon, Dec 09, 2013 at 22:20:47 +1100, matthew green wrote:

  On Mon, 09 Dec 2013, Mouse wrote:
   I actually don't see anything that promises that a pointer to a 
   function type may be converted to a pointer to void, nor back 
   again (except, in each direction, when the original pointer is 
   nil), much less promising anything about the results if it is 
   done.  But I haven't read over the whole thing recently enough 
   to be sure there isn't such a promise hiding somewhere.
  
  Sorry, I did not express myself clearly enough.  C does not 
  promise that function pointers can be converted to or from void* 
  pointers, but I believe that all existing NetBSD implementations 
  do allow such conversions.
 
 if you include the powerpc64 port in the list of existing
 netbsd implementations, then we do have one.. or at least
 sort of.  i don't recall the details, but they're not just
 simple pointers you jump to, and the descriptors take up
 more space than a normal pointer.

Itanium has fat function pointers - they point to a struct of code
address plus gp register value.

-uwe


Re: qsort_r

2013-12-09 Thread Daode
Sneaking in because of a reason..

 | On Mon, 09 Dec 2013, Mouse wrote:
 | I actually don't see anything that promises that a pointer to a 
 | function type may be converted to a pointer to void, nor back 
 | again (except, in each direction, when the original pointer is 
 | 
 | Sorry, I did not express myself clearly enough.  C does not 
 | promise that function pointers can be converted to or from void* 
 | pointers, but I believe that all existing NetBSD implementations 

Rich Felker corrected my worldview and correctly pointed out that
POSIX actually does add an additional constraint in [1]:

  Note that conversion from a void * pointer to a function pointer
  as in:

  fptr = (int (*)(int))dlsym(handle, my_function);

  is not defined by the ISO C standard. This standard requires this
  conversion to work correctly on conforming implementations.

  [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/dlsym.html

--steffen


Re: qsort_r

2013-12-09 Thread Christos Zoulas
In article 20131209061036.ge2...@apb-laptoy.apb.alt.za,
Alan Barrett  a...@cequrux.com wrote:
On Sun, 08 Dec 2013, David Holland wrote:
 My irritation with not being able to pass a data pointer through 
 qsort() boiled over just now. Apparently Linux and/or GNU 
 has a qsort_r() that supports this; so, following is a patch 
 that gives us a compatible qsort_r() plus mergesort_r(), and 
 heapsort_r().

Apparently FreeBSD [1] and GNU [2] have incompatible versions 
of qsort_r, passing the extra 'thunk' or 'data' argument in a 
different position.

[1]: FreeBSD qsort_r http://www.manpagez.com/man/3/qsort_r/
[2]: Linux qsort_r  http://man7.org/linux/man-pages/man3/qsort.3.html

If we have to pick one, let's pick the FreeBSD version.

Actually let's not (fortunately dh@ chose the right one).
We should pick the linux one:

http://sourceware.org/ml/libc-alpha/2008-12/msg7.html

christos



Re: qsort_r

2013-12-09 Thread David Laight
On Mon, Dec 09, 2013 at 03:55:30AM +, David Holland wrote:
 On Sun, Dec 08, 2013 at 11:26:47PM +, David Laight wrote:
 I have done it by having the original, non-_r functions provide a
 thunk for the comparison function, as this is least invasive. If we
 think this is too expensive, an alternative is generating a union of
 function pointers and making tests at the call sites; another option
 is to duplicate the code (hopefully with cpp rather than CP) but that
 seems like a bad plan.

I'd prefer to not have another indirect call. The only difference
is the definition and expanding a CMP macro differently?
   
   Is just casting the function pointers safe in C (well in NetBSD)?
   (with the calling conventions that Unix effectively requires)
 
 No. Well, it is, but it's explicitly illegal C and I don't think we
 should do it.

Actually given that these functions are in libc, their interface
is defined by the architecture's function call ABI, not by the C language.

Consider what you would do if you wrote an asm wrapper for qsort(a,b)
in terms of an asm qsort_r(a,b,d)?

For ABI where the the first 3 arguments are passed in registers
(eg: amd64, sparc, sparc64) and for ABI where arguments are stacked
and cleared by the caller (eg i386) I don't you'd consider doing anything
other than putting an extra label on the same code.

There might be ABI where the this isn't true - in which case the
'thunk' is an option, but I don't think NetBSD has one.

FWIW I think Linux is moving to an alternate ppc64 ABI that doesn't
use 'fat pointers'.

David

-- 
David Laight: da...@l8s.co.uk


Re: qsort_r

2013-12-08 Thread David Holland
On Sun, Dec 08, 2013 at 11:44:28PM +0100, Joerg Sonnenberger wrote:
   I have done it by having the original, non-_r functions provide a
   thunk for the comparison function, as this is least invasive. If we
   think this is too expensive, an alternative is generating a union of
   function pointers and making tests at the call sites; another option
   is to duplicate the code (hopefully with cpp rather than CP) but that
   seems like a bad plan.
  
  I'd prefer to not have another indirect call. The only difference
  is the definition and expanding a CMP macro differently?

Yes. But I'd rather not duplicate the code...

-- 
David A. Holland
dholl...@netbsd.org


Re: qsort_r

2013-12-08 Thread David Laight
On Sun, Dec 08, 2013 at 11:44:28PM +0100, Joerg Sonnenberger wrote:
 On Sun, Dec 08, 2013 at 10:29:53PM +, David Holland wrote:
  I have done it by having the original, non-_r functions provide a
  thunk for the comparison function, as this is least invasive. If we
  think this is too expensive, an alternative is generating a union of
  function pointers and making tests at the call sites; another option
  is to duplicate the code (hopefully with cpp rather than CP) but that
  seems like a bad plan.
 
 I'd prefer to not have another indirect call. The only difference
 is the definition and expanding a CMP macro differently?

Is just casting the function pointers safe in C (well in NetBSD)?
(with the calling conventions that Unix effectively requires)

Can anything slightly less nasty be done with varags functions?

David

-- 
David Laight: da...@l8s.co.uk


Re: qsort_r

2013-12-08 Thread Christos Zoulas
In article 20131208222953.gb25...@netbsd.org,
David Holland  dholland-t...@netbsd.org wrote:
(Cc: tech-kern because of kheapsort())

My irritation with not being able to pass a data pointer through
qsort() boiled over just now. Apparently Linux and/or GNU has a
qsort_r() that supports this; so, following is a patch that gives us
a compatible qsort_r() plus mergesort_r(), and heapsort_r().

I have done it by having the original, non-_r functions provide a
thunk for the comparison function, as this is least invasive. If we
think this is too expensive, an alternative is generating a union of
function pointers and making tests at the call sites; another option
is to duplicate the code (hopefully with cpp rather than CP) but that
seems like a bad plan. Note that the thunks use an extra struct to
hold the function pointer; this is to satisfy C standards pedantry
about void pointers vs. function pointers, and if we decide not to
care it could be simplified.

This patch was supposed to have all the necessary support widgetry,
like namespace.h changes, but there's at least one more thing not in
it: MLINKS for the new functions and corresponding setlist
changes. If I've forgotten anything else, let me know.

heapsort() is used in one place in the kernel as kheapsort(), which
takes an extra argument so that the heapsort code itself doesn't have
to know how to malloc in the kernel. I have done the following:
   - add kheapsort_r()
   - change the signature of plain kheapsort() to move the extra
 argument to a place it is less likely to cause confusion with the
 userdata argument;
   - update the caller for the signature change;
but I have not changed the caller to call kheapsort_r instead. Based
on the code, this should probably be done later as like many sort
calls it's using a global for context. At this point the plain
kheapsort can be removed.


LGTM.

christos



Re: qsort_r

2013-12-08 Thread Mouse
 My irritation with not being able to pass a data pointer through
 qsort() boiled over just now.

This sort of thing is one reason I do not consider nested function
support optional in any compiler I consider using: they make it easy to
curry away function arguments to generate something suitable for
passing to (say) qsort, thereby fixing the problem once, for all uses,
rather than depending on each called thing being fixed separately.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: qsort_r

2013-12-08 Thread David Laight
On Sun, Dec 08, 2013 at 10:29:53PM +, David Holland wrote:
 
 I have done it by having the original, non-_r functions provide a
 thunk for the comparison function, as this is least invasive. If we
 think this is too expensive, an alternative is generating a union of
 function pointers and making tests at the call sites; another option
 is to duplicate the code (hopefully with cpp rather than CP) but that
 seems like a bad plan. Note that the thunks use an extra struct to
 hold the function pointer; this is to satisfy C standards pedantry
 about void pointers vs. function pointers, and if we decide not to
 care it could be simplified.

On most architectures I think just:
__weak_alias(heapsort_r,heapsort)   
__weak_alias(heapsort_r,_heapsort)   
will work.

David

-- 
David Laight: da...@l8s.co.uk


Re: qsort_r

2013-12-08 Thread Joerg Sonnenberger
On Sun, Dec 08, 2013 at 06:21:16PM -0500, Mouse wrote:
  My irritation with not being able to pass a data pointer through
  qsort() boiled over just now.
 
 This sort of thing is one reason I do not consider nested function
 support optional in any compiler I consider using: they make it easy to
 curry away function arguments to generate something suitable for
 passing to (say) qsort, thereby fixing the problem once, for all uses,
 rather than depending on each called thing being fixed separately.

Nested functions the way GCC implements them are broken. For once, they
introduce security issues by requiring an executable stack. Anonymous
functions with lexical scope can be done properly, lambdas in C++11 are
an example. They shouldn't be shoe horned into C this way though.

Joerg


Re: qsort_r

2013-12-08 Thread David Holland
On Sun, Dec 08, 2013 at 11:26:47PM +, David Laight wrote:
I have done it by having the original, non-_r functions provide a
thunk for the comparison function, as this is least invasive. If we
think this is too expensive, an alternative is generating a union of
function pointers and making tests at the call sites; another option
is to duplicate the code (hopefully with cpp rather than CP) but that
seems like a bad plan.
   
   I'd prefer to not have another indirect call. The only difference
   is the definition and expanding a CMP macro differently?
  
  Is just casting the function pointers safe in C (well in NetBSD)?
  (with the calling conventions that Unix effectively requires)

No. Well, it is, but it's explicitly illegal C and I don't think we
should do it.

  Can anything slightly less nasty be done with varags functions?

Don't immediately see how...

-- 
David A. Holland
dholl...@netbsd.org


Re: qsort_r

2013-12-08 Thread Mouse
 Is just casting the function pointers safe in C

No.  As soon as you call through a pointer to the wrong type you're off
in nasal demon territory.  (Loosely put; I'd have to look up the exact
wording - there is a little wiggle room, but, if I've understood the
subject of the discussion correctly, not enough.)

 (well in NetBSD)?  (with the calling conventions that Unix
 effectively requires)

How does Unix (to the extent that there is a single such thing)
restrict the compiler-and-machine's calling conventions?

If you can find me a description of what NetBSD assumes beyond what C
promises, I can have a stab at answering that question.  (I know a few
of the things, such as there is an exactly-8-bit type, but I feel
fairly sure I don't know them all; I'm even now still discovering such
things occasionally.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: qsort_r

2013-12-08 Thread Mouse
[Should this maybe be moved to tech-toolchain?]

 My irritation with not being able to pass a data pointer through
 qsort() boiled over just now.
 This sort of thing is one reason I do not consider nested function
 support optional in any compiler I consider using: [...]
 Nested functions the way GCC implements them are broken.

I didn't say nested function support the way gcc implements them.
Also, I disagree that they are broken.  They work; I use them
routinely.  You may find them inappropriate for your tradeoffs, but
that does not make them broken; it just means they're not a right tool
for you.

 For once, they introduce security issues by requiring an executable
 stack.

I think that's overstating the case; they require an executable stack,
which in some cases (an important qualification) can introduce security
issues.

Yes.  I don't like stack trampolines.  I would much prefer fat function
pointers.  One of my blue-sky projects has long been to build a
compiler that uses fat function pointers to do nested functions.  (I
haven't looked much at teaching gcc to do fat function pointers; the
little I've seen from when I _have_ had my fingers in gcc's internals
makes me suspect doing it in gcc would involve major surgery.)

 Anonymous functions with lexical scope can be done properly,

gcc-style nested functions are not anonymous.  (Admittedly, this is a
quibble.)

Yes, they can.  For many purposes, gcc's implementation is an example,
the above quibble aside.  But nothing says it's the only way; even I,
as inexpert as I am at compilers, can think of another implementation
technique I'm certain could be made to work which avoids the execution
out of the stack that bothers you so, another technique which probably
could be made to work (though I'm less sure), and there probably are
plenty of others I haven't thought of.

 They shouldn't be shoe horned into C this way though.

What's wrong with gcc's method?  Aside from implementation issues, that
is; I'm trying to see what you think is wrong with the language, not
the implementation of it, since you seem to think there _is_ something
wrong, at the language level, with gcc's version.  Or did I misread?

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: qsort_r

2013-12-08 Thread David Holland
On Sun, Dec 08, 2013 at 11:51:15PM -0500, Mouse wrote:
  If you can find me a description of what NetBSD assumes beyond what C
  promises, I can have a stab at answering that question.  (I know a few
  of the things, such as there is an exactly-8-bit type, but I feel
  fairly sure I don't know them all; I'm even now still discovering such
  things occasionally.)

I don't think we have such a list. In practice though it's defined by
what properties all normal architectures share, filtered by which of
these properties gcc reliably allows to show through to the source
level. This is a moving target; for example, not all that long ago you
could rely on signed shifts to behave properly, but nowadays you
can't.

Historically the C standard was understood to provide certain wiggle
room for the architecture and other wiggle room for the compiler, such
that people writing MD code could reasonably assume that properties of
their architecture would be reflected through by the compiler. The gcc
guys have repeatedly demonstrated that they have no use for this model,
no interest in preserving such properties, and are willing to break
them arbitrarily. In this environment I don't think it's a good idea
to make any assumptions at all about things the C standard doesn't
guarantee.

(Also, as far as 8-bit types, while POSIX promises that char is 8 bits
wide, there's still from time to time loose talk about doing a port to
a 36-bit machine, where char would presumably be 9 bits. I think doing
this and making it work this would be good for the general quality of
the code base and I'd rather not introduce unnecessary complications.)

The one exception to all this that I can think of is that we don't
care about platforms where int is 16 bits wide. Or 18, either.

-- 
David A. Holland
dholl...@netbsd.org


Re: qsort_r

2013-12-08 Thread Mouse
 If you can find me a description of what NetBSD assumes beyond what
 C promises, [...]
 I don't think we have such a list.

I didn't think so either, but wasn't nearly sure enough to assume there
wasn't one.

 [...].  In this environment I don't think it's a good idea to make
 any assumptions at all about things the C standard doesn't guarantee.

Then there are a lot of things that need fixing; the one I'm most aware
of is the assumption that all-bits-0 is a nil pointer.  (This most
often shows up in the use of calloc, bzero, etc on space that includes
a pointer and then assuming the pointer is nil; see, for example, the
example code in 5.2's getaddrinfo(3) manpage.)

 (Also, as far as 8-bit types, while POSIX promises that char is 8
 bits wide, [...]

Exactly 8 bits, or at least 8 bits?  C promises it's at least 8 bits;
that isn't one of the things I had heard that POSIX said anything
about.

 [...] there's still from time to time loose talk about doing a port
 to a 36-bit machine, where char would presumably be 9 bits.  I think
 doing this and making it work this would be good for the general
 quality of the code base and I'd rather not introduce unnecessary
 complications.)

I agree.  (I've often thought it would be nice to have a checkout
compiler that went out of its way to break as many as possible of the
usual assumptions that are true everywhere but that C does not
actually promise)

 The one exception to all this that I can think of is that we don't
 care about platforms where int is 16 bits wide.  Or 18, either.

Or anything else less than 32, I suspect.  I would even hazard a guess
that there's code in the tree that assumes int is exactly 32 bits,
though code assuming long is also 32 has most probably been eradicated
by now.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: qsort_r

2013-12-08 Thread Alan Barrett

On Sun, 08 Dec 2013, David Holland wrote:
My irritation with not being able to pass a data pointer through 
qsort() boiled over just now. Apparently Linux and/or GNU 
has a qsort_r() that supports this; so, following is a patch 
that gives us a compatible qsort_r() plus mergesort_r(), and 
heapsort_r().


Apparently FreeBSD [1] and GNU [2] have incompatible versions 
of qsort_r, passing the extra 'thunk' or 'data' argument in a 
different position.


[1]: FreeBSD qsort_r http://www.manpagez.com/man/3/qsort_r/
[2]: Linux qsort_r  http://man7.org/linux/man-pages/man3/qsort.3.html

If we have to pick one, let's pick the FreeBSD version.

I have done it by having the original, non-_r functions 
provide a thunk for the comparison function, as this is least 
invasive. If we think this is too expensive, an alternative is 
generating a union of function pointers and making tests at the 
call sites; another option is to duplicate the code (hopefully 
with cpp rather than CP) but that seems like a bad plan.


I'd probably duplicate the code via CPP, to trade time for space, 
but your way is fine.


Note that the thunks use an extra struct to hold the function 
pointer; this is to satisfy C standards pedantry about void 
pointers vs. function pointers, and if we decide not to care it 
could be simplified.


That adds more run-time overhead.  Could you make it conditional 
on whether it's really necessary?  All existing NetBSD platforms 
can convert back and forth between void * and function pointers 
without any trouble.


--apb (Alan Barrett)


Re: qsort_r

2013-12-08 Thread Alan Barrett

On Sun, 08 Dec 2013, Mouse wrote:

Is just casting the function pointers safe in C


No.  As soon as you call through a pointer to the wrong type 
you're off in nasal demon territory.  (Loosely put; I'd have to 
look up the exact wording - there is a little wiggle room, but, 
if I've understood the subject of the discussion correctly, not 
enough.)


You can't call through a function pointer of the wrong type, but 
you can cast from one type to another.  I think that's enough, 
provided that void * is large enough to be converted to and from a 
function pointer.


If you can find me a description of what NetBSD assumes beyond 
what C promises, I can have a stab at answering that question.


There is no such list.  That's a bug in NetBSD's documentation.

--apb (Alan Barrett)


Re: qsort_r

2013-12-08 Thread David Holland
On Mon, Dec 09, 2013 at 12:59:49AM -0500, Mouse wrote:
   [...].  In this environment I don't think it's a good idea to make
   any assumptions at all about things the C standard doesn't guarantee.
  
  Then there are a lot of things that need fixing; the one I'm most aware
  of is the assumption that all-bits-0 is a nil pointer.  (This most
  often shows up in the use of calloc, bzero, etc on space that includes
  a pointer and then assuming the pointer is nil; see, for example, the
  example code in 5.2's getaddrinfo(3) manpage.)

Yes. I think the official stance on that is to worry about it when a
real platform appears where it matters. Myself, I don't write such
code (either for pointers or floats) -- in most cases it is clearer to
initialize explicitly and it's not hard for the compiler to turn a
bunch of explicit initializations into a bzero call if it thinks
that's better. The one family of cases where this isn't so clear is
kernel things where in some cases you get zeroed pages that you know
are zeroed and it's stupid and slow to write more zeros over them
unnecessarily.

   (Also, as far as 8-bit types, while POSIX promises that char is 8
   bits wide, [...]
  
  Exactly 8 bits, or at least 8 bits?  C promises it's at least 8 bits;
  that isn't one of the things I had heard that POSIX said anything
  about.

POSIX says that CHAR_BIT is 8.

   [...] there's still from time to time loose talk about doing a port
   to a 36-bit machine, where char would presumably be 9 bits.  I think
   doing this and making it work this would be good for the general
   quality of the code base and I'd rather not introduce unnecessary
   complications.)
  
  I agree.  (I've often thought it would be nice to have a checkout
  compiler that went out of its way to break as many as possible of the
  usual assumptions that are true everywhere but that C does not
  actually promise)

Many people think that, but nobody's really done it, as it's a bit of
a large project. Go for it :-)

   The one exception to all this that I can think of is that we don't
   care about platforms where int is 16 bits wide.  Or 18, either.
  
  Or anything else less than 32, I suspect.  I would even hazard a guess
  that there's code in the tree that assumes int is exactly 32 bits,

Yeah probably. It's a big tree and there aren't really adequate
checking tools for this.

  though code assuming long is also 32 has most probably been eradicated
  by now.

Most of that's even gone from pkgsrc at this point. I've been hunting
it down.

-- 
David A. Holland
dholl...@netbsd.org


Re: qsort_r

2013-12-08 Thread Mouse
 Is just casting the function pointers safe in C
 No.  As soon as you call through a pointer to the wrong type you're
 off in nasal demon territory.
 You can't call through a function pointer of the wrong type, but you
 can cast from one type to another.

True.  I wrote that with more context than the above quote gives.  I
was responding to something I might flesh out as Is just casting the
function pointers enough to make [the thing under discussion] safe in
C?.

 I think that's enough,

Maybe.  It's possible I misunderstood something; I didn't go through
the whole patch in detail.  But C99 does promise that

   [#8] A pointer to a function of one type may be converted to
   a pointer to a function of another type and back again;  the
   result  shall  compare  equal to the original pointer.  If a
   converted pointer is used to call a function whose  type  is
   not  compatible  with  the  pointed-to type, the behavior is
   undefined.

(6.3.2.3 in the version I have).

 provided that void * is large enough to be converted to and from a
 function pointer.

I actually don't see anything that promises that a pointer to a
function type may be converted to a pointer to void, nor back again
(except, in each direction, when the original pointer is nil), much
less promising anything about the results if it is done.  But I haven't
read over the whole thing recently enough to be sure there isn't such a
promise hiding somewhere.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: qsort_r

2013-12-08 Thread Mouse
 [...] I don't think it's a good idea to make any assumptions at all
 about things the C standard doesn't guarantee.
 Then there are a lot of things that need fixing; the one I'm most
 aware of is the assumption that all-bits-0 is a nil pointer.
 Yes.  I think the official stance on that is to worry about it when a
 real platform appears where it matters.

So, rather than start fixing it now, at leisure, a big scramble when it
actually becomes important is preferable?  Well, I suppose it's their
collective neck - or, rather, their project's.

 The one family of cases where this isn't so clear is kernel things
 where in some cases you get zeroed pages that you know are zeroed and
 it's stupid and slow to write more zeros over them unnecessarily.

Also possible in userland, though less commonly.

 (I've often thought it would be nice to have a checkout compiler
 that went out of its way to break as many as possible of the usual
 assumptions that are true everywhere but that C does not actually
 promise)
 Many people think that, but nobody's really done it, as it's a bit of
 a large project.  Go for it :-)

Heh.  Yeah, perhaps some one of these years I'll find the spare time to
start seriously hacking compilers.

 I would even hazard a guess that there's code in the tree that
 assumes int is exactly 32 bits,
 Yeah probably.  It's a big tree and there aren't really adequate
 checking tools for this.

Checkout compiler!  NetBSD/pdp10!  :-)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: qsort_r

2013-12-08 Thread David Holland
On Sun, Dec 08, 2013 at 10:50:15PM +, David Holland wrote:
 I have done it by having the original, non-_r functions provide a
 thunk for the comparison function, as this is least invasive. If we
 think this is too expensive, an alternative is generating a union of
 function pointers and making tests at the call sites; another option
 is to duplicate the code (hopefully with cpp rather than CP) but that
 seems like a bad plan.

I'd prefer to not have another indirect call. The only difference
is the definition and expanding a CMP macro differently?
  
  Yes. But I'd rather not duplicate the code...

So, to ground this a bit, the realistic options are:

starting from this (various details elided):

   void
   sort(void *p, size_t n, size_t sz, int (*cmp)(const void *, const void *))
   {
   ...
   r = cmp(a, b);
   ...
   }


(1) thunks [what's in my current patch]

   void
   sort_r(void *p, size_t n, size_t sz,
  int (*cmp)(const void *, const void *, void *), void *data)
   {
   ...
   r = cmp(a, b, data);
   ...
   }

   struct sort {
   int (*cmp)(const void *, const void *);
   };

   static int
   thunk(const void *a, const void *b, void *data)
   {
   return ((struct sort *)data)-cmp(a, b)
   }

   void
   sort(void *p, size_t n, size_t sz, int (*cmp)(const void *, const void *))
   {
   struct sort x = { .cmp = cmp };
   sort_r(p, n, sz, thunk, x);
   }

pros:
   + already written
   + noninvasive
   + reasonably tidy
   + does not duplicate the output code
cons:
   - extra indirect calls


(2) union of function pointers

   union sort {
   int (*cmp)(const void *, const void *);
   int (*cmp_r)(const void *, const void *, void *);
   };

   static void
   sort_internal(void *p, size_t n, size_t sz, bool is_r, union sort cmpu,
 void *data)
   {
   ...
   r = is_r ? cmpu.cmp(a, b) : cmpu.cmp_r(a, b, data);
   ...
   }

   void
   sort(void *p, size_t n, size_t sz, int (*cmp)(const void *, const void *))
   {
   union sort cmpu = { .cmp = cmp };
   sort_internal(p, n, sz, true, cmpu, NULL);
   }

   void
   sort_r(void *p, size_t n, size_t sz,
  int (*cmp_r)(const void *, const void *, void *), void *data)
   {
   union sort cmpu = { .cmp_r = cmp };
   sort_internal(p, n, sz, true, cmpu, data);
   }

pros:
   + not excessively invasive
   + no indirect calls
   + does not duplicate the output code
   + most likely to have unnecessary logic removed by the compiler
cons:
   + a bit ugly


(3) cloning the code with cpp

   --- in sort.c ---

   #ifdef MAKE_ME_REENTRANT
   #define SORTsort_r
   #define EXTRA   , void *data
   #define CMP(a, b)   cmp(a, b, data)
   #else
   #define SORTsort
   #define EXTRA
   #define CMP(a, b)   cmp(a, b)
   #endif

   void
   SORT(void *p, size_t n, size_t sz,
int (*cmp)(const void *, const void *  EXTRA)  EXTRA)
   {
   ...
   r = CMP(a, b);
   ...
   }

   --- in sort_r.c ---

   #define MAKE_ME_REENTRANT
   #include sort.c


pros:
   + noninvasive
   + no runtime overhead
cons:
   + ugly
   + duplicates the output code


Of these I think if we're concerned about the overhead of the thunks
in #1, #2 is the best bet.

While with normal calling conventions sort can end up being identical
object code to sort_r, as various people have noted, I think this kind
of optimization is best left to the compiler. With #2, the compiler
has a fighting chance of being able to do this. (I haven't checked;
I'd be moderately, but not greatly, surprised if any of the compilers
we have do.)

-- 
David A. Holland
dholl...@netbsd.org


Re: qsort_r

2013-12-08 Thread Robert Elz
Date:Mon, 9 Dec 2013 00:59:49 -0500 (EST)
From:Mouse mo...@rodents-montreal.org
Message-ID:  201312090559.aaa21...@chip.rodents-montreal.org

  | see, for example, the
  | example code in 5.2's getaddrinfo(3) manpage.)

I assume you mean 
memset(hints, 0, sizeof(hints));

That one is perfectly fine, in any C, the values of the pointer
fields of hints are irrelevant, they're never used (we may wonder
at the interface design that used a struct addrinfo for this purpose,
but that's beyond fixing now).   Only the int fields of the hints struct
have any defined use - the others must be zero or the null pointer.
(It doesn't say that the pointer fields must be the null pointer, and
other fields must be zero, though that would be a sane implementation,
just that the other fields must each be 0, or be the null pointer,
the memset() makes them all be 0 - which conforms with the requirements.)


  | I agree.  (I've often thought it would be nice to have a checkout
  | compiler that went out of its way to break as many as possible of the
  | usual assumptions that are true everywhere but that C does not
  | actually promise)

Of academic use only - in the real world economics (that existing code,
however badly written, that works today, must also work on future systems,
or no-one will buy them) puts all kinds of limitations on what hardware
designers are able to do, other than for very special purpose uses,
which NetBSD is never going to care about.

What the C standards people, and then the gcc implementors, are doing is
apalling - changing (or even just tightening) the spec to allow for
processor designs that once existed, but will never be seen again, or
that allow optimisations to make code run raster, when what needs optimisation
most is programmer effort, if we have programmers fighting against arcane
language rules that compiler implementors have used to make some code
segment run a few seconds faster, rather than doing the obvious intended
operation, then everyone suffers, and but the rules say we can do that,
and you shouldn't have assumed that we wouldn't are no consolation at all.

I'm all for additions/corrections/... that aid writing more correct, and
more portable code - that's always a good thing.  But changes that force
the writing of any of that, rather than simply promote it, are evil.

C used to be a great language, precisely because it didn't get in the
way, and make programmers much more productive.   These days, much of
that has gone, and it is becoming just another obstruction to getting
work actually done.

kre



Re: qsort_r

2013-12-08 Thread David Holland
On Mon, Dec 09, 2013 at 01:49:29AM -0500, Mouse wrote:
   Then there are a lot of things that need fixing; the one I'm most
   aware of is the assumption that all-bits-0 is a nil pointer.
   Yes.  I think the official stance on that is to worry about it when a
   real platform appears where it matters.
  
  So, rather than start fixing it now, at leisure, a big scramble when it
  actually becomes important is preferable?  Well, I suppose it's their
  collective neck - or, rather, their project's.

Well... what is a plausible scenario where this becomes important? In
what situation is someone going to try to change this convention, or
more accurately, in what situation is someone going to try to change
this convention in a way that makes it time-critical to fix problems
that arise, such that necks become involved?

The only cases I can think of are deliberately working this issue out
with suitable tools, and porting to some vintage machine where you
can't for some reason use 0 as the null pointer representation.
Neither is exactly an urgent situation.

No new machine will ever have this property because there's too much
existing code that would break and there's no plausible reason to
bother.

Similarly, for all-bits-0 floats, no new machine is going to use
anything but IEEE floats.

   The one family of cases where this isn't so clear is kernel things
   where in some cases you get zeroed pages that you know are zeroed and
   it's stupid and slow to write more zeros over them unnecessarily.
  
  Also possible in userland, though less commonly.

Vanishingly rarely; I suppose calloc can avoid rezeroing when it gets
fresh pages from the kernel, but does any calloc actually *do* that
rather than just calling malloc and bzero?

   (I've often thought it would be nice to have a checkout compiler
   that went out of its way to break as many as possible of the usual
   assumptions that are true everywhere but that C does not actually
   promise)
   Many people think that, but nobody's really done it, as it's a bit of
   a large project.  Go for it :-)
  
  Heh.  Yeah, perhaps some one of these years I'll find the spare time to
  start seriously hacking compilers.

If you keep worrying about how porky gcc is, sooner or later you'll
get irritated enough to write your own...

   I would even hazard a guess that there's code in the tree that
   assumes int is exactly 32 bits,
   Yeah probably.  It's a big tree and there aren't really adequate
   checking tools for this.
  
  Checkout compiler!  NetBSD/pdp10!  :-)

One of the reasons I proposed/started on mint was to enable checks
for stuff like this. But I haven't really put any time into it yet...

-- 
David A. Holland
dholl...@netbsd.org


Re: qsort_r

2013-12-08 Thread David Holland
On Mon, Dec 09, 2013 at 06:10:36AM +, Alan Barrett wrote:
  My irritation with not being able to pass a data pointer through
  qsort() boiled over just now. Apparently Linux and/or GNU has a
  qsort_r() that supports this; so, following is a patch that gives
  us a compatible qsort_r() plus mergesort_r(), and heapsort_r().
  
  Apparently FreeBSD [1] and GNU [2] have incompatible versions of
  qsort_r, passing the extra 'thunk' or 'data' argument in a
  different position.
  
  [1]: FreeBSD qsort_r http://www.manpagez.com/man/3/qsort_r/
  [2]: Linux qsort_r  http://man7.org/linux/man-pages/man3/qsort.3.html
  
  If we have to pick one, let's pick the FreeBSD version.

Given that it's just left vs. right side, the only technical reason to
favor one over the other is that the right-side one (the Linux one)
allows sharing the same object code for both versions, at least on
common platforms, as discussed elsewhere in this thread.

Also, all else being equal the Linux one de facto has greater market
penetration.

So I think picking the Linux one is a marginally better choice.

  Note that the thunks use an extra struct to hold the function
  pointer; this is to satisfy C standards pedantry about void
  pointers vs. function pointers, and if we decide not to care it
  could be simplified.
  
  That adds more run-time overhead.  Could you make it conditional on
  whether it's really necessary?  All existing NetBSD platforms can
  convert back and forth between void * and function pointers without
  any trouble.

I think switching approaches (as in the other post I just made) is
probably a better idea, because it also gets rid of this issue.

-- 
David A. Holland
dholl...@netbsd.org


Re: qsort_r

2013-12-08 Thread David Holland
corrections from Mouse:

On Mon, Dec 09, 2013 at 07:02:39AM +, David Holland wrote:
  (2) union of function pointers
 
 r = is_r ? cmpu.cmp(a, b) : cmpu.cmp_r(a, b, data);

this test is backwards.

void
sort(void *p, size_t n, size_t sz, int (*cmp)(const void *, const void *))
{
union sort cmpu = { .cmp = cmp };
sort_internal(p, n, sz, true, cmpu, NULL);
 
and that should be false.

sigh. pseudocode...

-- 
David A. Holland
dholl...@netbsd.org


Re: qsort_r

2013-12-08 Thread Mouse
 see, for example, the example code in 5.2's getaddrinfo(3) manpage.)
 I assume you mean 
   memset(hints, 0, sizeof(hints));

Yes.

 That one is perfectly fine, in any C, the values of the pointer
 fields of hints are irrelevant,

According to the manpage, their values are relevant:

 All other elements of the addrinfo structure passed via hints must be
 zero or the null pointer.

 (It doesn't say that the pointer fields must be the null pointer, and
 other fields must be zero, though that would be a sane
 implementation, just that the other fields must each be 0, or be the
 null pointer, the memset() makes them all be 0 - which conforms with
 the requirements.)

I would disagree that all zero bits is a reasonable choice for what
it means for a pointer to be zero, unless all zero bits happens to be
the implementation's choice for a nil pointer.  It's a C interface; I
have trouble interpreting be zero as anything but hold a value which
compares equal to zero, which for pointers means a nil pointer.

In any case, it's hardly the only example of that assumption; it's just
the first one that came to mind.

 Of academic use only - [...]

 I'm all for additions/corrections/... that aid writing more correct,
 and more portable code - that's always a good thing.

That's exactly what such a checkout compiler would be for.  Indeed, to
be really useful it would need to have some kind of controls for
exactly which assumptions it breaks, so that, for example, you could
tell it to make nil pointers all-0-bits but break the assumption that
int has no padding bits in it.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: qsort_r

2013-12-08 Thread Mouse
 No new machine will ever have this property because there's too much
 existing code that would break and there's no plausible reason to
 bother.

 Similarly, for all-bits-0 floats, no new machine is going to use
 anything but IEEE floats.

If you really believe those, there's no point in carrying this part of
the thread any farther.  I find each of them ludicrously implausible;
ever is way too long a time, and nothing will use anything but IEEE
floats is about as certain as everything is little-endian.  Indeed,
I suspect C is piling up trouble for itself by mandating binary (when
(not if) a non-binary computer arises, it makes C that much more
likely to be abandoned than adapted).

The only question in my mind is how far off these changes are.  I'm
reasonably sure none of them will hit the mass market within the next
decade or so, but that's about all I'm confident of.

 If you keep worrying about how porky gcc is, sooner or later you'll
 get irritated enough to write your own...

True.  That it will ever happen, though, depends on a bunch of
assumptions which are looking less and less plausible these years.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B