Hi Paul,

> > Please don't use identifiers that start with '__gl'. Identifiers that
> > start with '__' belong to the system (= compiler + libc).
> 
> I was a little disappointed to see patches immediately installed to 
> implement this request even though I've recently written here that I 
> disagree with it at least in part.
> 
> Gnulib is supposed to help its users, not get in their way. So when 
> Gnulib is implementing a standard C or POSIX header, it should be viewed 
> as being part of the implementation not part of the application, and its 
> API strive to follow the relevant namespace rules.

Yes, that's the argument you brought up each time we touched this topic.
I should have explained earlier why I feel that this argument is not
convincing. Let me explain it now. I hope I can convince you.

1) POSIX:2024
   
<https://pubs.opengroup.org/onlinepubs/9799919799/functions/V2_chap02.html#tag_16_02_02>
states:
  "The following identifiers are reserved regardless of the inclusion of
   headers:
   1.
     With the exception of identifiers beginning with the prefix _POSIX_ and
     those identifiers which are lexically identical to keywords defined by
     the ISO C standard (for example _Bool), all identifiers that begin with
     an <underscore> and either an uppercase letter or another <underscore>
     are always reserved for any use by the implementation.
   2.
     All identifiers that begin with an <underscore> are always reserved for
     use as identifiers with file scope in both the ordinary identifier and
     tag name spaces.
   3. ...
   4. ...
   5. ..."

It means essentially:
  - Identifiers with start with 2 underscores belong to the implementation
    (= libc + compiler, in this context).
  - Identifiers that start with 1 underscore and a lowercase letter may be
    used by programs, but only as local variables/types/tags, not at file
    scope.

The question "what identifiers can applications freely use?" therefore
has the answer "Only identifiers that don't start with an underscore,
and among them only those that don't match several specific patterns".

So, applications that rely on POSIX or Gnulib are *not* supposed to use
identifiers that start with an underscore (except as local entities).

That's also the reason why a current glibc defines
  - more than 600 functions starting with 2 underscores,
  - more than 300 functions starting with 1 underscore,
    such as '_obstack_begin' and '_obstack_free'.

By choosing symbols of the form '_gl_*' instead of '__gl_*' we are
  - picking a libc area that is still forbidden for us, but less risky,
  - not impeding the applications' freedom to choose identifiers.

2) Since Gnulib sits between the system and the application code, it
   needs some distance from both.

   It is more important to keep some distance from the system / libc
   than it is to keep some distance from the application code, because

     * Where Gnulib and the libc have common symbols, things get very
       difficult quickly. For example, lib/cdefs.h (thanks for maintaining
       this difficult one!!!) is quite a tricky thing.

     * When glibc is borrowing code from Gnulib (like they currently do
       with the 'fts' module), there would be a risk that they reuse
       function names like __gl_<something> and that later we have to
       work around that in our *.m4 macros. The risk is smaller if
       the function is named _gl_<something> because that will remind
       the glibc developers that it's not supposed to be glibc-private.

     * We have no influence on what the libc does (globally, from
       Solaris libc to NetBSD libc). But we have some influence on what
       the Gnulib-using applications do, via the documentation and the
       NEWS file. For instance, the need to #include <config.h> in every
       compilation unit is a requirement that we were able to impose.

> Yes, this places a greater burden on the Gnulib developers, as it means 
> we must pick names like "__gl_whatever" that work on every practical 
> Gnulib target. But that's OK

It's harder to do so, in a future-proof manner, if one uses 2 underscores,
than with 1 underscore. (See the symbol counts mentioned above: 600 > 300.)

> it's better for us to take on the minor 
> burden of picking such names, than for users of Gnulib to take on the 
> burden of worrying about Gnulib's intrusion into their namespace.

As explained above, function names _gl_<something> are not in the
applications' namespace. Therefore, we don't even need to document
that Gnulib-using applications should avoid this identifier pattern.

> So as a compromise I propose that although it's fine to stick with 
> ordinary names when implementing Gnulib's own headers, we should avoid 
> this when implementing standard headers.

I agree that a compromise is needed, because there is not much room
between the applications' namespace and the system's namespace. And
I claim that using _gl_<something> is the less risky compromise.

Bruno




Reply via email to