tl;dr - I ask some questions near the end, you probably need to read the
entire post to be able to insightfully answer them.
The Mozilla codebase, which I define loosely as mozilla-central,
comm-central, JS, NSS, NSPR, LDAP C SDKs, and any other small project
included in mozilla-central that is built and maintained
semi-independently, has organically and continuously grown from its
original kernels which predate many of the modern standards used for C
and C++. Thus, its APIs have impedance mismatches with what is more
common in modern C/C++, which may deter would-be contributors.
For those who do not follow the ongoing tasks of JTC1/SC22/WG14 and 21
with zeal, here are the various relevant versions of standards:
C89 aka ISO C aka ANSI C: This is what most people think of as
"standard" C code, although in practice modern compilers accept features
not present in this version (such as mixed code and declarations,
C++-style comments, long long). What you get nowadays in the default
mode tends to be closer to "the subset of C99 also in C++".
C99: This adds several features present in C++ (a few mentioned above),
but also contains some features that have been controversial
(variable-length arrays are a big one). MSVC doesn't yet fully implement
this, but some things (like designated initializers) will probably be in
the next version of MSVC.
C11: This is basically taking some new features of C++11 and shoving
them in C, such as atomics, threading, and noreturn. There's also some
minor goodies like standardizing the "x" flag in fopen to correlate with
O_CREAT | O_EXCL.
C++98/C++03: These are basically the same thing as far as programmers
are concern, since the changes in the 2003 version mostly matter only to
language lawyers. This is "traditional" C++.
C++11: This standard has a very large suite of new features added, which
have been incrementally supported in compilers over the last 5 years.
Clang and g++ are feature-complete as of 3.3 and 4.8.1, respectively;
MSVC is not yet feature complete, although things that cannot be worked
around will probably be added within the next two versions [1]. Note
that the standard library support has lagged compiler support,
particularly for <regex> (ES-compatible regular expressions remain
unsupported even on tip-of-trunk libstdc++).
C++14: This is a proposed standard in final drafting standards that is
better thought of as "C++11.1", adding a few features that ought to have
been in C++11 had people played with them more. The expected final list
of features boils down to generic lambdas, auto improvements, "correct"
VLAs, and variable templates.
C++14 Technical Specifications: C++ is moving to a modular design for
development in the future! There are currently three main TSs planned
for release in 2014: a filesystem specification, a networking
specification, and a "concepts lite" specification. There may also be
one for "things that just missed C++14" such as std::string_view (or
whatever bikeshedding happens) [2]. A draft of the Filesystem TS is
available here:
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3693.html>.
While most prior discussions about using newer versions of C/C++ have
focused on the compiler support, the topic of using newer versions of
the library has mostly been ignored. Library support is more annoying,
since instead of three major compilers, we have four standard libraries
to concern ourselves with and potential issues with compiler-library
compatibility (particularly Clang and libstdc++)--these are MSVC's
implementation, libc++, libstdc++, and stlport. Since most nice charts
of C++11 compatibility focus on what the compiler needs to do, I've put
together a high-level overview of the major additions to the standard
library [3]:
* std::tuple -- Generalization std::pair
* std::unique_ptr -- Non-broken replacement for std::auto_ptr
* std::shared_ptr -- Non-intrusive version of nsCOMPtr and friends
* std::function/std::bind -- Generalization of function pointers
* std::type_traits -- Template helpers
* std::ratio -- Compile-time rational arithmetic, mostly used for
std::chrono
* std::chrono -- Time, more specifically a generalization of time stamps
instead of calendaring functions
* std::array -- Generalization of compile-time arrays like int x[10]
* std::forward_list -- Singly-linked list
* std::unordered_map/std::unordered_set -- Hashtables/hashsets
* std::random -- More powerful random number generators than just rand()
* std::codecvt -- UTF-8/UTF-16/UTF-32 conversion classes (ties in with a
bit of the locale support)
* std::regex -- Regular expressions, but probably the least
well-implemented of anything here
* std::atomic, std::thread, std::mutex, std::condexpr, std::future --
Major threading interfaces
C++14 will also be adding [3]:
* std::optional -- A template for which "not present" is different than
"empty"/"null"/0/etc.
* std::dynarray -- Generalization of int x[N]
* std::shared_mutex -- R/W locks
* std::exchange -- std::atomic<T>::exchange that's not atomic.
* user-defined literals for std::chrono and std::string types.
Now that you have the background for what is or will be in standard C++,
let me discuss the real question I want to discuss: how much of this
should we be using in Mozilla? I assume that enabling exceptions or RTTI
is untenable and the mere suggestion of it would lead to everything else
I say being ignored :-) . The practical effect of this is that we are
unable to use any function where we would not want to crash if an
exception is thrown. Fortunately, the C++ specification appears to be
assuming that this is a situation worth designing for, as the Filesystem
TS draft in particular defines most functions in pairs: one that throws
an exception if something goes wrong, the other that uses an error code.
It feels worth saying that the error code style in use is a
std::system_error on the end of the function, not an nsresult return
value [4].
For purposes of discussion, I think it's worth breaking down the C++
(and C) standard library into the following components:
* Containers--vector, map, etc.
* Strings
* I/O
* Platform support (threading, networking, filesystems, locales)
* Other helpful utilities (std::random, std::tuple, etc.)
We have explicit non-STL implementations of many of the containers
(nsTArray, several hashtable implementations, mozilla::LinkedList), and
we have much more specialized string libraries than the C++ standard
library provides. The iostream library has some issues with using
(particularly static constructors IIRC), and is not so usable for most
of the things that Gecko needs to do. Platform support ultimately
depends on NSPR (or intl/ for locale stuff) in large part, although we
have C++/IDL wrappers for most of the major things, and the current C++
support for the things we need are lacking. For some of the helpful
utilities, we basically have the C++ implementation but using Mozilla
coding style instead (particularly type_traits stuff).
Using the STL wherever possible comes with drawbacks. We lose control
over the ABI (I believe libstdc++ tries to keep compatibility, but not
MSVC), which can impact people who write binary extensions. We also lose
control over the ability to make performance tweaks to containers. There
is also a critical API mismatch between the STL and the containers we
use: the STL tends to use iterators and templates heavily, which tends
to mean a lot of inlining (std::sort is fully inlined, for example); in
contrast, the containers we use tend to favor using function pointers
for sorting or even enumeration. Strings in particular are extremely
weak in the STL: there is one string class, where we have several for
specialized purposes (O(1) substring, null-terminated, O(1)
concatenation, allocate-on-stack, etc.).
Looking at a large C++ project that is not constrained by legacy as much
as we are, LLVM, it's clear that the STL by itself is not sufficient.
LLVM defines a large number of helper datastructures:
<http://llvm.org/doxygen/dir_a7dd73f244ee1af3dca2a8723843bc79.html>. Of
particular note is the use of llvm::StringRef (which is roughly
equivalent to everywhere we have const nsA[C]String & in our code) in
many places, as well as the existence of a llvm::SmallString (roughly
equivalent to nsAuto[C]String). There is similarly a large collection of
variants on std::vector, std::set, and std::map for specific purposes.
The downside to this kind of approach is that use of alternative APIs
tends to leak through APIs, particularly for out parameters; this can be
ameliorated somewhat with templates, but that causes its own set of
problems.
Even if fully using the standard library is untenable from a performance
perspective, usability may be enhanced if we align some of our APIs
which mimic STL functionality with the actual STL APIs. For example, we
could add begin()/end()/push_back()/etc. methods to nsTArray to make it
a fairly drop-in replacement for std::vector, or at least close enough
to one that it could be used in other STL APIs (like std::sort,
std::find, etc.). However, this does create massive incongruities in our
API, since the standard library prefers naming stuff with
this_kind_of_convention whereas most Mozilla style guides prefer
ThisKindOfConvention.
There is also a separate question of how much of the standard library we
should use or mimic. The current strings library is basically a
performance footgun to use as is, although a proposed string_view class
(an encapsulation of const char* + length) would be suitable for most
inparameters. Similarly, std::shared_ptr is moderately useful, but the
libstdc++ implementation appears to assume threadsafe reference counting
always (which would be a perf hit for us), and our intrusive refcounting
doesn't mesh well with it. It is possible (though dirty) to do something
like specializing std::shared_ptr for nsISupports-derived types and
making that specialization inherit from nsCOMPtr. On the other hand,
std::unique_ptr or other small utilities are basically what we would
code up ourselves modulo the style guideline issues.
With all of that stated, the questions I want to pose to the community
at large are as follows:
1. How much, and where, should we be using standard C++ library
functionality in Mozilla code?
2. To what degree should our custom ADTs (like nsTArray) be
interoperable with the C++ standard library?
3. How should we handle bridge support for standardized features not yet
universally-implemented?
4. When should we prefer our own implementations to standard library
implementations?
5. To what degree should our platform-bridging libraries
(xpcom/mfbt/necko/nspr) use or align with the C++ standard library?
6. Where support for an API we wish to use is not universal, what is the
preferred way to mock that support?
[Note: similar questions also apply to NSPR and NSS with respect to
newer C99 and C11 functionality.]
Thoughts/comments/corrections/questions/concerns/flames/insightful
discussion?
Footnotes:
[1] Well, strictly speaking, expression SFINAE, universal character
names, and inheriting constructors can't be worked around directly, but
avoiding the first two should be pretty easy, and the last one is
considered by some to be a misfeature anyways.
[2] There is also discussion of adding feature-test macros, such as what
Clang provides, in light of the incremental nature in the way this stuff
gets extended. This is not planned to be an official part of C++, but
rather a common convention that all major compilers will support. The
latest draft of this effort is here:
<http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3694.htm>, but
the full C++ committee has not discussed it yet.
[3] There are lots of minor changes to libraries and smaller things that
I don't think are worth calling out.
[4] This opens up an interesting idea to attempt boiling the ocean to
align API error strategies, and I can think of several benefits this
kind of approach might accrue. That said, given that it's a
boil-the-ocean kind of patch with a potential but unknown impact on
performance, I'm not going to suggest doing it here.
--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform