I think C++ would make working with the DocFormats library, at least in its 
current form, significantly easier. In particular, the explicit support for 
classes, and the ability to use smart pointers (thus avoiding manual reference 
counting) would be a big win in terms of complexity.

As a background to why the library is in C and not C++:

The reason is that originally DocFormats was written in Objective C (since I 
was targeting only iOS at the time). Objective C is a superset of C, so when I 
decided I wanted to open source the code and enable it to be used on non-Apple 
platforms, I methodically went through the source tree converting all the 
Objective C classes and reference counting statements into their C equivalents. 
Objective C has automatic reference counting now, but at the time I was not 
using it, so this meant the translation was relatively straightforward.

While it *is* possible to mix Objective C and C++, doing so results in an 
additional layer of complexity, which I wanted to avoid - you have two ways of 
defining classes etc. The conversion to C was simpler than I expect a 
conversion to C++ would have been. However, now all the code is in C and 
completely free of any Apple-specific dependencies, I think it would be 
reasonable to move to C++ to more concisely express many of the things that are 
currently done explicitly (memory management being the most significant). The 
resulting code would also be more readable.

I don’t volunteer to do the conversion myself, since it’s a lot of work. 
However for anyone willing to take on the task, this would be an excellent way 
of becoming intimately familiar with the library, which would be of great use 
in developing ODF and other filters.

I kind of have a natural aversion to C++ because of it’s complexity, and the 
sheer number of features which, if they are all (or even a significant portion 
of them) used can lead to very complicated code. I think we should agree on 
fairly strict guidelines on the subset of the language we use, do avoid things 
“getting out of hand” with the codebase, so to speak.

There are some nice properties of C I like, such as the ability to grep for a 
function name throughout the whole source tree to find out all the places it’s 
used, which is handy for refactoring. Xcode also has some refactoring tools 
which work for Objective C and most of C, which I used a lot doing the original 
conversion, but these do not work with C++ (of course this is a limitation of 
Xcode, not a problem with C++ per se).

There are some specific areas we’ll need to be careful about in terms of 
performance. Actually the first pure C code I had in DocFormats, long before I 
converted the whole library, was the DFNode and DFDocument structures, which 
use a specialised memory allocator that simply allocates a slab of memory and 
frees it all in one go after conversion has finished. Prior to that, every node 
was a separate Objective C object, and freeing a whole document took an 
inordinate amount of time, due to the large number of release messages sent to 
free individual nodes, and the fact that Objective C’s dispatch mechanism is 
not efficient for compute-intensive code. This had a very noticeable impact on 
load times of large documents, which was greatly improved by switching to a 
customised, efficient memory allocation strategy. We should maintain this when 
moving to C++.

Regarding Flat, I’d like to keep that in C at least for now, because my plan is 
to build a virtual machine for executing Flat programs, and for which I’ll 
implement a garbage collector, which necessarily requires intimate knowledge of 
the memory layout of objects. While this is possible to do in C++, it’s easier 
in C as there’s less abstractions in the way. Flat is also about to get it’s 
own type system, which will be different in many respects from that of C++ (and 
more tailored towards the task of transformation). I’ll post more on this in 
due course.

But for the bulk of the DocFormats code, I think it makes sense to move to C++, 
and that we’ll benefit from the improved maintainability and make it easier for 
new committers coming into the project to understand the structure of the code.

—
Dr Peter M. Kelly
[email protected]

PGP key: http://www.kellypmk.net/pgp-key <http://www.kellypmk.net/pgp-key>
(fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)

> On 10 Aug 2015, at 6:26 pm, jan i <[email protected]> wrote:
> 
> Hi
> 
> Peter and I talked the other day and among others about the benefits of
> using C++ instead of sticking to C99.
> 
> This would be a major change in the project (less in the code, more in the
> "how to"), and it is
> not something we should "just" do.
> 
> I favor C++, but not unlimited, I see 2 places where C++ can give us more
> stable code:
> - Interfaces.
> Using classes to group our functions (like e.g. platform, core, filters/odf
> etc.),
> would make it very clear where the function originates. It would also allow
> group global variables that are private to the rest of the world.
> I would not use real interface classes, for our internal grouping, that is
> not needed. But e.g. the DocFormats API should be a real interface class
> - Automatic.
> At the moment we have a lot of code managing construction/deconstruction,
> that could be totally automated by use of C++ smart pointers.
> - Object model (filters, flat and core)
> would be more logically represented as objects, and suddenly copying etc.
> would be a lot easier.
> 
> I would not like to see big inheritance (especially not multiple
> inheritance).
> 
> I fail to see what we loose by making the change, but please give your
> opinion.
> 
> rgds
> jan i.
> 
> Ps. This is in no way a vote thread, but simply a way to gather opinions.

Reply via email to