The Serialization library submission by Robert Ramey is not accepted into Boost at this time.
First of all, I'd like to acknowledge that this was a *very* difficult review for all concerned. It was tough for the reviewers, for me as review manager, and especially for Robert Ramey, the library author. Rendering a decision on the library was correspondingly difficult. I thank Robert for his work and his patience with the review process, and I hope that he finds the energy to follow through until we have a Boost library. At one point during the review process, Robert wrote to me privately, expressing the opinion that After spending the better part of a weekend looking over the library documentation and re-reading all of the review commentary, I can understand why Robert might be tempted to conclude that no single serialization library design would satisfy Boost because there were just too many conflicting desires on the part of reviewers. However, I hope he donesn't. I believe the "no serialization library designed by just one person is likely to satisfy Boost" is much closer to the truth. Fortunately, there was great interest in this library (which is why the scrutiny was so intense) and Robert received many enthusiastic offers of collaboration from reviewers. I believe the best path for Boost and for this library is as follows: 0. Reconsider the problem domain in a collaborative environment. If there are enough participants, a mailing list would be a good start (I can set up a SourceForge mailing list upon request), and adding a Wiki Page is easy enough. This process should give strong consideration to problem domains other than ones originally envisioned for the library. It should also reflect a reluctance to begin writing code too early. 1. Agreement on terms. In particular, I strongly suggest beginning with the definitions of serialization and persistence outlined by Augustus Saunders in http://lists.boost.org/MailArchives/boost/msg39598.php. I realize that Robert didn't like those definitions, but they resonated for most people (including me), and seem to provide an excellent starting point. Robert said "I didn't try to define Persistence as I see it as a more general notion". Distinctions are usful to the extent that they partition the space of things actually being considered. If persistence is defined to be even more general than everything we're talking about, it's not useful to us. Since we get to choose the definition, let's choose one we can apply ;-) 2. Careful description of scope. Answer questions like: * Is this a persistence or serialization library? * Is it important to be able to plug in arbitrary archive formats? * Is it important to be able to use the same UDT serialization code to write several different archive formats? * What kinds of applications are we intending to serve? * What kinds of applications are we explicitly NOT intending to serve? 3. Careful consideration of the appropriate interface for describing the serialization of UDTs on a conforming compiler. In particular, consider the lexical cost of requiring users to specialize library templates. Also consider that the use of operator<< is going to invoke ADL anyway, so maybe the interface should just use that. Serialization of class template specializations and other classes should use the same mechanisms. Subsequent consideration of how close the interface can come on broken compilers, should the participants decide they wish to serve that user base. 4. Once coding begins, it should go quickly, and proceed in the boost sandbox. 5. Well, Item 3 drifted a bit into technical issues, so here's a more-comprehensive list of technical issues I'd like to see considered carefully and collaboratively. I'm sorry that I didn't take the time to bring some of these up during the review period, which was a bit overwhelming just to watch ;-). * Dave Harris suggested several times that integers should be written in the binary archive in a variable-length format. This echoes a philosophy on serialization which I've had for years, provides many benefits and would seem to allow drastic simplification of the library if it is decided that the current scope will be retained, since it entirely obviates the need a text archive format (the same could be done for floating point numbers). The only application I can imagine this approach being unsuitable for would be extremely fast, relatively small in-memory archives... and I'd have to see benchmarks and a real use-case to be convinced of that. * Boost already has a mechanism for exploring the internal structure of UDTs. It's called visit_each, and it's used by the signals library to discover bound signal collaborators within function objects. Could this be exploited for serialization of composite types? * Boost already has a mechanism for registering inheritance relationships and convertibility among classes. It's not part of the public interface, but is an implementation detail of Boost.Python. Should this be exploited for serialization? * Objects without default constructors really should be deserializable. One possible approach is offered by Python's serialization mechanism ("pickler"). A class' __getinitargs__ function (if defined) will be called to get the arguments that should be passed to the class' constructor to reconstitute an instance of that class. It should be possible to build a similar mechanism around boost::tuple. * Is it important to allow all UDTs to be separately versioned? Every time I have implemented serialization and started with such a system, I eventually dropped it in favor of a whole-archive version number. Changing the format of a single class always creates a backward compatibility problem for new archives anyway. Allowing the archive to carry the version number also simplifies the [de]serialization interface. If separate versioning is in fact important and useful, a rationale should be provided. * Registration of participating classes must not be required to be monolithic. More generally, the library must support users who use polymorphism to insulate themselves from compilation dependencies. * Strong consideration should be given to a "you don't pay for what you don't use" approach. As Ralf Grosse-Kunstleve pointed out to me, C++ is not really good at serialization, natively. One of the only reasons to use it instead of a language with stronger reflection capabilities has got to be that it is fast. Avoiding virtual function calls for serializing large arrays of small objects (e.g. complex or rational numbers) must be possible. * I would like to see the requirement to use *only* ANSI/ISO C++ loosened. Serialization is one of those areas which is simply not well-supported by standard C++, IMO. Part of what we're doing here at Boost is expanding the scope of C++ by providing support for things like threading and the filesystem. Much may be gained by allowing some components to use extra-legal constructs that can be easily ported to a majority of platforms. Two areas that spring to mind are pointer comparisons outside a single array for unserializing internal object pointers, and the use of type_info::name() for type identification. Even if these were optional components to the library, they could provide enormous benefit for some applications. [BTW, since the review I have discovered some issues with type_info::name() and EDG compilers which may make it unsuitable for type identification in that context, depending on the application]. Given the enormous interest in addressing this problem domain (or domains) shown by Boost members, and the many offers of participation, it would be a real shame if this review didn't ultimately produce a Boost library that we can all stand behind. Broader collaboration in the Boost tradition seems like the best way to get there. Thanks to everyone for their participation in this review. Special, extra thanks to Robert Ramey for bringing forward his submission which stirred up this discussion and, I hope, gave us a start in the right direction. -- David Abrahams [EMAIL PROTECTED] * http://www.boost-consulting.com Boost support, enhancements, training, and commercial distribution _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
