Re: object-model: Wrapping Subversion C-structs in C++
Hyrum K. Wright hyrum_wright_at_mail.utexas.edu mailto:hyrum_wright_at_mail.utexas.edu?Subject=Re:%20object-model:%20Wrapping%20Subversion%20C-structs%20in%20C%2B%2B wrote: For the C++ folks out there, I've got a question about an approach to take on the object-model branch. At issue is how to wrap the various C structures returned to callers, particularly in a backward compatible manner. Currently, I'm looking at svn_wc_notify_t *. As I see it, there are a few options: 1) Just wrap the pointer to the C struct as a member of the wrapper class. Pros: Easy to implement; lightweight constructor. Cons: Getters would need to translate to C++ types; would need to implement a copy constructor which deep copies the C struct; would also introduce pools, since creating and duplicating C structs requires them. 2) Wrap each C struct member individually Pros: C-C++ complexity is constrained to the constructor, everything else is C++ types Cons: Hard to extend for future compatibility 3) Just pass the C-struct pointer around; don't even bother with a class Pros: Dead simple. Cons: Requires more memory management thought by consumers; not C++-y enough; may introduce wrapping difficulties. I'd like to come up with something consistent, which would be used throughout the C++ bindings. I'm also interested in a solution which ensures the C++ bindings can be used as the basis for other object-oriented bindings models (Python, Perl, etc.) Thoughts? One issue that has not been talked about in this thread is strong typing. If you remember the problems with Johan's diff / blame optimizations, the reason behind it was a confusion of type semantics. Some ints were line numbers, others were file offsets. But there was / is no formal way to tell them apart. Since you decided to use templates in your code, I thought I would give it a try and design a simple template class that allows you to define any number of int-like types that are mutually distinct and require explicit conversion. It would be nice to have the C++ wrappers use these types instead of plain ints etc. in their signatures. -- Stefan^2. // TypedInts.cpp : Defines the entry point for the console application. // #include stdafx.h // extend that enum to define further types / kinds of integers enum IntegerTypes { itLineNumber, itFileOffset, itRevision }; // a type selection utility struct // (X, Y, true) - X // (X, Y, false) - Y templateclass First, class Second, bool get_first struct SelectType { }; templateclass First, class Second struct SelectTypeFirst, Second, true { typedef typename First type; }; templateclass First, class Second struct SelectTypeFirst, Second, false { typedef typename Second type; }; // a utility struct mapping a (potentially unsigned) integer // to the corresponding signed integer. templateclass T struct DiffType { typedef typename T type; }; template struct DiffTypeunsigned char { typedef char type; }; template struct DiffTypeunsigned short { typedef short type; }; template struct DiffTypeunsigned { typedef int type; }; template struct DiffTypeunsigned long { typedef long type; }; template struct DiffTypeunsigned long long { typedef long long type; }; // A typed integer: // V .. base integer type (e.g. unsigned) // T .. formal classification, i.e. this actually separates the int types // diff_type .. if true, this is the difference type //if false, this is the absolute value type //TypedInt(X,Y,false) - TypedInt(X,Y,false) - TypedInt(X,Y,true) // // The arithmetics, conversions and getters have been carefully // designed that only meaningful combinations of arguments are // valid and everything else will be rejected by the compiler. // // In optimized code, this class does not impose any runtime overhead // over plain use of built-in types. templateclass V, IntegerTypes T, bool diff_type class TypedInt { public: // encapsulated int types typedef typename V absolute_value_t; typedef typename DiffTypeV::type diff_value_t; typedef typename SelectTypediff_value_t, absolute_value_t, diff_type::type value_t; // typed integers typedef typename TypedIntV, T, false absolute_t; typedef typename TypedIntV, T, true diff_t; typedef typename TypedIntV, T, diff_type this_t; // expose template parameters enum {type = T, is_diff = diff_type}; // construction, auto-conversion TypedInt() : value() { } TypedInt(value_t value) : value(value) { } // data access value_t get() { return value; } const value_t get() const { return value; } value_t* operator() { return value; } const value_t* operator() const { return value; } // assignment TypedInt operator=(value_t rhs)
Re: object-model: Wrapping Subversion C-structs in C++
Hyrum K. Wright hyrum_wri...@mail.utexas.edu: This could get ugly. Creating and destroying pools all over the place could get ugly, but it's necessary evil because all of our object creation / duplication functions all require a pool. An alternative would be a set of functions returning the size of the object, and then another which puts the object in a pre-allocated memory location. (These could theoretically replace the pool argument version of the API, but that'd be *a lot* of churn.) The approach would let the C++ allocate the memory (of the correct size) using whatever scheme it wants, and then do a placement initialize using the second API. If we do go this route, I'd recommend exposing these as private-to-Subversion, at least initially. Hm... this sounds very complex and complicated. What exactly is it you want to do? Have a thin C++ wrapper around C objects, where the C objects do the work? Why do you want to? Ie. what benefit do you expect to get, compared to just using the C objects from inside your C++ code? FWIW the patternt I would have looked at initially, (without actually studying the problem your're trying to solve...;-) ) is one of having refcounting smart pointers to a reference object. The reference object would have one pointer to the underlying C object and a reference count. The constructor of a smart pointer would up the reference count by one, and the destructor of the smart pointer would decrease the reference count by one. When the reference counter in the reference object went to 0, it would assume that it had no users, and call the C pool code to clean up its memory usage. For creation of the reference objects you have at least two choices: - Use some kind of factory - Create the C object using the C APIs creation functions in the reference object's construct (or perhaps a combination of the two) Its not watertight, but people have been writing good and working applications, using this pattern.
Re: object-model: Wrapping Subversion C-structs in C++
Branko Čibej br...@xbc.nu: I suggest you take a look at auto_ptr and auto_ptr_ref. auto_ptr is very limited in that it allows only a single pointer to a single object. You need a similar pair of classes that will deal with pools, ...but the idea of smart pointers is the correct one here, yes.
Re: object-model: Wrapping Subversion C-structs in C++
On 25.09.2010 10:34, Steinar Bang wrote: Branko Čibej br...@xbc.nu: I suggest you take a look at auto_ptr and auto_ptr_ref. auto_ptr is very limited in that it allows only a single pointer to a single object. Which is why I mentioned auto_ptr_ref, which is a reference to an auto_ptr (non-owned). -- Brane
Re: object-model: Wrapping Subversion C-structs in C++
Steinar Bang s...@dod.no: What exactly is it you want to do? Have a thin C++ wrapper around C objects, where the C objects do the work? Why do you want to? Ie. what benefit do you expect to get, compared to just using the C objects from inside your C++ code? Another pattern that I've used with some success (eg. when wrapping the W3C libwww as HTTP/FTP/etc. support in a Qt application): - Create a C++ class that corresponds to the C API you would like to wrap - Where the C API would return a pointer to some struct, return a C++ object that holds a pointer to the struct and has methods corresponding to the operations that can be done on the struct using the API - Life cycle management would be handled by the API and the C++ object wrapping the API (though in some cases the wrapper object destructor might hand the C pointer back to the API) Variations over the theme: - The C++ wrappers have virtual methods - The API wrapping object return pointers to the C++ objects rather than instances (typically for objects with virtual methods) - The API wrapping object return smart pointers to the C++ objects (but as these instances will be pointer sized objects, this doesn't seem very useful...) The first trouble spot one will run into is non-primitive argument values and method return values. Eg. - Life cycle of returned wrapper objects - Strings - Minimize copies in and out of std::string - Minimize recodning from UTF-8-coded char* to UTF-16, or UCS-4 in std::wstring But they are overcomable.
Re: object-model: Wrapping Subversion C-structs in C++
On Sat, Sep 25, 2010 at 3:30 AM, Steinar Bang s...@dod.no wrote: Hyrum K. Wright hyrum_wri...@mail.utexas.edu: This could get ugly. Creating and destroying pools all over the place could get ugly, but it's necessary evil because all of our object creation / duplication functions all require a pool. An alternative would be a set of functions returning the size of the object, and then another which puts the object in a pre-allocated memory location. (These could theoretically replace the pool argument version of the API, but that'd be *a lot* of churn.) The approach would let the C++ allocate the memory (of the correct size) using whatever scheme it wants, and then do a placement initialize using the second API. If we do go this route, I'd recommend exposing these as private-to-Subversion, at least initially. Hm... this sounds very complex and complicated. What exactly is it you want to do? Have a thin C++ wrapper around C objects, where the C objects do the work? Why do you want to? Ie. what benefit do you expect to get, compared to just using the C objects from inside your C++ code? Returning C objects to the callers of the bindings would require said callers to worry about managing the memory allocated for those objects, thus requiring them to care about pools and their lifetimes. I want to eliminate this requirement of direct management for the callers, and provide them with objects that look like standard C++ objects (with C++ memory management paradigms, etc). Most of these objects are not functional, but rather informational in nature. FWIW the patternt I would have looked at initially, (without actually studying the problem your're trying to solve...;-) ) is one of having refcounting smart pointers to a reference object. The reference object would have one pointer to the underlying C object and a reference count. The constructor of a smart pointer would up the reference count by one, and the destructor of the smart pointer would decrease the reference count by one. When the reference counter in the reference object went to 0, it would assume that it had no users, and call the C pool code to clean up its memory usage. For creation of the reference objects you have at least two choices: - Use some kind of factory - Create the C object using the C APIs creation functions in the reference object's construct (or perhaps a combination of the two) Its not watertight, but people have been writing good and working applications, using this pattern. And it's this approach that I implemented a couple of days ago. :) See http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h for the latest incarnation. -Hyrum
Re: object-model: Wrapping Subversion C-structs in C++
On 24.09.2010 04:05, Hyrum K. Wright wrote: On Thu, Sep 23, 2010 at 2:20 PM, Branko Čibej br...@xbc.nu wrote: On 22.09.2010 21:41, Hyrum K. Wright wrote: [ apologizes for the somewhat stream-of-conscious nature of these mails ] On Wed, Sep 22, 2010 at 7:16 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: For the C++ folks out there, I've got a question about an approach to take on the object-model branch. At issue is how to wrap the various C structures returned to callers, particularly in a backward compatible manner. Currently, I'm looking at svn_wc_notify_t *. As I see it, there are a few options: 1) Just wrap the pointer to the C struct as a member of the wrapper class. Pros: Easy to implement; lightweight constructor. Cons: Getters would need to translate to C++ types; would need to implement a copy constructor which deep copies the C struct; would also introduce pools, since creating and duplicating C structs requires them. 2) Wrap each C struct member individually Pros: C-C++ complexity is constrained to the constructor, everything else is C++ types Cons: Hard to extend for future compatibility 3) Just pass the C-struct pointer around; don't even bother with a class Pros: Dead simple. Cons: Requires more memory management thought by consumers; not C++-y enough; may introduce wrapping difficulties. I'd like to come up with something consistent, which would be used throughout the C++ bindings. I'm also interested in a solution which ensures the C++ bindings can be used as the basis for other object-oriented bindings models (Python, Perl, etc.) After lunch, and some thought, it feels like #1 is the best solution. This doesn't change the external class interface, which is good, and can still provide C++ values to callers who want them. The pool issues are a bit messy, but at least the object can manage it's own memory (albeit at a significant overhead). This could get ugly. Creating and destroying pools all over the place could get ugly, but it's necessary evil because all of our object creation / duplication functions all require a pool. An alternative would be a set of functions returning the size of the object, and then another which puts the object in a pre-allocated memory location. (These could theoretically replace the pool argument version of the API, but that'd be *a lot* of churn.) The approach would let the C++ allocate the memory (of the correct size) using whatever scheme it wants, and then do a placement initialize using the second API. If we do go this route, I'd recommend exposing these as private-to-Subversion, at least initially. The other option is just pass the C-struct pointer around everywhere, but then the bindings consumers have to work about this exact same issue. In other words, it solves it now, but really just pushes the problem elsewhere. -Hyrum Memory management with pools and C++ -- keep away from doing it in per-object ctor/dtor pairs is all I can say. Lifetimes would get so messy and mucked up you could hardly believe it. IMHO the best way to wrap the C structures in C++ is to subclass 'em. class svn_some_thing : public svn_some_thing_t { }; This way you can pass C++ pointers directly to the C implementation, most methods become just inlined wrappers. Callbacks are a bit more hairy, but not all that much. This works for input types, but right now it's the output types (which are much more common) that I'm concerned about. Things like svn_commit_info_t, or svn_wc_notify_t. For my initial hack at it see: http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h?p=1000600 If you're strict about the wrapping, i.e., do not add any data members or virtual functions to the wrappers, then a simple static_cast of the returned pointer will solve the type conversion for output parameters. It makes the wrapper methods a bit more complex, but still easily inlinable. Two things worry me about your approach: * Addidional dereference overhead, and construction overhead ... might or might not be important performance-wise, and certainly is important bug-wise ... * The pool-per-object that you already mentioned, made worse by the fact that you implement operator= ... nice syntactic sugar, but has serious side effects. The pool-per-object is especially bothersome because it creates the false sense that you don't have to worry about pool lifetime ... but you *do* because every pool has a parent, you've just hidden the fact away very obscurely. If you /do/ want to give each object a reference to its containing pool (by no means a bad thing), there are better ways, see below. Consider too what you'll do with destructors and pool destruction order. Clearly any destructor for objects such as yours cannot be
Re: object-model: Wrapping Subversion C-structs in C++
On Fri, Sep 24, 2010 at 1:02 AM, Branko Čibej br...@xbc.nu wrote: On 24.09.2010 04:05, Hyrum K. Wright wrote: On Thu, Sep 23, 2010 at 2:20 PM, Branko Čibej br...@xbc.nu wrote: On 22.09.2010 21:41, Hyrum K. Wright wrote: [ apologizes for the somewhat stream-of-conscious nature of these mails ] On Wed, Sep 22, 2010 at 7:16 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: For the C++ folks out there, I've got a question about an approach to take on the object-model branch. At issue is how to wrap the various C structures returned to callers, particularly in a backward compatible manner. Currently, I'm looking at svn_wc_notify_t *. As I see it, there are a few options: 1) Just wrap the pointer to the C struct as a member of the wrapper class. Pros: Easy to implement; lightweight constructor. Cons: Getters would need to translate to C++ types; would need to implement a copy constructor which deep copies the C struct; would also introduce pools, since creating and duplicating C structs requires them. 2) Wrap each C struct member individually Pros: C-C++ complexity is constrained to the constructor, everything else is C++ types Cons: Hard to extend for future compatibility 3) Just pass the C-struct pointer around; don't even bother with a class Pros: Dead simple. Cons: Requires more memory management thought by consumers; not C++-y enough; may introduce wrapping difficulties. I'd like to come up with something consistent, which would be used throughout the C++ bindings. I'm also interested in a solution which ensures the C++ bindings can be used as the basis for other object-oriented bindings models (Python, Perl, etc.) After lunch, and some thought, it feels like #1 is the best solution. This doesn't change the external class interface, which is good, and can still provide C++ values to callers who want them. The pool issues are a bit messy, but at least the object can manage it's own memory (albeit at a significant overhead). This could get ugly. Creating and destroying pools all over the place could get ugly, but it's necessary evil because all of our object creation / duplication functions all require a pool. An alternative would be a set of functions returning the size of the object, and then another which puts the object in a pre-allocated memory location. (These could theoretically replace the pool argument version of the API, but that'd be *a lot* of churn.) The approach would let the C++ allocate the memory (of the correct size) using whatever scheme it wants, and then do a placement initialize using the second API. If we do go this route, I'd recommend exposing these as private-to-Subversion, at least initially. The other option is just pass the C-struct pointer around everywhere, but then the bindings consumers have to work about this exact same issue. In other words, it solves it now, but really just pushes the problem elsewhere. -Hyrum Memory management with pools and C++ -- keep away from doing it in per-object ctor/dtor pairs is all I can say. Lifetimes would get so messy and mucked up you could hardly believe it. IMHO the best way to wrap the C structures in C++ is to subclass 'em. class svn_some_thing : public svn_some_thing_t { }; This way you can pass C++ pointers directly to the C implementation, most methods become just inlined wrappers. Callbacks are a bit more hairy, but not all that much. This works for input types, but right now it's the output types (which are much more common) that I'm concerned about. Things like svn_commit_info_t, or svn_wc_notify_t. For my initial hack at it see: http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h?p=1000600 If you're strict about the wrapping, i.e., do not add any data members or virtual functions to the wrappers, then a simple static_cast of the returned pointer will solve the type conversion for output parameters. It makes the wrapper methods a bit more complex, but still easily inlinable. Well, except for the pool lifetime issues. The wrapped pointer would still be allocated in a the result pool of the called function, but by doing a cast, and returning the wrapper to the caller, we've now hidden that fact (and given the consumer more than enough rope to hang their entire team). Hence the desire to duplicate the returned value into a pool managed outside of the one provided to the C API. Two things worry me about your approach: * Addidional dereference overhead, and construction overhead ... might or might not be important performance-wise, and certainly is important bug-wise ... * The pool-per-object that you already mentioned, made worse by the fact that you implement operator= ... nice syntactic sugar, but has serious side effects. The
Re: object-model: Wrapping Subversion C-structs in C++
On 24.09.2010 18:43, Hyrum K. Wright wrote: All of the Pools used to hold the child objects are children of the global parent (created with NULL as the parent pool). As such, they are independent of each other, and won't have destruction order issues. It's pretty wasteful in terms of the memory overhead, but meets the goal of having each object have it's own pool, and control it's own lifetime independent of other objects. I'm questioning that specific goal. What do you gain by it? I'm not convinced that it is a good idea to do this. If you keep the C++ wrappers minimal, i.e., wrap every C structure in a thin sheet of C++ and hope for the best -- then I have a nagging suspicion that managing pool lifetimes will have to be explicit. On the other hand, you could take a more high-level approach (like JavaHL?) and not tie the object model to the current API too much. Then I could imagine, e.g., having an SVNClient object that does all the pool management. It's generally more easy to deal with pools on a slightly higher level than per-object, wouldn't you say? Pool are certainly un-C++-like, but our APIs don't help to much either. To allocate (or duplicate) these structures, we require callers to provide a pool, instead of a generic allocation mechanism. If we did the latter, we could easily put these objects into memory managed natively by C++, but because we don't, a Pool is required. That's really the only reason to use pools in these objects. Actually this pool-instead-of-allocator restriction comes from APR ... early in the design of APR, generic allocators were dropped in favour of pools, which then force their lifetime behaviour on all applications that use them. I believe one of the reasons for this was allocation speed, and another was that pools fit a stateless HTTP server implementation perfectly ... though they're not too handy for other kinds of applications. -- Brane
Re: object-model: Wrapping Subversion C-structs in C++
On Fri, Sep 24, 2010 at 12:04 PM, Branko Čibej br...@xbc.nu wrote: On 24.09.2010 18:43, Hyrum K. Wright wrote: All of the Pools used to hold the child objects are children of the global parent (created with NULL as the parent pool). As such, they are independent of each other, and won't have destruction order issues. It's pretty wasteful in terms of the memory overhead, but meets the goal of having each object have it's own pool, and control it's own lifetime independent of other objects. I'm questioning that specific goal. What do you gain by it? I'm not convinced that it is a good idea to do this. If you keep the C++ wrappers minimal, i.e., wrap every C structure in a thin sheet of C++ and hope for the best -- then I have a nagging suspicion that managing pool lifetimes will have to be explicit. On the other hand, you could take a more high-level approach (like JavaHL?) and not tie the object model to the current API too much. Then I could imagine, e.g., having an SVNClient object that does all the pool management. It's generally more easy to deal with pools on a slightly higher level than per-object, wouldn't you say? The advantage that JavaHL has is that there is a well-defined boundary between the consumer (written in Java) and the wrappers. Returned structures (such as svn_commit_info_t) are converted to Java before the end of the call down into C++, so we can allocate everything in the per-API pool (the request pool), and not worry about lifetimes. A C++ application using the C++ bindings doesn't have that limitation, though we could do the analogue by converting the C structs deeply into C++ objects. But this introduces forward compatibility issues in the case of growing structs--the entire reason we use the pool-based creation functions in the first place. -Hyrum
Re: object-model: Wrapping Subversion C-structs in C++
On 22.09.2010 21:41, Hyrum K. Wright wrote: [ apologizes for the somewhat stream-of-conscious nature of these mails ] On Wed, Sep 22, 2010 at 7:16 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: For the C++ folks out there, I've got a question about an approach to take on the object-model branch. At issue is how to wrap the various C structures returned to callers, particularly in a backward compatible manner. Currently, I'm looking at svn_wc_notify_t *. As I see it, there are a few options: 1) Just wrap the pointer to the C struct as a member of the wrapper class. Pros: Easy to implement; lightweight constructor. Cons: Getters would need to translate to C++ types; would need to implement a copy constructor which deep copies the C struct; would also introduce pools, since creating and duplicating C structs requires them. 2) Wrap each C struct member individually Pros: C-C++ complexity is constrained to the constructor, everything else is C++ types Cons: Hard to extend for future compatibility 3) Just pass the C-struct pointer around; don't even bother with a class Pros: Dead simple. Cons: Requires more memory management thought by consumers; not C++-y enough; may introduce wrapping difficulties. I'd like to come up with something consistent, which would be used throughout the C++ bindings. I'm also interested in a solution which ensures the C++ bindings can be used as the basis for other object-oriented bindings models (Python, Perl, etc.) After lunch, and some thought, it feels like #1 is the best solution. This doesn't change the external class interface, which is good, and can still provide C++ values to callers who want them. The pool issues are a bit messy, but at least the object can manage it's own memory (albeit at a significant overhead). This could get ugly. Creating and destroying pools all over the place could get ugly, but it's necessary evil because all of our object creation / duplication functions all require a pool. An alternative would be a set of functions returning the size of the object, and then another which puts the object in a pre-allocated memory location. (These could theoretically replace the pool argument version of the API, but that'd be *a lot* of churn.) The approach would let the C++ allocate the memory (of the correct size) using whatever scheme it wants, and then do a placement initialize using the second API. If we do go this route, I'd recommend exposing these as private-to-Subversion, at least initially. The other option is just pass the C-struct pointer around everywhere, but then the bindings consumers have to work about this exact same issue. In other words, it solves it now, but really just pushes the problem elsewhere. -Hyrum Memory management with pools and C++ -- keep away from doing it in per-object ctor/dtor pairs is all I can say. Lifetimes would get so messy and mucked up you could hardly believe it. IMHO the best way to wrap the C structures in C++ is to subclass 'em. class svn_some_thing : public svn_some_thing_t { }; This way you can pass C++ pointers directly to the C implementation, most methods become just inlined wrappers. Callbacks are a bit more hairy, but not all that much. As to pools ... I'd once thought that a proper C++ wrapper for a pool would be a reference-counted smart pointer. That turns out to be too much overhead and too much of a good thing, since you *cannot* allow pool lifetime to depend on some reference counting order of execution. Best just use plain pool pointers, or maybe wrap them minimally so that destructors do the pool cleanup reliably. -- Brane
Re: object-model: Wrapping Subversion C-structs in C++
On Thu, Sep 23, 2010 at 2:20 PM, Branko Čibej br...@xbc.nu wrote: On 22.09.2010 21:41, Hyrum K. Wright wrote: [ apologizes for the somewhat stream-of-conscious nature of these mails ] On Wed, Sep 22, 2010 at 7:16 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: For the C++ folks out there, I've got a question about an approach to take on the object-model branch. At issue is how to wrap the various C structures returned to callers, particularly in a backward compatible manner. Currently, I'm looking at svn_wc_notify_t *. As I see it, there are a few options: 1) Just wrap the pointer to the C struct as a member of the wrapper class. Pros: Easy to implement; lightweight constructor. Cons: Getters would need to translate to C++ types; would need to implement a copy constructor which deep copies the C struct; would also introduce pools, since creating and duplicating C structs requires them. 2) Wrap each C struct member individually Pros: C-C++ complexity is constrained to the constructor, everything else is C++ types Cons: Hard to extend for future compatibility 3) Just pass the C-struct pointer around; don't even bother with a class Pros: Dead simple. Cons: Requires more memory management thought by consumers; not C++-y enough; may introduce wrapping difficulties. I'd like to come up with something consistent, which would be used throughout the C++ bindings. I'm also interested in a solution which ensures the C++ bindings can be used as the basis for other object-oriented bindings models (Python, Perl, etc.) After lunch, and some thought, it feels like #1 is the best solution. This doesn't change the external class interface, which is good, and can still provide C++ values to callers who want them. The pool issues are a bit messy, but at least the object can manage it's own memory (albeit at a significant overhead). This could get ugly. Creating and destroying pools all over the place could get ugly, but it's necessary evil because all of our object creation / duplication functions all require a pool. An alternative would be a set of functions returning the size of the object, and then another which puts the object in a pre-allocated memory location. (These could theoretically replace the pool argument version of the API, but that'd be *a lot* of churn.) The approach would let the C++ allocate the memory (of the correct size) using whatever scheme it wants, and then do a placement initialize using the second API. If we do go this route, I'd recommend exposing these as private-to-Subversion, at least initially. The other option is just pass the C-struct pointer around everywhere, but then the bindings consumers have to work about this exact same issue. In other words, it solves it now, but really just pushes the problem elsewhere. -Hyrum Memory management with pools and C++ -- keep away from doing it in per-object ctor/dtor pairs is all I can say. Lifetimes would get so messy and mucked up you could hardly believe it. IMHO the best way to wrap the C structures in C++ is to subclass 'em. class svn_some_thing : public svn_some_thing_t { }; This way you can pass C++ pointers directly to the C implementation, most methods become just inlined wrappers. Callbacks are a bit more hairy, but not all that much. This works for input types, but right now it's the output types (which are much more common) that I'm concerned about. Things like svn_commit_info_t, or svn_wc_notify_t. For my initial hack at it see: http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h?p=1000600 As to pools ... I'd once thought that a proper C++ wrapper for a pool would be a reference-counted smart pointer. That turns out to be too much overhead and too much of a good thing, since you *cannot* allow pool lifetime to depend on some reference counting order of execution. Best just use plain pool pointers, or maybe wrap them minimally so that destructors do the pool cleanup reliably. Yeah, we've got a Pool class which destroys the underlying APR pool in the dtor, so that provides a handy way to clean things up (if it weren't for the 8k overhead of the minimum pool size). To the initial hack, I've added a smart pointer to prevent multiple redundant copies of the underlying C struct, since I expect it to be read-only (we could probably add an explicit Clone() method if folks really wanted a deep copy). The smart pointer magic is here: http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h There's still a self-managed pool for allocating the C-struct, but it's encapsulated enough to be changable should we have a better way of duplicating objects in the future. Thanks for the insight, by the way. These classes and paradigms are in no way set
Re: object-model: Wrapping Subversion C-structs in C++
On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: For the C++ folks out there, I've got a question about an approach to take on the object-model branch. At issue is how to wrap the various C structures returned to callers, particularly in a backward compatible manner. Currently, I'm looking at svn_wc_notify_t *. As I see it, there are a few options: 1) Just wrap the pointer to the C struct as a member of the wrapper class. Pros: Easy to implement; lightweight constructor. Cons: Getters would need to translate to C++ types; would need to implement a copy constructor which deep copies the C struct; would also introduce pools, since creating and duplicating C structs requires them. 2) Wrap each C struct member individually Pros: C-C++ complexity is constrained to the constructor, everything else is C++ types Cons: Hard to extend for future compatibility 3) Just pass the C-struct pointer around; don't even bother with a class Pros: Dead simple. Cons: Requires more memory management thought by consumers; not C++-y enough; may introduce wrapping difficulties. I'd like to come up with something consistent, which would be used throughout the C++ bindings. I'm also interested in a solution which ensures the C++ bindings can be used as the basis for other object-oriented bindings models (Python, Perl, etc.) After lunch, and some thought, it feels like #1 is the best solution. This doesn't change the external class interface, which is good, and can still provide C++ values to callers who want them. The pool issues are a bit messy, but at least the object can manage it's own memory (albeit at a significant overhead). -Hyrum
Re: object-model: Wrapping Subversion C-structs in C++
[ apologizes for the somewhat stream-of-conscious nature of these mails ] On Wed, Sep 22, 2010 at 7:16 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright hyrum_wri...@mail.utexas.edu wrote: For the C++ folks out there, I've got a question about an approach to take on the object-model branch. At issue is how to wrap the various C structures returned to callers, particularly in a backward compatible manner. Currently, I'm looking at svn_wc_notify_t *. As I see it, there are a few options: 1) Just wrap the pointer to the C struct as a member of the wrapper class. Pros: Easy to implement; lightweight constructor. Cons: Getters would need to translate to C++ types; would need to implement a copy constructor which deep copies the C struct; would also introduce pools, since creating and duplicating C structs requires them. 2) Wrap each C struct member individually Pros: C-C++ complexity is constrained to the constructor, everything else is C++ types Cons: Hard to extend for future compatibility 3) Just pass the C-struct pointer around; don't even bother with a class Pros: Dead simple. Cons: Requires more memory management thought by consumers; not C++-y enough; may introduce wrapping difficulties. I'd like to come up with something consistent, which would be used throughout the C++ bindings. I'm also interested in a solution which ensures the C++ bindings can be used as the basis for other object-oriented bindings models (Python, Perl, etc.) After lunch, and some thought, it feels like #1 is the best solution. This doesn't change the external class interface, which is good, and can still provide C++ values to callers who want them. The pool issues are a bit messy, but at least the object can manage it's own memory (albeit at a significant overhead). This could get ugly. Creating and destroying pools all over the place could get ugly, but it's necessary evil because all of our object creation / duplication functions all require a pool. An alternative would be a set of functions returning the size of the object, and then another which puts the object in a pre-allocated memory location. (These could theoretically replace the pool argument version of the API, but that'd be *a lot* of churn.) The approach would let the C++ allocate the memory (of the correct size) using whatever scheme it wants, and then do a placement initialize using the second API. If we do go this route, I'd recommend exposing these as private-to-Subversion, at least initially. The other option is just pass the C-struct pointer around everywhere, but then the bindings consumers have to work about this exact same issue. In other words, it solves it now, but really just pushes the problem elsewhere. -Hyrum