Re: object-model: Wrapping Subversion C-structs in C++

2010-09-26 Thread Stefan Fuhrmann
 Hyrum K. Wright hyrum_wright_at_mail.utexas.edu 
mailto:hyrum_wright_at_mail.utexas.edu?Subject=Re:%20object-model:%20Wrapping%20Subversion%20C-structs%20in%20C%2B%2B 
wrote:



For the C++ folks out there, I've got a question about an approach to
take on the object-model branch. At issue is how to wrap the various
C structures returned to callers, particularly in a backward
compatible manner. Currently, I'm looking at svn_wc_notify_t *. As I
see it, there are a few options:

1) Just wrap the pointer to the C struct as a member of the wrapper 
class.

Pros: Easy to implement; lightweight constructor.
Cons: Getters would need to translate to C++ types; would need to
implement a copy constructor which deep copies the C struct; would
also introduce pools, since creating and duplicating C structs
requires them.

2) Wrap each C struct member individually
Pros: C-C++ complexity is constrained to the constructor,
everything else is C++ types
Cons: Hard to extend for future compatibility

3) Just pass the C-struct pointer around; don't even bother with a class
Pros: Dead simple.
Cons: Requires more memory management thought by consumers; not
C++-y enough; may introduce wrapping difficulties.

I'd like to come up with something consistent, which would be used
throughout the C++ bindings. I'm also interested in a solution which
ensures the C++ bindings can be used as the basis for other
object-oriented bindings models (Python, Perl, etc.)

Thoughts? 


One issue that has not been talked about in this thread
is strong typing. If you remember the problems with
Johan's diff / blame optimizations, the reason behind
it was a confusion of type semantics. Some ints were
line numbers, others were file offsets. But there was / is
no formal way to tell them apart.

Since you decided to use templates in your code, I
thought I would give it a try and design a simple template
class that allows you to define any number of int-like
types that are mutually distinct and require explicit
conversion.

It would be nice to have the C++ wrappers use these
types instead of plain ints etc. in their signatures.

-- Stefan^2.

// TypedInts.cpp : Defines the entry point for the console application.
//

#include stdafx.h

// extend that enum to define further types / kinds of integers

enum IntegerTypes
{
itLineNumber,
itFileOffset,
itRevision
};

// a type selection utility struct
//   (X, Y, true)  - X
//   (X, Y, false) - Y

templateclass First, class Second, bool get_first
struct SelectType
{
};

templateclass First, class Second
struct SelectTypeFirst, Second, true
{
typedef typename First type;
};

templateclass First, class Second
struct SelectTypeFirst, Second, false
{
typedef typename Second type;
};

// a utility struct mapping a (potentially unsigned) integer 
// to the corresponding signed integer.

templateclass T
struct DiffType
{
typedef typename T type;
};

template
struct DiffTypeunsigned char
{
typedef char type;
};

template
struct DiffTypeunsigned short
{
typedef short type;
};

template
struct DiffTypeunsigned
{
typedef int type;
};

template
struct DiffTypeunsigned long
{
typedef long type;
};

template
struct DiffTypeunsigned long long
{
typedef long long type;
};

// A typed integer:
//   V .. base integer type (e.g. unsigned)
//   T .. formal classification, i.e. this actually separates the int types
//   diff_type .. if true, this is the difference type
//if false, this is the absolute value type
//TypedInt(X,Y,false) - TypedInt(X,Y,false) - 
TypedInt(X,Y,true)
//
// The arithmetics, conversions and getters have been carefully
// designed that only meaningful combinations of arguments are
// valid and everything else will be rejected by the compiler.
//
// In optimized code, this class does not impose any runtime overhead 
// over plain use of built-in types.

templateclass V, IntegerTypes T, bool diff_type
class TypedInt
{
public:

// encapsulated int types

typedef typename V absolute_value_t;
typedef typename DiffTypeV::type diff_value_t;
typedef typename SelectTypediff_value_t, 
absolute_value_t, 
diff_type::type value_t;

// typed integers

typedef typename TypedIntV, T, false absolute_t;
typedef typename TypedIntV, T, true diff_t;
typedef typename TypedIntV, T, diff_type this_t;

// expose template parameters

enum {type = T, is_diff = diff_type};

// construction, auto-conversion

TypedInt()
: value()
{
}

TypedInt(value_t value)
: value(value)
{
}

// data access

value_t get()
{
return value;
}

const value_t get() const
{
return value;
}

value_t* operator()
{
return value;
}

const value_t* operator() const
{
return value;
}

// assignment

TypedInt operator=(value_t rhs)

Re: object-model: Wrapping Subversion C-structs in C++

2010-09-25 Thread Steinar Bang
 Hyrum K. Wright hyrum_wri...@mail.utexas.edu:

 This could get ugly.

 Creating and destroying pools all over the place could get ugly, but
 it's necessary evil because all of our object creation / duplication
 functions all require a pool.  An alternative would be a set of
 functions returning the size of the object, and then another which
 puts the object in a pre-allocated memory location.  (These could
 theoretically replace the pool argument version of the API, but that'd
 be *a lot* of churn.)

 The approach would let the C++ allocate the memory (of the correct
 size) using whatever scheme it wants, and then do a placement
 initialize using the second API.  If we do go this route, I'd
 recommend exposing these as private-to-Subversion, at least initially.

Hm... this sounds very complex and complicated.

What exactly is it you want to do?  Have a thin C++ wrapper around C
objects, where the C objects do the work?  Why do you want to?
Ie. what benefit do you expect to get, compared to just using the C
objects from inside your C++ code?

FWIW the patternt I would have looked at initially, (without actually
studying the problem your're trying to solve...;-) ) is one of having
refcounting smart pointers to a reference object.  The reference object
would have one pointer to the underlying C object and a reference count.

The constructor of a smart pointer would up the reference count by one,
and the destructor of the smart pointer would decrease the reference
count by one.  When the reference counter in the reference object went
to 0, it would assume that it had no users, and call the C pool code to
clean up its memory usage.

For creation of the reference objects you have at least two choices:
 - Use some kind of factory
 - Create the C object using the C APIs creation functions in the
   reference object's construct
(or perhaps a combination of the two)

Its not watertight, but people have been writing good and working
applications, using this pattern.



Re: object-model: Wrapping Subversion C-structs in C++

2010-09-25 Thread Steinar Bang
 Branko Čibej br...@xbc.nu:

 I suggest you take a look at auto_ptr and auto_ptr_ref.

auto_ptr is very limited in that it allows only a single pointer to a
single object.  

 You need a similar pair of classes that will deal with pools,

...but the idea of smart pointers is the correct one here, yes.



Re: object-model: Wrapping Subversion C-structs in C++

2010-09-25 Thread Branko Čibej
 On 25.09.2010 10:34, Steinar Bang wrote:
 Branko Čibej br...@xbc.nu:
 I suggest you take a look at auto_ptr and auto_ptr_ref.
 auto_ptr is very limited in that it allows only a single pointer to a
 single object.

Which is why I mentioned auto_ptr_ref, which is a reference to an
auto_ptr (non-owned).

-- Brane



Re: object-model: Wrapping Subversion C-structs in C++

2010-09-25 Thread Steinar Bang
 Steinar Bang s...@dod.no:

 What exactly is it you want to do?  Have a thin C++ wrapper around C
 objects, where the C objects do the work?  Why do you want to?
 Ie. what benefit do you expect to get, compared to just using the C
 objects from inside your C++ code?

Another pattern that I've used with some success (eg. when wrapping the
W3C libwww as HTTP/FTP/etc. support in a Qt application):

 - Create a C++ class that corresponds to the C API you would like to
   wrap 
 - Where the C API would return a pointer to some struct, return a C++
   object that holds a pointer to the struct and has methods
   corresponding to the operations that can be done on the struct using
   the API
 - Life cycle management would be handled by the API and the C++ object
   wrapping the API (though in some cases the wrapper object destructor
   might hand the C pointer back to the API)

Variations over the theme:
 - The C++ wrappers have virtual methods
 - The API wrapping object return pointers to the C++ objects rather
   than instances (typically for objects with virtual methods)
 - The API wrapping object return smart pointers to the C++ objects (but
   as these instances will be pointer sized objects, this doesn't seem
   very useful...)

The first trouble spot one will run into is non-primitive argument
values and method return values.  Eg.
 - Life cycle of returned wrapper objects
 - Strings
  - Minimize copies in and out of std::string
  - Minimize recodning from UTF-8-coded char* to UTF-16, or UCS-4 in
std::wstring 

But they are overcomable.



Re: object-model: Wrapping Subversion C-structs in C++

2010-09-25 Thread Hyrum K. Wright
On Sat, Sep 25, 2010 at 3:30 AM, Steinar Bang s...@dod.no wrote:
 Hyrum K. Wright hyrum_wri...@mail.utexas.edu:

 This could get ugly.

 Creating and destroying pools all over the place could get ugly, but
 it's necessary evil because all of our object creation / duplication
 functions all require a pool.  An alternative would be a set of
 functions returning the size of the object, and then another which
 puts the object in a pre-allocated memory location.  (These could
 theoretically replace the pool argument version of the API, but that'd
 be *a lot* of churn.)

 The approach would let the C++ allocate the memory (of the correct
 size) using whatever scheme it wants, and then do a placement
 initialize using the second API.  If we do go this route, I'd
 recommend exposing these as private-to-Subversion, at least initially.

 Hm... this sounds very complex and complicated.

 What exactly is it you want to do?  Have a thin C++ wrapper around C
 objects, where the C objects do the work?  Why do you want to?
 Ie. what benefit do you expect to get, compared to just using the C
 objects from inside your C++ code?

Returning C objects to the callers of the bindings would require said
callers to worry about managing the memory allocated for those
objects, thus requiring them to care about pools and their lifetimes.
I want to eliminate this requirement of direct management for the
callers, and provide them with objects that look like standard C++
objects (with C++ memory management paradigms, etc).  Most of these
objects are not functional, but rather informational in nature.

 FWIW the patternt I would have looked at initially, (without actually
 studying the problem your're trying to solve...;-) ) is one of having
 refcounting smart pointers to a reference object.  The reference object
 would have one pointer to the underlying C object and a reference count.

 The constructor of a smart pointer would up the reference count by one,
 and the destructor of the smart pointer would decrease the reference
 count by one.  When the reference counter in the reference object went
 to 0, it would assume that it had no users, and call the C pool code to
 clean up its memory usage.

 For creation of the reference objects you have at least two choices:
  - Use some kind of factory
  - Create the C object using the C APIs creation functions in the
   reference object's construct
 (or perhaps a combination of the two)

 Its not watertight, but people have been writing good and working
 applications, using this pattern.

And it's this approach that I implemented a couple of days ago.  :)
See 
http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h
for the latest incarnation.

-Hyrum


Re: object-model: Wrapping Subversion C-structs in C++

2010-09-24 Thread Branko Čibej
 On 24.09.2010 04:05, Hyrum K. Wright wrote:
 On Thu, Sep 23, 2010 at 2:20 PM, Branko Čibej br...@xbc.nu wrote:
  On 22.09.2010 21:41, Hyrum K. Wright wrote:
 [ apologizes for the somewhat stream-of-conscious nature of these mails ]

 On Wed, Sep 22, 2010 at 7:16 PM, Hyrum K. Wright
 hyrum_wri...@mail.utexas.edu wrote:
 On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright
 hyrum_wri...@mail.utexas.edu wrote:
 For the C++ folks out there, I've got a question about an approach to
 take on the object-model branch.  At issue is how to wrap the various
 C structures returned to callers, particularly in a backward
 compatible manner.  Currently, I'm looking at svn_wc_notify_t *.  As I
 see it, there are a few options:

 1) Just wrap the pointer to the C struct as a member of the wrapper class.
Pros: Easy to implement; lightweight constructor.
Cons: Getters would need to translate to C++ types; would need to
 implement a copy constructor which deep copies the C struct; would
 also introduce pools, since creating and duplicating C structs
 requires them.

 2) Wrap each C struct member individually
Pros: C-C++ complexity is constrained to the constructor,
 everything else is C++ types
Cons: Hard to extend for future compatibility

 3) Just pass the C-struct pointer around; don't even bother with a class
Pros: Dead simple.
Cons: Requires more memory management thought by consumers; not
 C++-y enough; may introduce wrapping difficulties.

 I'd like to come up with something consistent, which would be used
 throughout the C++ bindings.  I'm also interested in a solution which
 ensures the C++ bindings can be used as the basis for other
 object-oriented bindings models (Python, Perl, etc.)
 After lunch, and some thought, it feels like #1 is the best solution.
 This doesn't change the external class interface, which is good, and
 can still provide C++ values to callers who want them.  The pool
 issues are a bit messy, but at least the object can manage it's own
 memory (albeit at a significant overhead).
 This could get ugly.

 Creating and destroying pools all over the place could get ugly, but
 it's necessary evil because all of our object creation / duplication
 functions all require a pool.  An alternative would be a set of
 functions returning the size of the object, and then another which
 puts the object in a pre-allocated memory location.  (These could
 theoretically replace the pool argument version of the API, but that'd
 be *a lot* of churn.)

 The approach would let the C++ allocate the memory (of the correct
 size) using whatever scheme it wants, and then do a placement
 initialize using the second API.  If we do go this route, I'd
 recommend exposing these as private-to-Subversion, at least initially.

 The other option is just pass the C-struct pointer around everywhere,
 but then the bindings consumers have to work about this exact same
 issue.  In other words, it solves it now, but really just pushes the
 problem elsewhere.

 -Hyrum
 Memory management with pools and C++ -- keep away from doing it in
 per-object ctor/dtor pairs is all I can say. Lifetimes would get so
 messy and mucked up you could hardly believe it.

 IMHO the best way to wrap the C structures in C++ is to subclass 'em.

 class svn_some_thing : public svn_some_thing_t {  };

 This way you can pass C++ pointers directly to the C implementation,
 most methods become just inlined wrappers. Callbacks are a bit more
 hairy, but not all that much.
 This works for input types, but right now it's the output types (which
 are much more common) that I'm concerned about.  Things like
 svn_commit_info_t, or svn_wc_notify_t.  For my initial hack at it see:
 http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h?p=1000600

If you're strict about the wrapping, i.e., do not add any data members
or virtual functions to the wrappers, then a simple static_cast of the
returned pointer will solve the type conversion for output parameters.
It makes the wrapper methods a bit more complex, but still easily inlinable.

Two things worry me about your approach:

* Addidional dereference overhead, and construction overhead ...
  might or might not be important performance-wise, and certainly is
  important bug-wise ...
* The pool-per-object that you already mentioned, made worse by the
  fact that you implement operator= ... nice syntactic sugar, but
  has serious side effects.

The pool-per-object is especially bothersome because it creates the
false sense that you don't have to worry about pool lifetime ... but you
*do* because every pool has a parent, you've just hidden the fact away
very obscurely. If you /do/ want to give each object a reference to its
containing pool (by no means a bad thing), there are better ways, see below.

Consider too what you'll do with destructors and pool destruction order.
Clearly any destructor for objects such as yours cannot be 

Re: object-model: Wrapping Subversion C-structs in C++

2010-09-24 Thread Hyrum K. Wright
On Fri, Sep 24, 2010 at 1:02 AM, Branko Čibej br...@xbc.nu wrote:
  On 24.09.2010 04:05, Hyrum K. Wright wrote:
 On Thu, Sep 23, 2010 at 2:20 PM, Branko Čibej br...@xbc.nu wrote:
  On 22.09.2010 21:41, Hyrum K. Wright wrote:
 [ apologizes for the somewhat stream-of-conscious nature of these mails ]

 On Wed, Sep 22, 2010 at 7:16 PM, Hyrum K. Wright
 hyrum_wri...@mail.utexas.edu wrote:
 On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright
 hyrum_wri...@mail.utexas.edu wrote:
 For the C++ folks out there, I've got a question about an approach to
 take on the object-model branch.  At issue is how to wrap the various
 C structures returned to callers, particularly in a backward
 compatible manner.  Currently, I'm looking at svn_wc_notify_t *.  As I
 see it, there are a few options:

 1) Just wrap the pointer to the C struct as a member of the wrapper 
 class.
    Pros: Easy to implement; lightweight constructor.
    Cons: Getters would need to translate to C++ types; would need to
 implement a copy constructor which deep copies the C struct; would
 also introduce pools, since creating and duplicating C structs
 requires them.

 2) Wrap each C struct member individually
    Pros: C-C++ complexity is constrained to the constructor,
 everything else is C++ types
    Cons: Hard to extend for future compatibility

 3) Just pass the C-struct pointer around; don't even bother with a class
    Pros: Dead simple.
    Cons: Requires more memory management thought by consumers; not
 C++-y enough; may introduce wrapping difficulties.

 I'd like to come up with something consistent, which would be used
 throughout the C++ bindings.  I'm also interested in a solution which
 ensures the C++ bindings can be used as the basis for other
 object-oriented bindings models (Python, Perl, etc.)
 After lunch, and some thought, it feels like #1 is the best solution.
 This doesn't change the external class interface, which is good, and
 can still provide C++ values to callers who want them.  The pool
 issues are a bit messy, but at least the object can manage it's own
 memory (albeit at a significant overhead).
 This could get ugly.

 Creating and destroying pools all over the place could get ugly, but
 it's necessary evil because all of our object creation / duplication
 functions all require a pool.  An alternative would be a set of
 functions returning the size of the object, and then another which
 puts the object in a pre-allocated memory location.  (These could
 theoretically replace the pool argument version of the API, but that'd
 be *a lot* of churn.)

 The approach would let the C++ allocate the memory (of the correct
 size) using whatever scheme it wants, and then do a placement
 initialize using the second API.  If we do go this route, I'd
 recommend exposing these as private-to-Subversion, at least initially.

 The other option is just pass the C-struct pointer around everywhere,
 but then the bindings consumers have to work about this exact same
 issue.  In other words, it solves it now, but really just pushes the
 problem elsewhere.

 -Hyrum
 Memory management with pools and C++ -- keep away from doing it in
 per-object ctor/dtor pairs is all I can say. Lifetimes would get so
 messy and mucked up you could hardly believe it.

 IMHO the best way to wrap the C structures in C++ is to subclass 'em.

 class svn_some_thing : public svn_some_thing_t {  };

 This way you can pass C++ pointers directly to the C implementation,
 most methods become just inlined wrappers. Callbacks are a bit more
 hairy, but not all that much.
 This works for input types, but right now it's the output types (which
 are much more common) that I'm concerned about.  Things like
 svn_commit_info_t, or svn_wc_notify_t.  For my initial hack at it see:
 http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h?p=1000600

 If you're strict about the wrapping, i.e., do not add any data members
 or virtual functions to the wrappers, then a simple static_cast of the
 returned pointer will solve the type conversion for output parameters.
 It makes the wrapper methods a bit more complex, but still easily inlinable.

Well, except for the pool lifetime issues.  The wrapped pointer would
still be allocated in a the result pool of the called function, but by
doing a cast, and returning the wrapper to the caller, we've now
hidden that fact (and given the consumer more than enough rope to hang
their entire team).  Hence the desire to duplicate the returned value
into a pool managed outside of the one provided to the C API.

 Two things worry me about your approach:

    * Addidional dereference overhead, and construction overhead ...
      might or might not be important performance-wise, and certainly is
      important bug-wise ...
    * The pool-per-object that you already mentioned, made worse by the
      fact that you implement operator= ... nice syntactic sugar, but
      has serious side effects.

 The 

Re: object-model: Wrapping Subversion C-structs in C++

2010-09-24 Thread Branko Čibej
 On 24.09.2010 18:43, Hyrum K. Wright wrote:
 All of the Pools used to hold the child objects are children of the
 global parent (created with NULL as the parent pool).  As such, they
 are independent of each other, and won't have destruction order
 issues.  It's pretty wasteful in terms of the memory overhead, but
 meets the goal of having each object have it's own pool, and control
 it's own lifetime independent of other objects.

I'm questioning that specific goal. What do you gain by it?
I'm not convinced that it is a good idea to do this. If you keep the C++
wrappers minimal, i.e., wrap every C structure in a thin sheet of C++
and hope for the best -- then I have a nagging suspicion that managing
pool lifetimes will have to be explicit.

On the other hand, you could take a more high-level approach (like
JavaHL?) and not tie the object model to the current API too much. Then
I could imagine, e.g., having an SVNClient object that does all the
pool management. It's generally more easy to deal with pools on a
slightly higher level than per-object, wouldn't you say?

 Pool are certainly un-C++-like, but our APIs don't help to much
 either.  To allocate (or duplicate) these structures, we require
 callers to provide a pool, instead of a generic allocation mechanism.
 If we did the latter, we could easily put these objects into memory
 managed natively by C++, but because we don't, a Pool is required.
 That's really the only reason to use pools in these objects.

Actually this pool-instead-of-allocator restriction comes from APR ...
early in the design of APR, generic allocators were dropped in favour of
pools, which then force their lifetime behaviour on all applications
that use them. I believe one of the reasons for this was allocation
speed, and another was that pools fit a stateless HTTP server
implementation perfectly ... though they're not too handy for other
kinds of applications.

-- Brane



Re: object-model: Wrapping Subversion C-structs in C++

2010-09-24 Thread Hyrum K. Wright
On Fri, Sep 24, 2010 at 12:04 PM, Branko Čibej br...@xbc.nu wrote:
  On 24.09.2010 18:43, Hyrum K. Wright wrote:
 All of the Pools used to hold the child objects are children of the
 global parent (created with NULL as the parent pool).  As such, they
 are independent of each other, and won't have destruction order
 issues.  It's pretty wasteful in terms of the memory overhead, but
 meets the goal of having each object have it's own pool, and control
 it's own lifetime independent of other objects.

 I'm questioning that specific goal. What do you gain by it?
 I'm not convinced that it is a good idea to do this. If you keep the C++
 wrappers minimal, i.e., wrap every C structure in a thin sheet of C++
 and hope for the best -- then I have a nagging suspicion that managing
 pool lifetimes will have to be explicit.

 On the other hand, you could take a more high-level approach (like
 JavaHL?) and not tie the object model to the current API too much. Then
 I could imagine, e.g., having an SVNClient object that does all the
 pool management. It's generally more easy to deal with pools on a
 slightly higher level than per-object, wouldn't you say?

The advantage that JavaHL has is that there is a well-defined boundary
between the consumer (written in Java) and the wrappers.  Returned
structures (such as svn_commit_info_t) are converted to Java before
the end of the call down into C++, so we can allocate everything in
the per-API pool (the request pool), and not worry about lifetimes.
A C++ application using the C++ bindings doesn't have that limitation,
though we could do the analogue by converting the C structs deeply
into C++ objects.  But this introduces forward compatibility issues in
the case of growing structs--the entire reason we use the pool-based
creation functions in the first place.

-Hyrum


Re: object-model: Wrapping Subversion C-structs in C++

2010-09-23 Thread Branko Čibej
 On 22.09.2010 21:41, Hyrum K. Wright wrote:
 [ apologizes for the somewhat stream-of-conscious nature of these mails ]

 On Wed, Sep 22, 2010 at 7:16 PM, Hyrum K. Wright
 hyrum_wri...@mail.utexas.edu wrote:
 On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright
 hyrum_wri...@mail.utexas.edu wrote:
 For the C++ folks out there, I've got a question about an approach to
 take on the object-model branch.  At issue is how to wrap the various
 C structures returned to callers, particularly in a backward
 compatible manner.  Currently, I'm looking at svn_wc_notify_t *.  As I
 see it, there are a few options:

 1) Just wrap the pointer to the C struct as a member of the wrapper class.
Pros: Easy to implement; lightweight constructor.
Cons: Getters would need to translate to C++ types; would need to
 implement a copy constructor which deep copies the C struct; would
 also introduce pools, since creating and duplicating C structs
 requires them.

 2) Wrap each C struct member individually
Pros: C-C++ complexity is constrained to the constructor,
 everything else is C++ types
Cons: Hard to extend for future compatibility

 3) Just pass the C-struct pointer around; don't even bother with a class
Pros: Dead simple.
Cons: Requires more memory management thought by consumers; not
 C++-y enough; may introduce wrapping difficulties.

 I'd like to come up with something consistent, which would be used
 throughout the C++ bindings.  I'm also interested in a solution which
 ensures the C++ bindings can be used as the basis for other
 object-oriented bindings models (Python, Perl, etc.)
 After lunch, and some thought, it feels like #1 is the best solution.
 This doesn't change the external class interface, which is good, and
 can still provide C++ values to callers who want them.  The pool
 issues are a bit messy, but at least the object can manage it's own
 memory (albeit at a significant overhead).
 This could get ugly.

 Creating and destroying pools all over the place could get ugly, but
 it's necessary evil because all of our object creation / duplication
 functions all require a pool.  An alternative would be a set of
 functions returning the size of the object, and then another which
 puts the object in a pre-allocated memory location.  (These could
 theoretically replace the pool argument version of the API, but that'd
 be *a lot* of churn.)

 The approach would let the C++ allocate the memory (of the correct
 size) using whatever scheme it wants, and then do a placement
 initialize using the second API.  If we do go this route, I'd
 recommend exposing these as private-to-Subversion, at least initially.

 The other option is just pass the C-struct pointer around everywhere,
 but then the bindings consumers have to work about this exact same
 issue.  In other words, it solves it now, but really just pushes the
 problem elsewhere.

 -Hyrum

Memory management with pools and C++ -- keep away from doing it in
per-object ctor/dtor pairs is all I can say. Lifetimes would get so
messy and mucked up you could hardly believe it.

IMHO the best way to wrap the C structures in C++ is to subclass 'em.

class svn_some_thing : public svn_some_thing_t {  };

This way you can pass C++ pointers directly to the C implementation,
most methods become just inlined wrappers. Callbacks are a bit more
hairy, but not all that much.

As to pools ... I'd once thought that a proper C++ wrapper for a pool
would be a reference-counted smart pointer. That turns out to be too
much overhead and too much of a good thing, since you *cannot* allow
pool lifetime to depend on some reference counting order of execution.
Best just use plain pool pointers, or maybe wrap them minimally so that
destructors do the pool cleanup reliably.

-- Brane


Re: object-model: Wrapping Subversion C-structs in C++

2010-09-23 Thread Hyrum K. Wright
On Thu, Sep 23, 2010 at 2:20 PM, Branko Čibej br...@xbc.nu wrote:
  On 22.09.2010 21:41, Hyrum K. Wright wrote:
 [ apologizes for the somewhat stream-of-conscious nature of these mails ]

 On Wed, Sep 22, 2010 at 7:16 PM, Hyrum K. Wright
 hyrum_wri...@mail.utexas.edu wrote:
 On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright
 hyrum_wri...@mail.utexas.edu wrote:
 For the C++ folks out there, I've got a question about an approach to
 take on the object-model branch.  At issue is how to wrap the various
 C structures returned to callers, particularly in a backward
 compatible manner.  Currently, I'm looking at svn_wc_notify_t *.  As I
 see it, there are a few options:

 1) Just wrap the pointer to the C struct as a member of the wrapper class.
    Pros: Easy to implement; lightweight constructor.
    Cons: Getters would need to translate to C++ types; would need to
 implement a copy constructor which deep copies the C struct; would
 also introduce pools, since creating and duplicating C structs
 requires them.

 2) Wrap each C struct member individually
    Pros: C-C++ complexity is constrained to the constructor,
 everything else is C++ types
    Cons: Hard to extend for future compatibility

 3) Just pass the C-struct pointer around; don't even bother with a class
    Pros: Dead simple.
    Cons: Requires more memory management thought by consumers; not
 C++-y enough; may introduce wrapping difficulties.

 I'd like to come up with something consistent, which would be used
 throughout the C++ bindings.  I'm also interested in a solution which
 ensures the C++ bindings can be used as the basis for other
 object-oriented bindings models (Python, Perl, etc.)
 After lunch, and some thought, it feels like #1 is the best solution.
 This doesn't change the external class interface, which is good, and
 can still provide C++ values to callers who want them.  The pool
 issues are a bit messy, but at least the object can manage it's own
 memory (albeit at a significant overhead).
 This could get ugly.

 Creating and destroying pools all over the place could get ugly, but
 it's necessary evil because all of our object creation / duplication
 functions all require a pool.  An alternative would be a set of
 functions returning the size of the object, and then another which
 puts the object in a pre-allocated memory location.  (These could
 theoretically replace the pool argument version of the API, but that'd
 be *a lot* of churn.)

 The approach would let the C++ allocate the memory (of the correct
 size) using whatever scheme it wants, and then do a placement
 initialize using the second API.  If we do go this route, I'd
 recommend exposing these as private-to-Subversion, at least initially.

 The other option is just pass the C-struct pointer around everywhere,
 but then the bindings consumers have to work about this exact same
 issue.  In other words, it solves it now, but really just pushes the
 problem elsewhere.

 -Hyrum

 Memory management with pools and C++ -- keep away from doing it in
 per-object ctor/dtor pairs is all I can say. Lifetimes would get so
 messy and mucked up you could hardly believe it.

 IMHO the best way to wrap the C structures in C++ is to subclass 'em.

 class svn_some_thing : public svn_some_thing_t {  };

 This way you can pass C++ pointers directly to the C implementation,
 most methods become just inlined wrappers. Callbacks are a bit more
 hairy, but not all that much.

This works for input types, but right now it's the output types (which
are much more common) that I'm concerned about.  Things like
svn_commit_info_t, or svn_wc_notify_t.  For my initial hack at it see:
http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h?p=1000600

 As to pools ... I'd once thought that a proper C++ wrapper for a pool
 would be a reference-counted smart pointer. That turns out to be too
 much overhead and too much of a good thing, since you *cannot* allow
 pool lifetime to depend on some reference counting order of execution.
 Best just use plain pool pointers, or maybe wrap them minimally so that
 destructors do the pool cleanup reliably.

Yeah, we've got a Pool class which destroys the underlying APR pool in
the dtor, so that provides a handy way to clean things up (if it
weren't for the 8k overhead of the minimum pool size).  To the initial
hack, I've added a smart pointer to prevent multiple redundant copies
of the underlying C struct, since I expect it to be read-only (we
could probably add an explicit Clone() method if folks really wanted a
deep copy).  The smart pointer magic is here:
http://svn.apache.org/repos/asf/subversion/branches/object-model/subversion/bindings/c++/include/Types.h

There's still a self-managed pool for allocating the C-struct, but
it's encapsulated enough to be changable should we have a better way
of duplicating objects in the future.

Thanks for the insight, by the way.  These classes and paradigms are
in no way set 

Re: object-model: Wrapping Subversion C-structs in C++

2010-09-22 Thread Hyrum K. Wright
On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright
hyrum_wri...@mail.utexas.edu wrote:
 For the C++ folks out there, I've got a question about an approach to
 take on the object-model branch.  At issue is how to wrap the various
 C structures returned to callers, particularly in a backward
 compatible manner.  Currently, I'm looking at svn_wc_notify_t *.  As I
 see it, there are a few options:

 1) Just wrap the pointer to the C struct as a member of the wrapper class.
    Pros: Easy to implement; lightweight constructor.
    Cons: Getters would need to translate to C++ types; would need to
 implement a copy constructor which deep copies the C struct; would
 also introduce pools, since creating and duplicating C structs
 requires them.

 2) Wrap each C struct member individually
    Pros: C-C++ complexity is constrained to the constructor,
 everything else is C++ types
    Cons: Hard to extend for future compatibility

 3) Just pass the C-struct pointer around; don't even bother with a class
    Pros: Dead simple.
    Cons: Requires more memory management thought by consumers; not
 C++-y enough; may introduce wrapping difficulties.

 I'd like to come up with something consistent, which would be used
 throughout the C++ bindings.  I'm also interested in a solution which
 ensures the C++ bindings can be used as the basis for other
 object-oriented bindings models (Python, Perl, etc.)

After lunch, and some thought, it feels like #1 is the best solution.
This doesn't change the external class interface, which is good, and
can still provide C++ values to callers who want them.  The pool
issues are a bit messy, but at least the object can manage it's own
memory (albeit at a significant overhead).

-Hyrum


Re: object-model: Wrapping Subversion C-structs in C++

2010-09-22 Thread Hyrum K. Wright
[ apologizes for the somewhat stream-of-conscious nature of these mails ]

On Wed, Sep 22, 2010 at 7:16 PM, Hyrum K. Wright
hyrum_wri...@mail.utexas.edu wrote:
 On Wed, Sep 22, 2010 at 5:35 PM, Hyrum K. Wright
 hyrum_wri...@mail.utexas.edu wrote:
 For the C++ folks out there, I've got a question about an approach to
 take on the object-model branch.  At issue is how to wrap the various
 C structures returned to callers, particularly in a backward
 compatible manner.  Currently, I'm looking at svn_wc_notify_t *.  As I
 see it, there are a few options:

 1) Just wrap the pointer to the C struct as a member of the wrapper class.
    Pros: Easy to implement; lightweight constructor.
    Cons: Getters would need to translate to C++ types; would need to
 implement a copy constructor which deep copies the C struct; would
 also introduce pools, since creating and duplicating C structs
 requires them.

 2) Wrap each C struct member individually
    Pros: C-C++ complexity is constrained to the constructor,
 everything else is C++ types
    Cons: Hard to extend for future compatibility

 3) Just pass the C-struct pointer around; don't even bother with a class
    Pros: Dead simple.
    Cons: Requires more memory management thought by consumers; not
 C++-y enough; may introduce wrapping difficulties.

 I'd like to come up with something consistent, which would be used
 throughout the C++ bindings.  I'm also interested in a solution which
 ensures the C++ bindings can be used as the basis for other
 object-oriented bindings models (Python, Perl, etc.)

 After lunch, and some thought, it feels like #1 is the best solution.
 This doesn't change the external class interface, which is good, and
 can still provide C++ values to callers who want them.  The pool
 issues are a bit messy, but at least the object can manage it's own
 memory (albeit at a significant overhead).

This could get ugly.

Creating and destroying pools all over the place could get ugly, but
it's necessary evil because all of our object creation / duplication
functions all require a pool.  An alternative would be a set of
functions returning the size of the object, and then another which
puts the object in a pre-allocated memory location.  (These could
theoretically replace the pool argument version of the API, but that'd
be *a lot* of churn.)

The approach would let the C++ allocate the memory (of the correct
size) using whatever scheme it wants, and then do a placement
initialize using the second API.  If we do go this route, I'd
recommend exposing these as private-to-Subversion, at least initially.

The other option is just pass the C-struct pointer around everywhere,
but then the bindings consumers have to work about this exact same
issue.  In other words, it solves it now, but really just pushes the
problem elsewhere.

-Hyrum