Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Hamish Mackenzie
On Wed, 2003-06-25 at 20:45, Stefan Seefeld wrote: 
  Why should the node-wrappers keep the document alive?
 
 for consistency, and convenience. In the same way you can get down from
 the document to the individual nodes you can get up: node.parent() and
 node.document() provide the means to walk up towards the document root.

And yet it does not keep it's parent() alive.  How is that consistent?  

In what way would you be unable to walk toward the document if it was
not add reffed?

 Imagine a factory object that is initialized with a dom::node_ptr
 (holding configurational data, say). Whenever you access its method,
 the object looks up data in the node...
 
 
 class factory
 {
 public:
factory(dom::node_ptr n) : my_node(n) {}
foo create_foo(); /* access my_node */
bar create_bar(); /* access my_node */
 private:
dom::node_ptr my_node;
 };
 
 factory 'owns' the node, so whereever you instantiate the factory,
 you'd read in a dom::document, look up the relevant node, and pass
 that to the constructor:
 
 factory *f;
 {
dom::document_ptr document(config.xml);
dom::node_set set = document.root_node().find(//factory.info);
f = new factory(set[0]);
 } // document and set are now deleted, but factory still references
// the document through its 'my_node' member
 
 
 If the document wasn't ref-counted, you'd need to pass it along with
 the node to the factory, as only the factory would know when to drop
 it (in its destructor, presumably).

One of the following should be ok...

// Each factory has its own document
class factory : boost::noncopyable
{
public:
  factory( ... );
private:
  dom::node_ptr node_;
  std::auto_ptr dom::document  doc_;
};

// They share documents
class factory
{
public:
  factory( ... );
private:
  dom::node_ptr node_;
  std::shared_ptr dom::document  doc_;
};

// All factories use the same document
class factory_ref
{
private:
  factory_ref( dom::node_ptr node ) : node_( node )
  {
  }
  dom::node_ptr node_;
  friend class factories;
};

class factories : boost::noncopyable
{
public:
  factory_ref get_factory( ... );

private:
  std::auto_ptr dom::document  doc_;
};

By making the user manage the document object it forces them to consider
which of these models they want.

With the implicit add ref it is hard to tell what the intent of the code
is.  Your example illustrates this problem well, but consider this
simpler version

{
xml::document_ptr doc( config.xml );
some_function( doc-root_node() );
}

I cannot tell what the writer intended the scope of doc to be without
examining some_function and understanding what it does.  It might add
ref the document it might not.

Now without the add ref stuff we could have 

{
xml::document doc( config.xml );
some_function( doc-root_node() );
}

and

{
std::auto_ptr xml::document  doc( new xml::document( config.xml ) );
some_function( doc, doc-root_node() );
}

Now we know if some_function is allowed to extend the life of the
document or not.

  Here is the analogy I think works best...
  
  container -- document
  container::value_type -- node
  container::iterator -- node_iterator
  container::pointer_type -- node_pointer
  container::reference_type -- node_reference
 
 hmm, that makes it look simpler than it actually is: is there really
 a single 'value_type' ?

True you need attribute_, element_ etc. variations which you already
have.

value_types could exist but it would require a deep copy to be
consistent.  If you do want to define it then I suggest

typedef void value_type;

Along with an explanation.  That way if I write

node_iterator::value_type temp( *i ); // Make a copy for later
(*i).some_non_const_function();

I would get a compiler error rather than a nasty surprise when I use
temp.

 Is there really a single iterator ? (iterating 
 over all child nodes of a given parent and iterating over all attributes
 is not the same)

Your iterator types look good.  Why is there an extra level of
indirection in basic_element_const_iterator?

 Also, what is a node_reference (as opposed to a pointer) ?
If you renamed your _ptr classes as _ref or _reference and replace _ptr
with instances of basic_node_pointer.  (Make basic_node_pointer by
taking basic_element_iterator and stripping the ++ and -- operators).

Then some_pointer-x() and some_iterator-x() would call the same x()
member of the _reference class.

-- 

Hamish Mackenzie [EMAIL PROTECTED]

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Stefan Seefeld
Hamish Mackenzie wrote:
On Wed, 2003-06-25 at 20:45, Stefan Seefeld wrote: 

Why should the node-wrappers keep the document alive?
for consistency, and convenience. In the same way you can get down from
the document to the individual nodes you can get up: node.parent() and
node.document() provide the means to walk up towards the document root.


And yet it does not keep it's parent() alive.  How is that consistent?  

In what way would you be unable to walk toward the document if it was
not add reffed?
dom::node_ptr root;
{
  dom::document_ptr = document(foo.xml);
  root = document.root_node();
}
at this point document would go out of scope and so the tree would
be destructed, making root invalid. With root holding a reference
to the document it is not, and the following will work:
dom::document_ptr document = root.document();
// continue here...
[...]

With the implicit add ref it is hard to tell what the intent of the code
is.  Your example illustrates this problem well, but consider this
simpler version
yes, letting the user explicitely manage the document would be the alternative.
I'm not sure that nodes referencing their document is less clear, though.
{
xml::document_ptr doc( config.xml );
some_function( doc-root_node() );
}
I cannot tell what the writer intended the scope of doc to be without
examining some_function and understanding what it does.  It might add
ref the document it might not.
What about the relationship between the document and dom::document_ptr ?
So even if 'doc' (a document pointer) is limitted, the callee could
set its own member
some_function(dom::node_ptr node)
{
  my_doc = node-document()
}
and just by looking at the API it is not clear that the 'doc'
variable above is really the master document pointer. Being
able to write things like
dom::document_ptr doc(config.xml);
dom::document_ptr doc2(doc);
somehow suggests that either the document is not managed by
dom::document_ptr at all, or by all of them (and thus by
extension by dom::node_ptr)
[...]

Here is the analogy I think works best...

container -- document
container::value_type -- node
container::iterator -- node_iterator
container::pointer_type -- node_pointer
container::reference_type -- node_reference
hmm, that makes it look simpler than it actually is: is there really
a single 'value_type' ?


True you need attribute_, element_ etc. variations which you already
have.
value_types could exist but it would require a deep copy to be
consistent.  If you do want to define it then I suggest
you mean if I do *not* want to define it ?

typedef void value_type;


Your iterator types look good.  Why is there an extra level of
indirection in basic_element_const_iterator?
the const iterator is non-functional right now. I'v been wondering
how to provide one. It seems I would need to define a 'const_node_ptr'
set of classes.
Regards,
Stefan
___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Hamish Mackenzie
Ok I think I understand the problem now.  What does node-document()
return and what does it point to???

Well I think as with the node-parent() it should return a proxy
object.  Something like...

// non owning reference
class document_ref
{
public:
  // Define document related methods here
protected:
  document_ref() : raw_( 0 ) {}
  xmlDoc * raw_;
};

// non owning pointer
class document_ptr
{
public:
  document_ptr( document_ref * );
  document_ref operator *() { return ref_; }
  document_ref * operator -() { return ref_; }
private:
  document_ref ref_;
};

// owning object with deep copy
class document : public document_ref
{
public:
  explicit document( const std::string  file );
  document( document_ref source )
  {
// Deep copy here
  }
  ~document() { xmlFreeDoc( raw_ ); }
};

root-document() can return document_ptr or document_ref.

  value_types could exist but it would require a deep copy to be
  consistent.  If you do want to define it then I suggest
 
 you mean if I do *not* want to define it ?

Yes, you could
1) define a deep copy value_type
2) typedef void value_type;
3) leave it undefined

  Your iterator types look good.  Why is there an extra level of
  indirection in basic_element_const_iterator?
 
 the const iterator is non-functional right now. I'v been wondering
 how to provide one. It seems I would need to define a 'const_node_ptr'
 set of classes.

I think that is right.  I would prefer it to be

typedef node_pointer node_ref  node_ptr;
typedef node_pointer const_node_ref  const_node_ptr;

In fact you will probably need const_ versions for all your reference,
pointer and iterator types. Though the pointers and iterators should
just be additional instances of template classes.

-- 
Hamish Mackenzie [EMAIL PROTECTED]

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Stefan Seefeld
Hamish Mackenzie wrote:
Ok I think I understand the problem now.  What does node-document()
return and what does it point to???
it returns a dom::document_ptr, which behaves exactly the same way
as the other _ptr types, i.e. it has reference semantics.
Well I think as with the node-parent() it should return a proxy
object.  Something like...
that's what it does indeed.


// non owning reference
class document_ref
{
public:
  // Define document related methods here
protected:
  document_ref() : raw_( 0 ) {}
  xmlDoc * raw_;
};
// non owning pointer
class document_ptr
{
public:
  document_ptr( document_ref * );
  document_ref operator *() { return ref_; }
  document_ref * operator -() { return ref_; }
private:
  document_ref ref_;
};
// owning object with deep copy
class document : public document_ref
{
public:
  explicit document( const std::string  file );
  document( document_ref source )
  {
// Deep copy here
  }
  ~document() { xmlFreeDoc( raw_ ); }
};
I don't really understand why we need three different classes to
manage documents. In particular I don't understand why you provide
a 'document_ptr' that is a wrapper around document_ref.
And I don't use a 'document' class, as that is managed implicitely
by my dom::document_ptr:
dom::document_ptr document; // create new document;
dom::document_ptr doc(document); // create second reference to it
dom::document_ptr doc2 = document.clone(); // clone it, i.e. make deep
  copy
root-document() can return document_ptr or document_ref.
indeed, that's what it does.



value_types could exist but it would require a deep copy to be
consistent.  If you do want to define it then I suggest
you mean if I do *not* want to define it ?


Yes, you could
1) define a deep copy value_type
that doesn't work as there is no way to copy nodes 'out of the
document'.
2) typedef void value_type;
3) leave it undefined

Your iterator types look good.  Why is there an extra level of
indirection in basic_element_const_iterator?
the const iterator is non-functional right now. I'v been wondering
how to provide one. It seems I would need to define a 'const_node_ptr'
set of classes.


I think that is right.  I would prefer it to be

typedef node_pointer node_ref  node_ptr;
typedef node_pointer const_node_ref  const_node_ptr;
In fact you will probably need const_ versions for all your reference,
pointer and iterator types. Though the pointers and iterators should
just be additional instances of template classes.
yes

Stefan



___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Stefan Seefeld
Stefan Seefeld wrote:

And I don't use a 'document' class, as that is managed implicitely
by my dom::document_ptr:
dom::document_ptr document; // create new document;
that should actually become

dom::document_ptr document = dom::make_document(1.0);

or similar to indicate that a new document is to be created.
Then the default constructor can just create an empty pointer.
dom::document_ptr doc(document); // create second reference to it
dom::document_ptr doc2 = document.clone(); // clone it, i.e. make deep
Stefan

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Hamish Mackenzie
On Thu, 2003-06-26 at 16:04, Stefan Seefeld wrote:
 I don't really understand why we need three different classes to
 manage documents. In particular I don't understand why you provide
 a 'document_ptr' that is a wrapper around document_ref.

The document_ref and document_ptr would only be used when a non owning
reference or pointer is required.  Even then you could use
dom::document * and dom::document  instead in most cases.

One big difference between a reference and a pointer is that a reference
must contain a valid non null value.

dom::document_ref doc1; // Error
dom::document_ref doc2( 0 ); // Error
dom::document_ptr doc3; // Ok
dom::document_ptr doc4( 0 ); // Ok

This means you do not have to check references for null values.  A
pointer can be useful if you wish to be able to delay initialisation or
if an value is optional.

void some_function( document_ref doc ); // You must pass a doc
void some_function( document_ptr doc ); // You could pass 0

 And I don't use a 'document' class, as that is managed implicitely
 by my dom::document_ptr:
 
 dom::document_ptr document; // create new document;
 dom::document_ptr doc(document); // create second reference to it
 dom::document_ptr doc2 = document.clone(); // clone it, i.e. make deep
copy

This is not consistent with the standard library or C++ in general.  It
will seem strange that the pointer class
1) Does not require dereferencing
2) Contains a valid and non null value after default construction
3) Has a constructor such as document_ptr( config.xml )
4) Has member functions such as write_to_file

The alternative would allow both...

boost::shared_ptr dom::document  doc( new dom::document() );
boost::shared_ptr dom::document  doc1( doc );
dom::document doc2( *doc1 );

and if the 'doc1' reference was non-owning...

dom::document doc();  // Create new doc
dom::document  doc1( doc ); // Second reference
dom::document doc2( doc1 );  // Deep copy

Again this makes it clear when reference counting is required by the
design and when it isn't.

  Yes, you could
  1) define a deep copy value_type
 
 that doesn't work as there is no way to copy nodes 'out of the
 document'.

I am not suggesting we need this but it is possible...
class node
{
  ...
private:
  document doc_;
  node_ptr node_;
};

  2) typedef void value_type;
  3) leave it undefined

-- 
Hamish Mackenzie [EMAIL PROTECTED]

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Stefan Seefeld
Hamish Mackenzie wrote:
On Thu, 2003-06-26 at 16:04, Stefan Seefeld wrote:

I don't really understand why we need three different classes to
manage documents. In particular I don't understand why you provide
a 'document_ptr' that is a wrapper around document_ref.


The document_ref and document_ptr would only be used when a non owning
reference or pointer is required.  Even then you could use
dom::document * and dom::document  instead in most cases.
One big difference between a reference and a pointer is that a reference
must contain a valid non null value.
ok, but are these types really needed ?

My current proposal only provides dom::document_ptr, and I use implicit
refcounting on the underlying document tree. It seems to work quite
fine. I provide a bool operator () that tells me whether the
document_ptr is referring to a document or not.

dom::document_ref doc1; // Error
dom::document_ref doc2( 0 ); // Error
dom::document_ptr doc3; // Ok
dom::document_ptr doc4( 0 ); // Ok
This means you do not have to check references for null values.  A
pointer can be useful if you wish to be able to delay initialisation or
if an value is optional.
void some_function( document_ref doc ); // You must pass a doc
void some_function( document_ptr doc ); // You could pass 0
ok, I can see that as useful.

And I don't use a 'document' class, as that is managed implicitely
by my dom::document_ptr:
dom::document_ptr document; // create new document;
dom::document_ptr doc(document); // create second reference to it
dom::document_ptr doc2 = document.clone(); // clone it, i.e. make deep
  copy


This is not consistent with the standard library or C++ in general.  It
will seem strange that the pointer class
1) Does not require dereferencing
would you say the same if the class name was spelled 'document_ref'
instead ?
2) Contains a valid and non null value after default construction
right, see my followup post to that mail.

3) Has a constructor such as document_ptr( config.xml )
4) Has member functions such as write_to_file
The alternative would allow both...

boost::shared_ptr dom::document  doc( new dom::document() );
boost::shared_ptr dom::document  doc1( doc );
dom::document doc2( *doc1 );
and if the 'doc1' reference was non-owning...

dom::document doc();  // Create new doc
dom::document  doc1( doc ); // Second reference
dom::document doc2( doc1 );  // Deep copy
right, but given such an approach, what would nodes return in their
'parent()' method ?
Regards,
Stefan
___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Hamish Mackenzie
On Thu, 2003-06-26 at 16:18, Stefan Seefeld wrote:
 Stefan Seefeld wrote:
 
  And I don't use a 'document' class, as that is managed implicitely
  by my dom::document_ptr:
  
  dom::document_ptr document; // create new document;
 
 that should actually become
 
 dom::document_ptr document = dom::make_document(1.0);

Hm now

dom::document doc( 1.0 );

looks even nicer :-)

-- 
Hamish Mackenzie [EMAIL PROTECTED]

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Hamish Mackenzie
On Thu, 2003-06-26 at 18:32, Stefan Seefeld wrote:
 Hamish Mackenzie wrote:
  On Thu, 2003-06-26 at 16:04, Stefan Seefeld wrote:
  
 I don't really understand why we need three different classes to
 manage documents. In particular I don't understand why you provide
 a 'document_ptr' that is a wrapper around document_ref.
  
  
  The document_ref and document_ptr would only be used when a non owning
  reference or pointer is required.  Even then you could use
  dom::document * and dom::document  instead in most cases.
  
  One big difference between a reference and a pointer is that a reference
  must contain a valid non null value.
 
 ok, but are these types really needed ?

IMHO yes

 My current proposal only provides dom::document_ptr, and I use implicit
 refcounting on the underlying document tree. It seems to work quite
 fine. I provide a bool operator () that tells me whether the
 document_ptr is referring to a document or not.

I would rather have a debug version that tracked
pointers/iterators/references and flagged (at runtime) their use when
they are invalid.  This could catch a wider range of problems including
the use of nodes who's parents have been erased.

 And I don't use a 'document' class, as that is managed implicitely
 by my dom::document_ptr:
 
 dom::document_ptr document; // create new document;
 dom::document_ptr doc(document); // create second reference to it
 dom::document_ptr doc2 = document.clone(); // clone it, i.e. make deep
copy
  
  
  This is not consistent with the standard library or C++ in general.  It
  will seem strange that the pointer class
  1) Does not require dereferencing
 
 would you say the same if the class name was spelled 'document_ref'
 instead ?

1  4 would be ok, but 3 would stand and having an 'operator bool' would
be added to the list.

  3) Has a constructor such as document_ptr( config.xml )
  4) Has member functions such as write_to_file
  
  The alternative would allow both...
  
  boost::shared_ptr dom::document  doc( new dom::document() );
  boost::shared_ptr dom::document  doc1( doc );
  dom::document doc2( *doc1 );
  
  and if the 'doc1' reference was non-owning...
  
  dom::document doc();  // Create new doc
  dom::document  doc1( doc ); // Second reference
  dom::document doc2( doc1 );  // Deep copy
 
 right, but given such an approach, what would nodes return in their
 'parent()' method ?

The parent is always an element (is that right?) so it would return
element_ptr or element_ref.

I feel the correct answer is element_ptr, because, presumably,
root.parent() is null and you can't have a null reference.

You can apply the same logic to doc.root().  If a document can have a
null root then doc.root() should return a pointer.  If it can't then it
should return a reference.

-- 
Hamish Mackenzie [EMAIL PROTECTED]

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Stefan Seefeld
Hamish Mackenzie wrote:

And I don't use a 'document' class, as that is managed implicitely
by my dom::document_ptr:
dom::document_ptr document; // create new document;
dom::document_ptr doc(document); // create second reference to it
dom::document_ptr doc2 = document.clone(); // clone it, i.e. make deep
 copy


This is not consistent with the standard library or C++ in general.  It
will seem strange that the pointer class
1) Does not require dereferencing
would you say the same if the class name was spelled 'document_ref'
instead ?


1  4 would be ok, but 3 would stand and having an 'operator bool' would
be added to the list.

3) Has a constructor such as document_ptr( config.xml )
4) Has member functions such as write_to_file
The alternative would allow both...

boost::shared_ptr dom::document  doc( new dom::document() );
boost::shared_ptr dom::document  doc1( doc );
dom::document doc2( *doc1 );
and if the 'doc1' reference was non-owning...

dom::document doc();  // Create new doc
dom::document  doc1( doc ); // Second reference
dom::document doc2( doc1 );  // Deep copy
right, but given such an approach, what would nodes return in their
'parent()' method ?


The parent is always an element (is that right?) so it would return
element_ptr or element_ref.
yes. Sorry, I meant to ask what 'document()' would return.

Stefan

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Hamish Mackenzie
On Thu, 2003-06-26 at 19:51, Stefan Seefeld wrote:
 Hamish Mackenzie wrote:
 
 And I don't use a 'document' class, as that is managed implicitely
 by my dom::document_ptr:
 
 dom::document_ptr document; // create new document;
 dom::document_ptr doc(document); // create second reference to it
 dom::document_ptr doc2 = document.clone(); // clone it, i.e. make deep
   copy
 
 
 This is not consistent with the standard library or C++ in general.  It
 will seem strange that the pointer class
 1) Does not require dereferencing
 
 would you say the same if the class name was spelled 'document_ref'
 instead ?
  
  
  1  4 would be ok, but 3 would stand and having an 'operator bool' would
  be added to the list.
  
  
 3) Has a constructor such as document_ptr( config.xml )
 4) Has member functions such as write_to_file
 
 The alternative would allow both...
 
 boost::shared_ptr dom::document  doc( new dom::document() );
 boost::shared_ptr dom::document  doc1( doc );
 dom::document doc2( *doc1 );
 
 and if the 'doc1' reference was non-owning...
 
 dom::document doc();  // Create new doc
 dom::document  doc1( doc ); // Second reference
 dom::document doc2( doc1 );  // Deep copy
 
 right, but given such an approach, what would nodes return in their
 'parent()' method ?
  
  
  The parent is always an element (is that right?) so it would return
  element_ptr or element_ref.
 
 yes. Sorry, I meant to ask what 'document()' would return.

Assuming xmlNode::doc cannot be null it would return document_ref.

You might be worried about...

dom::document dom;
assert( dom.root().document() == dom );

I think this can work be made to work with

bool operator ==( document * p1, document_ref * p2 )
{
  return p1-raw_ == p2-raw_;
}

bool operator ==( document_ref * p1, document * p2 )
{
  return p1-raw_ == p2-raw_;
}

etc.

-- 
Hamish Mackenzie [EMAIL PROTECTED]

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Hamish Mackenzie
On Thu, 2003-06-26 at 21:00, Hamish Mackenzie wrote:
 You might be worried about...
 
 dom::document dom;
 assert( dom.root().document() == dom );
 
 I think this can work be made to work with
 
 bool operator ==( document * p1, document_ref * p2 )
 {
   return p1-raw_ == p2-raw_;
 }
 
 bool operator ==( document_ref * p1, document * p2 )
 {
   return p1-raw_ == p2-raw_;
 }

Actually that wouldn't work as dom.root().document() would fail to
compile.  (not without caching document_ref in node_ref)  But this might
be ok..

dom::document doc;
dom::document_ref doc2( doc.root().document() );
assert( doc2 == doc );

and...

assert( doc2 == doc );

Can be implemented but ideally it would compare all the nodes in the
document.

-- 
Hamish Mackenzie [EMAIL PROTECTED]

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Stefan Seefeld
Hamish Mackenzie wrote:

dom::document doc;
dom::document_ref doc2( doc.root().document() );
assert( doc2 == doc );
and...

assert( doc2 == doc );

Can be implemented but ideally it would compare all the nodes in the
document.
well, that's different. Do you want to know whether both documents are 
equal, or whether they are identical, i.e. whether both references point
to the same document ?

Hmm, just to check whether we are still talking about the same thing
here: do we agree that there can't be a 'node' type, i.e. just a 
'node_ref'/'node_ptr' ?

Else it would be impossible to make that API a wrapper around libs like
libxml2.
Stefan

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-26 Thread Hamish Mackenzie
On Thu, 2003-06-26 at 21:39, Stefan Seefeld wrote:
 Hamish Mackenzie wrote:
 
  dom::document doc;
  dom::document_ref doc2( doc.root().document() );
  assert( doc2 == doc );
  
  and...
  
  assert( doc2 == doc );
  
  Can be implemented but ideally it would compare all the nodes in the
  document.
 
 well, that's different. Do you want to know whether both documents are 
 equal, or whether they are identical, i.e. whether both references point
 to the same document ?

doc2 == doc would test if doc and doc2 refer to the same libxml2 doc
doc2 == doc would compare documents to see if they are equal

I think you could do the same for nodes.  The equality check would look
only at child nodes (otherwise you might as well compare the documents)

 Hmm, just to check whether we are still talking about the same thing
 here: do we agree that there can't be a 'node' type, i.e. just a 
 'node_ref'/'node_ptr' ?

 Else it would be impossible to make that API a wrapper around libs like
 libxml2.

I can live without a node class.  And I don't know if it would be any
easier to make a deep copy node in MSXML or any other xml lib.

I don't think it is impossible just not very easy or efficient.  You
would have to copy the entire document into the node or at least part of
it.  Even then you would still need node_ptr and node_ref.  node could
look like this...

class node_ref
{
public:
  // All the node members
protected:
  xmlNode * raw_node_;
};

class node : public node_ref
{
public:
  node( node_ref source ) : doc_( source.document() )
  {
raw_node_ = find_same_node_somehow( doc_, source );
// find_same_node_somehow would look up the copy of
// the source node in the copy of the document.
  }

private:
  document doc_;
};

document doc( test.xml );
node new_node( doc.root() );

node_ref root = doc.root();
assert( new_node == root );
assert( new_node != root );

document_ref new_doc = new_node.document();
assert( new_doc == doc );
assert( new_doc != doc ) ;

So even erasing the root node of doc would not invalidate new_node, as
new_node has it's own copy of the document which includes all the nodes
children, parents and siblings.

I think it would work but it seems a little bonkers for value_type to
require an entire copy of the container.

I suppose you could just implement 'element' and restrict the coying to
the child nodes and not other relatives.  That way the node in question
would also be the root node of the copied document.  Can any element be
made the root of a new libxml2 document?  And does it make sense that
new_node.parent() would return null even if the original was not a root
node?

My gut feeling is that references, pointers and iterators only is best.

-- 
Hamish Mackenzie [EMAIL PROTECTED]

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-25 Thread Hamish Mackenzie
On Wed, 2003-06-25 at 01:12, Stefan Seefeld wrote:
 hi there,
 
 some weeks ago I proposed an API for XML, which triggered an interesting
 discussion. Hamish Mackenzie proposed a somewhat simpler mechanism to attach
 the C++ wrapper objects to the C structs from libxml2.
 
 I reworked the API to use that mechanism, so now using an xml document
 looks somewhat like:
 
 dom::document_ptr document(1.0);
 dom::element_ptr root = document.create_root_node(root);
 dom::element_ptr child = root.append_child(child);
 dom::text_ptr text = root.append_text(hello world);
 for (dom::element_ptr::child_iterator i = root.begin_children();
   i != root.end_children();
   ++i)
std::cout  i-get_name()  std::endl;
 
 As the wrapper objects have reference semantics, I append '_ptr' to
 their name to stress that fact. A practical side-effect of this is
 that the document is now ref-counted, as it doesn't own the node-wrappers
 any more (as was the case in my former API).

Why should the node-wrappers keep the document alive?

Here is the analogy I think works best...

container -- document
container::value_type -- node
container::iterator -- node_iterator
container::pointer_type -- node_pointer
container::reference_type -- node_reference

Consider the following

std::vector foo  x;
...
foo * y = x[0];
x.erase( x.begin(), x.end() );

I don't expect y to add_ref x.  I wouldn't mind if it did, but it
wouldn't make y any more valid after the call to erase.

So the problem with add reffing the document is... what happens if the
root node or some parent of a node you have a pointer to is erased from
the document?  libxml2 has no way of knowing you have a pointer to a
child node.

I think the solution is that the node iterators/pointers/reference
should not own anything (as is the convention with containers).  The
document owns the root node, the root node owns it's children and so on.

If a particular implementation (such as MSXML) has ref counting built in
that's fine but it should not be a requirement.  The boost::xml
interface requirements could simply state iterators, pointers and
references for nodes are invalidated by deleting the document to which
they belong or by erasing them from the document (erasing a parent node
erases it's children).  It doesn't matter if for some implementations
they are still valid.  We can add a remove function something like...

class document
{
public:
  std::auto_ptr document  remove( node_iterator i );
};

This would take a node and all its children and put them in a new
document object.  An implementation that has a underlying mechanism for
nodes to exist without the implementations document object could allow
xml::document to exist with just a root node and make the appropriate
optimisation to remove.

-- 
Hamish Mackenzie [EMAIL PROTECTED]

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-25 Thread Stefan Seefeld
Hamish Mackenzie wrote:

Why should the node-wrappers keep the document alive?
for consistency, and convenience. In the same way you can get down from
the document to the individual nodes you can get up: node.parent() and
node.document() provide the means to walk up towards the document root.
node.begin_children() lets you iterate over all current child nodes,
i.e. the nodes it returns will all be valid, thus this iterator
interface is as stable as that of linked lists.
I expect the same from node.parent() and node.document(), i.e. as long
as there is an API to walk the tree, it should return valid objects.
That is not to say that these objects will remain valid over time.
As you point out, erasing the content of a container will invalidate
iterators you may still hold for that container.
Imagine a factory object that is initialized with a dom::node_ptr
(holding configurational data, say). Whenever you access its method,
the object looks up data in the node...
class factory
{
public:
  factory(dom::node_ptr n) : my_node(n) {}
  foo create_foo(); /* access my_node */
  bar create_bar(); /* access my_node */
private:
  dom::node_ptr my_node;
};
factory 'owns' the node, so whereever you instantiate the factory,
you'd read in a dom::document, look up the relevant node, and pass
that to the constructor:
factory *f;
{
  dom::document_ptr document(config.xml);
  dom::node_set set = document.root_node().find(//factory.info);
  f = new factory(set[0]);
} // document and set are now deleted, but factory still references
  // the document through its 'my_node' member
If the document wasn't ref-counted, you'd need to pass it along with
the node to the factory, as only the factory would know when to drop
it (in its destructor, presumably).

Here is the analogy I think works best...

container -- document
container::value_type -- node
container::iterator -- node_iterator
container::pointer_type -- node_pointer
container::reference_type -- node_reference
hmm, that makes it look simpler than it actually is: is there really
a single 'value_type' ? Is there really a single iterator ? (iterating
over all child nodes of a given parent and iterating over all attributes
is not the same)
Also, what is a node_reference (as opposed to a pointer) ?
Consider the following

std::vector foo  x;
...
foo * y = x[0];
x.erase( x.begin(), x.end() );
I don't expect y to add_ref x.  I wouldn't mind if it did, but it
wouldn't make y any more valid after the call to erase.
agreed. However, you explicitely erase elements, while all I want to 
prevent is implicit object destruction just because a reference to it
goes out of scope.

So the problem with add reffing the document is... what happens if the
root node or some parent of a node you have a pointer to is erased from
the document?  libxml2 has no way of knowing you have a pointer to a
child node.
that's right. And adding such a feature may be quite expensive memory / 
performance wise. As I said, I don't feel it's a problem, as you would
explicitely remove the node, so you should know what you are doing 
anyways. (As you said, you wouldn't even expect an iterator to be valid
after you erase the container's content).

Regards,
Stefan
___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-25 Thread Daryle Walker
On Tuesday, June 24, 2003, at 8:12 PM, Stefan Seefeld wrote:

[SNIP]
As the wrapper objects have reference semantics, I append '_ptr' to
their name to stress that fact. A practical side-effect of this is
[TRUNCATE]

Shouldn't the type names use a suffix of _ref instead?  (I don't need 
to know that they're [possibly] implemented as pointers.)

Daryle

___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-25 Thread Stefan Seefeld
Daryle Walker wrote:
On Tuesday, June 24, 2003, at 8:12 PM, Stefan Seefeld wrote:

[SNIP]

As the wrapper objects have reference semantics, I append '_ptr' to
their name to stress that fact. A practical side-effect of this is
[TRUNCATE]

Shouldn't the type names use a suffix of _ref instead?  (I don't need 
to know that they're [possibly] implemented as pointers.)
it seems 'pointer' has for you a very precise (C/C++) meaning.
I just used _ptr the same way it is used in CORBA (i.e. the C++
mapping), where it doesn't imply anything about the implementation.
I believe _ptr and _ref are fairly equivalent.

Regards,
Stefan
___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


Re: [boost] API Review request: XML API for C++, second round

2003-06-25 Thread Stefan Seefeld
Glen Knowles wrote:

_ptr has a very specific meaning in CORBA as well, you must explicitly 
manage the deletion of the object yourself, like, well... a pointer. If 
you must use the CORBA namings this, at my first look, seems closer to 
_var then _ptr. At which point I also think _ref is a better choice.
well, right, I was thinking of the relationship between _ptr (proxy) and
servant, not of the _ptr - proxy relationship.
I don't think _ptr means that the object in question is a pointer to a
proxy (and the _ptr - _var distinction really is about the proxy
management), but pointer-to-the-servant. In that sense there is no
semantic difference between _ptr and _var.
Anyways, I don't really have any preference, so if everybody here would
be more comfortable with _ref I can certainly switch. I just don't want
to go back and forth all day :-)
Regards,
Stefan
___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost


[boost] API Review request: XML API for C++, second round

2003-06-24 Thread Stefan Seefeld
hi there,

some weeks ago I proposed an API for XML, which triggered an interesting
discussion. Hamish Mackenzie proposed a somewhat simpler mechanism to attach
the C++ wrapper objects to the C structs from libxml2.
I reworked the API to use that mechanism, so now using an xml document
looks somewhat like:
dom::document_ptr document(1.0);
dom::element_ptr root = document.create_root_node(root);
dom::element_ptr child = root.append_child(child);
dom::text_ptr text = root.append_text(hello world);
for (dom::element_ptr::child_iterator i = root.begin_children();
 i != root.end_children();
 ++i)
  std::cout  i-get_name()  std::endl;
As the wrapper objects have reference semantics, I append '_ptr' to
their name to stress that fact. A practical side-effect of this is
that the document is now ref-counted, as it doesn't own the node-wrappers
any more (as was the case in my former API).
Please review the package 'xml++-2003-06-24.tgz' at
http://groups.yahoo.com/group/boost/files/xml/
Kind regards,
Stefan
___
Unsubscribe  other changes: http://lists.boost.org/mailman/listinfo.cgi/boost