Yes.  I think we are in substantial agreement.  I was trying to be careful in 
my wording choosing to describe the documents in the collection as begin 
similar in content and structure (there exists a natural association among 
them).

I would also point out that there is not anything that compels them to be 
similar.  If validation is not required (as you point out), documents of any 
arbitrary structure or content can be placed in a collection.

The same argument for refactoring similar classes into a super class and its 
sub-classes can be made here.  There's something vexing about a collection of 
similar documents each with its own DTD or schema.  Due to the similarity of 
the document the DTD or schema is meant to describe, the DTD/schema must be 
similar.  It's the  value-less redundancy that nags at us and causes us to look 
for a way to "clean" it up with a single definition that describes them all.

-----Original Message-----
From: Mark J. Stang [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 03, 2002 9:52 PM
To: [EMAIL PROTECTED]
Subject: Re: Data or Documents for Xindice 2.0 (was Re: XSD or DTD validation?)

I think one view is that a collection is homogeneous.  And it appears that this
is Mikes' view, correct me if I am wrong.   In general, most of the documents
in my collections could be described by a DTD.   However, I have one collection
that is a collection of different types of documents.   I would prefer not to
have to create an individual collection for a single document.   I would also
not want to be constrained in having only one type of document in a collection.

In one of my collections, I am storing RepairOrders.   I also have a document 
that
is a list of all the "Open" RepairOrders.   I either have to have a very slick 
DTD to
cover both or put my list in another collection.   Seems artificial to have a 
single
collection for one document.

DTDs can be very useful when you are receiving documents from the outside
world.   They can help maintain the correctness of the data.   They remind me
of flexible schemas.

I for one do not need any of the above or the overhead that comes with it.
I don't have outside documents.   I will rely on developers and QA for the
correctness of my documents.   And I choose XML because it gives me
the flexibility to morph my document into any form that fits my Customer, not
my DBAs or Developers.

I see xml documents as data that comes in many formats.   In so many formats
that a DTD would be useless.

+1 for NOT requiring DTDs and defining collections as being ANY document.
DTDs as optional with no overhead for not using them is fine with me.

Mike Mortensen wrote:

> I believe that choice made (validating the collection) is the correct one.  
> It most closely represents what happens in the real world.  Even with widely 
> different data and usage, it still makes sense to validate the collection.
>
> For example, if used in business, invoices generated and sent to customers 
> could be stored in Xindice.  It is appropriate to validate the collection 
> since all documents are, by their nature, similar in content and structure.  
> The case is likewise true if Xindice were used instead to store the chapters 
> of a book.  Each chapter has similar content and structure.  We could just as 
> easily throw the periodic table of the elements into Xindice and science 
> would give "thumbs up" to the collection approach.
>
> I recognized that there may be occasions where the content is substantially 
> dissimilar.  In this case, we simply put the documents into separate 
> collections and still get the desired validation outcome.
> +1 for the path taken as the logical choice.

Reply via email to