I believe that choice made (validating the collection) is the correct one. It most closely represents what happens in the real world. Even with widely different data and usage, it still makes sense to validate the collection.
For example, if used in business, invoices generated and sent to customers could be stored in Xindice. It is appropriate to validate the collection since all documents are, by their nature, similar in content and structure. The case is likewise true if Xindice were used instead to store the chapters of a book. Each chapter has similar content and structure. We could just as easily throw the periodic table of the elements into Xindice and science would give "thumbs up" to the collection approach. I recognized that there may be occasions where the content is substantially dissimilar. In this case, we simply put the documents into separate collections and still get the desired validation outcome. +1 for the path taken as the logical choice.
