Stefano Mazzocchi wrote: > I see a native XML database as an incredibly great DBMS for > semi-structured data and an incredibly poor DBMS for structured data.
I don't think anyone's debating that, though I wouldn't use the label 'incredibly poor' for structured data, especially since the definition of what structured data is can't be answered by relational DBs either... I don't consider normalization and joins as being structure, so much as I consider it to be a rigid decomposition of structure. > Corba? no thanks, I need WebDAV. As much as all of us hate it, CORBA absolutely has its uses. We could never get away with wire-compression if we were using a 'service the world' WebDAV style approach. Wire compression has bought us performance gains, though not enough to justify keeping it exclusively. > Joins? no thanks, I need document fragment aggregation. In the context of XML, I think these are the same. > XMLSchemas? no thanks, I need infoset-neutral RelaxNG validation. Personally, and I'm just reiterating things I've said in the past, I hate W3C XML Schemas, and many others do as well. I don't want to have to put ourselves in a position where we're forced to make a choice on any one validation mechanism to the detriment of our users. So if we can continue to push validation to the client application, that's the track we should take... for a couple of important reasons: (1) Performance... validation is slow, Bogging down the server to perform it can only cause problems, and (2) Choice: If we standardize on W3C Schemas, then we exlude support for other schema specifications. I think that's unwise, especially with the major backlash that XML Schemas has received. > If you have structured data, you can't beat the relational model. This > is the result of 50 years of database research: do we *really* believe > we are smarter/wiser/deeper-thinkers than all the people that worked on > the database industry since the 50's? One might argue that the relational database industry hasn't learned very much in the decades that it's been around. Not that I'm saying XML databases are better, but relational databases were created to solve the problems of the databases of their time. That time has passed. There are still a lot of applications that have the problem that relational databases are trying to solve, but there are many applications that have the problem that XML databases are trying to solve. Further still, there are apps that no database can adequately solve. > I see two big fields where XIndice can make a difference (and this is > the reason why I wanted this project to move under Apache in the first > place!): > > - web services > - content management systems Don't forget health care, legal documents, and scientific applications. These are three areas where Xindice has organically found a home in since its creation. > - one big tree with nodes flavor (following .NET blue/red nodes): > follows the design patterns of file systems with folders, files, > symlinks and such. [great would be the ability to dump the entire thing > as a huge namespaced XML file to allow easy backup and duplication] > - node-granular and ACL-based authorization and security [great would > be the ability to make nodes 'transparent' for those people who don't > have access to see them] > > - file system-like direct access (WebDAV instead of useless XUpdate!) > [great for editing solutions since XUpdate requires the editor to get > the document, perform the diff and send the diff, while the same > operation can be performed by the server with one less connection, this > is what CVS does!] Woah! Stop right there. XUpdate is far from useless, and your explaination of how it works, in the context of Xindice is incorrect. When you perform an XUpdate query, it's sent to the server which performs all of the work. Never is a document sent to the client except for a summary of how many nodes were touched by the update. It actually performs very well, because you can modify every single document in a collection, taking several different actions, with a single command. > - internal aggregation of document fragments (the equivalent of file > system symlinks) [content aggregation at the database level will be much > faster than aggregation at the publishing level, very useful for content > that must be included in the same place... should replace the notion of > XML entities] We have this functionality in a very experimental form. It's called AutoLinking. It's been around for a while, but it's going away at some point, to be replaced by XQuery. The problem with it is that you have to modify the structure of your XML content, so it can't be treated as data. XQuery will allow this aggregation using the data in the documents rather than instructions within the document. Beyond that, there's nothing stopping somebody from using XLink, its just not a task that the server will perform because of the passive nature of XLinks. > - native metadata support (last modified time, author, etc..) [vital > for any useful caching system around the engine!] Some of this is already available, there's no way to expose it currently though. > - node-granular event triggers [inverts the control of the database: > when something happens the database does something, useful mostly to > avoid expensive validity lookup for cached resources] We talked about this early on in developing the product, but decided to put it on a back burner for a while... probably for the same reason we decided to shelve any specification validation system. > In short: I'd like to have a file system able to decompose XML documents > and store each single node as a file, scale to billions of nodes and > perform fast queries with XPath-like syntaxes. This is not to far from where we are at the moment. Nodes are individually addressable, but we cluster them into Documents for atomicity, much like an object database will cluster objects together in a way that ensures optimal I/O performance. > This is my vision. Now if this can work within the framework of my vision then nobody'll get hurt. :-) > Now, with my years-old asbesto underwear on, I'll be ready for your > comments :) -- Tom Bradford - http://www.tbradford.org Developer - Apache Xindice (formerly dbXML)