Everybody who tried knows that using a relational database to store semi-structured content is a suicidal move. You can use all the object-relational or xml-relational or text-indexing database-specific capabilities of the world and still feel like there is something wrong.

It's like storing relational data on the file system directly. Sure, it can be done, but it feels wrong, no matter what.

With linotype, I thought about what repository to use, went over the whole stinking range and decided, WTF, I'm going to write on the stinking file system and forget it.

It was lazy-driven design. Less is more to the very bone. But it might be the most revolutionary move in content management over the last few years. Let me explain why.

I gained tons of flexibility, an access API named java.io which is very practical and simple to use. moving stuff around? mv. copying? cp. copying remotely thru a secure channel? scp. backup? cronned tar. replication on multiple machines? untar.

Considering doing the same thing over, say, a WebDAV repository. You need to write the clients that do that... or... you have a (macosx-like) webdav file system mounter.

now.

If you think about it, there are not many file systems around. FAT? crap. NTFS? better but still crap (doesn't even have symlinks, go figure) and then we have the entire range of POSIX filesystems that differ for implementation but they all share the same design.

This morning I found this

http://www.namesys.com/v4/pseudo.html

and I was *litterarely* blown away.

The concept seems borrowed right from our sitemap:

 - foo -> a file
 - foo/ -> a directory

the stupid web servers default behavior of interpreting foo as one or the other depending on the file system layout prevented people to understand the "semantic" difference of the two on the web, but it also prevented the web people to influence the file system design.

I admit that I thought that POSIX was so carved in stone that I never even considered extending it to provide the functionality that was missing right at OS level! So I'm guilty as anybody else.

But think about it:

/document

is the full document

/document/html/body/p

is the content of the p node, so

echo "whatever" > /document/html/body/p

changes the content of the node! right from the command line. You can use symlinks to share content between your document!!! you can have ACL at the node level!!! just write an xgrep and you are able to do xpath searches from the command line! just add an XML parser and you are able to do something like

cat xhtml.dtd > /document/@schema

and when you save your stuff as /document, the file system will trigger an error if the file is not valid!

Imagine all the stuff that you can do with this! and the file system can be journalled, offer node-level transactional capabilities, you name it! at right into the OS. fast fast fast!

What? it's not remote, you say? get over it: scp and duct tape can make it as remote as anything.

Forget WebDAV, forget JSR170, your perfect repository might be, thanks to the pseudo-file concept, just a new linux module away.

--
Stefano.



Reply via email to