> I'd really be interesting in getting a hold of a C implementation of > an RFC-822 parser. I could even help you package it up and distribute > it or something.
I'm not confident, that the thingy already is fully RFC-822 compliant, but I never had problems processing emails and http requests. I "just hacked" it away, when my prototyp code became too slow. > However: I hate surrogate keys, that is, unique IDs that are made up. > I don't want to support them, I don't want them at all. They are not I'm not sure that this will work out. See, when you malloc something, you don't care about the address. Same for those id's. Look at them as if they where a virtual address in a document address space. > inherently meaningful to people, therefore they are not meaningful to > the user, and make debugging and maintenance very difficult. Sure they are not meaningful to the people. But first experience tought me, that the debugging is not that hard. What you want to do is to hide them from the people. This happens by giving names, i.e., aliases. One id can have any number of names/aliases. Those look like file names. You (or better me) also want to put maintenance under software control. BTW: I don't think that you have a chance to get rid of the unique id after all. Or don't your files have an inode number? Well, those inode numbers are reused as file names are, which is a bad idea for documents. > Furthermore: I forsee the document ID (in my scheme, something like > debian-policy/policy.html/index.html) enabling multiple > interpretations! For instance, "in the documentation area of the > package debian-policy, the file policy.html/index.html". This is Right. The naming *must* be local, those ids are supposed to be unique and global. Global naming never works. Some issue will always come up, which was not foreseen and break the whole thing. You will have to have both, a unique id, wich is not for the people, but the machine and a name, which is interpreted by both, the human and the machine. When I was working on my thesis, I experiemented some time with those issues and found two things related: some work I don't have close, which was from human language research (structuralism). My english skills and the fact that this is too long ago, both prevent any details here right now. And the second thing was the naming solution in VSTa (an experimental micro kernel os). That one has no global names in the file system. And it's really easy to config things hardly done with unix etc. To put it short, the lesson I learned was: "drop all global naming, it is meaningless". (Sure thats overgeneralization, don't flame, i want to give a picture here.) Within the WrapBit design a path as you gave (debian-policy/policy.html/index.html) is interpreted step by step. That is 'debian-policy' knows somehow where to find 'policy.html' which in turn might use a completly different mechanism to figure what 'index.html' really is. Or for a totally unrelated real world example. there is a [EMAIL PROTECTED] cc-ed. This will deliver the mail into the first piece of such a path. That will just filter out some mailing lists, find it not related to anything and forward to another thingy which gets all rest mail. That in turn will simply store it assigning a pretty meaningless number. I can reference the mail via an md5sum automaticaly assigned when the msg is stored, or that number. But at another page I can put a link and assign different name and find it by a totaly different search mechanism. (Even better, as long as there is one such link, the document never gets lost.) > important to do because I think the documentation area is going to > move from /usr/doc to /usr/share/doc . Futhermore, it's important for > ultimate URN support. This would enable me to say > "debian:debian-policy/policy.html/index.html" or however one would > formulate it as a URN, and if the user doesn't have the package > installed, the URN server would redirect to a suitable mirror. Aggreed. I always envisioned such a scheme which allows to fetch a certain document from wherever if it is not present at a given point. I think that in the example above the 'debian:debian-policy' or whatever should find the mirror. When writing the I've got the impression, that we will have to clarify what we are addressing. My concern is a general mechanism, how to handle documents. For the debian doc I believe a specialized instance is needed. We should separate those issues, otherwise everything will be confusing. We should get the ip-phone going. I have a hard time explaining the idea in writing whenever I try. But whenever I follow it and apply it to some real problem, it proofes so useful. > Probably what I should do is force, for relative URLs, the first > argument, "debian-policy" to be in fact the package name. If the > actual resource is *not* in the doc area of the package holding the > metadata, the developer should use an absolute file URL. I'm not sure > that this isn't too draconian. Let me think about this a bit more. Didn't get that. > >> 2.1 Automatic Document Conversion > > > This all sounds as if we should tweak the rabbit a little. > > I'd love that. 'install-docs' *has* to be a quick fast program, and Hm, not sure that it is really quick yet. Too much shell. Too many fork/exec where threads could be used. But good enough for interactive use if you can bear a little. > small, since it's on or near abouts the base system. So you think we It's *still* small (170k gziped tar for source + 700k tgz test data & docs [which is actualy the same]), but needs a lot of other things we would have to get rid of. > could modify rabbit to reference some local document store (i.e., a > little database of metadata). I'll have to take a look at your code. But is actually nothing but a (yet) local document store, holding data, meta data and enforcing a little bit of policy (at a level where I don't know whether this is policy or mechanism.) so long /Jerry -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

