> > > this is possible, but it assumes, essentially, that one doesn't run into > such a limit. > > if one gets to a point where every "fundamental" concept is only ever > expressed once, and everything is built from preceding fundamental concepts, > then this is a limit, short of dropping fundamental concepts.
Yes, but I don't think any theoretical framework can tell us a priori how close we are to that limit. The fact that we run out of ideas doesn't mean there are no more new ideas waiting to be discovered. Maybe if we change our choice of fundamental concepts, we can further simplify our systems. For instance, it was assumed that the holy grail of Lisp would be to get to the essence of lambda calculus, and then John Shutt did away with lambda as a fundamental concept, he derived it from vau, doing away with macros and special forms in the process. I don't know whether Kernel will live up to its promise, but in any case it was an innovative line of inquiry. > theoretically, about the only way to really do much better would be using a > static schema (say, where the sender and receiver have a predefined set of > message symbols, predefined layout templates, ...). personally though, I > really don't like these sorts of compressors (they are very brittle, > inflexible, and prone to version issues). > > this is essentially what "write a tic-tac-toe player in Scheme" implies: > both the sender and receiver of the message need to have a common notion of > both "tic-tac-toe player" and "Scheme". otherwise, the message can't be > decoded. But nothing prevents you from reaching this common notion via previous messages. So, I don't see why this protocol would have to be any more brittle than a more verbous one. > > a more general strategy is basically to build a model "from the ground up", > where the sender and reciever have only basic knowledge of basic concepts > (the basic compression format), and most everything else is built on the fly > based on the data which has been seen thus far (old data is used to build > new data, ...). Yes, but, as I said, old that are used to build new data, but there's no need to repeat old data over and over again. When two people communicate with each other, they don't introduce themselves and share their personal details again and again at the beginning of each conversation. > > and, of course, such a system would likely be, itself, absurdly complex... > The system wouldn't have to be complex. Instead, it would *represent* complexity through first-class data structures. The aim would be to make the implicit complexity explicit, so that this simple system can reason about it. More concretely, the implicit complexity is the actual use of competing, redundant standards, and the explicit complexity is an ontology describing those standards, so that a reasoner can transform, translate and find duplicities with dramatically less human attention. Developing such an ontology is by no means trivial, it's hard work, but in the end I think it would be very much worth the trouble. > > > and this is also partly why making everything smaller (while keeping its > features intact) would likely end up looking a fair amount like data > compression (it is compression code and semantic space). > Maybe, but I prefer to think of it in terms of machine translation. There are many different human languages, some of them more expressive than others (for instance, with a larger lexicon, or a more fine-grained tense system). If you want to develop an interlingua for machine translation, you have to take a superset of all "features" of the supported languages, and a convenient grammar to encode it (in GF it would be an "abstract syntax"). Of course, it may be tricky to support translation from any language to any other, because you may need neologisms or long clarifications to express some ideas in the least expressive languages, but let's leave that aside for the moment. My point is that, once you do that, you can feed a reasoner with literature in any language, and the reasoner doesn't have to understand them all; it only has to understand the interlingua, which may well be easier to parse than any of the target languages. You didn't eliminate the complexity of human languages, but now it's tidily packaged in an ontology, where it doesn't get in the reasoner's way. > > some of this is also what makes my VM sub-project as complex as it is: it > deals with a variety of problem cases, and each adds a little complexity, > and all this adds up. likewise, some things, such as interfacing (directly) > with C code and data, add more complexity than others (simpler and cleaner > FFI makes the VM itself much more complex). Maybe that's because you are trying to support everything "by hand", with all this knowledge and complexity embedded in your code. On the other hand, it seems that the VPRI team is trying to develop new, powerful standards with all the combined "features" of the existing ones while actually supporting a very small subset of those legacy standards. This naturally leads to a small, powerful and beautiful system, but then you miss a lot of knowledge which is trapped in legacy code. What I would like to see is a way to rescue this knowledge and treat it like data, so that the mechanism to interface to those other systems doesn't have to be built into yours, and so that you don't have to interface with them that often to begin with. _______________________________________________ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc