If we're just naming values, I'd like to avoid the complexity and just share the value directly. Rather than having "foo" function vs. "bar" function, we'll just have a block of anonymous code. If we have a large sound file that gets a lot of references, perhaps in that case explicitly using a content-distribution and caching model would be appropriate, though it might be better to borrow from Tahoe-LAFS for security reasons.
For identity, I prefer to formally treat uniqueness as a semantic feature, not a syntactic one. Uniqueness can be formalized using substructural types, i.e. we need an uncopyable (affine typed) source of unique values. I envision a uniqueness source is used for: 1) creating unique sealer/unsealer pairs. 2) creating initially 'exclusive' bindings to external state. 3) creating GUID-like values that afford equality testing. In a sense, this is three different responsibilities for identity. Each involves different types. It seems what you're calling 'identity' corresponds to item 2. If I assume those responsibilities are handled, and also elimination of local variable or parameter names because of tacit programming, the remaining uses of 'names' I'm likely to encounter are: * names for dynamic scope, config, or implicit params * names for associative lookup in shared spaces * names as human short-hand for values or actions It is this last item that I think most directly corresponds to what Sean and Matt call names, though there might also be a bit of 'independent maintenance' (external state via the programming environment) mixed in. Regarding shorthand, I'm quite interested in alternative designs, such as binding human names to values based on pattern-matching (so when you write 'foo' I might read 'bar'), but Sean's against this due to out-of-band communication concerns. To address those concerns, use of an extended dictionary that tracks different origins for words seems reasonable. Regarding your 'foo' vs. 'bar' equivalence argument, I believe hashing is not associative. Ultimately, `foo bar baz` might have the same expansion-to-bytecode as `nitwit blubber oddment tweak` due to different factorings, but I think it will have a different hash, unless you completely expand and rebuild the 'deep' hashes each time. Of course, we might want to do that anyway, i.e. for optimization across words. > If I were to enter 3 characters a second into a computer for 40 years, > assuming a byte per character, I'd have generated ~3.8 GiB of information, > which would fit in memory on my laptop. I'd say that user input at least is > well worth saving. Huh, I think you underestimate how much data you generate, and how much that will grow with different input devices. Entering characters in a keyboard is minor compared to the info-dump caused by a LEAP motion. The mouse is cheap when it's sitting still, but can model spatial-temporal patterns. If you add information from your cell-phone - you've got GPS, accelerometers, temperatures, touch, voice. If you get some AR setup, you'll have six-axis motion for your head, GPS, voice, and gestures. It adds up. But it's still small compared to what devices can input if we kept a stream of microphone input or camera visual data. I think any history will inevitably be lossy. But I agree that it would be convenient to keep high-fidelity data available for a while, and preferably extract the most interesting operations. On Wed, Sep 25, 2013 at 2:45 PM, Sam Putman <atmanis...@gmail.com> wrote: > Well, since we're talking about a concatenative bytecode, I'll try to > speak Forthfully. > > Normally when we define a word in a stack language we make up an ASCII > symbol and say "this symbol refers to all these other symbols, in this > definite order". Well and good, with two potential problems: we have to > make up a symbol, and that symbol might conflict with someone else's > symbol. > > Name clashes is an obvious problem. The fact that we must make up a symbol > is less obviously a problem, except that the vast majority of our referents > should be generated by a computer. A computer generated symbol may as well > be a hash function, at which point, a user-generated symbol may as well be > a hash also, in a special case where the data hashed includes an ASCII > handle for user convenience. > > This is fine for immutable values, but for identities (referents to a > series of immutable values, essentially), we need slightly more than this: > a master hash, taken from the first value the identity refers to, the time > of creation, and perhaps other useful information. This master hash then > points to the various values the identity refers to, as they change. > > There are a few things that are nice about this approach, all of which > derive from the fact that identical values have identical names and that > relatively complex relationships between identifies and values may be > established and modified programmatically. > > As an example, if I define a "foo" function which is identical to someone > else's "bar" function, they should have the same "name" (hash) despite > having different handles. With a little work, we should be able to retrieve > all the contexts where a value appears, as well as all the handles and > other metadata associated with that value in those contexts. > > [continued to second e-mail] > What we gain relative to URLs is that a hash is not arbitrary. If two > programs are examining the same piece of data, say a sound file, it would > be nice if they came to the same, independant conclusion as to what to call > it. > > Saving total state at all times is not necessary, but there are times when > it may be convenient. If I were to enter 3 characters a second into a > computer for 40 years, assuming a byte per character, I'd have generated > ~3.8 GiB of information, which would fit in memory on my laptop. I'd say > that user input at least is well worth saving.
_______________________________________________ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc