On 3/19/2012 5:24 AM, Martin Baldan wrote:
but, hmm... one could always have 2 stacks: create a stack over the stack,
in turn reversing the RPN into PN, and also gets some "meta" going on...
Uh, I'm afraid one stack is one too many for me. But then again, I'm
not sure I get what you mean.

in traditional RPN (PostScript, Forth, ...).
one directly executes commands from left-to-right.

in this case, one pushes commands left to right, and then pops and executes them in a loop. so, there is a stack for values, and a stack for "the future" (commands awaiting execution).

naturally enough, it flips the notation (since sub-expressions are executed first).

+ 2 * 3 4 =>  24
Wouldn't that be "+ 2 * 3 4 =>  14" in Polish notation? Typo?

or mental arithmetic fail, either way...
I vaguely remember writing this, and I think the mental arithmetic came out wrong.


commands are pushed left to right, execution then consists of popping of and
executing said commands (the pop/execute loop continues until the stack is
empty). execution then proceeds left-to-right.
Do you have to get busy with "rot", "swap", "drop",  "over" etc?
That's the problem I see with stack-based languages.

if things are designed well, these mostly go away.

mostly it is a matter of making every operation "do the right thing" and expect arguments in a sensible order.

a problem, for example, in the design of PostScript, is that people tried to give the operations their "intuitive" ordering, but this leads to both added awkwardness and additional need for explicit stack operations.


say, for example, one can have a language with:
/<someName> <someValue> bind
or:
<someValue> /<someName> bind

though seemingly a trivial difference, one form is likely to need more swap/exch calls than the other.

likewise:
<array> <index> <value> setindex
vs:
<value> <array> <index> setindex
...


"dup" is a little harder, but generally I have found that dup tended to appear in places where a higher-level / compound operation was more sensible.

granted, for example, such compound operations are a large portion of my interpreter's bytecode ISA, but many help improve performance by "optimizing" common operations).

an example, suppose one compiles for an operation like:
j=i++;

one could emit, say:
load i; dup; push 1; binary add; store i; store j;

with all of the lookups and type-checks along the way.

also possible is the sequence:
lpostinc_fn 1; lstore_f 2;
(assume 1 and 2 are the lexical variable indices for i and j, both inferred fixnum).
or:
postinc_s i; store j;
(collapsed operation, but not knowing/giving exact types or locations).

now, what has happened?:
the first 5 operations collapse into a single operation, in the former case, specialized also for a lexical variable and for fixnums (say, due to type-inference);
what is left is a simple store.

as-noted, most of this was due to interpreter-specific micro-optimizations, and a lot of this is ignored in the (newer/incomplete) JIT (which mostly decomposes the operations again, and uses more specialized variable allocation and type-inference logic).

these sorts of optimizations are also to some extent language and use-case specific, but they do help somewhat with performance of a plain interpreter.


similar could likely be applied to a stack language designed for use by humans, where lots of operations/words are dedicated to common constructions likely to be used by a user of the language.


_ I *hate* infix notation. It can only make sense where everything has
arity 3, like in RDF.

many people would probably disagree.
whether culture or innate, infix notations seem to be fairly popular.
My beef with infix notation is that you get ambiguity, and then this
ambiguity is usually eliminated with arbitrary policies of operator
priority, and then you still have to use parens, even with fixed
arity. In contrast, with pure Polish notation, once you accept fixed
arity you get unambiguity for free and you get rid of parens for
everything (except, maybe, explicit lists).

For instance, in infix notation, when I see:

2 + 3 * 4

I have to remember that it means:

2 + (3*4)

But then I need the parens when I mean:

(2 + 3) * 4

In contrast, with Polish notation, the first case would be:

+ 2 * 3 4

And the second case would be:

* 4 + 2 3

Which is clearly much more elegant. No parens, no operator priority.

many people are not particularly concerned with elegance though, and tend to take it for granted what the operator precedences are and where the parens go.

this goes even for the (arguably poorly organized) C precedence hierarchy:
many new languages don't change it because people expect it a certain way;
in my case, I don't change it mostly for sake of making at least some effort to conform with ECMA-262, which defines the hierarchy a certain way.

the advantage is that, assuming the precedences are sensible, much more commonly used operations have higher precedence, and so don't need explicit parenthesis. on average, this tends to work out fairly well.

prefix also works, but has the drawback of being marginally more awkward for arithmetic, as well as generally requiring added white-space.

"2+3*4" vs: "+ 2 * 3 4", which contains 4 additional space characters.


actually, it can be noted that the many of the world languages are SVO (and
many others are SOV), so there could be a pattern here.
I've read a recent study which says the human brain seems to be wired
for SOV. The reason for this conclusion was that two groups of deaf
people had independently developed their own sign languages, and both
were SOV.

By the way, I'm playing with the idea of making a logical conlang with
a concise, highly-regular syntax. The most promising type of syntax
seems to be REBOL-like, that is, Polish notation. The funny thing is
that, at present, the way I'm trying to handle event description makes
it have the verb at the end, but it's not RPN. Here's why:

walk evt-1 == "evt-1 is a walking event"

subj John walk evt-1 == "evt-1 is a walking event with John as its subject"

ex-past walk == "there is a past walking event"

ex-past subj John walk == "there's a past walking event with John as
its subject"

My point is that the correspondence between SOV or SVO and the
underlying syntax may not be so straightforward. Also, the issue of
how to build a good model of spoken language is still open.

I once tried to imagine how a language could be structured which could be used for both a natural language and as a programming language (the design was partly Lisp based). this fell apart.


I later considered the possibility of a "mechanically defined" English subset, but ran into significant problems with the semantic model. it seems it would likely be much easier to write up a parser for a simplified/regularized English grammar than to come up with a reasonable semantic model for how the language concepts are expressed (IIRC, I was trying to fit it onto a variant of Prototype-OO and lexical scoping or similar).

IIRC, the consideration was that it could have been used as a system of expression for game AIs and to some extent for human/AI interactions (within a game world), but this idea never really went anywhere (basically, to allow interactions more advanced than merely attacking them or engaging in fixed menus or dialogs, but more casual than the use of a scripting language).


I don't think I ever wrote the parser either, since this would have been fairly pointless without the semantic model.

IIRC: the plan had been to parse the statements into S-Expressions using a recursive-descent parser: hence why a simplified and more rigid grammar was defined, although ideally the statements would be readable/writable by "mere humans", it would not try to deal with the problem of free-form or ambiguous statements (any such "agents" would reject/ignore any statements they don't understand, or maybe ask the user to rephrase their statement).

it was also noted during this exercise that most mention of "grammar" was mostly people nitpicking and defining seemingly arbitrary/pointless "rules of use" (based on pet-peeves about "how the language should be written/spoken" and similar), rather than much of anything which would have been helpful in defining a formal grammar and parser for an English subset, so I think at the time I used "common sense rules", and wrote up something based on this.

I think it did place limitations on which combinations of "parts of speech" a given word was allowed to have (and, I think made up a few new ones, mostly for words which didn't fit well). the parser would have been largely dictionary-driven (sort of like the declaration parsing in C and C++...).

but, all this went nowhere...


a reasonable tradeoff IMO is using prefix notation for commands and infix
notation for arithmetic.
You can always use a library for infix instead of having it built into
the interpreter and making life more difficult for those who prefer
the more consistent prefix notation. I would say that's reasonable
enough. For instance:

http://folk.uio.no/jornv/infpre/infpre.html

it is possible.

personally, I tended not to think it was worthwhile to worry about, since most languages for human use can use conventional syntax, and most ones with simpler or more regular syntax (such as S-Expressions) are mostly intended for internal use.


_ Matching parens is a non-issue. Just use Paredit or similar ;)

I am currently mostly using Notepad2, which does have parenthesis matching
via highlighting.

however, the issue isn't as much with just using an editor with parenthesis
matching, but more an issue when quickly typing something interactively. one
may have to make extra mental effort to get the counts of opening and
closing parenthesis right, potentially distracting from "the task at hand"
Ah, but that's the whole point of Paredit. It *doesn't let you* have
unmatched parens. That's right, you just can't do it. You don't write
or delete parens, you create an empty sexpr, you delete it, you move
it around, you swallow the following sexpr into it,  you barf it, you
fuse, splice, etc, always working with sexprs, never with parens.

fair enough.

something like REBOL could possibly work fairly well here, given it has some
structural similarity to shell-command syntax.
I would say REBOL is better, because it's just as terse if not more,
and it's more regular.

possibly.

vaguely similar could make sense for a more advanced shell language.
as-is, my console/shell language is fairly naive, but is generally sufficient for what I use it for (and it allows easily embedding script-language code, for more advanced uses).


the main merit it has is that it can reduce the need for commas (and/or
semicolons), since the parser can use whitespace as a separator (and space
is an easier key to hit).
Okay, so it's because the space key is bigger. I see the point, but it
has more to do with keyboard layout than with visual or mental
considerations. I would happily exchange a little typing speed for
stronger visual cues, but other people may have other preferences.

visual cues are more important for reading, but are probably less relevant for interactive use, where one is more often typing commands for some other purpose, such as fiddling with or testing something, directly controlling something, calculating or showing something, ...


decided to leave out some stuff about integrating script code into game maps in my 3D engine, as I wrote it and then couldn't see how it related or was relevant.

or such...

_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

Reply via email to