Way back long time ago, in the 80's, the old Apple MPW used option key to make delimiters easier. Like ∂ as escape, and ¬ for line continuation. Here we are, 30 years later still using essentially 7-bit ASCII for textual representation of code. Why not take advantage of all those unused characters? Like perhaps « block » or … block of comment text …
--- Dave Carlton PolyMicro Systems da...@polymicro.net <mailto:da...@polymicro.net> 808-220-1727 <tel:808-220-1727> On Aug 17, 2016, 15:07 -1000, Doug Coleman <doug.cole...@gmail.com <mailto:doug.cole...@gmail.com>>, wrote: > (disclaimer: sourceforge only accepts 40kb message bodies without approval, i > might have forgotten the admin password, this email might be sent 3+ times, > google inbox confuses me with hidden text bloating my replies...) > > > > Hi Александр, > > I'm finally able to contain enough rage to answer you! ;) > > --Delimiters-- > > My definition for delimiter is "character(s) that separate the type of the > payload from the payload". > > Some examples are: > > url"google.com <http://google.com/>" > url`google.com <http://google.com/> > url( "google.com <http://google.com/>" ) ! hypothetical call to url word > with string as payload > > The delimiter tokens I'm proposing are [ { ( ` " [[ {{ (( and the > corresponding closing tags. > > > --Code bloat-- > > The minimal executable size is based on what the tree-shaker is able to prune > from the program. If there's no locals usage, in theory it should not include > any code at all for locals, though this would need to be verified and > possibly fixed. > > As for out of order definitions and circularity, it's hellish right now not > having these features. If your vocabularies load correctly without > circularity but then circularity is introduced through code changes, right > now it will likely reload fine but throw an error upon loading in a fresh > image. It's hard to track these down in the fep backtrace as it's just a spew > of badly-formatted factor objects and separators. Also, the circularity can > be subtle, as it usually isn't as simple as reciprocal usage, it's more > likely a chain of dependencies and tracking it down is hard without tools > (which we don't have). > > > --Loading code from other platforms-- > > Loading Linux code on Windows etc would really just be loading the textual > code and possibly running the stack-checker on it. It would use as much > memory as the text size of the file + the factor slice and syntax objects > used to contain that text. Of course you would strip these objects on a > deployed image. > > Loading code for the wrong platform means you can do whatever really, as long > as you stop short of calling functions that don't exist on the platform. The > advantage you get is being able to use a tool to rename a word completely > across every Factor file, not just the loaded files or the files that run on > your current platform. There are so many bugs in the git history where we > updated one platform but not another because the tools are missing. > > > --Arbitrary Payloads-- > > An arbitrary payload for text is the ability to put any text at all in the > string literal without having to escape it. For instance, the common problem > in C/C++ is if you have a comment /* ... */ and then you decide to comment > out a larger block that contains that comment. You can use the ``#if 0 ... > #endif`` trick, but what if you want to comment that block out? The payload > (the comment) starts to interfere with the delimiters. Lua has a cool way to > contain any text in a literal without escaping it where you just make sure > the delimiters are variable and are not contained in the payload. > > If C comments worked different: > /* */ # first C comment > /** /* */ **/ # nested C comment, yes this isn't really C > /*** /** /* */ **/ ***/ # etc, making sure ******/ isn't in the payload > > The same principle applies for string payloads. A common use case is copying > text off a website or out of a hexdump, out of a packet capture, etc, and not > needing to care about escaping the copied text. You can even generate the > right delimiters with a tool or the editor if you have the payload. The > motivation for this feature is to allow the programmer to forget about > string/comment escaping. > > Finally, features like python's triple strings '''string''' and """string""" > (single and double quotes, tripled) are usually ok, but if you want to write > docs about them that contain syntax examples, then you have to micromanage > your string delimiters. It's frustrating and programming should not be > frustrating in this way! > > > --Backticks-- > > I wanted a way to golf the C string syntax for strings that don't have > spaces. The proposed way is url`google.com <http://google.com/>, where url is > the tag and google.com <http://google.com/> is the payload. Also I thought > that you had to double or triple-quote markdown with backticks, but this > doesn't appear to be the case, it supports `thing`. > > C++ has something like this with their user-defined literals: > Kilograms w = 200.5_lb + 100.1_kg; // C++ user-defined literals > > Notice that you don't have to use two escape characters, just the one works. > I'm open to not having the foo`bar form and just making it foo`bar`, but that > was the motivation -- golfing it one character! Could also use single-quote > or abandon the idea. > > > --Delimiter Location-- > > The <XML XML> vs XML< XML> is a style choice. It's consistent with the > "tagged payloads" style where you have a 1) > tag-delimiter-payload-[delimiter]. The other way to do it is by 2) > delimiter-tag-payload-delimiter, e.g. ``[fry _ + ]`` or ``{H { 1 2 } }``. The > 2) way is consistent with the <XML XML> syntax. It's whatever, but the > attempt was at consistency. > > > --Concatenative lexing tokens-- > … > V{ 1 2 3 }[ 0 ] ! should desugar to C-style array access > > The idea behind the above syntax was that if several lexed tokens are > together without whitespace, then the first one decides how to handle the > following ones. So a vector followed by a quotation would attempt to address > into itself with the return values from the quotation. This might work better > with delimiter style 2) from above, like ``{V 1 2 3 }[nth 0 ]``. On the other > hand, this looks "ugly"? > > What do you think about V{ 1 2 3 } vs {V 1 2 3 } ? > > > --Comments-- > > author# erg ! a "typed/tagged comment" > > This is indeed a weird idea. There's nothing preventing it from working, but > it's probably too confusing. It works better in style 2) perhaps: > #author erg > > > --Backtick example-- > > - How will this case be handled? > - fixnum``hello```world````` > > This one is a mess. According to the lexing rules, it would see fixnum as a > tag of a double-backticked payload, concatenated with a single-backtick > "world" with no tag, with an empty double-backtick payload with no tag. I > dunno. It seems to lex, but the next phase would pattern-match against fixnum > + the rest of the mess and fixnum wouldn't know how to handle it, so it would > have a parse error. Something like that. > > Or the trailing backticks would say "end of file found but expected > 4-backticks". > > > --Delimiters revisited-- > > Literals have to start with opening delimiters. Things like ))foo(( should > not work. > > For ``$description{ "foo" }`` in documentation, the $ just signals docs. It's > not really special. Maybe there's a better convention. > > --Operators-- > > I had problems trying to figure out the ``char: a`` form, but I think it's > just a prefix operator named ``char:``. It should parse as ``char:`` ``a`` > and in another pass they should be joined into a single token, operator-char: > with a payload "a". Likewise color: hexcolor: pointer: alien: are prefix > operators, the pair-rocket H{ 1 => 2 3 => 4 } is an infix operator, and > something like ``a++`` could be a postfix operator, e.g. ``1 a++`` where it > could increment it at compile-time if it's a literal, or dispatch at run-time > otherwise. (Postfix operators are unnecessary?) > > Prefix and infix operators fix the problem of "related text should parse to a > single literal" and the char: prefix operator means that lower-case-colon > words don't have to be baked into the lexer. Also, the assignment :> doesn't > have to change, it's just a prefix operator! > > Something like: > PREFIX-OPERATOR: \ char: > INFIX-OPERATOR: \ => > > --Fry and make-- > > It seems that '[ _ , % ] syntax is fine for these, and any other ways to > write it, like $[ obj% seq% ] are hard to type into the editor and look weird. > > --Roots-- > I haven't figured out the best repository/directory structure, but it should > handle adding repositories with arbitrary URIs like ``@erg/factor``, handle > versions of Factor libraries, etc. > > --Final thoughts-- > > The parser should give better error messages with enough work and syntax > error examples. There's really no reason for it to be worse. > > Sorry for the mishmash of replies. > > Cheers, > Doug > ------------------------------------------------------------------------------ > _______________________________________________ > Factor-talk mailing list > Factor-talk@lists.sourceforge.net <mailto:Factor-talk@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/factor-talk > <https://lists.sourceforge.net/lists/listinfo/factor-talk>
------------------------------------------------------------------------------
_______________________________________________ Factor-talk mailing list Factor-talk@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/factor-talk