Re: [Factor-talk] New parser discussion (continued)

Jon Harper Wed, 30 Nov 2016 05:22:31 -0800

Hi,
just heard today of tagged template literals in ES6:
https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Template_literals


tag`foo ${bar}`

which seem to work like what Doug had in mind. That's nice.

I want to find some time to help with the new parser, which solves real
problems (universal comments, inconsistencies between parsing words, and
many more..) and adds a lot of value to factor !
Cheers,



Jon

On Wed, Aug 24, 2016 at 8:15 PM, Dave Carlton <da...@polymicro.net> wrote:

> Way back long time ago, in the 80's, the old Apple MPW used option key to
> make delimiters easier. Like ∂ as escape, and ¬ for line continuation. Here
> we are, 30 years later still using essentially 7-bit ASCII for textual
> representation of code. Why not take advantage of all those unused
> characters? Like perhaps « block » or
> …
> block of comment text
> …
>
> ---
> Dave Carlton
> PolyMicro Systems
> da...@polymicro.net
> 808-220-1727 <808-220-1727>
>
> On Aug 17, 2016, 15:07 -1000, Doug Coleman <doug.cole...@gmail.com>,
> wrote:
>
> (disclaimer: sourceforge only accepts 40kb message bodies without
> approval, i might have forgotten the admin password, this email might be
> sent 3+ times, google inbox confuses me with hidden text bloating my
> replies...)
>
>
>
> Hi Александр,
>
> I'm finally able to contain enough rage to answer you! ;)
>
> --Delimiters--
>
> My definition for delimiter is "character(s) that separate the type of the
> payload from the payload".
>
> Some examples are:
>
> url"google.com"
> url`google.com
> url( "google.com" )  ! hypothetical call to url word with string as
> payload
>
> The delimiter tokens I'm proposing are [ { ( ` " [[ {{ (( and the
> corresponding closing tags.
>
>
> --Code bloat--
>
> The minimal executable size is based on what the tree-shaker is able to
> prune from the program. If there's no locals usage, in theory it should not
> include any code at all for locals, though this would need to be verified
> and possibly fixed.
>
> As for out of order definitions and circularity, it's hellish right now
> not having these features. If your vocabularies load correctly without
> circularity but then circularity is introduced through code changes, right
> now it will likely reload fine but throw an error upon loading in a fresh
> image. It's hard to track these down in the fep backtrace as it's just a
> spew of badly-formatted factor objects and separators. Also, the
> circularity can be subtle, as it usually isn't as simple as reciprocal
> usage, it's more likely a chain of dependencies and tracking it down is
> hard without tools (which we don't have).
>
>
> --Loading code from other platforms--
>
> Loading Linux code on Windows etc would really just be loading the textual
> code and possibly running the stack-checker on it. It would use as much
> memory as the text size of the file + the factor slice and syntax objects
> used to contain that text. Of course you would strip these objects on a
> deployed image.
>
> Loading code for the wrong platform means you can do whatever really, as
> long as you stop short of calling functions that don't exist on the
> platform. The advantage you get is being able to use a tool to rename a
> word completely across every Factor file, not just the loaded files or the
> files that run on your current platform. There are so many bugs in the git
> history where we updated one platform but not another because the tools are
> missing.
>
>
> --Arbitrary Payloads--
>
> An arbitrary payload for text is the ability to put any text at all in the
> string literal without having to escape it. For instance, the common
> problem in C/C++ is if you have a comment /* ... */ and then you decide to
> comment out a larger block that contains that comment. You can use the
> ``#if 0 ... #endif`` trick, but what if you want to comment that block out?
> The payload (the comment) starts to interfere with the delimiters. Lua has
> a cool way to contain any text in a literal without escaping it where you
> just make sure the delimiters are variable and are not contained in the
> payload.
>
> If C comments worked different:
> /*  */        # first C comment
> /**  /* */  **/  # nested C comment, yes this isn't really C
> /*** /** /* */ **/ ***/  # etc, making sure ******/ isn't in the payload
>
> The same principle applies for string payloads. A common use case is
> copying text off a website or out of a hexdump, out of a packet capture,
> etc, and not needing to care about escaping the copied text. You can even
> generate the right delimiters with a tool or the editor if you have the
> payload. The motivation for this feature is to allow the programmer to
> forget about string/comment escaping.
>
> Finally, features like python's triple strings '''string''' and
> """string""" (single and double quotes, tripled) are usually ok, but if you
> want to write docs about them that contain syntax examples, then you have
> to micromanage your string delimiters. It's frustrating and programming
> should not be frustrating in this way!
>
>
> --Backticks--
>
> I wanted a way to golf the C string syntax for strings that don't have
> spaces. The proposed way is url`google.com, where url is the tag and
> google.com is the payload. Also I thought that you had to double or
> triple-quote markdown with backticks, but this doesn't appear to be the
> case, it supports `thing`.
>
> C++ has something like this with their user-defined literals:
> Kilograms w = 200.5_lb + 100.1_kg; // C++ user-defined literals
>
> Notice that you don't have to use two escape characters, just the one
> works. I'm open to not having the foo`bar form and just making it foo`bar`,
> but that was the motivation -- golfing it one character! Could also use
> single-quote or abandon the idea.
>
>
> --Delimiter Location--
>
> The <XML XML> vs XML< XML> is a style choice. It's consistent with the
> "tagged payloads" style where you have a 1) tag-delimiter-payload-[delimiter].
> The other way to do it is by 2) delimiter-tag-payload-delimiter, e.g.
> ``[fry _ + ]`` or ``{H { 1 2 } }``. The 2) way is consistent with the <XML
> XML> syntax. It's whatever, but the attempt was at consistency.
>
>
> --Concatenative lexing tokens--
> …
> V{ 1 2 3 }[ 0 ]  ! should desugar to C-style array access
>
> The idea behind the above syntax was that if several lexed tokens are
> together without whitespace, then the first one decides how to handle the
> following ones. So a vector followed by a quotation would attempt to
> address into itself with the return values from the quotation. This might
> work better with delimiter style 2) from above, like ``{V 1 2 3 }[nth 0
> ]``. On the other hand, this looks "ugly"?
>
> What do you think about V{ 1 2 3 } vs {V 1 2 3 } ?
>
>
> --Comments--
>
> author# erg  ! a "typed/tagged comment"
>
> This is indeed a weird idea. There's nothing preventing it from working,
> but it's probably too confusing. It works better in style 2) perhaps:
> #author erg
>
>
> --Backtick example--
>
> - How will this case be handled?
> - fixnum``hello```world`````
>
> This one is a mess. According to the lexing rules, it would see fixnum as
> a tag of a double-backticked payload, concatenated with a single-backtick
> "world" with no tag, with an empty double-backtick payload with no tag. I
> dunno. It seems to lex, but the next phase would pattern-match against
> fixnum + the rest of the mess and fixnum wouldn't know how to handle it, so
> it would have a parse error. Something like that.
>
> Or the trailing backticks would say "end of file found but expected
> 4-backticks".
>
>
> --Delimiters revisited--
>
> Literals have to start with opening delimiters. Things like ))foo(( should
> not work.
>
> For ``$description{ "foo" }`` in documentation, the $ just signals docs.
> It's not really special. Maybe there's a better convention.
>
> --Operators--
>
> I had problems trying to figure out the ``char: a`` form, but I think it's
> just a prefix operator named ``char:``. It should parse as ``char:`` ``a``
> and in another pass they should be joined into a single token,
> operator-char: with a payload "a". Likewise color: hexcolor: pointer:
> alien: are prefix operators, the pair-rocket H{ 1 => 2 3 => 4 } is an infix
> operator, and something like ``a++`` could be a postfix operator, e.g.  ``1
> a++`` where it could increment it at compile-time if it's a literal, or
> dispatch at run-time otherwise. (Postfix operators are unnecessary?)
>
> Prefix and infix operators fix the problem of "related text should parse
> to a single literal" and the char: prefix operator means that
> lower-case-colon words don't have to be baked into the lexer. Also, the
> assignment :> doesn't have to change, it's just a prefix operator!
>
> Something like:
> PREFIX-OPERATOR: \ char:
> INFIX-OPERATOR: \ =>
>
> --Fry and make--
>
> It seems that '[ _ , % ] syntax is fine for these, and any other ways to
> write it, like $[ obj% seq% ] are hard to type into the editor and look
> weird.
>
> --Roots--
> I haven't figured out the best repository/directory structure, but it
> should handle adding repositories with arbitrary URIs like ``@erg/factor``,
> handle versions of Factor libraries, etc.
>
> --Final thoughts--
>
> The parser should give better error messages with enough work and syntax
> error examples. There's really no reason for it to be worse.
>
> Sorry for the mishmash of replies.
>
> Cheers,
> Doug
> ------------------------------------------------------------
> ------------------
> _______________________________________________
> Factor-talk mailing list
> Factor-talk@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/factor-talk
>
>
> ------------------------------------------------------------
> ------------------
>
> _______________________________________________
> Factor-talk mailing list
> Factor-talk@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/factor-talk
>
>

------------------------------------------------------------------------------

_______________________________________________
Factor-talk mailing list
Factor-talk@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/factor-talk

Re: [Factor-talk] New parser discussion (continued)

Reply via email to