Re: [Factor-talk] Safe parsing

Eduardo Cavazos Sun, 16 Nov 2008 07:30:51 -0800

On Saturday 15 November 2008 22:49:17 Slava Pestov wrote:

> But, if syntax is split up into two vocabularies, you'd be out of luck
> for library data types. Eg, parsing a float array (using F{ in
> float-arrays) is safe.


Yeah, words that provide literal syntax for library data would have to go in a 
separate library. But that actually sounds good for use with words 
like 'parse-with-vocabs'. It seems desirable to say:

        "... FACTOR DATA ..."
        { "syntax.literal" "syntax.literal.library" }
        parse-with-vocabs

and have control over what syntax is available.

> Instead we could have a flag on parsing words, specifying if they're
> safe or not. [ is safe, F{ is safe, but : is not safe and neither is
> <<. USING: is safe, but IN: is not.

Well that's one way to do it.

I was actually thinking of categorizing 'syntax' even further:

        syntax.defining         Defining words like ':' and 'GENERIC:'
        syntax.literal          Literal syntax for data
        syntax.parser           Stuff that affects 'use' and 'in'

It would be nice to categorize parsing words by various properties. For 
example, I think there's a notion of a "functional parsing word". I.e. a 
parsing word which has no side effects besides pushing onto the 'accum'. 
Perhaps another category of parsing word is one which removes elements from 
the 'accum' (an odd kind of word...) Finally, a non-functional parsing word 
is one which has side effects, such as 'TUPLE:'.

So, you can organize them into vocabularies, tag them individually as you 
mention above, or do the *really* elite thing and have a word which analyzes a 
word for you and tells you if it's functional, defining, side-effecting, etc.

As far as safe code goes, I think defining words isn't so bad, it's changing 
the using list which is dangerous. So now you're getting into 
labeling vocabularies as safe or unsafe. So what's unsafe? To start with, 
stuff like:

        Launching external processes

        Initiating network connections

        Monkey patching existing vocabularies
        (i.e. set 'in' to an existing system vocabulary and begin scribbling on
         stuff)

To achieve sandboxed listeners, it would be nice to have a set of capabilities 
and restrictions which can be composed. For example:

        Only allow literal syntax for data structures in core

        Cannot add things to 'use'

        Cannot change 'in'

        May add to use, but only from this specific list: ...

        May add to use, any vocabulary marked as "safe"

        May create new vocabularies

        May not set 'in' to an existing vocabulary which you did not create
        (i.e. cannot "monkey patch" a system vocabulary)

        May define new words

        May *not* define new words

        etc...

How these are enforced is up for design.

I think breaking down 'syntax' into the 'literal', 'defining', and 'parser' 
sub-vocabularies might be a step towards composing those capabilities in 
limited listeners. However, it may not be as simple as that. For example, 
even if you deny access to 'IN:', code might still be able to set 'in' 
using 'set'. (but would it be possible at parse time?) Since 'use' and 'in' 
affect things in such a powerful way, and since they are involved via 
the 'namespaces' subsystem, controlling the ability to change variables 
generally should also to be considered.

Ed

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Factor-talk mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/factor-talk

Re: [Factor-talk] Safe parsing

Reply via email to