On 8/11/2011 8:16 PM, David Barbour wrote:

On Thu, Aug 11, 2011 at 5:06 PM, BGB <cr88...@gmail.com <mailto:cr88...@gmail.com>> wrote:


    the big problem though:
    to try to implement this as a sole security model, and expecting
    it to be effective, would likely impact language design and
    programming strategy, and possibly lead to a fair amount of effort
    WRT "hole plugging" in an existing project.


A problem with language design is only a "big problem" if a lot of projects are using the language. Security is a big problem today because a lot of projects use languages that were not designed with effective security as a requirement.

or:
if the alteration would make the language unfamiliar to people;
if one has, say, a large pile of code (say, for example, 500 kloc or 1 Mloc or more), and fundamental design changes could impact any non-trivial amount of said code.

for example, for a single developer, a fundamental redesign in a 750 kloc project is not a small task, and much easier is to find more quick and dirty ways to patch up problems as they arise, or find a few good (strategic and/or centralized) locations to add security checks, rather than a strategy which would essentially require lots of changes all over the place.




    how to effectively prevent spoofing (say, one manages to "extract"
    the key from a trusted app, and then signs a piece of malware with
    it).


Reason about security /inductively/. Assume the piece holding the key is secure up to its API. If you start with assumptions like: "well, let's assume the malware has backdoor access to your keys and such", you're assuming insecurity - you'll never reach security from there.


the problem though is that it may be possible for a person making the piece of malware to get at the keys indirectly...

a simple example would be a login style system:
malware author has, say, GoodApp they want to get the login key from;
they make a dummy or hacked version of the VM (Bad VM), and run the good app with this;
GoodApp does its thing, and authorizes itself;
malware author takes this key, and puts it into "BadApp";
BadApp, when run on GoodVM, gives out GoodApp's key, and so can do whatever GoodApp can do.

these types of problems are typically addressed (partially) with the VM/... logging into a server and authenticating keys over the internet, but there are drawbacks with this as well.


Phrases such as 'trusted app' or 'trusted code' smell like vaguely of brimstone - like a road built of good intentions. What is the app trusted with? How do we answer this question with a suitably fine-grained executable policy?

the terminology is mostly from what all I have read regarding the .NET and Windows security architecture...

but, generally the "trust" is presumably spread between several parties:
the vendors of the software (VM, apps, ...);
the user of the software.



    yes, there is still always the risk of a naive user confirming a
    piece of malware, but this is their own problem at this point.


I disagree. Computer security includes concerns such as limiting and recovering from damage, and awareness. And just 'passing the blame' to the user is a rather poor justification for computer security.

this is, however, how it is commonly handled with things like Windows.
if something looks suspect to the OS (bad file signing, the app trying to access system files, ...) then Windows pops up a dialogue "Do you want to allow this app to do this?"

at which point the user confirms this, then yes it is their problem.

the only "real" alternative is to assume that the user is "too stupid for their own good", and essentially disallow them from using the software outright. in practice, this is a much bigger problem, as then one has taken away user rights (say, they can no longer install non-signed apps on their system...).

systems which have taken the above policy have then often been manually "jailbroken" or "rooted" by the users, essentially gaining personal freedom at the risk of (potentially) compromising their personal security (or voided their warranty, or broke the law).

better I think to make the system do its best effort to keep itself secure, but then delegate to the user for the rest.



    if trying to use a feature simply makes code using it invalid
    ("sorry, I can't let you do that"), this works.


When I first got into language design, I thought as you did. Then I realized: * With optional features, I have 2^N languages no matter how I implement them. * I pay implementation, maintenance, debugging, documentation, and design costs for those features, along with different subsets of them. * Library developers are motivated to write for the Least Common Denominator (LCD) language anyway, for reusable code. * Library developers can (and will) create frameworks, interpreters, EDSLs to support more features above the LCD. * Therefore, it is wasteful to spend my time on anything but the LCD features, and make life as cozy as possible for library developers and their EDSLs.

/The only cogent purpose of general purpose language design is to raise the LCD./

Optional features are a waste of time and effort, BGB - yours, and of everyone who as a result submits a bug report or wades through the documentation.

optional features are very common though in most systems, and in this case, most of the optional features are those mostly intended for library development and "low-level" programming, notably pointers, ...

so, code which doesn't need to use pointers, shouldn't use pointers, and if people choose not to use them, that is no problem for me.

however, for some tasks, like C interop, they can be fairly useful...

anyways, C# and C++ do basically the same thing...



    with a language/VM existing for approx 8 years and with ~ 540
    opcodes, ... I guess things like this are inevitable.


I think this is a property of your language design philosophy, rather than inherent to language development.


well, this language isn't exactly the same as something like Lua or Scheme, and is not intended to strive for elegance or minimalism or similar.

it is at this point sort of "between" lighter-weight languages (such as JavaScript) and heavier languages (such as C# or C++).

in an "ideal" world, it would be able to be usable in a similar abstraction domain roughly between C++ and JavaScript.



    but whitelisting is potentially much more effort than
    blacklisting, even if potentially somewhat better from a security
    perspective.


Effectiveness for effort, whitelisting is typically far better than blacklisting. In most cases, it is less effort. Always, it is far easier to reason about. I think you'll need to stretch to find rare counter-examples.


it depends I think.

I mostly figure one can blacklist most of the obvious holes (direct access to OS-level C APIs and unrestrained pointers, for example), and probably leave the rest for later.



    LambdaMoo found a MUD, if this is what was in question...


LambdaMoo is a user-programmable MUD, with prototype based objects and a Unix-like security model.


    as for "simple" or "efficient", a Unix-style security model
    doesn't look all that bad.


Unix security model is complicated, inefficient, and ineffective compared to object capability model. But I agree that you could do worse.

most of the security checking amounts to if/else and bit-masking and other things.

this then is wrapped in "CanIDoX()" style function calls.

if(!CanIHasCheezeburger(...))
{
    ... BARF and throw something...
}

really, it is not too much different or worse from dynamic type-checking...



    luckily, there are only a relatively small number of places I
    really need to put in security checks (mostly in the object system
    and similar). most of the rest of the typesystem or VM doesn't
    really need them.


I recommend you pursue control of the toplevel capabilities (FFI, and explicit forwarding of math, etc.) as you demonstrated earlier, perhaps add some support for 'freezing' objects to block implicit delegation of assignment, and simply forget about Unix or permissions checks.


the permissions are currently intended as the underlying model by which a lot of the above can be achieved.

granted, yes, there are potential differences between, say, making the FFI not visible (by setting the reference to null), causing the FFI object to be no-op, or doing both, but either way...

I initially considered just making a single state flag, basically a system/user flag (similar to the Ring 0/1/2/3 concept in x86, or the "safe/unsafe" concept in .NET), but figured a naive Unix-like model could also provide limited protection of apps from each other, even when both are running in the same address space and may have access to many of the same objects, ...

so, for the most part, it amounts mostly to a "root/non-root" and "A vs B" issue. as noted, I am currently making no attempt to implement either ACLs or the full scope of the POSIX security model (which does use ACLs and similar).

also it will currently only apply to objects and functions, but most other data types (lists, arrays, ...) will not have such checks (array-based checks would be considerably more frequent than toplevel checks due to technical reasons, as essentially an array-based check would be invoked on every array access, rather than, say, the first time a given UID+GID tries to access a particular function or method). (or at least, after I get around to: adding the relevant "access" field to the main VM thread contexts, locating the code for the hash table, and adding the access value into the hash function and entry match-check, ...).

technically, the access checks are done during the "recursive search" phase of lookups, whereas there is actually a "lookup hash" which is accessed before that, and a cache-hit is assumed sufficient to justify that one has access (if one didn't have access, there would be an exception, and the hash-slot would be set to "undefined" or similar).

(note that the hash is kept from getting "stale" by having certain operations, such as assigning to delegates, ..., essentially flush the hash).

(note: there is a similar hash used mostly for implementing per-class slot and method lookups and interface-method dispatches, but this is technically a different hash table in a different part of the codebase, and is flushed under different criteria).

this means, say, for:
for(i=0; i<100; i++)
    printf("test %d\n", i);

"printf()" may only needs a single access-rights check, rather than 100 such checks.


the reason so much hashing/caches/... is used is because, actually, when the FFI gets involved some of this stuff (database queries, code generation, ...) can actually get kind of slow (previously I was looking at 1-2s stalls during queries, although IIRC I think I went and optimized something, mostly making a trivial reorganization of the DB structure and query mechanism, unexpectedly resulting in a drastic speedup).

one can actually debate which is slower:
security access checks;
or performing queries against a largish database (~ 250k-entry IIRC), and dynamically-generating glue code (writing out assembler, assembling it, and linking it into the program image), ...

really, it doesn't seem like all that big of a deal...

actually, bit of trivia: even with the power of AVL trees, 250k entries in a single list can be a bit costly. splitting the DB into a number of smaller per-library lists, seemed to notably speed up the query times (for whatever reason, 30 lists of 8k entries or so can be queried more quickly than 1 list of 250k entries). it is a mystery...

granted, yes, DB queries are typically implemented using for-loops, strings, "sprintf()", and sequential probing, ... but, it works...


or such...

_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

Reply via email to