Re: [Felix-language] [felix] val the terrifying

john skaller Sun, 09 Dec 2012 14:41:26 -0800

On 10/12/2012, at 12:45 AM, srean wrote:

> Hi John


Before I completely terrify you .. 

In practice Felix works pretty well.
I know this because I've been writing code with it.
Generally it's better than any other programming language.

OMG C++? Yes I can do it but after writing Felix I don't want to!

Python? Well, I can do that too but the useful dynamics just
don't scale -- look at fbuild. There's a design bug in the way
the system compiler is selected (based on the OS) and its
almost impossible to fix.

Ocaml: sure, very nice, but for an FPL it is really BAD at
handling recursion.

Haskell: I can't do Haskell, but roughly here you pay
a price in knowledge of performance for the strong
semantics (I don't mean it is slow: I mean you just cannot
tell if it is going to be slow or not).

You have to remember what the alternatives are!

There is a bit of a steep learning curve, due to lack
of documentation, the convolution of high and low
level concepts, and the fact that its an experimental, non-stable
language with few resources behind it, so sorting out the
concepts proceeds more slowly than I would like.
There are "gotchas" to know about. There are choices to be
made which ones to fix and how.

> It seems the keyword val does two orthogonal things, expresses notions of (i) 
> immutability and (ii) "will get evaluated or not whenever the compiler 
> chooses to".

Indeed. As does var: mutable + eager evaluation.

This convolution just dropped out of the development process.
Vals are immutable because they're just names for expressions
and we needed something which could safely be used to factor
complex expressions by naming bits of it.

But now, you have to remember that even in C, the order
of evaluation of function arguments isn't determinate,
and those function arguments are themselves expressions.

So actually the Felix rule for vals is natural in the sense that
lazy evaluation is what you actually get "by definition" and
permission to eagerly evaluate is a convenience that should
work most of the time in a C like language. Of course it doesn't
if you're using lazy evaluation in C, eg you cannot factor

        c1 ? ( c2 ? e1 : e2) : e3

in C into

        int tmp1 = e1;
        int tmp2 = e2;
        int tmp3 = c2 ? tmp1 : tmp2;
        int tmp4 = c1 ? tmp3 : e4

because here eager evaluation *destroys* the semantics.

So actually, it's not like you might think: eager evaluation
doesn't make the semantics more determinate, it actually
destroys the semantics. The natural (safe) evaluation strategy
is always lazy in functional code. Except when you introduce
variables (and pointers!) which are eagerly evaluated now you are stuck
between a rock and hard place: neither eager nor lazy is a safe rule!

What's the safe rule? 

Duh! Felix current rule is the safe rule!
The safe rule is the one that tells you that you'd better
only use vals when you're sure the evaluation strategy
doesn't matter.

> Have you thought separating the concerns,

Of course. But it is not so easy. If you separate everything, you get extreme
complexity.

> for example use "=" and "<-" to indicate eager vs whenever semantics, and let 
> val denote immutable as is common in other languages.

That particular break down wouldn't work so well, when you consider that
the concept also applies to binding arguments to function parameters.
in that case the argument and parameter can be annotated but the
linkage between them cannot because it is spatially separated.
In other words there's no way to write the "=" or "<-" when you're passing
an argument to a function.

I recently did something like this:

        f ( var something)

which is supposed to  be:

        var tmp = something;
        f tmp

and so forces eager evaluation.
        
> 
> I find it terrifying that a val can be inited by an impure function, and that 
> means I need to follow the data/dependency path to the leaves to be sure that 
> such a definition cannot have multiple meanings.

Me too. However doing this right is extremely difficult. 

If you look at some of Dobes complaints you will be even more terrified.
It's much worse than you think. Consider:

        val x = y;
        val y = 1;

That's valid Felix. Who know what value x has?  Here, even though both x and y
are initialised, the order matters, and order of writing produces undefined
results. But the code is legal because Felix uses set-wise lookup not
linear lookup: things do not have to be defined before they're used,
they only have to be defined in the same space, or an enclosing space,
or even a contained space (if you use qualifiers or open the space).

These rules allow libraries to be organised arbitrarily. You can
call any function from anywhere, if you don't like the breakup,
just move a function from one place to another. You don't have
to worry about where things are defined. Refactoring is a breeze.

The downside is that it can be hard to find things.
And in the case of variables, you can use them before they're
initialised. [I got caught by this myself recently. I was using
regular definitions to define a regex and accidentally did
a forward reference. The code looks declarative. But it isn't,
it actually reduces to executable code that builds regexps ..]

The problem in general is much harder than you think.

For example pure functions seem like the holy grail,
until you realise that they might not be total!
in that case you need yet another concept such as strictness:

        f ( error ) = error

in Haskell before you can safely convert lazy evaluation
into eager evaluation. For example conditional operator
and C's lazy && operator are NOT strict. On the other
hand most math functions like sin are strict. Note in particular
that "tan" isn't total even though "sin" is. Its all quite messy!

Again, for impure functions, i.e. ones that depend on a non-parameter
variable, you can do this:

        pure fun f ...
        impure fun f ...

and if you just write

        fun f...

then the compiler calculates the purity anyhow (at least, it tries to).

But not totality, and not strictness: the evaluation strategy matters 
EVEN FOR PURE FUNCTIONS because pure functions don't have to
be total. Such as "tan" or division.

OR we could get more detailed and follow the C++ lambda route
where you have to SAY which variables you depend on.
But as noted -- its STILL doesn't make the evaluation strategy
a mere optimisation.

You need to realise in cases like this that having pointers makes
things even more complicated!  Throw in function closures,
so you only have the function type available, or at best a guess
if you do data flow analysis .. and you can see how difficult
semantics in ANY algol like language is.

> 
> I understand your motivation: to create opportunities for optimization, and 
> do appreciate that. Performance focused languages is not a populated niche, 
> so its good. Yet this is quite terifying, a consequence of which would be 
> that I would distrust code that has val with non-literal initialization, and 
> minimize its use in my code should I write in Felix. The latter would defeat 
> the purpose of having val in the language.

I know. But using var everywhere still doesn't make it safe (see above).

In practice, val's get used everywhere that you're doing functional
stuff. If you're doing control stuff or mutating things var's make things a bit 
clearer.

var evaluation gets optimised too, but the compiler is more
conservative ("as if rule" has to be followed).

> 
> C also famously have undefined oder of executions, but are well known enough 
> that programers (are expected to) exercise vigilance to avoid them.
> 
> I will feel a lot safer if the meaning is defined such that the 
> initializations of val is eager if the rhs is tainted with a var somewhere in 
> its path, failing that a separation of concerns, so that I can keep things 
> immutable but force eagerness.


I don't know how to do this calculation.
The compiler is conservative here but not conservative enough.

At first it seems easy. Dobes thought it was easy. But it is not.
It requires full scale global data flow analysis .. and even then I'm not sure!
At first you think:

        var x = ...
        val y = 1 + x;  // Duh, obvious

Sure, x is a variable here. But what about this:

        val y = 1 + f();

Woops! Now we have to know all about what f() does and you can
be fairly sure it is LIKELY to depend on a variable because it doesn't
have any arguments!

What if f is itself a variable? Ie. a function closure?
then we don't even know what f does.

Well you could say well, if you're not sure, eagerly evaluate.
But this is not so easy to do either. Consider that
ALL C bindings are lazily evaluated. Sure,

        fun f : int -> int = "f ($1)";

looks like a function but it is actually a macro: its lazily
evaluated. Its argument is *substituted" for the parameter $1
textually. To eagerly evaluate this, we would have to break
down EVERY expression into three-address code.

you may think, well 

        val a = 1;
        val b = a + 1;

ah,  no variables there! Sure, there are no variables
but there IS a function call: remember

        fun + :  int * int -> int = "$1+$2";


is a library function. The compiler doesn't know what it does.
So if its conservative we'd have to use eager evaluation again.

Now you may think all this is hopeless!

What's the answer???

Simple. Just use vars. And watch out for macros (C bindings).

Want better semantics? Sure, well you can use Haskell.
Purely functional and lazy evaluation, optimisations
only when semantics not broken. Kind of makes
fast array operation tricky though.

The real answer is: don't be afraid. Expect to get errors.
Just get on with it, and wait until you have stuffed up enough
to have an idea how to work around it, or even better,
how to automate a check for the common problems.

It's not like you cannot do infinite loops, even in Haskell!

IMHO whilst the semantics are a bit problematic at the moment,
the ability to predict performance is MORE of a problem.

Generally copying stuff about is -- ints, floats, etc,
is quite fast, so the evaluation strategy doesn't matter.
When the values are 100 x 100 matrices its a different
story altogether. I need actual (simple) use cases where
the compiler fails to do an obvious optimisation, so as to
try to figure out how to fix it. And I need a user to whom
it matters so I can prioritise.


--
john skaller
skal...@users.sourceforge.net
http://felix-lang.org




------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
Felix-language mailing list
Felix-language@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/felix-language

Re: [Felix-language] [felix] val the terrifying

Reply via email to