Re: [HelenOS-devel] What should a C replacement be like?

Jiří Zárevúcky Mon, 17 Feb 2014 14:09:06 -0800

Thanks for the feedback

On 17 February 2014 18:53, Martin Decky <[email protected]> wrote:


> Hello Jiri,
>
>
>  Requirements:
>>   - strong static typing, compiled to native code
>>   - ability to statically prove absence of invalid dereferences,
>>     integer overflows, out of bounds array accesses... any condition
>> that would cause
>>     the program to misbehave
>>   - similar to C (superficially, and in simplicity)
>>   - minimal requirements for runtime support
>>     - programs without dynamic memory must be possible
>>   - ability to call functions written in assembly/C
>>   - ability to write assembly inline and do check it (i.e. not just "put
>> this string in the generated code")
>>   - builtin support for asynchronous code (fibrils) and tools to
>> provably avoid races in SMP environment
>>
>
> These requirements seem rather reasonable.
>
> Maybe except the last point which is perhaps too high-level for the core
> of the language -- the fibrils specifically, which we only have as a pure
> user space concept. The kernel uses regular preemptible threads and I don't
> think that the core of the language should really prefer any specific
> scheduling paradigm.
>

Point taken.
The idea behind this is that the language should have a construct for
invoking a separate thread of control (implementation nonspecific), so that
it can statically check that anything that is shared between the threads is
safe to share. It could easily be made to invoke a specific function that
may or may not be implemented at all. Alternatively, it can just be part of
the preconditions (i.e. something like "require
can_implement_thread(closure)").


>
> However, I can imagine that the core of the language might support (a)
> some primitives and intrinsics (e.g. atomics, test-and-set, etc.), and (b)
> advanced stack manipulation (e.g. coroutines) that you can use to implement
> any scheduling paradigm.
>
>
As for primitives, I am playing with the idea of shareable and
non-shareable types. For integer types and pointers, those could have a
modifier "atomic" that would make it shareable and at the same time force
any access to be an atomic intrinsic (which would be a part of the
language).

Similar intrinsics could be present for said stack manipulation, since the
compiler should know how to do it anyway.

For the other shareable types, I am contemplating a "locked" modifier, with
a dedicated "lock (x) { ... }" statement, which would strip the modifier
for the inner block only. This would again call upon functions optionally
implemented by the runtime support.

To make this simpler, I am currently trying to avoid global variables
completely. They will not be supported by the first iteration of the
language. Whether or not that is a practical approach is open for debate.


Above I mentioned "stripping the modifier". I make a distinction between a
declared static type and the situational static type.
I.e. imagine you have a local variable i, which can be an arbitrary
integer. Normally, any array indexing of the form "arr[i]" would be
statically invalid. However, if arr is known to have size 5, then the
following is valid:

if i >= 0 && i < 5 {
   arr[i] = 0
}

Here, i in the true branch of the if statement would have a situational
type of "range 0..4", which is valid to use for the index here.
This becomes a little more messy when you want to deal with dynamic values,
as in  i < len(arr), but I suppose extending the definition of a "type"
with a set of predicates is not that much of a stretch.

By the way, I expect an argument of the form that the static type should
not depend on complicated analysis of condition expressions. I agree. You
can, however, only interpret for type annotations a certain form of
expressions, put those expressions in there as assertions, and let the SMT
solver prove (or disprove, or timeout for) the rest.



>  Extending from above (mostly my preferences):
>>   - arbitrary ranges instead of fixed machine types; the compiler will
>>     check that the range can't overflow (as in Ada)
>>
>
> Yes. Although fixed machine types can also come handy, especially if you
> can really specify them precisely (e.g. uint32_aligned_little_endian_t).
>
>
Things like alignment and endianity can be present in the form of
user-provided attributes, instead of something predefined. Even the exact
memory layout of a number could potentially be specified by providing a
custom load and store methods, but I don't want to go in that direction
just yet. Suffice to provide alignment and endianity attributes, and
specify how integer ranges map to memory by default in a way compatible
with native-endian two's complement. "int16_t" would become synonymous to
"range -32768..32767". It could be defined as an alias for convenience.



>
>    - owned, borrowed and managed pointers (as in Rust)
>>   - statement-based syntax, packages (as in Go)
>>   - parser plugins for machine code
>>   - contracts (preconditions, postconditions, invariants)
>>     - statically verified (on a function level - a method call is
>>       specified by its interface, no inlining)
>>     - integration with exchangable SMT solver (SMT-LIB 2.0 provides
>>       suitable interface)
>>     - (for an example see Dafny)
>>   - no classes, statically bound methods, type-tagged interface pointers
>>     to provide dynamic binding (as in Go)
>>   - dynamic stack size, unless the maximum can be determined statically
>>     (also Go)
>>   - compiled (at least) to ANSI C
>>
>
> Oh, this would be Christmas :-D But when speaking of the contracts, what
> kind of properties do you have on your mind (ownership, safety, temporal)?
> What kind of logic?
>
>
Any valid boolean expression in the language that only contains calls to
functions with no side effects.
In general, truthfulness of such expressions is already algorithmically
undecidable, but since you need to teach programmers how to write provable
stuff anyway, I don't see that as much of a drawback.
I suspect it will be somewhat necessary to allow to simply assume an
expression is correct (at the cost of technically possible runtime
failure), but that's still good enough if it's used sparingly and the
compiler can report all occurrences.

For extra-functional properties, I only thought of using the Until LTL
operator for method invocations. To express e.g. "After this call, it is
not called again until this other one is called with the same argument,
which must eventually happen" (think lock() and unlock(): lock(x) requires
"!lock(x) U unlock(x)"). Possibly also the Since operator, to specify for
unlock(x): "!unlock(x) S lock(x)".


> A few additional ideas of my own:
>
> - some kind of safe meta-programming (something more sane than both the
>   C preprocessor and C++ templates), at least for the implementation of
>   the most useful polymorphic functions (e.g. max())
>

Definitely a good idea, with "some kind" being the biggest problem. I don't
really know how to go about it right now. If someone can create (or adapt
from another language that already has) a nice and sane system, I am buying
that person a few rounds of beer.



> - some basic support for more advanced programming concepts, preferably
>   implemented only as optional syntactic sugar without a heavy run-time
>   (foreach loops, iterators, operator overloading, yield [coroutines])
>

I am not a big fan of explicit asynchronicity, so I would definitely avoid
coroutines in any form other than as an implementation detail of a
userspace thread scheduler.

As for the other three: syntax sugar for foreach loops and iterators would
definitely be nice, if done right. Operator overloading I am a bit
sceptical about, as it would interfere with the ability to understand the
code, as frequently demonstrated by C++.
In any case, both can be added after the core of the language is finished.


> - precise control for type alignment and array/structure packing (with
>   the possibility to use thr C alignment and packing rules)
>

Definitely needed.


> - type system with support for physical dimensions (e.g. m*s^-2, bytes,
>   pages = bytes/PAGE_SIZE, etc.), ideally with safety assignment checks
>   or explicit conversion functions
>

That is not really useful for the purpose I am aiming for, I think.


> - (goes with the previous one) preferably no hard-wired implicit value
>   and type conversions, implicit conversions only using explicitly
>   defined conversion functions
>

I'd rather completely avoid implicit calls to user-defined functions. I
might change my mind if I see it improves readability, but I have a feeling
it would hurt it (code lying to the reader by omission).


> - contracts with object ownership and life-cycle notion (as in VCC)
>

Not familiar with that one, will check out.


> - possibility to easily inline binary blobs as data (not that
>   necessary, but still a handy feature)
>

Very handy. Maybe should be part of the hypothetical meta-programming
support?



>
> If you don't stop me, I might generate similar ideas for a long time :-)


Please do. :)


> But, of course, you have the liberty to reject all of them as your own
> list is already good for a dissertation or two :-)
>

Mgr., RNDr., Phd., there's some space left still. ;)


>
> Also have a look on Sysel [1]. Granted, the goals and starting points are
> quite different, but you might find some inspiration there.
>
>
Already did before I sent the first e-mail. I think I diverge too far for
there to be much common ground. Funny thing though.. I independently
started my project with a bootstrap interpreter, which I intend to follow
by a self-hosting compiler toolkit (with a clear, if unwritten, rationale
why that's a good way to go about it). Maybe I read the page some time in
the past and was inspired unconsciously.



> A discussion between Jiri S. and Jiri Z. about designing the perfect
> programming language might be rather .. interesting :-)
>
>
IMHO the only measure of how perfect a language is, is how it fits its
goals (heck, C is close to perfect if you travel 40 years back in time and
develop original UNIX with it). With that kept in mind, all ideas are fair
game. ;)

-- Jirka Z.

_______________________________________________
HelenOS-devel mailing list
[email protected]
http://lists.modry.cz/listinfo/helenos-devel

Re: [HelenOS-devel] What should a C replacement be like?

Reply via email to