Hi all, As you may already know, I've been working on a project I've called Ocean. The summary is that I want to create a source-level replacement for GNU C that provides language safety, introspection, user-extensible syntax/semantics, and sane concurrency support. I will be using the COLA infrastructure to build the compiler, and I want to help produce useful modules that can be integrated in the basic COLA system.
I intend to provide the majority of Ocean under the GPLv2, but I will be amenable to relicensing portions of it under MIT (especially infrastructure that might be useful to COLA in general). I want to have an open development process, and will be using github to host the project. The core language would be mostly-backwards-compatible with C (and in the future, possibly C++ and/or Objective-C). However, the ABI is rather different, so in order to use Ocean with a C library, you would have to recompile that library. Some wizard pointer manipulation would not compile or maybe not execute under Ocean (since it is a safe language), but I think that would not be a problem for most applications, especially if they can put an "#ifdef __OCEAN__" in the needed places. Below is a laundry list of the features I would like to provide. I would appreciate any feedback you have to offer. Thanks, -- Michael FIG <[EMAIL PROTECTED]> //\ http://michael.fig.org/ \// Ocean Core Language ******************* Michael FIG <[EMAIL PROTECTED]>, 2008-09-05 These are the features of Ocean's ABI, and that work with Ocean's default C-like core language. Much of the design here is inspired by the COLA and Erlang systems. * Wide oops (object-oriented pointers) There is no such thing as a pointer containing an arbitrary address. Every pointer is an oop: it points to a valid object + offset, and is associated with a compile-time size as in C (with the exception of the "void *" oop). This is accomplished by making oops a double word with the high word as the base address of an object, and the low word is an offset to be added. The compiler forbids the conversion of any value type to an oop, but allows oop-to-integer conversions for compatibility with C. Oop arithmetic has the same semantics as C pointer arithmetic, but properly preserves the base object address instead of mixing it with the offset. * Runtime object metadata Every object has an oop header in the double-word immediately preceding the base pointer which indicates a metadata object. The metaobject describes any extra oop object headers (such as are needed to preserve the allocated size, field layout, object vtable, locks, versioning, ownership, etc). All functions also have an oop metadata header. * Object safety Every metaobject declares a read/write barrier, which the compiler forces clients to use. An optional metaobject layer can, for example, validate all object access (no indexing off the beginning of an object or the end of an array or assigning a value type to a pointer member). More aggressive layers can implement object permissions, such as denying access based on the caller's context. These barriers can also provide hooks for garbage collection. There are also no uninitialized variables. Stack variables are zeroed before the frame is entered, just like heap allocations. * Discriminated unions A layout function must be declared for every union that contains both value types and oops, so that its oops can be correctly located. * Malloc support Malloc returns a zero-filled object with "unknown layout" in its oop header. A call to the Ocean primitive "layout(struct MyStruct, ptr)" updates the object at "ptr" to have the layout corresponding to the named type. The following code fragment can allow C compatibility: #ifndef __OCEAN__ # define layout(TYPE, PTR) ((TYPE *)(PTR)) #endif The C++ "new" operator will probably be introduced as a non-C-compatible Ocean extension. * Precise GC Copying garbage collection is possible because every object has an associated layout, so pointers can be identified and updated whether on the stack or in malloced memory. Oops make this possible by allowing the garbage collector to alter the base but not the offset of each pointer when relocating an object. * Tasklets and kernel threads (NxM threading) Every function receives a hidden oop argument to chain stack frames together and provide stack and thread-specific data. Tasklet creation and manipulation functions are available. Tasklets can be declared as having a reserved kernel thread (i.e. no other tasklet runs on that thread). The compiler inserts rescheduling requests into code so that CPU-bound tasklets don't block other tasklets. Otherwise, rescheduling is requested at every "receive" (see below). Tasklets by default run in a thread pool, with the at least one thread, and at most the number of physical cores allocated to the application by the system administrator (default all cores) minus the reserved threads. I/O requests (i.e. blocking system calls) are performed by sending a message to a tasklet that is running in a special I/O thread pool, then waiting to receive a result message from the I/O thread. The generated code tracks stack usage so that tasklets can be created with a tiny stack object, and larger stack objects can be added as necessary. Large stack objects could be reclaimed if the stack space becomes unused. * Message passing Every tasklet has a private mailbox. Tasklets can send messages asynchronously, and wait for messages from other sources with a timeout in milliseconds. The "send" construct recursively changes the ownership of an oop and places it in the specified tasklet's mailbox. The "receive" construct loops through messages in the current tasklet's mailbox, evaluating the body until a "break". If there was no "break" and the timeout is nonzero, it waits that many milliseconds for more incoming messages for the body to process. If the timeout expires without reaching a "break", then the message is set to NULL. If there was a "break", the current message is removed from the mailbox. Message order is preserved by the "receive" construct. /* Append MY_MSG to tasklet1's mailbox. */ send(tasklet1, my_msg); void *msg; receive (msg, 0) /* Only process messages already in our queue (0ms). */ { /* If the message matches, exit the receive clause. */ if (((MyMsg *)msg)->zot == 123) break; } receive (msg, 1000) break; /* Receive any message within one second. */ receive (NULL, 1000); /* Wait one second. */ * Software Transactional Memory Rather than using locks, STM allows the programmer to start a transaction, read and write objects without affecting other tasklets, then atomically commit or roll back the transaction. Again, this is made possible with wide oops and read/write barriers. Each object can be "owned" by a given tasklet (no other tasklet is allowed to touch it directly: the write barrier prevents it), and if there is no owner the STM metaobject layer only allows write access from within a transaction. * Metaprogramming Everything within an "ifdef __OCEAN_META__" is evaluated as COLA code at compile time. This is how the Ocean compiler can be modified to extend syntax or semantics. #ifdef __OCEAN_META__ (printf "this is COLA code!\n") #endif End. _______________________________________________ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc