Thanks for the status report. Some comments that may or may not help...
First of all, let me say that I think C-- did hit the nail on the head.
It was successfull in doing something that I think needed to be done.
Speaking as someone who has been wandering around looking at alternatives
for implementing a functional language, the pros and cons of the
various options are:
- GCC: Still quite complicated to work with, still requires you to write
your compiler in C. Implementing a decent type system is going to be
interesting enough in Ocaml or Haskell, I'll pass on doing that in C.
Which means a hybrid compiler, with a lot more complexity. Also,
functional languages are definately still second class citizens in GCC
world- things like tail call optimization are still not where they need to
be. Which means implementing an optimization layer above GCC to deal with
tail calls. Plus you still have all the run time library issues you need
to deal with- you still need to write a GC, exception handlers, threading,
etc. On the plus side, you do get a lot of fancy optimizations- SSE use,
etc.
- Java/C#: The big advantage of targeting these run times is that with a
little work in the language, you can take advantage of the huge libraries
these runtimes have. This the big advantage of an F# or Acute. And you
get at least middling decent optimization. But only middling- Haskell's
pure laziness costs about as much performance as Java's virtual machine
architecture, as an approximation.
You don't have to write your own gc or exception handlers, but this is
because you don't have a choice- you get to use what the runtime provides
you. And the ones the runtime provides you is decidedly suboptimal for
functional languages, especially in the GC department. Don Syme of F#
asked the CLR people if they could maybe tune the GC to work better with
programs with high rates of allocation (aka functional programs), and got
told that high rates of allocation were a bug. Plus, I think tail call
optimization is even less advanced in these environments than GCC. So
again you're hitting the functional programming as a second class citizen
problem.
- Write your own back end. This is a boatload of work (as you guys well
know)- approximately as much work as everything else in the compiler put
together. Also note that you still have all the run time issues to deal
with.
- Write your own virtual machine. And just eat the performance. Note
that again you still need to implement your own GC, threading, etc. For
performance, you're aiming at Ruby/PHP/TCL/Bourne shell level of
performance- the level below Java/C#/Haskell. Where functional
programming really shines, I think, is programming in the large- word
processors and CAD/CAM systems etc. It's when you start dealing with
things like maintainance and large scale reuse and multithreading that
functional programming really spreads it's wings and flies. And, unlike
scripting/web programming, performance really does matter. Even stepping
down to Java-level performance becomes a problem (see OpenOffice). For
"proof of concept" languages this may not be a problem, but if you ever
want your language to have even a remote possibility of being used by
anyone other than yourself, this isn't acceptable.
- Use C as a back-end. You're writing your own runtime again, tail
recursion is poorly supported again, and a lot of function programming
constructs don't map well to C.
- Use C--. You still have to implement your runtime, but you're basically
going to have to do that anyways. You get decent optimization, you get to
write your compiler in the language you want to, and functional languages
are first class languages.
Of these options, I think C-- (assuming it's not a dead project) is the
best of the lot. Even if it needs some work (an x86-64 back end, the
ability to move a stack frame from one stack to another), it'll be no more
work than any other option. My second choice would be GCC as a back end,
I think. But the point here is that the fundamental niche C-- fills is
still usefull and needed.
Success in open source projects is mainly a matter of luck. This is
especially true for infrastructure projects, whose success depends upon
some other successfull project using them. I've been involved in
uncounted email "discussions" (aka flamewars) along the lines of "what
does this project need to do to be successfull", and I've seen all sorts
of proposals and seen pretty much all of them shot down. To my knowledge,
there seems to be two requirements: 1) write code that works, and 2) get
lucky.
Some forms of luck you can make for yourself, however. One of the big
strokes of luck that C-- lacks is a big, well known project that uses it.
To give an example not quite at random, if a functional language luminary
like Simon Peyton Jones were to start a project to design the successor to
Haskell and Ocaml using C-- as the back end, that'd be a huge boost to
C--. Success of the Haskell++ language using C-- would bring visibility,
credibility, and developers to C--.
I'm not sure if he'd be interested in such a project (even if he had
volunteers, like you're truely, to do a lot of lifting). I mention this
because this is effectively why I'm looking at C-- (modulo SPJ's
envolvement).
I am less convinced that converting C-- to C is all that important. Maybe
C-- to GCC's back end. But the main benefit that would bring is support
for more architectures. If anything, the architecture realm has gotten
simpler since 1998. The Alpha, PA-RISC, and MIPS architectures are dead,
Itanium is a no-show, and Apple has dropped the PowerPC for the x86. By
the time there is any demand for the Power/PowerPC or Sparc architectures
to be supported, there will be more than enough developer interest to do
so. So the only architectures that really matter are x86-32 and x86-64.
The fancy optimizations (autovectorization, etc.) you'd get from enganging
a C backend would be nice, but I don't think they're necessary. The vast
bulk of the performance boost native code gets is due to the lack if JIT
compiling costs, register allocation, and simple peephole optimization.
Well, and functional programming specific optimizations (uncurrying,
etc.). The "fancy optimizations" give you maybe 10-30% over basic
optimizations. You could easily hit Ocaml-level speeds with C--.
As for the generic run-time library, I think this is fraught with dangers.
The biggest of which is the "you didn't do things exactly the way I wanted
them done in my language, so it's worthless" syndrome. The original
design of C-- had it dead right, IMHO. Things like exception handling and
garbage collection are too heavily impacted by the language design and
goals to have generic implementations. For example, how do you know
what's a pointer and what isn't? An Ocaml-like language may want to
decide to take the low order bit of every integer as a tag bit. A more
Java-like language may decide to use the reflection capabilities of the
language to do GC. Do you allow objects in older generations to hold
pointers to objects in younger generations? A functional language might
say no, an imperitive language might say yes. Are destructors common or
uncommon? Try to please everyone, and you'll likely end up with neither
fish nor fowl, or a huge white elephant ("a mouse built to goverment
specifications")- something that's equally useless to everyone.
Were I to implement such a library, I'd be inclined to do it primarily as
the runtime to my language. Which would at least guarentee one user, and
gaurentee it fits the needs of one language. But again, the success of
the library would be primarily based on the success of the language built
on top of it.
I think the three new things I'd like to see out of C-- are (in rough
order of priority):
1) x86-64 support
2) the ability to move/copy a stack frame from one stack to another, and
3) Some form of inline assembler without having to go to C (necessary for
writting threading primitives in C--)
I am contemplating just adding those capabilities.
Brian
_______________________________________________
Cminusminus mailing list
[email protected]
https://cminusminus.org/mailman/listinfo/cminusminus