Re: Why the CLR languages fail?

Alexandru Nedelcu Thu, 06 Jun 2013 03:53:44 -0700

On Thu, Jun 6, 2013 at 12:47 PM, Colin Fleming
<colin.mailingl...@gmail.com>wrote:


> I'm not sure this is true, Don Syme has written several times about how
> difficult it would be to implement F# on the JVM - I believe tail recursion
> and not being able to define new intrinsic types (i.e. new primitives) are
> the sticking points.


Yes, both of these have been issues people had to confront on the JVM.
However, for example tail-recursion in CLR not being used by C# meant that
it went unoptimized and even broken on the 64bits CLR for years.

Also, tail-recursion is not such a big issue, because self-tail recursion
is easy to optimize by generating a simple loop (Scala does it, Clojure
does it) and more complicated cases require indeed usage of trampolines.
Not to underestimate their importance, being a useful feature to model
state-machines, however you can still work by means of trampolines (Clojure
elegantly solves it with syntactic sugar and it's workable in Scala because
of it's good type-system [1])

On the other hand, the lack of support for new primitives has been hurting
language implementations, like JRuby. Note that both features have
experimental implementations as part of OpenJDK and it's possible we'll see
them in the JVM. Plus, why stop at tail-calls? There are many languages
that can hardly be implemented on top of both .NET and the JVM - not having
continuations support, useful for Scheme or Smalltalk is a PITA. Haskell
wouldn't be possible in either, because of it's laziness by default. You
also mentioned primitive types, but for example the CLR lacks a Union data
type, so implementing lazy integers for example comes with a lot of
overhead.


> I think a lot of people believe that from a functionality point of view
> the CLR is better than the JVM - as far as I know it's not missing any
> functionality from the JVM and it has significant advantages (reified
> generics as well as the functionality mentioned above).


That's not true. It does have a couple of advantages (like the ones you
mentioned), but that list is shrinking and the advantages that the JVM has
are huge and the list gets bigger with each new release.

Reified generics are actually a PITA for new languages. Scala's generics
are much more flexible, with a much better design than the ones in Java -
you cannot implement Scala's type system on top of .NET's generics without
cutting down on their features. Erasure is not bad per se and for the
utility of reified generics, it really depends on what language you're
talking about. Erasure has only been bad for Java, because Java's type
system is too weak and because in Java the wildcards are used on the
methods themselves, rather than at the class level, which has been a big
mistake. Also, Scala's type system is much stronger, much more static and
much more expressive. I never felt the need for reification in Scala. It
even has the option of doing specialization for primitives, to avoid
boxing/unboxing, as an optimization option. Haskell also implements
generics by erasure.

Reified generics are bad for other languages if you need to workaround
them. For dynamic languages it's somewhat a disaster. On the JVM the
bytecode is pretty dynamic, except for when you need to work with
primitives (for invoking operations optimized for primitives) or when you
call a method, you need to know the interface it belongs to (something
which changed in OpenJDK 7 with invokeDynamic). Otherwise you don't have
static types in the actual bytecode (e.g. casting an object to something is
just an assertion in the actual bytecode that you don't need). But with
reified generics, suddenly you have more static typing you need to take
care of and avoid.

When generating bytecode it's easier on the JVM. The final packages (Jars)
are just Zip files containing .class files, with a text-based manifest. The
.class files themselves contain the debugging symbols and those debugging
symbols are part of the standard spec, whereas the format for the CLR was
never a part of the ECMA standard, was private and for a long time Mono has
been using their own format for those debugging symbols, as they had to
slowly reverse-engineer whatever Microsoft was doing. Because Java
libraries tend to use a lot of bytecode-generating routines, the tools
available for that are amazing (like ASM), whereas on top of .NET things
like System.Reflection.Emit takes care of only a subset, so the Mono people
had to come up with their own alternatives (like Cecil). For the parsers,
you can use mature libraries such as Antlr, at least for the initial
prototype. Speaking of Antlr, it does have a C# generating backend, but
it's a poor and badly maintained port.

The JVM is awesome in what optimizations can do at runtime. VMs have a
tendency to be optimized for their primary languages. In C# methods are
final by default, whereas in Java the methods are virtual by default. This
seemingly small and insignificant issue meant that the JVM needs to
optimize virtual method calls at runtime - so the JVM can inline such
virtual method calls, while the CLR cannot. The C# compiler is where most
optimizations happen, whereas more and more optimizations have been move in
the JVM itself at runtime ... Scala code runs as fast as equivalent Java
code, even though the compiler is much newer and immature. The JVM can do
other runtime optimizations as well, based on actual profiling and
heuristics. It can do escape analysis to eliminate locks, if it sees that a
short-lived object doesn't escape its scope, then it can allocate it on the
stack instead of the heap (as a result of Java not having the ability to
define new stack-allocated values). If an optimization isn't performing
well, then it has the ability to deoptimize that piece of code and try
something else. InvokeDynamic from JDK 7 gives you the ability to override
the normal method-dispatch resolution. So you don't need to know the
interface of a method when calling it. And such dynamic calls get the same
optimizations as normal virtual method calls. This reduces overhead for
languages such as JRuby.

The JVM has been used a lot in a server-side context. This means
long-running processes that generate junk and that must have near-realtime
low latencies. This means a lot of effort has been made to build garbage
collectors that cope with such workloads. As a result, you have the
concurrent mark-sweep GC (CMS), the new G1 from JDK 7 or Azul's Pauseless
GC (expensive, but amazing, or so I hear). Alternative languages (such a
Scala, JRuby or Clojure) tend to generate a lot of short-lived junk. And
JVM's garbage collectors can cope with it efficiently.

I can probably think of other stuff, but this email is already too long :-)

[1]
http://blog.richdougherty.com/2009/04/tail-calls-tailrec-and-trampolines.html
[2] http://www.artima.com/lejava/articles/azul_pauseless_gc.html

-- 
Alexandru Nedelcu
https://bionicspirit.com

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Why the CLR languages fail?

Reply via email to