Recently I needed to analyze some mail logs. I needed to find the hosts that were sending mail and how many lines in the log existed for each host. Thinking this would be perfect for natively compiled code in D I first wrote a D app. I then wrote it in perl and was amazed at how much faster perl was. I expanded out to Java and Scala. For all four I used the same source files and the same output was created.

The four source files totalled 430MB together with 1.69 million lines. In all four implementations the file was read a line at a time and the same regex was applied to extract the desired data. Output was 6,857 bytes.

Here are the run results on a 32-bit linux 3.0.0 system. The absolute numbers are not important since I ran this on a very old system. It is the relative numbers that matter here.

Java (JVM 7 update 1)
real    0m56.465s
user    0m51.911s
sys     0m3.344s

Perl (5.12.4)
real    1m22.256s
user    1m19.773s
sys     0m2.212s

Scala (2.9.1 on JVM 7 update 1)
real    1m41.187s
user    1m36.566s
sys     0m3.892s

D (2.0.56 compiled with -O -release -inline -noboundscheck)
real    4m21.255s
user    4m14.216s
sys     0m5.940s

Java is the fastest, even faster than perl. The D version which is the only natively compiled code version is over 4.6 times slower than the Java version even when including overhead like the JVM startup time.

The source for each of the four implementations is attached. I admit to being very new to D so perhaps I'm really doing something wrong.

"bearophile" <bearophileh...@lycos.com> wrote in message news:jb6j8h$12js$1...@digitalmars.com...
Jude:

Got any ideas for code that is currently way less than optimal in D?

Compared to Java running on the OracleVM D is most times slower when it comes to heavily garbage collected code, and often with floating-point-heavy code. Exceptions (and synchronized methods) are faster than D-DMD ones. Sometimes textual I/O is faster in Java compared to D-DMD-Phobos one. Often the JavaVM is able to de-virtualize and inline virtual calls, while D-DMD is not able to do this (maybe LDC2-LLVM3 will be able to do this a bit), making such code faster. The JavaVM is usually able to dynamically unrool loops that have a loop count known only at runtime, this sometimes speeds up loops a lot compared to D-DMD. Some of Java libraries implement important data structures or other things that are currently sometimes significantly faster than equivalent D ones. This is probably a not complete list.

Bye,
bearophile

Attachment: relayhosts.d
Description: Binary data

Attachment: RelayHosts.java
Description: Binary data

Attachment: relayhosts.pl
Description: Binary data

Attachment: RelayHosts.scala
Description: Binary data

Reply via email to