Hi This is my first experiment with Julia and I wanted to share some results. I have ported the STREAM benchmark (http://www.cs.virginia.edu/stream/) to Julia. The code is available on github (https://github.com/kapiliitr/JuliaBenchmarks/blob/master/streamp.jl).
I am getting the following performance results in Julia - Array size = 5000000 (elements), Offset = 0 (elements) Memory per array = 38.14697265625 MiB (= 0.03725290298461914 GiB) Total memory required = 114.44091796875 MiB (= 0.11175870895385742 GiB) Function Best Rate MB/s Avg time Min time Max time Copy: 43.0 1.885108 1.861376 1.908840 Scale: 37.1 2.166505 2.155083 2.177926 Add: 48.2 2.532873 2.487158 2.578587 Triad: 43.1 2.787225 2.784426 2.790023 I am getting the following performance results in C - Array size = 5000000 (elements), Offset = 0 (elements) Memory per array = 38.1 MiB (= 0.0 GiB). Total memory required = 114.4 MiB (= 0.1 GiB). Each kernel will be executed 3 times. Function Best Rate MB/s Avg time Min time Max time Copy: 8553.3 0.009360 0.009353 0.009366 Scale: 8248.4 0.009712 0.009699 0.009726 Add: 9490.6 0.012987 0.012644 0.013329 Triad: 9032.0 0.013540 0.013286 0.013793 Following are the results with 4 processors in Julia- Function Best Rate MB/s Avg time Min time Max time Copy: 11122.2 0.007308 0.007193 0.007423 Scale: 465.5 0.217924 0.171840 0.264008 Add: 12481.8 0.009678 0.009614 0.009742 Triad: 471.3 0.267199 0.254624 0.279775 Following are the results with 4 omp threads in C- Function Best Rate MB/s Avg time Min time Max time Copy: 11077.0 0.007228 0.007222 0.007233 Scale: 10552.7 0.007587 0.007581 0.007594 Add: 11986.9 0.010023 0.010011 0.010036 Triad: 12173.0 0.009865 0.009858 0.009872 As it can be seen that with one thread/process, performance of Julia is much less than C for all the functions. However, for multi-process runs, Julia performs similar to C for Copy and Add functions but it's performance hits for Scale and Triad functions. What could be the reason behind this ? Could this be a problem in my implementation or is this just the way Julia is implemented ? Thanks -- Kapil