so I had a look at the code. The loops are all fine. replicateM_ isn't
a problem, but getDot is decidedly non trivial. Lots of pattern matching
on different vector forms, and to top it off ffi calls.
With some inlining in the blas library I was able to cut a few seconds
off the running time, but
aeyakovenko:
> i get the same crappy performance with:
>
> $ cat htestdot.hs
> {-# OPTIONS_GHC -O2 -fexcess-precision -funbox-strict-fields
> -fglasgow-exts -fbang-patterns -lcblas#-}
> module Main where
>
> import Data.Vector.Dense.IO
> import Control.Monad
>
> main = do
>let size = 10
>
i get the same crappy performance with:
$ cat htestdot.hs
{-# OPTIONS_GHC -O2 -fexcess-precision -funbox-strict-fields
-fglasgow-exts -fbang-patterns -lcblas#-}
module Main where
import Data.Vector.Dense.IO
import Control.Monad
main = do
let size = 10
let times = 10*1000*1000
v1::IOVect
On Friday 27 June 2008, Anatoly Yakovenko wrote:
> $ cat htestdot.hs
> {-# OPTIONS_GHC -O2 -fexcess-precision -funbox-strict-fields
> -fglasgow-exts -fbang-patterns -lcblas#-}
> module Main where
>
> import Data.Vector.Dense.IO
> import Control.Monad
>
> main = do
>let size = 10
>let times
> I suspect that it is your initialization that is the difference. For
> one thing, you've initialized the arrays to different values, and in
> your C code you've fused what are two separate loops in your Haskell
> code. So you've not only given the C compiler an easier loop to run
> (since you'r
On 19 Jun 2008, at 4:16 am, Anatoly Yakovenko wrote:
C doesn't work like that :). functions always get called.
Not true. A C compiler must produce the same *effect* as if
the function had been called, but if by some means the compiler
knows that the function has no effect, it is entitled to
On Wed, Jun 18, 2008 at 06:03:42PM +0100, Jules Bean wrote:
> Anatoly Yakovenko wrote:
> >>>#include
> >>>#include
> >>>
> >>>int main() {
> >>> int size = 1024;
> >>> int ii = 0;
> >>> double* v1 = malloc(sizeof(double) * (size));
> >>> double* v2 = malloc(sizeof(double) * (size));
> >>> fo
On Wed, Jun 18, 2008 at 09:16:24AM -0700, Anatoly Yakovenko wrote:
> >> #include
> >> #include
> >>
> >> int main() {
> >> int size = 1024;
> >> int ii = 0;
> >> double* v1 = malloc(sizeof(double) * (size));
> >> double* v2 = malloc(sizeof(double) * (size));
> >> for(ii = 0; ii < size*s
Anatoly Yakovenko wrote:
#include
#include
int main() {
int size = 1024;
int ii = 0;
double* v1 = malloc(sizeof(double) * (size));
double* v2 = malloc(sizeof(double) * (size));
for(ii = 0; ii < size*size; ++ii) {
double _dd = cblas_ddot(0, v1, size, v2, size);
}
free(v1);
On Wed, Jun 18, 2008 at 9:16 AM, Anatoly Yakovenko
<[EMAIL PROTECTED]> wrote:
> C doesn't work like that :)
Yes it can. You would have to check the disassembly to be sure, but C
compilers can, and do, perform dead code elimination.
AGL
--
Adam Langley [EMAIL PROTECTED] http://www.imperialviole
>> #include
>> #include
>>
>> int main() {
>> int size = 1024;
>> int ii = 0;
>> double* v1 = malloc(sizeof(double) * (size));
>> double* v2 = malloc(sizeof(double) * (size));
>> for(ii = 0; ii < size*size; ++ii) {
>> double _dd = cblas_ddot(0, v1, size, v2, size);
>> }
>> free
On Tue, Jun 17, 2008 at 9:00 PM, Anatoly Yakovenko
<[EMAIL PROTECTED]> wrote:
> here is the C:
>
> #include
> #include
>
> int main() {
> int size = 1024;
> int ii = 0;
> double* v1 = malloc(sizeof(double) * (size));
> double* v2 = malloc(sizeof(double) * (size));
> for(ii = 0; ii < siz
here is the C:
#include
#include
int main() {
int size = 1024;
int ii = 0;
double* v1 = malloc(sizeof(double) * (size));
double* v2 = malloc(sizeof(double) * (size));
for(ii = 0; ii < size*size; ++ii) {
double _dd = cblas_ddot(0, v1, size, v2, size);
}
free(v1);
fr
13 matches
Mail list logo