Re: [Rd] Improvement of [dpq]wilcox functions

2009-07-22 Thread Ivo Ugrina
Ivo Ugrina wrote:
 Hi,
 
 I believe I have significantly improved [dpq]wilcox
 functions by implementing Harding's algorithm:
 Harding, E.F. (1984): An Efficient, Minimal-storage Procedure
 for Calculating the Mann-Whitney U, Generalized U and Similar
 Distributions, App. Statist., 33, 1-6
 
 Results on my computer show (against R-2.9.1):
 
 system.time( dwilcox( 800, 800, 80) )
user  system elapsed
   0.240   0.180   0.443
 system.time( dwilcox( (1:2800), 80, 80) )
user  system elapsed
   0.290   0.020   0.30
 system.time( dwilcox( (1:2800), 160, 160) )
user  system elapsed
   0.810   0.060   0.868
 system.time( dwilcox( (1:28000), 210, 210) )
user  system elapsed
  17.700   0.600  18.313
 RAM: ~ 700MB
 system.time( pwilcox( 21000, 211, 211) )
user  system elapsed
  18.110   0.640  18.762
 system.time( a - qwilcox( 0.43, 200, 400) )
 ^C
 Timing stopped at: 14.39 1.43 18.794
 RAM:  1.4GB at interrupt time
 
 system.time( dwilcox(800, 800, 80) )
user  system elapsed
   0.140   0.000   0.144
 system.time( dwilcox( (1:2800), 80, 80) )
user  system elapsed
   0.010   0.000   0.007
 system.time( dwilcox( (1:2800), 160, 160) )
user  system elapsed
   0.020   0.000   0.016
 system.time( dwilcox( (1:28000), 210, 210) )
user  system elapsed
   0.050   0.000   0.05
 RAM:  1MB
 system.time( pwilcox( 21000, 211, 211) )
user  system elapsed
   0.020   0.000   0.025
 system.time( a - qwilcox( 0.43, 200, 400) )
user  system elapsed
   0.040   0.000   0.07
 RAM:  1MB
 
 There is no more need for
 wilcox_free at [dpq]wilcox in src/library/stats/distn.R
 (every other call after the first one with the same m,n
 will just read the results from the array so it will be
 really fast) and for
 #define WILCOX_MAX 50 in src/nmath/nmath.h
 
 p.s. modified files are in the attachment
 
 have fun,
 

So is this an improvement or not?

-- 
Ivo Ugrina  http://web.math.hr/~iugrina 
Teaching/Research Assistant at Department of Mathematics
University of Zagreb, Croatia

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Improvement of SignRank functions

2007-12-15 Thread Ivo Ugrina
Martin Maechler wrote:
 do you have evidence for your belief?
 i.e. a set of  system.time(.) calls where you see the
 difference?


system.time(dsignrank(17511, 400))
user  system elapsed
   1.010   0.120   1.145
system.time(dsignrank((0:17511), 400))
user  system elapsed
1.250.131.40
system.time(dsignrank((0:17511), 500))
user  system elapsed
   2.040   0.220   2.296
system.time(psignrank((0:17511), 600))
user  system elapsed
  20.670   0.580  21.403
system.time(qsignrank(0.56, 300))
user  system elapsed
   0.700   0.050   0.753
==
system.time(dsignrank(17511, 400))
user  system elapsed
   0.070   0.000   0.078
system.time(dsignrank((0:17511), 400))
user  system elapsed
   0.100   0.000   0.104
system.time(dsignrank((0:17511), 500))
user  system elapsed
   0.160   0.000   0.164
system.time(psignrank((0:17511), 600))
user  system elapsed
  16.330   0.370  16.729
system.time(qsignrank(0.56, 300))
user  system elapsed
   0.020   0.010   0.029



system.time(dsignrank((0:2), 600))
user  system elapsed
   3.470   0.280   3.745
RAM: ~130MB
==
system.time(dsignrank((0:2), 600))
user  system elapsed
   0.250   0.010   0.26
RAM: ~1MB



 BTW: If you had a smart idea to *not* use a static 'w' and still
  be memory efficient,
  that could lead to make that code thread-safe, but I am
  not at all sure this is possible without using
  thread-library C code.

I'll look into it.


With respect,
-- 
Ivo Ugrina
ICQ: 47508335 | www.iugrina.com
---
baza matematickih pojmova
http://baza.iugrina.com
---
anime, manga, Japan fanzin
http://yoshi.iugrina.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Improvement of SignRank functions

2007-12-15 Thread Ivo Ugrina
Hi Martin,

Martin Maechler wrote:
 that's quite convincing; thank you!
 and I can verify part of it on my computer.
:D

 I think I'd just commit your signrank.c
 (with a few cosmetic changes) to the sources, right?
Right!

There is no need for SIGNRANK_MAX in src/nmath/nmath.h anymore.

 Thank you for your contribution!
You're welcome.

With respect,
-- 
Ivo Ugrina
ICQ: 47508335 | www.iugrina.com
---
baza matematickih pojmova
http://baza.iugrina.com
---
anime, manga, Japan fanzin
http://yoshi.iugrina.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Improvement of SignRank functions

2007-12-14 Thread Ivo Ugrina

I took some time and liberty and tried to improve
existing implementation of SignRank functions
in R. (dsignrank, ...)

As I have seen they've been based on csignrank.
So I modified csignrank and, I believe,
improved calculation time and memory efficiency.

The idea is basically the same. I use the same recursion
as original author used with one slight modification.
I am generating Wilcoxon SignRank density from the
beginning (for n=2,3,...) with the help of recursion formula.

What is changed?
There is no need for SINGRANK_MAX in src/nmath/nmath.h anymore.
Only functions:
static void w_free()
void signrank_free()
static void w_init_maybe(int n)
static double csignrank(int k, int n)
in src/nmath/signrank.c have been altered.
There was no change to dsignrank, psignrank, ...

I've tried to make as little changes as possible.
So, to compile with new functions only src/nmath/signrank.c
needs to be changed with the one given in attachment.

I hope it really is an improvement.

With respect,
--
Ivo Ugrina
ICQ: 47508335 | www.iugrina.com
---
baza matematickih pojmova
http://baza.iugrina.com
---
anime, manga, Japan fanzin
http://yoshi.iugrina.com
/*
 *  Mathlib : A C Library of Special Functions
 *  Copyright (C) 1999-2007  The R Development Core Team
 *
 *  This program is free software; you can redistribute it and/or modify
 *  it under the terms of the GNU General Public License as published by
 *  the Free Software Foundation; either version 2 of the License, or
 *  (at your option) any later version.
 *
 *  This program is distributed in the hope that it will be useful,
 *  but WITHOUT ANY WARRANTY; without even the implied warranty of
 *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *  GNU General Public License for more details.
 *
 *  You should have received a copy of the GNU General Public License
 *  along with this program; if not, a copy is available at
 *  http://www.r-project.org/Licenses/
 *
 *  SYNOPSIS
 *
 *#include Rmath.h
 *double dsignrank(double x, double n, int give_log)
 *double psignrank(double x, double n, int lower_tail, int log_p)
 *double qsignrank(double x, double n, int lower_tail, int log_p)
 *double rsignrank(double n)
 *
 *  DESCRIPTION
 *
 *dsignrankThe density of the Wilcoxon Signed Rank distribution.
 *psignrankThe distribution function of the Wilcoxon Signed Rank
 * distribution.
 *qsignrankThe quantile function of the Wilcoxon Signed Rank
 * distribution.
 *rsignrankRandom variates from the Wilcoxon Signed Rank
 * distribution.
 */

#include nmath.h
#include dpq.h

static double *w;
static int allocated_n;

static void
w_free()
{
if( !w) return;

free( (void *) w);
w = 0;
allocated_n = 0;
}

void signrank_free()
{
w_free();
}

static void
w_init_maybe(int n)
{
int u, c;

u = n * (n + 1) / 2;
c = (int) (u / 2);

if( w){
if( n != allocated_n){
w_free();
}
else return;
}

if( !w){
w = (double *) calloc(c + 1, sizeof(double));
#ifdef MATHLIB_STANDALONE
if (!w) MATHLIB_ERROR(%s, _(signrank allocation error));
#endif
allocated_n = n;
}
}

static double
csignrank(int k, int n)
{
int c, u, i, j, end;

#ifndef MATHLIB_STANDALONE
R_CheckUserInterrupt();
#endif

u = n * (n + 1) / 2;
c = (int) (u / 2);

if ((k  0) || (k  u))
return 0;
if (k  c)
k = u - k;

if ( n == 1)
return 1.0;
if ( w[0]==1.0 )
return w[k];

w[0]=w[1]=1.0;
for( j=2; jn+1; ++j){
end=j*(j+1)/2;
end= ( endc )?end:c;
for( i=end; i=j; --i){
w[i] += w[i-j];
}
}

return w[k];
}

double dsignrank(double x, double n, int give_log)
{
double d;

#ifdef IEEE_754
/* NaNs propagated correctly */
if (ISNAN(x) || ISNAN(n)) return(x + n);
#endif
n = floor(n + 0.5);
if (n = 0)
ML_ERR_return_NAN;

if (fabs(x - floor(x + 0.5))  1e-7)
return(R_D__0);
x = floor(x + 0.5);
if ((x  0) || (x  (n * (n + 1) / 2)))
return(R_D__0);

w_init_maybe(n);
d = R_D_exp(log(csignrank(x, n)) - n * M_LN2);

return(d);
}

double psignrank(double x, double n, int lower_tail, int log_p)
{
int i;
double f, p;

#ifdef IEEE_754
if (ISNAN(x) || ISNAN(n))
return(x + n);
#endif
if (!R_FINITE(n)) ML_ERR_return_NAN;
n = floor(n + 0.5);
if (n = 0) ML_ERR_return_NAN;

x = floor(x + 1e-7);
if (x  0.0)
return(R_DT_0);
if (x = n * (n + 1) / 2)
return(R_DT_1);

w_init_maybe(n);
f = exp(- n * M_LN2);
p = 0;
if (x = (n * (n + 1) / 4)) {
for (i = 0; i = x; i++)
p += csignrank(i, n) * f;
}
else {
x = n * (n + 1) / 2 - x;
for (i = 0; i  x; i++)
p += csignrank(i, n) * f;
lower_tail