from:"p . dalgaard"

Re: [Rd] system.time provides inaccurate sys.child (PR#14210)

2010-02-21 Thread p . dalgaard

Henrik Bengtsson wrote:
> FYI,
> 
> you're much more likely to get a response/see actions on this if you
> report issues using the most recent stable version (R v2.10.1) and/or
> even the developers version (R v2.11.0).  You're current version is,
> as you see, more than 2 years old.  It is likely that the threshold to
> compare the code of your version with the latest one etc is to large
> for someone to be bothered.
> 
> /Henrik

It was fixed in r-devel same day, though. The message threading is just 
a bit messed up.

-- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] (PR#14210) incoming/14210 system.time provides inaccurate

2010-02-10 Thread P . Dalgaard

Manuel L=C3=B3pez-Ib=C3=A1=C3=B1ez wrote:
> Patch against current trunk attached. It is a one-liner, so I do not
> believe anyone can claim copyright over it.

Fixed for r-devel (r51115).

> Cheers,
>=20
> Manuel.
>=20
> BTW, bugs.r-project.org is painfully slow. I cannot login, I cannot pos=
t
> messages, I cannot attach files. And it doesn't handle accents in my na=
me.

Well, it will die from other causes at the latest on March 1, anyway...
Hopefully Simon Urbanek can pick up the pieces and put a more modern bug
tracker in its place.

(Part of the reason is that Jitterbug is horribly old and unmaintained;
another part is that U.Cph. appears to be intent on committing IT
suicide in the name of rampant corporativism. It is by design that only
a select group of people can login, though. It is expecting followups by
mail, for some reason.)

--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B=

  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R] Suppressing scientific notation on plot axis tick labels (PR#14203)

2010-02-02 Thread P . Dalgaard

murd...@stats.uwo.ca wrote:
> On 02/02/2010 6:20 AM, Dimitri Shvorob wrote:
>> Ruben Roa has kindly suggested using 'scipen' option - cf.
>>
>>> fixed notation will be preferred unless it is more than =C3=A2=E2=82=AC=
=CB=9Cscipen=C3=A2=E2=82=AC=E2=84=A2 digits
>>> wider.
>> However,=20
>>
>> options(scipen =3D 50)
>> x  =3D c(1e7, 2e7)
>> barplot(x)=20
>>
>> still does not produce the desired result.
>=20
> This is strange.  I see what you describe the first time through, but
> if I print the option I get the non-scientific labels on the second plo=
t:
>=20
> options(scipen =3D 50)
> x  =3D c(1e7, 2e7)
> barplot(x)
> options("scipen")
> barplot(x)
>=20
> Looks like some sort of caching bug to me.  I don't think I'll have tim=
e=20
> to track this down; this is a crazy week.  I see the same thing in=20
> R-devel as in 2.10.1.

Same thing with, e.g.

x <- c(1e7, 2e7)
options(scipen =3D3)
barplot(x)
x
barplot(x)
options(scipen=3D0)
barplot(x)
x
barplot(x)


> Duncan Murdoch
>=20
> Version:
>   platform =3D i386-pc-mingw32
>   arch =3D i386
>   os =3D mingw32
>   system =3D i386, mingw32
>   status =3D
>   major =3D 2
>   minor =3D 10.1
>   year =3D 2009
>   month =3D 12
>   day =3D 14
>   svn rev =3D 50720
>   language =3D R
>   version.string =3D R version 2.10.1 (2009-12-14)
>=20
> Windows XP (build 2600) Service Pack 3
>=20
> Locale:
> LC_COLLATE=3DEnglish_Canada.1252;LC_CTYPE=3DEnglish_Canada.1252;LC_MONE=
TARY=3DEnglish_Canada.1252;LC_NUMERIC=3DC;LC_TIME=3DEnglish_Canada.1252
>=20
> Search Path:
>   .GlobalEnv, package:stats, package:graphics, package:grDevices,=20
> package:utils, package:datasets, package:methods, Autoloads, package:ba=
se
>=20
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B=

  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] poisson.test from stats package does not pass the conf.level (PR#14197)

2010-01-27 Thread p . dalgaard

m...@niaid.nih.gov wrote:
> Hi,
> 
> The poisson.test function from stats package does not pass the conf.level p=
> arameter for the two-sample test. Here is an example:
> 
> poisson.test(c(2,4),c(20,14),conf.level=3D.95)$conf.int
> poisson.test(c(2,4),c(20,14),conf.level=3D.9)$conf.int
> 
> 
> Here is the solution, change:
> 
> RVAL <- binom.test(x, sum(x), r * T[1]/(r * T[1] + T[2]),
> alternative =3D alternative)
> 
> to:
> 
> RVAL <- binom.test(x, sum(x), r * T[1]/(r * T[1] + T[2]),
> alternative =3D alternative, conf.level=3Dconf.level)


Now fixed in 2.10.1 patched and R-devel. Thanks.

-- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] `mgp[1:3]' are of differing sign (PR#14130)

2009-12-12 Thread p . dalgaard

Peter Ehlers wrote:
> 
> cornell.p.gonsch...@iem.fh-friedberg.de wrote:
>> Full_Name: Cornell Gonschior
>> Version: 2.10.0
>> OS: Linux
>> Submission from: (NULL) (212.201.28.40)
>>
>>
>> Hi,
>>
>> in the introduction to R, you can find the following sentence in the 
>> par()
>> chapter:
>> "Use tck=0.01 and mgp=c(1,-1.5,0) for internal tick marks."
>> I thought that's nice, because I wanted to have tick marks and tick 
>> labels
>> inside and the axis title outside.
>>
>> But:
>>> plot(z, las=1, tck=0.01, mgp=c(1,-1.5,0))
>> Warnmeldungen:
>> 1: In plot.window(...) : `mgp[1:3]' are of differing sign
>> 2: In plot.xy(xy, type, ...) : `mgp[1:3]' are of differing sign
>> 3: In axis(side = side, at = at, labels = labels, ...) :
>>   `mgp[1:3]' are of differing sign
>> 4: In axis(side = side, at = at, labels = labels, ...) :
>>   `mgp[1:3]' are of differing sign
>> 5: In box(...) : `mgp[1:3]' are of differing sign
>> 6: In title(...) : `mgp[1:3]' are of differing sign
>>
>>> par(las=1, tck=0.01, mgp=c(1,-1,0))
>> Warnmeldung:
>> In par(las = 1, tck = 0.01, mgp = c(1, -1, 0)) :
>>   `mgp[1:3]' are of differing sign
>>
>> Was there a recent change, couldn't find anything useful searching the 
>> web.
>>
>> Regards,
>> Cornell
> 
> Well, it's only a warning, making you aware of a possibly
> unintended par setting. Warnings are good things but if you
> don't want to see them, they can be suppressed.
> 
> Certainly not a bug.

Hmm, then again, I tend to agree with Cornell that there are a bit too 
many cases where mgp[1:3]' would sensibly have differing sign, compared 
to cases where it is a mistake. In addition to internal tick marks and 
labels, there are also cases where the whole axis is shifted into the 
plot area. I'd more likely use axis(pos=...) for that, but still.

-- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Internal error in 'ls' for pathological environments (PR#14035)

2009-11-01 Thread p . dalgaard

macra...@alum.mit.edu wrote:
> nchar(with(list(2),ls())) gives an internal error. This is of course
> a peculiar call (no names in the list), but the error is not caught
> cleanly.
> 
> It is not clear from the documentation whether with(list(2)...) is
> allowable; if it is not, it should presumably give an error. If it is, then
> ls
> shouldn't have problems with the resulting environment.
> 
>> qq <- with(list(2),ls()) # An incorrect call (no
> names in list)
>> nchar(qq)
> Error in nchar(qq) : 'getEncChar' must be called on a CHARSXP  # ls returned
> a bad object
>> qq
> [1]Error: 'getEncChar' must be called on a CHARSXP
>> qq[1]
> [1]Error: 'getEncChar' must be called on a CHARSXP
>> qq[2]
> [1] NA
> 
> Apparently related:
> 
>> with(list(a=1,2),ls())
> Error in ls() : 'getEncChar' must be called on a CHARSXP

Thanks, yes, this looks wrong.

Also, closer to the root cause:

 > eval(quote(ls()),list(a=1,2))
Error in ls() : 'getEncChar' must be called on a CHARSXP

 > e <- evalq(environment(),list(2))
 > ls(e)
[1]Error: 'getEncChar' must be called on a CHARSXP

It is not quite clear that it should be allowed to have unnamed elements 
in lists used by eval (which is what with() uses internally). I suppose 
it should, since the intended semantics are unaffected in most cases. 
(I.e., there is nothing really wrong with eval(quote(a+b), 
list(a=1,b=2,3,4)), and people may have been using such code unwittingly 
all over.)

However, it IS a bug that we are creating ill-formed environments. The 
culprit seems to be that NewEnvironment (memory.c) is getting called in 
violation of its assumption that

" This definition allows
   the namelist argument to be shorter than the valuelist; in this
   case the remaining values must be named already.  (This is useful
   in cases where the entire valuelist is already named--namelist can
   then be R_NilValue.)
"

Removing the assumption from NewEnvironment looks like an efficiency 
sink, so I would suggest that we fix do_eval (eval.c) instead, 
effectively doing l <- l[names(l) != ""].  So in

 case LISTSXP:
env = NewEnvironment(R_NilValue, duplicate(CADR(args)), encl);
PROTECT(env);
break;

we need to replace the duplicate() call with something that skips the 
unnamed elements.

The only side effect I can see from such a change is the resulting 
environments get a different length() than before. I'd say that if there 
are coder who actually rely on the length of an invalid environment, 
then they'd deserve what they'd get...

-- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Inverting a square... (PR#13762)

2009-06-18 Thread P . Dalgaard

Refiling this.  The actual fix was slightly more complicated. Will soon
be committed to R-Patched (aka 2.9.1 beta).

-p

rvarad...@jhmi.edu wrote:
> Full_Name: Ravi Varadhan
> Version: 2.8.1
> OS: Windows
> Submission from: (NULL) (162.129.251.19)
>=20
>=20
> Inverting a matrix with solve(), but using LAPACK=3DTRUE, gives erroneo=
us
> results:

Thanks, but there seems to be a much easier fix.

Inside coef.qr, we have

coef[qr$pivot, ] <-
=2ECall("qr_coef_real", qr, y, PACKAGE =3D "base")[seq_len(p)]

which should be [seq_len(p),]

(otherwise, in the matrix case, the RHS will recycle only the 1st p
elements, i.e., the 1st column).

>=20
> Here is an example:
>=20
>  hilbert <- function(n) { i <- 1:n; 1 / outer(i - 1, i, "+") }
>   h5 <- hilbert(5)
>   hinv1 <- solve(qr(h5))
>   hinv2 <- solve(qr(h5, LAPACK=3DTRUE))=09
>   all.equal(hinv1, hinv2)  # They are not equal
>=20
> Here is a function that I wrote to correct this problem:
>=20
>   solve.lapack <- function(A, LAPACK=3DTRUE, tol=3D1.e-07) {
>   # A function to invert a matrix using "LAPACK" or "LINPACK"
> if (nrow(A) !=3D ncol(A)) stop("Matrix muxt be square")
> qrA <- qr(A, LAPACK=3DLAPACK, tol=3Dtol)
> if (LAPACK) {
>   apply(diag(1, ncol(A)), 2, function(x) solve(qrA, x))=20
> } else  solve(qrA)
>   }
>=20
> hinv3 <- solve.lapack(h5)
>   all.equal(hinv1, hinv3)  # Now, they are equal
>=20
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B=

  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Inverting a square... (PR#13765)

2009-06-18 Thread P . Dalgaard

Drats!. Jitterbug is up to its old PR renaming trick again... This
should have been a followup to PR#13762. Will refile.

p.dalga...@biostat.ku.dk wrote:
> rvarad...@jhmi.edu wrote:
>> Full_Name: Ravi Varadhan
>> Version: 2.8.1
>> OS: Windows
>> Submission from: (NULL) (162.129.251.19)
>>
>>
>> Inverting a matrix with solve(), but using LAPACK=3DTRUE, gives errone=
ous
>> results:
>=20
> Thanks, but there seems to be a much easier fix.
=2E


--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B=

  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Inverting a square matrix using solve() with LAPACK=TRUE (PR#13765)

2009-06-18 Thread p . dalgaard

rvarad...@jhmi.edu wrote:
> Full_Name: Ravi Varadhan
> Version: 2.8.1
> OS: Windows
> Submission from: (NULL) (162.129.251.19)
> 
> 
> Inverting a matrix with solve(), but using LAPACK=TRUE, gives erroneous
> results:

Thanks, but there seems to be a much easier fix.

Inside coef.qr, we have

coef[qr$pivot, ] <-
.Call("qr_coef_real", qr, y, PACKAGE = "base")[seq_len(p)]

which should be [seq_len(p),]

(otherwise, in the matrix case, the RHS will recycle only the 1st p 
elements, i.e., the 1st column).

> 
> Here is an example:
> 
>  hilbert <- function(n) { i <- 1:n; 1 / outer(i - 1, i, "+") }
>   h5 <- hilbert(5)
>   hinv1 <- solve(qr(h5))
>   hinv2 <- solve(qr(h5, LAPACK=TRUE)) 
>   all.equal(hinv1, hinv2)  # They are not equal
> 
> Here is a function that I wrote to correct this problem:
> 
>   solve.lapack <- function(A, LAPACK=TRUE, tol=1.e-07) {
>   # A function to invert a matrix using "LAPACK" or "LINPACK"
> if (nrow(A) != ncol(A)) stop("Matrix muxt be square")
> qrA <- qr(A, LAPACK=LAPACK, tol=tol)
> if (LAPACK) {
>   apply(diag(1, ncol(A)), 2, function(x) solve(qrA, x)) 
> } else  solve(qrA)
>   }
> 
> hinv3 <- solve.lapack(h5)
>   all.equal(hinv1, hinv3)  # Now, they are equal
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R build fails during make when configured with "--with-x=no" (PR#13666)

2009-04-20 Thread P . Dalgaard

j...@ku.edu wrote:

>=20
> If R is configured using the "--with=3Dx=3Dno" option, then the make fa=
ils with the
> following error:

> make[1]: *** [R] Error 1
> make[1]: Leaving directory `/home/jeet/Scratch/r-build/on-frontend/R-2.=
9.0/src'
> make: *** [R] Error 1
>=20
> The problem appears to be with the "src/modules/Makefile". Specfically,=
 lines
> 26-29:
>=20
>   @for d in "$(R_MODULES)"; do \
> (cd $${d} && $(MAKE) $@) || exit 1; \
>   done
>=20
> Here, R_MODULES is blank, resulting in the "cd" command transferring to=
 the
> user's home directory, where, of course, no Makefile is found resulting=
 in the
> error above.

(Even more "fun" would ensue if in fact there were a Makefile there...)


> Work-around appears to be to simply disable loop if R_MODULES is empty.=


Shell script and Make portability is a pain in the derriere, but
offhand, those double quotes just look wrong:

viggo:~/>for i in "" ; do echo $i; done

viggo:~/>for i in  ; do echo $i; done
viggo:~/>for i in "foo bar" ; do echo $i; done
foo bar
viggo:~/>for i in foo bar ; do echo $i; done
foo
bar

Notice that the versions with quotes invariably do the Wrong Thing


--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B=

  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] type.convert (PR#13646)

2009-04-10 Thread p . dalgaard

William Dunlap wrote:
> You may have to use
>   (unsigned int)(unsigned char)*s++
> instead of just
>   (unsigned int)*s++
> to avoid the sign extension.

Thanks again,

I probably won't be doing the change since I don't have a Windows build 
environment around, and I'm a bit superstitious about fixing bugs that I 
cannot see...

Let me just filter this information into the bug repository for now.

-pd

> 
> Bill Dunlap
> TIBCO Software Inc - Spotfire Division
> wdunlap tibco.com  
> 
>> -Original Message-
>> From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] 
>> Sent: Friday, April 10, 2009 1:41 PM
>> To: William Dunlap
>> Cc: r-devel@r-project.org
>> Subject: Re: [Rd] type.convert (PR#13646)
>>
>> William Dunlap wrote:
>>> I can reproduce the difference that Stefan saw, depending
>>> on whether or not I start Rgui with the flags
>>> --no-environ --no-Rconsole
>>> I think it boils down to the isBlankString() function.
>>> For the string "\247" it returns 1 when those flags are
>>> not present and 0 when they are.  isBlankString does use
>>> some locale-specific functions:
>>> Rboolean isBlankString(const char *s)
>>> {
>>> #ifdef SUPPORT_MBCS
>>> if(mbcslocale) {
>>> wchar_t wc; int used; mbstate_t mb_st;
>>> mbs_init(&mb_st);
>>> while( (used = Mbrtowc(&wc, s, MB_CUR_MAX, &mb_st)) ) {
>>> if(!iswspace(wc)) return FALSE;
>>> s += used;
>>> }
>>> } else
>>> #endif
>>> while (*s)
>>> if (!isspace((int)*s++)) return FALSE;
>>> return TRUE;
>>> }
>>>
>>> I was using R 2.8.1, downloaded precompiled from CRAN, on Windows
>>> XP SP3. The outputs of sessionInfo() and Sys.getenv() are the same
>>> in both sessions.  'Process Explorer' shows that the 2 sessions
>>> have the same dll's opened.
>> Thanks for that analysis Bill!
>>
>> Stefan was in "German_Austria.1252" which I don't think is 
>> multibyte, so 
>> only the else-clause should be relevant, pointing the finger rather 
>> squarely at isspace(). Googling indicates that others have 
>> been caught 
>> out by signed/unsigned char issues there. Should this 
>> possibly rather read
>>
>> if (!isspace((unsigned int)*s++)) return FALSE;
>>
>> ??
>>
 sessionInfo()
>>> R version 2.8.1 (2008-12-22) 
>>> i386-pc-mingw32 
>>>
>>> locale:
>>> LC_COLLATE=English_United 
>> States.1252;LC_CTYPE=English_United 
>> States.1252;LC_MONETARY=English_United 
>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>>> attached base packages:
>>> [1] stats graphics  grDevices utils datasets  
>> methods   base 
>>> I did the test with a dll compiled from
>>> #include 
>>> #include 
>>>
>>> void test_isBlankString(char **s, int *res)
>>> {
>>>*res = isBlankString(*s) ;
>>> }
>>>
>>> and called by .C("test_isBlankString","\247",-1L)
>>>
>>> I don't see the difference while running a version of 2.9.0(devel)
>>> compiled locally on 11 March 2009 (from svn rev 48116).
>>>
>>> Bill Dunlap
>>> TIBCO Software Inc - Spotfire Division
>>> wdunlap tibco.com  
>>>
 -Original Message-
 From: r-devel-boun...@r-project.org 
 [mailto:r-devel-boun...@r-project.org] On Behalf Of Peter Dalgaard
 Sent: Friday, April 10, 2009 2:03 AM
 To: Raberger, Stefan
 Cc: r-b...@r-project.org; r-de...@stat.math.ethz.ch
 Subject: Re: [Rd] type.convert (PR#13646)

 Raberger, Stefan wrote:
> Hi Peter,
>
> each of the four PCs actually has the same locale setting: 
>
>> Sys.setlocale("LC_CTYPE")
> [1] "German_Austria.1252"
>
> (all the other settings returned by invoking 
 Sys.getlocale() are identical as well).
> Just to be sure (because it's displayed incorrectly in my 
 browser on the bugtracking page): the character inside the 
 type.convert function ought to be a "section"-sign (HTML Code 
 § or § , in R "\247", and not a dot ".").

 I saw it correctly. It's "\302\247" in UTF8 locales, which is 
 of course 
 the reason I suspected locale settings, but I can't seem to 
 trigger the 
 NA behaviour.

 I'm at a loss here, but some ideas:

 In the cases where it returns NA, what type is it? (I.e. 
 storage.mode(type.convert()))

 What do you get from

  > charToRaw("§")
 [1] c2 a7

 (a7, presumably, but better check).

 -p

> -Ursprüngliche Nachricht-
> Von: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] 
> Gesendet: Donnerstag, 09. April 2009 19:26
> An: Raberger, Stefan
> Cc: r-de...@stat.math.ethz.ch; r-b...@r-project.org
> Betreff: Re: [Rd] type.convert (PR#13646)
>
> s.raber...@innovest.at wrote:
>> Full_Name: Stefan Raberger
>> Version: 2.8.1
>> OS: Windows XP
>> Submission from: (NULL) (213.185.163.242)
>>
>>
>> Hi there, 
>>
>> I recently noticed some strange behaviour of the command 
 "type.convert",
>>

Re: [Rd] [R] RNG Cycle and Duplication (PR#12537)

2008-08-14 Thread p . dalgaard

Shengqiao Li wrote:
> Hello all,
>
> I am generating large samples of random numbers. The RNG help page 
> says: "All the supplied uniform generators return 32-bit integer 
> values that are converted to doubles, so they take at most 2^32 
> distinct values and long runs will return duplicated values." But I 
> find that the cycles are not the same as the 32-bit integer.
>
> My test indicated that the cycles for Knuth's methods were 2^30 while 
> Wichmann-Hill's cycle was larger than 2^32! No numbers were duplicated 
> in 10M numbers generated by runif using Wichmann-Hill. The other three 
> methods had cycle length of 2^32.
>
> So, anybody can explain this? And any improvement to the 
> implementation can be made to increase the cycle length like the 
> Wichmann-Hill method?
>
What test? These are not simple linear congruential generators. Just 
because you get the same value twice, it doesn't mean that the sequence 
is repeating. Perhaps you should read the entire help page rather than 
just the note.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Bug in R 2.7 for over long lines (crasher+proposed fix!) (PR#11284)

2008-04-26 Thread p . dalgaard

[EMAIL PROTECTED] wrote:
> OK, I am just sending it here too as it looks like [EMAIL PROTECTED]
g
> is not the right place:
>  =20
I think it was seen there too, just that noone got around to reply. In=20
R-bugs, there's a filing system so that it won't be completely forgotten.=
=2E.

However, your mail seems to have gotten encoded in quoted-printable, you =

might want to follow up with a cleaned version. (Just keep the =20
(PR#11281) in the header).
> =3DEF=3DBB=3DBFOn Fri, 2008-04-25 at 08:48 +0200, Soeren Sonnenburg wro=
te:
>  =20
>> While trying to fix swig & R2.7 I actually discovered that there is a
>> bug in R 2.7 causing a crash (so R & swig might actually work):
>> =3D20
>> the bug is in ./src/main/gram.c  line 3038:
>> =3D20
>> } else { /* over-long line */
>> fixthis --> char *LongLine =3D3D (char *) malloc(nc);
>> if(!LongLine)
>> error(_("unable to allocate space for source line %
>>=20
> d"), xxlineno);
>  =20
>> strncpy(LongLine, (char *)p0, nc);
>>  bug -->LongLine[nc] =3D3D '\0';
>> SET_STRING_ELT(source, lines++,
>>mkChar2((char *)LongLine));
>> free(LongLine);
>> =3D20
>> note that LongLine is only nc chars long, so the LongLine[nc]=3D3D'\0'=

>>=20
> might
>  =20
>> be an out of bounds write. the fix would be to do
>> =3D20
>> =3DEF=3DBB=3DBFchar *LongLine =3D3D (char *) malloc(nc+1);=

>> =3D20
>> in line 3034
>> =3D20
>> Please fix and thanks to dirk for the debian r-base-dbg package!
>>=20
>
> Looking at the code again there seems to be another bug above this for
> the MAXLINESIZE test too:
>
> if (*p =3D3D=3D3D '\n' || p =3D3D=3D3D end - 1) {
> nc =3D3D p - p0;
> if (*p !=3D3D '\n')
> nc++;
> if (nc <=3D3D MAXLINESIZE) {
> strncpy((char *)SourceLine, (char *)p0, nc);
> bug2 -->SourceLine[nc] =3D3D '\0';
> SET_STRING_ELT(source, lines++,
>mkChar2((char *)SourceLine));
> } else { /* over-long line */
> char *LongLine =3D3D (char *) malloc(nc+1);
> if(!LongLine)
> error(_("unable to allocate space for source line %d"),=

> xxlineno);
> bug1 -->strncpy(LongLine, (char *)p0, nc);
> LongLine[nc] =3D3D '\0';
> SET_STRING_ELT(source, lines++,
>mkChar2((char *)LongLine));
> free(LongLine);
> }
> p0 =3D3D p + 1;
> }
>
>
> So I guess the test would be for nc < MAXLINESIZE above or to change
> SourceLine to have MAXLINESIZE+1 size.
>
> Alternatively as the strncpy manpage suggests do this for all
> occurrences of strncpy
>
>strncpy(buf, str, n);
>if (n > 0)
>buf[n - 1]=3D3D =3DE2=3D80=3D99\0=3DE2=3D80=3D99;
>
> this could even be made a makro / helper function ...
>
> And another update: This does fix the R+swig crasher for me (tested)!
>
> Soeren
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>  =20


--=20
   O__   Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] bad variable names when printing a data frame containing (PR#10732)

2008-02-09 Thread p . dalgaard

[EMAIL PROTECTED] wrote:
> library(glmpath)
> data(heart.data)
> # heart.data is a list, $y a vector, $x a matrix
> data <- data.frame(x=3DI(heart.data$x), y =3D heart.data$y)
>  =20
>> data[1:2,]
>>=20
> x.1   x.2   x.3   x.4   x.5   x.6   x.7   x.8   x.9 y
> 1   16012  5.73 23.11 149  25.3  97.252 1
> 2   144  0.01  4.41 28.61 055 28.87  2.0663 1
>  =20
>> dimnames(heart.data$x)[[2]]
>>=20
> [1] "sbp"   "tobacco"   "ldl"   "adiposity" "famhist"   "typea"=
   =20
> [7] "obesity"   "alcohol"   "age" =20
>
> Note that the printed variable names do not use the column names
> of the matrix.
>
> In contrast, in S-PLUS the names are used; the printout begins:
>x.sbp x.tobacco  x.ldl x.adiposity x.famhist x.typea x.obesity x.alc=
ohol=20
>  =20
The reason seems to be that format.AsIs is losing dimnames. That could=20
be easily fixed -- unless I'm overlooking something?

--=20
   O__   Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327=
918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327=
907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] xspline(..., draw=FALSE) fails if there is no open device (PR#10728)

2008-02-08 Thread p . dalgaard

[EMAIL PROTECTED] wrote:
> Full_Name: Jari Oksanen
> Version:  2.6.2 RC (2008-02-07 r44369)
> OS: Linux
> Submission from: (NULL) (130.231.102.145)
>
>
> Even if function xspline() is called with argument draw=3DFALSE, it req=
uires a
> graphics device (that it won't use since it was draw=3DFALSE). I run in=
to this
> because I intended to use xspline within a function (that does not yet =
draw:
> there is plot method for that), and the function failed when called in =
a virgin
> environment.=20
>
> Here is an example in a virgin environemt just after starting R:
>
>  =20
>> out <- xspline(c(0,1,0), c(1,0,1), draw=3DFALSE)
>>=20
> Error in xspline(c(0, 1, 0), c(1, 0, 1), draw =3D FALSE) :=20
>   plot.new has not been called yet
>  =20
>> str(out)
>>=20
> Error in str(out) : object "out" not found
>
> This works:
>
>  =20
>> plot(0)
>> out <- xspline(c(0,1,0), c(1,0,1), draw=3DFALSE)
>> str(out)
>>=20
> List of 2
>  $ x: num [1:3] 0 1 0
>  $ y: num [1:3] 1 0 1
>
> This won't:
>
>  =20
>> dev.off()
>>=20
> null device=20
>   1=20
>  =20
>> xspline(c(0,1,0), c(1,0,1), draw=3DFALSE)
>>=20
> Error in xspline(c(0, 1, 0), c(1, 0, 1), draw =3D FALSE) :=20
>   plot.new has not been called yet
>
> R graphics internal are black magic to me. However, it seems that the e=
rror
> messge comes from function GCheckState(DevDesc *dd) in graphics.c, whic=
h is
> called by do_xspline(SEXP call, SEXP op, SEXP args, SEXP env) in plot.c=
 even
> when xspline was called with draw =3D FALSE (and even before getting th=
e argument
> draw into do_xspline). It seems that graphics device is needed somewher=
e even
> with draw =3D FALSE, since moving the  GCheckState() test after findig =
the value
> draw, and executing the test only if draw=3DTRUE gave NaN as the numeri=
c output.=20
>
> If this is documented behaviour, the documentation escaped my attention=
 and beg
> for pardon. It may be useful to add a comment on the help page saying t=
hat an
> open graphics device is needed even when unused with draw=3DFALSE.
>
>  =20

I think the reason is that 2d splines are aspect ratio dependent. =20
There's this loop inside,

for (i =3D 0; i < nx; i++) {
xx[i] =3D x[i];
yy[i] =3D y[i];
GConvert(&(xx[i]), &(yy[i]), USER, DEVICE, dd);
}
=20
and that will not work without knowing how to convert to device=20
coordinates. The default for "border" may get you first, though.  That=20
seems to be documented incorrectly, by the way.

-p

> Cheers, Jari Oksanen
>
>  platform =3D i686-pc-linux-gnu
>  arch =3D i686
>  os =3D linux-gnu
>  system =3D i686, linux-gnu
>  status =3D RC
>  major =3D 2
>  minor =3D 6.2
>  year =3D 2008
>  month =3D 02
>  day =3D 07
>  svn rev =3D 44369
>  language =3D R
>  version.string =3D R version 2.6.2 RC (2008-02-07 r44369)
>
> Locale:
> LC_CTYPE=3Den_GB.UTF-8;LC_NUMERIC=3DC;LC_TIME=3Den_GB.UTF-8;LC_COLLATE=3D=
en_GB.UTF-8;LC_MONETARY=3Den_GB.UTF-8;LC_MESSAGES=3Den_GB.UTF-8;LC_PAPER=3D=
en_GB.UTF-8;LC_NAME=3DC;LC_ADDRESS=3DC;LC_TELEPHONE=3DC;LC_MEASUREMENT=3D=
en_GB.UTF-8;LC_IDENTIFICATION=3DC
>
> Search Path:
>  .GlobalEnv, package:stats, package:graphics, package:grDevices, packag=
e:utils,
> package:datasets, package:methods, Autoloads, package:base
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>  =20


--=20
   O__   Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327=
918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327=
907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] range( , na.rm = TRUE ) (PR#10508)

2007-12-11 Thread P . Dalgaard

Prof Brian Ripley wrote:
> I don't think that is the right fix.  All methods for min/max should
> now support na.rm=3DTRUE (but not finite=3DTRUE), so range.default shou=
ld
> just call min and max with that argument.
>
> I'd need to verify those 'should's first 
>
OK, that why I didn't touch the sources

-p

> Brian
>
> On Tue, 11 Dec 2007, Peter Dalgaard wrote:
>
>> (Drats! Jitterbug is playing tricks with the PR# again. Attempting to
>> refile so that we can kill PR#10509)
>>
>> Peter Dalgaard wrote:
>>> [EMAIL PROTECTED] wrote:
>>>
 --- Start of forwarded message ---
 Date: Tue, 13 Nov 2007 21:44:57 +0100
 To: Steve Mongin <[EMAIL PROTECTED]>
 Cc: [EMAIL PROTECTED]
 Subject: Re: range( , na.rm =3D TRUE )
 In-Reply-To: <[EMAIL PROTECTED]>
 Reply-To: [EMAIL PROTECTED]
 From: Kurt Hornik <[EMAIL PROTECTED]>
 X-AntiVirus: checked by AntiVir MailGate (version: 2.1.3-2; AVE:
 7.6.0.34; VDF: 7.0.0.210; host: fsme.wu-wien.ac.at)
 X-Virus-Scanned: ClamAV 0.90.3/4768/Tue Nov 13 18:25:08 2007 on
 pocken.wu-wien.ac.at
 X-Virus-Status: Clean



> Steve Mongin writes:
>
>


> Dear CRAN:
> I am running 'R' on Linux as follows:
>
>


>> version
>>
>>
>_
>   platform   i686-redhat-linux-gnu
>   arch   i686
>   os linux-gnu
>   system i686, linux-gnu
>   status
>   major  2
>   minor  6.0
>   year   2007
>   month  10
>   day03
>   svn rev43063
>   language   R
>   version.string R version 2.6.0 (2007-10-03)
>
>


> I have a question about the behavior of "range()" with missing date=
s.
>
>


> With the previous version (2.4?) , the command:
>
>


>> range( as.Date( c( "2007-11-06", NA ) ), na.rm =3D TRUE )
>>
>>


> yielded:
>
>


>> [1] "2007-11-06" "2007-11-06"
>>
>>


> Now I get:
>
>


>> [1] NA NA
>>
>>


> Is this a bug?
>
>


> Yes, I see in the "What's New" page:
>
>


>   "The Math2 and Summary groups (round, signif, all, any, max, min,=

>summ, prod, range) are now primitive."
>
>


> Is the "primitive" characteristic supposed to behave as above with
> missing dates?
>
>


> Thanks for any help that you can provide.
>
>
 This is really a question for r-devel or r-bugs, I think, but not fo=
r
 the CRAN maintainers.

 I would think it is a bug.  Perhaps simply file a bug report?



>>> Again? ;-)
>>>
>>> The bug is here:
>>>
>>>
 range.default

>>> function (..., na.rm =3D FALSE, finite =3D FALSE)
>>> {
>>> x <- c(..., recursive =3D TRUE)
>>> if (is.numeric(x)) {
>>> if (finite)
>>> x <- x[is.finite(x)]
>>> else if (na.rm)
>>> x <- x[!is.na(x)]
>>> }
>>> c(min(x), max(x))
>>> }
>>> 
>>>
>>> Objects of class Date are not considered numeric, so we end up taking=

>>> min and max without removing NA.
>>>
>>> One solution could be
>>>
>>> if (is.numeric(x) || inherits(x, "Date") ){}
>>>
>>>
>>>
>>
>>
>>
>


--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B=

  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327=
918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327=
907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] range( , na.rm = TRUE ) (PR#10508)

2007-12-11 Thread P . Dalgaard

(Drats! Jitterbug is playing tricks with the PR# again. Attempting to
refile so that we can kill PR#10509)

Peter Dalgaard wrote:
> [EMAIL PROTECTED] wrote:
>  =20
>> --- Start of forwarded message ---
>> Date: Tue, 13 Nov 2007 21:44:57 +0100
>> To: Steve Mongin <[EMAIL PROTECTED]>
>> Cc: [EMAIL PROTECTED]
>> Subject: Re: range( , na.rm =3D TRUE )
>> In-Reply-To: <[EMAIL PROTECTED]>
>> Reply-To: [EMAIL PROTECTED]
>> From: Kurt Hornik <[EMAIL PROTECTED]>
>> X-AntiVirus: checked by AntiVir MailGate (version: 2.1.3-2; AVE: 7.6.0=
=2E34; VDF: 7.0.0.210; host: fsme.wu-wien.ac.at)
>> X-Virus-Scanned: ClamAV 0.90.3/4768/Tue Nov 13 18:25:08 2007 on pocken=
=2Ewu-wien.ac.at
>> X-Virus-Status: Clean
>>
>>  =20
>>=20
>>> Steve Mongin writes:
>>>=20
>>>  =20
>>  =20
>>=20
>>> Dear CRAN:
>>> I am running 'R' on Linux as follows:
>>>=20
>>>  =20
>>  =20
>>=20
 version
  =20
=20
>>>_  =20
>>>   platform   i686-redhat-linux-gnu  =20
>>>   arch   i686   =20
>>>   os linux-gnu  =20
>>>   system i686, linux-gnu=20
>>>   status=20
>>>   major  2  =20
>>>   minor  6.0=20
>>>   year   2007   =20
>>>   month  10 =20
>>>   day03 =20
>>>   svn rev43063  =20
>>>   language   R  =20
>>>   version.string R version 2.6.0 (2007-10-03)
>>>=20
>>>  =20
>>  =20
>>=20
>>> I have a question about the behavior of "range()" with missing dates.=

>>>=20
>>>  =20
>>  =20
>>=20
>>> With the previous version (2.4?) , the command:
>>>=20
>>>  =20
>>  =20
>>=20
 range( as.Date( c( "2007-11-06", NA ) ), na.rm =3D TRUE )
  =20
=20
>>  =20
>>=20
>>> yielded:
>>>=20
>>>  =20
>>  =20
>>=20
 [1] "2007-11-06" "2007-11-06"
  =20
=20
>>  =20
>>=20
>>> Now I get:
>>>=20
>>>  =20
>>  =20
>>=20
 [1] NA NA
  =20
=20
>>  =20
>>=20
>>> Is this a bug?
>>>=20
>>>  =20
>>  =20
>>=20
>>> Yes, I see in the "What's New" page:
>>>=20
>>>  =20
>>  =20
>>=20
>>>   "The Math2 and Summary groups (round, signif, all, any, max, min,
>>>summ, prod, range) are now primitive."
>>>=20
>>>  =20
>>  =20
>>=20
>>> Is the "primitive" characteristic supposed to behave as above with
>>> missing dates?
>>>=20
>>>  =20
>>  =20
>>=20
>>> Thanks for any help that you can provide.
>>>=20
>>>  =20
>> This is really a question for r-devel or r-bugs, I think, but not for
>> the CRAN maintainers.
>>
>> I would think it is a bug.  Perhaps simply file a bug report?
>>
>>  =20
>>=20
> Again? ;-)
>
> The bug is here:
>
>  =20
>> range.default
>>=20
> function (..., na.rm =3D FALSE, finite =3D FALSE)
> {
> x <- c(..., recursive =3D TRUE)
> if (is.numeric(x)) {
> if (finite)
> x <- x[is.finite(x)]
> else if (na.rm)
> x <- x[!is.na(x)]
> }
> c(min(x), max(x))
> }
> 
>
> Objects of class Date are not considered numeric, so we end up taking
> min and max without removing NA.
>
> One solution could be
>
> if (is.numeric(x) || inherits(x, "Date") ){}
>
>
>  =20


--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B=

  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327=
918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327=
907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [EMAIL PROTECTED]: Re: range( , na.rm = (PR#10509)

2007-12-11 Thread P . Dalgaard

[EMAIL PROTECTED] wrote:
> --- Start of forwarded message ---
> Date: Tue, 13 Nov 2007 21:44:57 +0100
> To: Steve Mongin <[EMAIL PROTECTED]>
> Cc: [EMAIL PROTECTED]
> Subject: Re: range( , na.rm =3D TRUE )
> In-Reply-To: <[EMAIL PROTECTED]>
> Reply-To: [EMAIL PROTECTED]
> From: Kurt Hornik <[EMAIL PROTECTED]>
> X-AntiVirus: checked by AntiVir MailGate (version: 2.1.3-2; AVE: 7.6.0.=
34; VDF: 7.0.0.210; host: fsme.wu-wien.ac.at)
> X-Virus-Scanned: ClamAV 0.90.3/4768/Tue Nov 13 18:25:08 2007 on pocken.=
wu-wien.ac.at
> X-Virus-Status: Clean
>
>  =20
>> Steve Mongin writes:
>>=20
>
>  =20
>> Dear CRAN:
>> I am running 'R' on Linux as follows:
>>=20
>
>  =20
>>> version
>>>  =20
>>_  =20
>>   platform   i686-redhat-linux-gnu  =20
>>   arch   i686   =20
>>   os linux-gnu  =20
>>   system i686, linux-gnu=20
>>   status=20
>>   major  2  =20
>>   minor  6.0=20
>>   year   2007   =20
>>   month  10 =20
>>   day03 =20
>>   svn rev43063  =20
>>   language   R  =20
>>   version.string R version 2.6.0 (2007-10-03)
>>=20
>
>
>  =20
>> I have a question about the behavior of "range()" with missing dates.
>>=20
>
>  =20
>> With the previous version (2.4?) , the command:
>>=20
>
>  =20
>>> range( as.Date( c( "2007-11-06", NA ) ), na.rm =3D TRUE )
>>>  =20
>
>  =20
>> yielded:
>>=20
>
>  =20
>>> [1] "2007-11-06" "2007-11-06"
>>>  =20
>
>  =20
>> Now I get:
>>=20
>
>  =20
>>> [1] NA NA
>>>  =20
>
>  =20
>> Is this a bug?
>>=20
>
>  =20
>> Yes, I see in the "What's New" page:
>>=20
>
>  =20
>>   "The Math2 and Summary groups (round, signif, all, any, max, min,
>>summ, prod, range) are now primitive."
>>=20
>
>  =20
>> Is the "primitive" characteristic supposed to behave as above with
>> missing dates?
>>=20
>
>  =20
>> Thanks for any help that you can provide.
>>=20
>
> This is really a question for r-devel or r-bugs, I think, but not for
> the CRAN maintainers.
>
> I would think it is a bug.  Perhaps simply file a bug report?
>
>  =20
Again? ;-)

The bug is here:

> range.default
function (..., na.rm =3D FALSE, finite =3D FALSE)
{
x <- c(..., recursive =3D TRUE)
if (is.numeric(x)) {
if (finite)
x <- x[is.finite(x)]
else if (na.rm)
x <- x[!is.na(x)]
}
c(min(x), max(x))
}


Objects of class Date are not considered numeric, so we end up taking
min and max without removing NA.

One solution could be

if (is.numeric(x) || inherits(x, "Date") ){}


--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B=

  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327=
918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327=
907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] daylight saving / time zone issues with as.POSIXlt/as.POSIXct (PR#10393)

2007-11-01 Thread p . dalgaard

[EMAIL PROTECTED] wrote:
> Running under Windows XP 64 bit, as.POSIXlt()/as.POSIXct() seem
> to think that US time zones (EST5EDT, MST7MDT) switched from daylight
> savings back to standard time on Oct 28, 2007, whereas the switch
> is actually on Sun Nov 04, 2007.
>
>  =20
Not Our Problem. (This sort of thing never is. We are wholly dependent=20
on the OS for this information). Check out

http://support.microsoft.com/kb/933360

> Examples:
>
>  > Sys.timezone()
> [1] "Mountain Daylight Time"
>  > as.POSIXct("2007-10-30 12:38:47")
> [1] "2007-10-30 12:38:47 Mountain Daylight Time"
>  > # *** Should report 2007-10-30 14:38:47 EDT:
>  > as.POSIXlt(as.POSIXct("2007-10-30 12:38:47"), "EST5EDT")
> [1] "2007-10-30 13:38:47 EST"
>  > Sys.time()
> [1] "2007-11-01 09:22:28 Mountain Daylight Time"
>
>  > # Bad behavior is manifested in different ways with TZ=3D"MST7MDT"
>  > Sys.setenv(TZ=3D"MST7MDT")
>  > # *** Should report "12:38:47 MDT"
>  > as.POSIXct("2007-10-30 12:38:47")
> [1] "2007-10-30 12:38:47 MST"
>  > as.POSIXlt(as.POSIXct("2007-10-30 12:38:47"), "EST5EDT")
> [1] "2007-10-30 14:38:47 EST"
>  > # *** Should report "2007-11-01 09:23:09 MDT"
>  > Sys.time()
> [1] "2007-11-01 08:23:09 MST"
>  >
>  > sessionInfo()
> R version 2.6.0 Patched (2007-10-11 r43143)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=3DEnglish_United States.1252;LC_CTYPE=3DEnglish_United State=
s.1252;LC_MONETARY=3DEnglish_United States.1252;LC_NUMERIC=3DC;LC_TIME=3D=
English_United States.1252
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>  >
>
>
> Furthermore, with the timezone "Mountain Daylight Time"
> (which is the default I get when I start R), the switch
> appears to be on Nov 5 in 2006, whereas it actually was
> on Oct 29 in 2006.
>
>  > # New R session
>  > Sys.timezone()
> [1] "Mountain Daylight Time"
>  > # *** wrong switch in 2006 ***
>  > as.POSIXct("2006-10-30 12:38:47")+(-4:7)*(24*3600)
>   [1] "2006-10-26 12:38:47 Mountain Daylight Time"
>   [2] "2006-10-27 12:38:47 Mountain Daylight Time"
>   [3] "2006-10-28 12:38:47 Mountain Daylight Time"
>   [4] "2006-10-29 12:38:47 Mountain Daylight Time"
>   [5] "2006-10-30 12:38:47 Mountain Daylight Time"
>   [6] "2006-10-31 12:38:47 Mountain Daylight Time"
>   [7] "2006-11-01 12:38:47 Mountain Daylight Time"
>   [8] "2006-11-02 12:38:47 Mountain Daylight Time"
>   [9] "2006-11-03 12:38:47 Mountain Daylight Time"
> [10] "2006-11-04 12:38:47 Mountain Daylight Time"
> [11] "2006-11-05 11:38:47 Mountain Standard Time"
> [12] "2006-11-06 11:38:47 Mountain Standard Time"
>  > as.POSIXct("2007-10-30 12:38:47")+(-4:7)*(24*3600)
>   [1] "2007-10-26 12:38:47 Mountain Daylight Time"
>   [2] "2007-10-27 12:38:47 Mountain Daylight Time"
>   [3] "2007-10-28 12:38:47 Mountain Daylight Time"
>   [4] "2007-10-29 12:38:47 Mountain Daylight Time"
>   [5] "2007-10-30 12:38:47 Mountain Daylight Time"
>   [6] "2007-10-31 12:38:47 Mountain Daylight Time"
>   [7] "2007-11-01 12:38:47 Mountain Daylight Time"
>   [8] "2007-11-02 12:38:47 Mountain Daylight Time"
>   [9] "2007-11-03 12:38:47 Mountain Daylight Time"
> [10] "2007-11-04 11:38:47 Mountain Standard Time"
> [11] "2007-11-05 11:38:47 Mountain Standard Time"
> [12] "2007-11-06 11:38:47 Mountain Standard Time"
>  > Sys.setenv(TZ=3D"MST7MDT")
>  > Sys.timezone()
> [1] "MST"
>  > as.POSIXct("2006-10-30 12:38:47")+(-4:7)*(24*3600)
>   [1] "2006-10-26 13:38:47 MDT" "2006-10-27 13:38:47 MDT"
>   [3] "2006-10-28 13:38:47 MDT" "2006-10-29 12:38:47 MST"
>   [5] "2006-10-30 12:38:47 MST" "2006-10-31 12:38:47 MST"
>   [7] "2006-11-01 12:38:47 MST" "2006-11-02 12:38:47 MST"
>   [9] "2006-11-03 12:38:47 MST" "2006-11-04 12:38:47 MST"
> [11] "2006-11-05 12:38:47 MST" "2006-11-06 12:38:47 MST"
>  > # *** wrong switch in 2007 ***
>  > as.POSIXct("2007-10-30 12:38:47")+(-4:7)*(24*3600)
>   [1] "2007-10-26 13:38:47 MDT" "2007-10-27 13:38:47 MDT"
>   [3] "2007-10-28 12:38:47 MST" "2007-10-29 12:38:47 MST"
>   [5] "2007-10-30 12:38:47 MST" "2007-10-31 12:38:47 MST"
>   [7] "2007-11-01 12:38:47 MST" "2007-11-02 12:38:47 MST"
>   [9] "2007-11-03 12:38:47 MST" "2007-11-04 12:38:47 MST"
> [11] "2007-11-05 12:38:47 MST" "2007-11-06 12:38:47 MST"
>  >
>
> I see this behavior on all the Windows systems I have tried:
> Windows XP 64 bit, Windows XP 32 bit Pro, Windows XP home,
> Windows 2000, with a variety of R versions.  The systems
> have all relevant Windows updates applied (unless some were
> inadvertently missed) and the systems otherwise appear to
> behave correctly with respect to times and timezones.
>
> I do not see this problem on Ubuntu Linux systems.
>
> -- Tony Plate
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>  =20


--=20
   O__   Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+

Re: [Rd] (PR#10379) Re: x11(....) kills R without DISPLAY

2007-10-29 Thread p . dalgaard

Hin-Tak Leung wrote:
> Peter Dalgaard wrote:
> 
>> You need x11() with a valid display to trigger the bug:
>>
>> [EMAIL PROTECTED] BUILD]$ ssh -Y 192.168.1.10
>> [EMAIL PROTECTED]'s password:
>> Last login: Sat Oct 27 02:40:16 2007 from 192.168.1.11
>> [EMAIL PROTECTED] ~]$ echo $DISPLAY
>> localhost:10.0
>> [EMAIL PROTECTED] ~]$ DISPLAY=3D R -q
>>  > x11("localhost:10.0")
>> Error: Couldn't find per display information
>> [EMAIL PROTECTED] ~]$ uname -a
>> Linux janus 2.6.22.9-91.fc7 #1 SMP Thu Sep 27 20:47:39 EDT 2007=20
>> x86_64 x86_64 x86_64 GNU/Linux
>> [EMAIL PROTECTED] ~]$ cat /etc/issue
>> Fedora release 7 (Moonshine)
>> Kernel \r on an \m
>
> Agh, sorry. Yes, x11() (with or without $DISPLAY set) doesn't
> die catatrophically, x11("validinfo") does.
>
> HTL
The culprit would seem to be this bit of devX11.c

1302xtdpy =3D XtOpenDisplay(app_con, NULL, "r_x11",=20
"R_x11",
1303  NULL, 0, &zero, NULL);
1304toplevel =3D XtAppCreateShell(NULL, "R_x11",

The 2nd arg to XtOpenDisplay is listed as display_string, so passing a=20
NULL here seems like trouble when the default ways of finding the=20
display do not work.

Looks like a fix is to insert p instead of NULL. (Tested rudimentarily.)



--=20
   O__   Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327=
918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327=
907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] (PR#10379) Re: x11(....) kills R without DISPLAY

2007-10-26 Thread p . dalgaard

Hin-Tak Leung wrote:
> Peter Dalgaard wrote:
>> [EMAIL PROTECTED] wrote:
>>> Full_Name: Christian Brechbuehler
>>> Version: 2.4.1, 2.5.1, OS: Ubuntu GNU/Linux
>>> Submission from: (NULL) (24.61.47.236)
>>>
>>>
> 
>>> Example (start R without DISPLAY from bash):
>>>   % DISPLAY=3D R
>>>   > x11("localhost:11.0")# this is my valid=20
>>> DISPLAY
>>>   Error: Couldn't find per display information
>>>   %
> 
>>>  =20
>> I see this on Fedora 7 too. I suspect that the earlier report was=20
>> thought to be Mac specific.
> 
>
> I was experimenting with xvfb last week and didn't see the catatrophic =

> problem like that, so I tried again. Is it possible that this has=20
> already been fixed in R 2.6.0 ? (I am on fedora 7, x86_64 as well).
>
> --
> $ export -n DISPLAY
>
> $ R
> R version 2.6.0 (2007-10-03)
> ...
> > x11()
> Error in X11(display, width, height, pointsize, if (is.null(gamma)) 1=20
> else gamma,  :
>   unable to start device X11
> In addition: Warning message:
> In x11() : unable to open connection to X11 display ''
> > q()
> Save workspace image? [y/n/c]: n
> $ export DISPLAY=3D
> $ R
> R version 2.6.0 (2007-10-03)
> ...
> > x11()
> Error in X11(display, width, height, pointsize, if (is.null(gamma)) 1=20
> else gamma,  :
>   unable to start device X11
> In addition: Warning message:
> In x11() : unable to open connection to X11 display ''
> > q()
> Save workspace image? [y/n/c]: n
> --
>
You need x11() with a valid display to trigger the bug:

[EMAIL PROTECTED] BUILD]$ ssh -Y 192.168.1.10
[EMAIL PROTECTED]'s password:
Last login: Sat Oct 27 02:40:16 2007 from 192.168.1.11
[EMAIL PROTECTED] ~]$ echo $DISPLAY
localhost:10.0
[EMAIL PROTECTED] ~]$ DISPLAY=3D R -q
 > x11("localhost:10.0")
Error: Couldn't find per display information
[EMAIL PROTECTED] ~]$ uname -a
Linux janus 2.6.22.9-91.fc7 #1 SMP Thu Sep 27 20:47:39 EDT 2007 x86_64=20
x86_64 x86_64 GNU/Linux
[EMAIL PROTECTED] ~]$ cat /etc/issue
Fedora release 7 (Moonshine)
Kernel \r on an \m


--=20
   O__   Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327=
918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327=
907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Incorrect matrix of spearman correlations .... in 64-bit (PR#9570)

2007-03-16 Thread p . dalgaard

[EMAIL PROTECTED] wrote:
> Full_Name: Vladimir Obolonkin
> Version: tested in 2.0 to 2.4.1
> OS: linux, win, mac
> Submission from: (NULL) (202.14.96.194)
>
> {{ Subject shortened manually -- to pass anti-spam filters 
>
>Original-subject: Incorrect matrix of spearman correlations working \
>  large (24000 by 425 and 78 by 425 data frames)  in 64-bit Linux 
> machines;\
>   the same code gives correct results in 32-bits Win and Mac (PR#9568)
> }}
>
> cc_s<-cor(phenos,vec,method="spearman",use="pairwise.complete.obs")
>
> this is a line copied from real script producing different results in 64
> bit/Linux/R (wrong) and Mac&Win (correct)
>
> The script was implemented on 4 machines with Linux 64, Suse 9-10, in 5 
> variants
> of R -- versions from 2.3 to 2.4.1, compiled with few different settings of
> optimization and 2 versions of compilers. In all cases the results of spearman
> correlation were identical, but wrong.
>
> The same script was started up to 10 times in Win32 on Intel and Linux32 on 
> Mac
> with the Rs from 2.3 to 2.4.1 -- in this set of cases the results were 
> identical
> and correct.
>
> 'phenos' and 'vec' are data frames of 425x78 and 425x24128 respectively, all
> numeric variables. The 'phenos' has moderate number of NAs in some columnes.
>
> The problem dissapeared when trying to reduce the size of matrices (by 
> selection
> of rows and/or columns) and/or when simulating the data with random 
> generators.
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>   
I don't see anything to reproduce here, so how are we supposed to get a 
handle on the issue?

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] obscure error with subsetting as.list() of a function then (PR#9504)

2007-02-08 Thread p . dalgaard

[EMAIL PROTECTED] writes:

> Hello. I was writing some code that computes on the language and came acr=
oss
> this. I can work around it, but thought you might like to know about it.
>
>> f <- function(x) { NULL }
>> a <- as.list(f)[[1]]
>> a # ie print(a)
> Error: argument "a" is missing, with no default
>
> Note it says *argument* "a", which is strange. In fact, and unsurprisingl=
y, the bug lies
> with the object itself, not with print():
>
>> typeof(a)
> Error in typeof(a) : argument "a" is missing, with no default
>> deparse(a)
> Error in deparse(a) : argument "a" is missing, with no default
>
> However, this does work:
>> as.list(f)[[1]]
>
> It prints nothing, which is correct, and there is no error. So it seems t=
he bug lies with
> assigning a name to as.list(f)[[1]] as above, then trying to work with th=
at new object.


It's not a bug things work in ways that confuse users when they pry
into things they were not expected to pry into Do you have a good
reason to call this a bug?

What you're seeing is R's "missing argument object", via the default
value of the formal argument x. A slightly cleaner way to get your
result is

> formals(f)
$x


> a <-formals(f)$x
> a
Error: argument "a" is missing, with no default

Technically, the missing argument object is a zero-length variable
name:=20

> mode(formals(f)$x)
[1] "name"
> as.character(formals(f)$x)
[1] ""


Except for direct meddling with the formals(f), the only way to assign
the missing argument object is via parameter passing - any other
attempt to access it gives an error. So the common case is that the
object is indeed a function argument.



> Regards,
> [EMAIL PROTECTED]
>
>
> --please do not edit the information below--
>
> Version:
>  platform =3D i386-pc-mingw32
>  arch =3D i386
>  os =3D mingw32
>  system =3D i386, mingw32
>  status =3D
>  major =3D 2
>  minor =3D 4.1
>  year =3D 2006
>  month =3D 12
>  day =3D 18
>  svn rev =3D 40228
>  language =3D R
>  version.string =3D R version 2.4.1 (2006-12-18)
>
> Windows XP Professional (build 2600) Service Pack 2.0
>
> Locale:
> LC_COLLATE=3DEnglish_United Kingdom.1252;LC_CTYPE=3DEnglish_United Kingdo=
m.1252;LC_MONETARY=3DEnglish_United Kingdom.1252;LC_NUMERIC=3DC;LC_TIME=3DE=
nglish_United Kingdom.1252
>
> Search Path:
>  .GlobalEnv, file:c:/schupl/R/myRLib/.RData, package:stats, package:graph=
ics, package:grDevices, package:utils, package:datasets, package:methods, A=
utoloads, package:base
> ---
>
> This e-mail may contain confidential and/or privileged infor...{{dropped}}
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

--=20
   O__   Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R windows crash (PR#9426)

2006-12-21 Thread P . Dalgaard

Prof Brian Ripley wrote:
> On Thu, 21 Dec 2006, Peter Dalgaard wrote:
>
> [...]
>
>   
>> This seems reproducible on  Linux, except that it goes into an infinite
>> loop. The lm call seems to be the real culprit:
>>
>> 
>>> testfun <- function(aa=aa) return(aa)
>>> testfun()
>>>   
>> Error in testfun() : recursive default argument reference
>> 
>>> testfun <- function(aa=aa) lm(x~y,data=aa)
>>> testfun()
>>>   
>> (*poof*)
>> 
>
> The difference is in argument evaluation between closures and internal 
> functions (c() in my example, return() in yours).
>   
Er? I'd rather say that the issue is in the semantics of missing():

> f <- function(x) missing(x)
> testfun <- function(aa=aa) f(aa)
> testfun()
Error: segfault from C stack overflow

which is a bit nasty. AFAICS the thing is that the logic for detection
of recursive arguments works by forcing promises (if you at some point
need the result of the same promise you are forcing, you know that
something is wrong), but missing() tries hard not to force promises. We
already have the following anomaly,

> g <- function(v) missing(v)
> f <- function(v) g(v)
> f()
[1] TRUE
> f <- function(v=!h, h=!v) g(v)
> f()
[1] FALSE
> f <- function(v=!h, h) g(v)
> f()
[1] FALSE

so the fix could be to realize that we cannot detect missingness in a
perfectly reliable way and just pretend that arguments are always
non-missing when they have a default.

[r-bugs reinserted as cc:]

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] rm() deletes 'c' if c('a','b') is the argument (PR#9399)

2006-11-29 Thread p . dalgaard

Steven McKinney wrote:
> Same behaviour seen on Apple Mac OSX 10.4.8 platform:
>
>   
>> sessionInfo()
>> 
> R version 2.4.0 Patched (2006-10-31 r39758) 
> powerpc-apple-darwin8.8.0 
>
> locale:
> en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
>
> attached base packages:
> [1] "methods"   "stats" "graphics"  "grDevices" "utils" "datasets"  
> "base" 
>
> other attached packages:
> XML 
> "1.2-0" 
>   
>> ls()
>> 
> [1] "getMonograph" "last.warning" "myfun"   
>   
>> a <- 1
>> b <- 2
>> c <- letters
>> a
>> 
> [1] 1
>   
>> b
>> 
> [1] 2
>   
>> c
>> 
>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" 
> "s" "t" "u" "v" "w" "x" "y" "z"
>   
>> rm(c('a', 'b'))
>> a
>> 
> Error: object "a" not found
>   
>> b
>> 
> Error: object "b" not found
>   
>> c
>> 
> .Primitive("c")
>   
>> ls()
>> 
> [1] "getMonograph" "last.warning" "myfun"   
>   
>> a <- 1
>> b <- 2
>> d <- letters
>> ls()
>> 
> [1] "a""b""d""getMonograph" 
> "last.warning" "myfun"   
>   
>> rm(c('a', 'b'))
>> 
> Warning message:
> remove: variable "c" was not found 
>   
>> ls()
>> 
> [1] "d""getMonograph" "last.warning" "myfun"   
>   
>
> Steven McKinney
>
> Statistician
> Molecular Oncology and Breast Cancer Program
> British Columbia Cancer Research Centre
>
> email: [EMAIL PROTECTED]
>
> tel: 604-675-8000 x7561
>
> BCCRC
> Molecular Oncology
> 675 West 10th Ave, Floor 4
> Vancouver B.C. 
> V5Z 1L3
> Canada
>
>
>
>
> -Original Message-
> From: [EMAIL PROTECTED] on behalf of [EMAIL PROTECTED]
> Sent: Wed 11/29/2006 10:35 AM
> To: r-devel@stat.math.ethz.ch
> Cc: [EMAIL PROTECTED]
> Subject: [Rd] rm() deletes 'c' if c('a','b') is the argument (PR#9399)
>  
> Full_Name: Lixin Han
> Version: 2.4.0
> OS: Windows 2000
> Submission from: (NULL) (155.94.110.222)
>
>
> A character vector c('a','b') is supplied to rm().  As a result, 'c' is 
> deleted
> unintentionally.
>
>   
>> a <- 1:5
>> b <- 'abc'
>> c <- letters
>> ls()
>> 
> [1] "a" "b" "c"
>   
>> rm(c('a','b'))
>> ls()
>> 
> character(0)
>   
>
>   
The reason is that
 > x <- function(...) sapply(match.call(expand.dots = FALSE)$..., 
as.character)
 > x(c(a,b))
 [,1]
[1,] "c"
[2,] "a"
[3,] "b"

Which in turn happens because
 > as.character(quote(c(a,b)))
[1] "c" "a" "b"

I don't know if it really qualifies as a bug, but it's not documented 
that as.character() is used and I suppose we could be more careful with 
the argument checking.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread p . dalgaard

"Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:

> Peter,
>=20
> I ran the memory limit function you mention below and both versions provi=
de
> the same result:
>=20
> >
> > memory.limit(size=3D4095)
> NULL
> > memory.limit(NA)
> [1] 4293918720
> >
> I do have 4GB ram on my PC. As a more reproducible form of the test, I
> have attached output that uses a randomly generated dataset after fixing =
the
> seed. Same result as last time: works with 2.3.0 and not 2.4.0. I guess t=
he
> one caveat here is that I just increased the dataset size until I got the
> memory issue with at least one of the R versions. It's okay. No need to
> spend more time on this. I really don't mind using the previous version.
> Like you mentioned, probably just a function of the new version requiring
> more memory.


Hmm, you might want to take a final look at the Windows FAQ 2.9. I am
still not quite convinced you're really getting more than the default
1.5 GB.

Also, how much can you increase the problem size on 2.3.0 before it
breaks? If you can only go to say 39 or 40 variables, then there's
probably not much we can do. If it is orders of magnitude, then we may
have a real bug (or not: sometimes we fix bugs resulting from things
not being duplicated when they should have been, the fixed code then
uses more memory than the unfixed code.)

=20
> Thanks,
> Derek
>=20
>=20
>=20
> On 06 Nov 2006 21:42:04 +0100, Peter Dalgaard <[EMAIL PROTECTED]>
> wrote:
> >
> > "Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:
> >
> > > Thanks for the replies. Point taken regarding submission protocol. I
> > have
> > > included a text file attachment that shows the R output with version
> > 2.3.0and
> > > 2.4.0. A label distinguishing the version is included in the comments.
> > >
> > > A quick background on the attached example. The dataset has 650,000
> > records
> > > and 32 variables. the response is dichotomous (0/1) and i ran a logis=
tic
> > > model (i previously mentioned multinomial, but decided to start simple
> > for
> > > the example). Covariates in the model may be continuous or categorica=
l,
> > but
> > > all are numeric. You'll notice that the code is the same for both
> > versions;
> > > however, there is a memory error with the 2.3.0 version. i ran this
> > several
> > > times and in different orders to make sure it was not some sort of
> > hardware
> > > issue.
> > >
> > > If there is some sort of additional output that would be helpful, I c=
an
> > > provide as well. Or, if there is nothing I can do, that is fine also.
> >
> > I don't think it was ever possible to request 4GB on XP. The version
> > difference might be caused by different response to invalid input in
> > memory.limit(). What does memory.limit(NA) tell you after the call to
> > memory.limit(4095) in the two versions?
> >
> > If that is not the reason: What is the *real* restriction of memory on
> > your system? Do you actually have 4GB in your system (RAM+swap)?
> >
> > Your design matrix is on the order of 160 MB, so shouldn't be a
> > problem with a GB-sized workspace. However, three copies of it will
> > brush against 512 MB, and it's not unlikely to have that many copies
> > around.
> >
> >
> >
> > > -Derek
> > >
> > >
> > > On 11/6/06, Kasper Daniel Hansen < [EMAIL PROTECTED]> wrote:
> > > >
> > > > It would be helpful to produce a script that reproduces the error on
> > > > your system. And include details on the size of your data set and
> > > > what you are doing with it. It is unclear what function is actually
> > > > causing the error and such. Really, in order to do something about =
it
> > > > you need to show how to actually obtain the error.
> > > >
> > > > To my knowledge nothing _major_ has happened with the memory
> > > > consumption, but of course R could use slightly more memory for
> > > > specific purposes.
> > > >
> > > > But chances are that this is not really memory related but more
> > > > related to the functions your are using - perhaps a bug or perhaps a
> > > > user error.
> > > >
> > > > Kasper
> > > >
> > > > On Nov 6, 2006, at 10:20 AM, Derek Stephen Elmerick wrote:
> > > >
> > > > > thanks for the friendly reply. i think my description was fairly
> > > > > clear: i
> > > > > import a large dataset and run a model. using the same dataset, t=
he
> > > > > process worked previously and it doesn't work now. if the new
> > > > > version of R
> > > > > requires more memory and this compromises some basic data analyse=
s,
> > > > > i would
> > > > > label this as a bug. if this memory issue was mentioned in the
> > > > > documentation, then i apologize. this email was clearly not well
> > > > > received,
> > > > > so if there is a more appropriate place to post these sort of
> > > > > questions,
> > > > > that would be helpful.
> > > > >
> > > > > -derek
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard
> > > > > < [EMAIL PROTECTED]>
> > > > > wrote:
> > > > >>
> > > > >> [EMAIL PROTECT

Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread p . dalgaard

"Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:

> Thanks for the replies. Point taken regarding submission protocol. I have
> included a text file attachment that shows the R output with version 2.3.=
0and
> 2.4.0. A label distinguishing the version is included in the comments.
>=20
> A quick background on the attached example. The dataset has 650,000 recor=
ds
> and 32 variables. the response is dichotomous (0/1) and i ran a logistic
> model (i previously mentioned multinomial, but decided to start simple for
> the example). Covariates in the model may be continuous or categorical, b=
ut
> all are numeric. You'll notice that the code is the same for both version=
s;
> however, there is a memory error with the 2.3.0 version. i ran this sever=
al
> times and in different orders to make sure it was not some sort of hardwa=
re
> issue.
>=20
> If there is some sort of additional output that would be helpful, I can
> provide as well. Or, if there is nothing I can do, that is fine also.

I don't think it was ever possible to request 4GB on XP. The version
difference might be caused by different response to invalid input in
memory.limit(). What does memory.limit(NA) tell you after the call to
memory.limit(4095) in the two versions?=20

If that is not the reason: What is the *real* restriction of memory on
your system? Do you actually have 4GB in your system (RAM+swap)?=20

Your design matrix is on the order of 160 MB, so shouldn't be a
problem with a GB-sized workspace. However, three copies of it will
brush against 512 MB, and it's not unlikely to have that many copies
around.=20


=20
> -Derek
>=20
>=20
> On 11/6/06, Kasper Daniel Hansen <[EMAIL PROTECTED]> wrote:
> >
> > It would be helpful to produce a script that reproduces the error on
> > your system. And include details on the size of your data set and
> > what you are doing with it. It is unclear what function is actually
> > causing the error and such. Really, in order to do something about it
> > you need to show how to actually obtain the error.
> >
> > To my knowledge nothing _major_ has happened with the memory
> > consumption, but of course R could use slightly more memory for
> > specific purposes.
> >
> > But chances are that this is not really memory related but more
> > related to the functions your are using - perhaps a bug or perhaps a
> > user error.
> >
> > Kasper
> >
> > On Nov 6, 2006, at 10:20 AM, Derek Stephen Elmerick wrote:
> >
> > > thanks for the friendly reply. i think my description was fairly
> > > clear: i
> > > import a large dataset and run a model. using the same dataset, the
> > > process worked previously and it doesn't work now. if the new
> > > version of R
> > > requires more memory and this compromises some basic data analyses,
> > > i would
> > > label this as a bug. if this memory issue was mentioned in the
> > > documentation, then i apologize. this email was clearly not well
> > > received,
> > > so if there is a more appropriate place to post these sort of
> > > questions,
> > > that would be helpful.
> > >
> > > -derek
> > >
> > >
> > >
> > >
> > > On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard
> > > <[EMAIL PROTECTED]>
> > > wrote:
> > >>
> > >> [EMAIL PROTECTED] writes:
> > >>
> > >>> Full_Name: Derek Elmerick
> > >>> Version: 2.4.0
> > >>> OS: Windows XP
> > >>> Submission from: (NULL) ( 38.117.162.243)
> > >>>
> > >>>
> > >>>
> > >>> hello -
> > >>>
> > >>> i have some code that i run regularly using R version 2.3.x . the
> > >>> final
> > >> step of
> > >>> the code is to build a multinomial logit model. the dataset is
> > >>> large;
> > >> however, i
> > >>> have not had issues in the past. i just installed the 2.4.0
> > >>> version of R
> > >> and now
> > >>> have memory allocation issues. to verify, i ran the code again
> > >>> against
> > >> the 2.3
> > >>> version and no problems. since i have set the memory limit to the
> > >>> max
> > >> size, i
> > >>> have no alternative but to downgrade to the 2.3 version. thoughts?
> > >>
> > >> And what do you expect the maintainers to do about it? ( I.e. why are
> > >> you filing a bug report.)
> > >>
> > >> You give absolutely no handle on what the cause of the problem might
> > >> be, or even to reproduce it. It may be a bug, or maybe just R
> > >> requiring more memory to run than previously.
> > >>
> > >> --
> > >>   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, En=
tr.B
> > >> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> > >> (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
> > >> 35327918
> > >> ~~ - ([EMAIL PROTECTED])  FAX: (+45)
> > >> 35327907
> > >>
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
>=20
>=20
>=20
> > ##
> > ### R 2.4.0
> > ##
> >=20
> > rm(list=3Dls(all=3DTRUE))
> > memory.limit(siz

Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread p . dalgaard

"Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:

> thanks for the friendly reply. i think my description was fairly clear: i
> import a large dataset and run a model. using the same dataset, the
> process worked previously and it doesn't work now. if the new version of R
> requires more memory and this compromises some basic data analyses, i wou=
ld
> label this as a bug. if this memory issue was mentioned in the
> documentation, then i apologize. this email was clearly not well received,
> so if there is a more appropriate place to post these sort of questions,
> that would be helpful.

We have mailing lists.=20

[EMAIL PROTECTED]

would be appropriate.=20

You're still not giving sufficient information for anyone to come up
with a sensible reply, though.

Apologies if the tone came out a bit sharp, but allow me to
remind you that the first line of the report form you used reads:

"Before submitting a bug report, please read Chapter `R Bugs' of `The R
FAQ'. It describes what a bug is and how to report a bug."

You do have to realize that from the point of view of improving R, it
simply is not very informative that there is a user who has a problem
which (just?) fitted in available memory in a previous version, but
doesn't anymore.

>=20
> On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard <[EMAIL PROTECTED]>
> wrote:
> >
> > [EMAIL PROTECTED] writes:
> >
> > > Full_Name: Derek Elmerick
> > > Version: 2.4.0
> > > OS: Windows XP
> > > Submission from: (NULL) (38.117.162.243)
> > >
> > >
> > >
> > > hello -
> > >
> > > i have some code that i run regularly using R version 2.3.x . the fin=
al
> > step of
> > > the code is to build a multinomial logit model. the dataset is large;
> > however, i
> > > have not had issues in the past. i just installed the 2.4.0 version o=
f R
> > and now
> > > have memory allocation issues. to verify, i ran the code again against
> > the 2.3
> > > version and no problems. since i have set the memory limit to the max
> > size, i
> > > have no alternative but to downgrade to the 2.3 version. thoughts?
> >
> > And what do you expect the maintainers to do about it? ( I.e. why are
> > you filing a bug report.)
> >
> > You give absolutely no handle on what the cause of the problem might
> > be, or even to reproduce it. It may be a bug, or maybe just R
> > requiring more memory to run than previously.
> >
> > --
> >   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B
> > c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> > (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
> > 35327918
> > ~~ - ([EMAIL PROTECTED])  FAX: (+45)
> > 35327907
> >

--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] eval(match.call()) (PR#9339)

2006-11-03 Thread p . dalgaard

Bill Dunlap <[EMAIL PROTECTED]> writes:

> On Fri, 3 Nov 2006 [EMAIL PROTECTED] wrote:
>=20
> > > > On Fri, 2006-11-03 at 21:15 +0100, Peter Dalgaard wrote:
> > > > > > x <- quote(match.call())
> > > > > > eval(x)
> > > > > *** buffer overflow detected ***: /usr/lib/R/bin/exec/R terminated
> > > > > /lib/libc.so.6(__chk_fail+0x41)[0x1f1161]
> > > > > /lib/libc.so.6[0x1f0617]
> >
> > > > > does look like something that just Should Not Happen...
>=20
>=20
> I think valgrind shows the problem is in deparse.c:
> 245 strncpy(data, CHAR(STRING_ELT(svec, 0)), 10);
> 246 if (strlen(CHAR(STRING_ELT(svec, 0))) > 10) strcat(data, =
"...");
> You need to put a '\0' into data[10] after that strncpy
> so strcat can find the end of the string when the length
> of the copied string is >=3D10.  It currently runs into
> uninitialized memory at the end of ".Primitive".
>=20
> (This is in a copy of R source from June 2006.)

Now fixed in 2.4.0 Patched and the development version.

--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] eval(match.call()) (PR#9339)

2006-11-03 Thread p . dalgaard

Marc Schwartz <[EMAIL PROTECTED]> writes:

> On Fri, 2006-11-03 at 21:15 +0100, Peter Dalgaard wrote:
> > [EMAIL PROTECTED] writes:
> >=20
> > > Full_Name: Justin Harrington
> > > Version: 2.4.0
> > > OS: Fedora Core 6
> > > Submission from: (NULL) (142.103.121.203)
> > >=20
> > >=20
> > > When I type the (albeit stupid) command
> > >=20
> > > eval(match.call())
> > >=20
> > > R crashes with the following messages (truncated):
> > >=20
> > > *** buffer overflow detected ***: /usr/lib/R/bin/exec/R terminated
> >=20
> > Yes, don't do that then ;-)
>=20
> Indeed...  ;-)
>=20
> > Part of the puzzle is that
> >=20
> > > match.call()
> > match.call()
> >=20
> > which looks like something with potential for infinite recursion, but
> > that doesn't seem to be issue since
> >=20
> > > f <- function(call =3D sys.call(sys.parent()))call
> > > f()
> > f()
> > > eval(f())
> > f()
> >=20
> > does not exhibit the same crash. And indeed
> >=20
> > > x <- quote(match.call())
> > > eval(x)
> > *** buffer overflow detected ***: /usr/lib/R/bin/exec/R terminated
> > =3D=3D=3D=3D=3D=3D=3D Backtrace: =3D=3D=3D=3D=3D=3D=3D=3D=3D
> > /lib/libc.so.6(__chk_fail+0x41)[0x1f1161]
> > /lib/libc.so.6[0x1f0617]
> >=20
> > does look like something that just Should Not Happen...
>=20
> Peter, are you on FC6?
>=20
> On FC5, I cannot replicate your crash:
>=20
> > x <- quote(match.call())
> > eval(x)
> Error in match.call(definition, call, expand.dots) :
> '.Primitive...' is not a function
>=20
> ?

Yes, I'm on FC6 since yum had updated the 1229 packages this morning.

I see the crash with the FC6 RPM but not with a self-compiled R-patched.

--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 'make check' fails on d-p-q-r-tests (PR#9326)

2006-10-31 Thread p . dalgaard

Peter Kleiweg <[EMAIL PROTECTED]> writes:

> Peter Dalgaard schreef op de 31e dag van de wijnmaand van het jaar 2006:
>=20
> > Peter Kleiweg <[EMAIL PROTECTED]> writes:
> >=20
> > > Peter Dalgaard schreef op de 31e dag van de wijnmaand van het jaar 20=
06:
> > >=20
> > > > [EMAIL PROTECTED] writes:
> > > >=20
> > > > > 'make check' fails on d-p-q-r-tests:
> > > > ...
> > > > > --please do not edit the information below--
> > > > >=20
> > > > > Version:
> > > > >  platform =3D i686-pc-linux-gnu
> > > > >  arch =3D i686
> > > > >  os =3D linux-gnu
> > > > >  system =3D i686, linux-gnu
> > > > >  status =3D=20
> > > > >  major =3D 2
> > > > >  minor =3D 4.0
> > > > >  year =3D 2006
> > > > >  month =3D 10
> > > > >  day =3D 03
> > > > >  svn rev =3D 39566
> > > > >  language =3D R
> > > > >  version.string =3D R version 2.4.0 (2006-10-03)
> > > > >=20
> > > > > Locale:
> > > > > [EMAIL PROTECTED];LC_NUMERIC=3DC;[EMAIL PROTECTED];LC_COLL=
ATE=3DC;[EMAIL PROTECTED];[EMAIL PROTECTED];[EMAIL PROTECTED]
euro;LC_NAME=3DC;LC_ADDRESS=3DC;LC_TELEPHONE=3DC;[EMAIL PROTECTED]
o;LC_IDENTIFICATION=3DC
> > > >=20
> > > >=20
> > > > You need to be more specific (yes, it is unfortunate that we cannot
> > > > extract all details about Linuxen from the Version: listing). Which
> > > > distribution, did you compile youself or use a binary, and if the
> > > > former: did you set any special compiler flags?
> > >=20
> > > config.log:
> >=20
> > 11000 lines or so deleted, and you still didn't answer the questions
> >
> > (well, I can tell that it is SUSE, and it must be an old version since
> > you are using the ancient 2.95.3 compilers...)
>=20
> Let me see...
>=20
> > Which distribution
>=20
> SuSE. It's right there in config.log.
> And there is nothing ancient about 2.95.3 compilers.=20
>=20
> > did you compile youself or use a binary
>=20
> I sent you config.log. That should give you a clue.=20
>=20
> > did you set any special compiler flags?
>=20
> Did you read the first few lines of config.log?
>=20
>=20
> Any questions I missed?

Why should we care? It's your problem and your job to make it easier
for maintainers to track down problems. Copying config.log to a public
mailing list could be considered thoughtless, the above is plainly
insulting.=20

Maybe someone else will help you, but I have definitely lost all
interest. For your information, the problem does not occur in SUSE 9.3
and 10.0.


--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 'make check' fails on d-p-q-r-tests (PR#9326)

2006-10-31 Thread p . dalgaard

Peter Kleiweg <[EMAIL PROTECTED]> writes:

> Peter Dalgaard schreef op de 31e dag van de wijnmaand van het jaar 2006:
>=20
> > [EMAIL PROTECTED] writes:
> >=20
> > > 'make check' fails on d-p-q-r-tests:
> > ...
> > > --please do not edit the information below--
> > >=20
> > > Version:
> > >  platform =3D i686-pc-linux-gnu
> > >  arch =3D i686
> > >  os =3D linux-gnu
> > >  system =3D i686, linux-gnu
> > >  status =3D=20
> > >  major =3D 2
> > >  minor =3D 4.0
> > >  year =3D 2006
> > >  month =3D 10
> > >  day =3D 03
> > >  svn rev =3D 39566
> > >  language =3D R
> > >  version.string =3D R version 2.4.0 (2006-10-03)
> > >=20
> > > Locale:
> > > [EMAIL PROTECTED];LC_NUMERIC=3DC;[EMAIL PROTECTED];LC_COLLATE=
=3DC;[EMAIL PROTECTED];[EMAIL PROTECTED];[EMAIL PROTECTED]
o;LC_NAME=3DC;LC_ADDRESS=3DC;LC_TELEPHONE=3DC;[EMAIL PROTECTED];L=
C_IDENTIFICATION=3DC
> >=20
> >=20
> > You need to be more specific (yes, it is unfortunate that we cannot
> > extract all details about Linuxen from the Version: listing). Which
> > distribution, did you compile youself or use a binary, and if the
> > former: did you set any special compiler flags?
>=20
> config.log:

11000 lines or so deleted, and you still didn't answer the questions

(well, I can tell that it is SUSE, and it must be an old version since
you are using the ancient 2.95.3 compilers...)

--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] bug in rect (PR#9307)

2006-10-20 Thread p . dalgaard

Duncan Murdoch <[EMAIL PROTECTED]> writes:

> Are you fixing rect?

Done now (-patched and -devel).

--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Installation, permissions of /usr/local/lib/R (PR#9054)

2006-07-02 Thread p . dalgaard

Thibaut Jombart <[EMAIL PROTECTED]> writes:

> [EMAIL PROTECTED] wrote:
> 
> >I said
> >./configure --with-readline=no --with-x=no
> >make
> >make install
> >
> >and everything works except that /usr/local/lib/R/etc/ldpaths was not 
> >readable
> >as normal user.

> You don't say if you installed R as root or as a normal user.
> Did you try:
> chown your-user-name /usr/local/lib/R/ -R
> ?
> 
> Or, if you intend to make R available to several users, create a R users 
> group and then type the previous command replacing "your-user-name" by 
> "R-users-group-name". Maybe this would help.

More likely, the umask setting was too restrictive during make
install. AFAIR, "umask 022" (i.e. no write permission for anyone
except user, but read and execute allowed) is needed to enable
non-root users to run R. Some systems set it differently, and we're
not overriding that since it could be a policy issue. It's not a bug.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] a new way to crash R? (PR#8981)

2006-06-26 Thread p . dalgaard

Duncan Murdoch <[EMAIL PROTECTED]> writes:

> On 6/26/2006 7:31 AM, [EMAIL PROTECTED] wrote:
> > This works for me on both Linux and Windows.
> > 
> > Please check your memory usage: it does need about 900Kb of VM, and as you 
> > have less RAM than that installed you need to set --max-mem-size=1G or 
> > some such.
> > 
> > (It is quite likely that this is a Windows memory allocation failure: that 
> > has been reported before but not tracked down.)
> 
> I can make it reproducible now.  It happens relatively quickly when I 
> set max-mem-size=100M.  In R-devel, there's a call to malloc at line 
> 1952 of memory.c, and as R is running out of memory, that returns a -1 
> instead of a zero.  This causes a seg fault a few lines later.
> 
> The malloc code is quite complicated, so I can't see exactly why we're 
> getting the -1.

Hmm, some specs (www.opengroup.org) for malloc have the following

RETURN VALUE

Upon successful completion with size not equal to 0, malloc()
shall return a pointer to the allocated space. If size is 0,
either a null pointer or a unique pointer that can be successfully
passed to free() shall be returned. Otherwise, it shall return a
null pointer and set errno to indicate the error.

so the only standards-conforming interpretation of a -1 return is that
the request was for 0 bytes, and "-1" is the representation of the
unique pointer which can be passed to free(). Then again, when was a
Windows lib ever standards-conforming?

BTW, malloc never returns zero, but possibly NULL. 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Numerical error in R (win32) (PR#8909)

2006-05-30 Thread p . dalgaard

"ltp" <[EMAIL PROTECTED]> writes:

> Hi
> 
>   Thanks for the quick reply. However, I am not satisfied, as
> > round(3.1500, 1)
> [1] 3.1
> > round(3.7500, 1)
> [1] 3.8
> 
>   I think the problem is really more of an error in the rounding off
> algorithm than finite precision.

It isn't, and adding zeros doesn't change anything. The issue is that
3.15 has a nonterminating binary expansion, just like 1/7 has a
nonterminating decimal one. Please do read the references that have
already been provided to you.
 
3.15 (dec) == 11.0010011001100110011... (bin)
3.75 (dec) == 11.11 (bin)

> Thanks
> Teckpor
> 
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter
> Dalgaard
> Sent: Monday, May 29, 2006 17:49
> To: [EMAIL PROTECTED]
> Cc: r-devel@stat.math.ethz.ch; [EMAIL PROTECTED]
> Subject: Re: [Rd] Numerical error in R (win32) (PR#8909)
> 
> [EMAIL PROTECTED] writes:
> 
> > Hi
> > I had observed the following problem in R (also C, Matlab, and
> Python).
> > sprintf('%1.2g\n', 3.15)
> > give 3.1 instead of 3.2 whereas an input of 3.75 gives 3.8.
> > Java's System.out.printf is ok though.  
> >  
> > > round(3.75,1)
> > [1] 3.8
> > > round(3.15,1)
> > [1] 3.1
> >  
> > Similar outcome with sprintf in R.
> > 
> > 
> > However, the right answer should be 3.2
> 
> According to what? Remember that we're dealing with finite precision binary
> arithmetic here:
> 
> >  (3.15 - 3.1)<.05
> [1] TRUE
> >  abs(3.15 - 3.2)>.05
> [1] TRUE
> 
> See also FAQ 7.31.
>   
> > Regards
> > Teckpor
> > 
> 
> -- 
>O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
> ~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907
> 
> 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] pbinom( ) function (PR#8700)

2006-03-22 Thread p . dalgaard

Duncan Murdoch <[EMAIL PROTECTED]> writes:

> On 3/22/2006 3:52 AM, [EMAIL PROTECTED] wrote:
> >> "cspark" == cspark  <[EMAIL PROTECTED]>
> >> on Wed, 22 Mar 2006 05:52:13 +0100 (CET) writes:
> > 
> > cspark> Full_Name: Chanseok Park Version: R 2.2.1 OS: RedHat
> > cspark> EL4 Submission from: (NULL) (130.127.112.89)
> > 
> > 
> > 
> > cspark> pbinom(any negative value, size, prob) should be
> > cspark> zero.  But I got the following results.  I mean, if
> > cspark> a negative value is close to zero, then pbinom()
> > cspark> calculate pbinom(0, size, prob). 
> > 
> > >> pbinom( -2.220446e-22, 3,.1)
> > [1] 0.729
> > >> pbinom( -2.220446e-8, 3,.1)
> > [1] 0.729
> > >> pbinom( -2.220446e-7, 3,.1)
> > [1] 0
> > 
> > Yes, all the [dp]* functions which are discrete with mass on the
> > integers only, do *round* their 'x' to integers.
> > 
> > I could well argue that the current behavior is *not* a bug,
> > since we do treat "x close to integer" as integer, and hence 
> >pbinom(eps, size, prob)  with  eps "very close to 0" should give
> >pbinom(0,   size, prob)
> > as it now does.
> > 
> > However, for esthetical reasons, 
> > I agree that we should test for "< 0" first (and give 0 then) and only
> > round otherwise.  I'll change this for R-devel (i.e. R 2.3.0 in
> > about a month).
> > 
> > cspark> dbinom() also behaves similarly.
> > 
> > yes, similarly, but differently.
> > I have changed it (for R-devel) as well, to behave the same as
> > others d*() , e.g., dpois(), dnbinom() do.
> 
> Martin, your description makes it sound as though dbinom(0.3, size, 
> prob) would give the same answer as dbinom(0, size, prob), whereas it 
> actually gives 0 with a warning, as documented in ?dbinom.  The d* 
> functions only round near-integers to integers, where it looks as though 
> near means within 1E-7.  The p* functions round near integers to 
> integers, and truncate others to the integer below.

Well, the p-functions are constant on the intervals between
integers... (Or, did you refer to the lack of a warning? One point
could be that cumulative p.d.f.s extends naturally to non-integers,
whereas densities don't really extend, since they are defined with
respect to counting measure on the integers.)
 
> I suppose the reason for this behaviour is to protect against rounding 
> error giving nonsense results; I'm not sure that's a great idea, but if 
> we do it, should we really be handling 0 differently?

Most of these round-near-integer issues were spurred by real
programming problems. It is somewhat hard to come up with a problem
that leads you generate a binomial variate value with "floating point
noise", but I'm quite sure that we'll be reminded if we try to change
it... (One potential issue is back-calculation to counts from relative
frequencies).


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] sub returns garbage (PR#8687)

2006-03-17 Thread p . dalgaard

Peter Dalgaard <[EMAIL PROTECTED]> writes:

> [EMAIL PROTECTED] writes:
> 
> > Full_Name: Todd Bailey
> > Version: 2.1
> 
> Er, 2.2.1
> 
> > OS: Mac OS-X 10.4.3
> > Submission from: (NULL) (87.112.79.124)
> > 
> > 
> > sub returns garbage in some strings when replacing something with nothing 
> > and
> > fixed=TRUE.  For example:
> > 
> > > a=c('hello','hello'); sub('lo','',a,fixed=TRUE)
> > [1] "hel" "hel\0\0"
> > > a=c('hello','hello'); sub('lo','',a,fixed=FALSE)
> > [1] "hel" "hel"
> 
> Confirmed on Linux & Windows. We've seen this symptom a few times
> before 

...and fixed it, apparently! (Thanks Marc.) 

I claim XP braindamage -- I'm not Linuxifying my new ThinkPad until
FC5 is out.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] pbinom with size argument 0 (PR#8560)

2006-02-04 Thread p . dalgaard

P Ehlers <[EMAIL PROTECTED]> writes:

> I prefer a (consistent) NaN. What happens to our notion of a
> Binomial RV as a sequence of Bernoulli RVs if we permit n=0?
> I have never seen (nor contemplated, I confess) the definition
> of a Bernoulli RV as anything other than some dichotomous-outcome
> one-trial random experiment. 

What's the problem ??

An n=0 binomial is the sum of an empty set of Bernoulli RV's, and the
sum over an empty set is identically 0.

> Not n trials, where n might equal zero,
> but _one_ trial. I can't see what would be gained by permitting a
> zero-trial experiment. If we assign probability 1 to each outcome,
> we have a problem with the sum of the probabilities.

Consistency is what you gain. E.g. 

 binom(.,n=n1+n2,p) == binom(.,n=n1,p) * binom(.,n=n2,p)

where * denotes convolution. This will also hold for n1=0 or n2=0 if
the binomial in that case is defined as a one-point distribution at
zero. Same thing as any(logical(0)) etc., really.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Multiplication (PR#8466)

2006-01-06 Thread p . dalgaard

Thomas Lumley <[EMAIL PROTECTED]> writes:

> On Fri, 6 Jan 2006, [EMAIL PROTECTED] wrote:
> 
> > hi - in version 2.1 the command
> >
> > >-2^2
> >
> > gives
> >
> > -4
> >
> > as the answer.  (-2)^2 is evaluated correctly.
> 
> So is -2^2.  The precedence of ^ is higher than that of unary minus. It 
> may be surprising, but it *is* documented and has been in S for a long 
> time.

Pretty much standard too, for languages that have an exponentiation
operator. AFAICS Fortran, Perl, SAS all have ** at higher precedence
than unary minus (or equal, but evaluate right to left). Stata seems
like it might be the exception.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] update.formula gotcha (PR#8462)

2006-01-04 Thread p . dalgaard


(Reported by Søren Højsgaard)

Looks like update.formula is stripping of parentheses in cases where
they shouldn't be

> update.formula (Reaction ~ Days + (Days | Subject), . ~ . + I(Days^2))
Reaction ~ Days + Days | Subject + I(Days^2)

Notice that the right hand side is interpreted with the bar at the root
of the parse tree, as in
(Days + Days) | (Subject + I(Days^2)):

> f <- update.formula (Reaction ~ Days + (Days | Subject), . ~ . + I(Days^2))
> f[[3]]
Days + Days | Subject + I(Days^2)
> f[[3]][[1]]
`|`

This confuses lmer() rather badly:

library(lme4)
example(lmer)
update(fm1,formula = . ~ . + I(Days^2))

   -->

Error in x[[2]] : object is not subsettable
Error in model.matrix(eval(substitute(~T, list(T = x[[2]]))), frm) :
unable to find the argument 'object' in selecting a method for function 
'model.matrix'


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] shared-mime-info (PR#8278)

2005-11-04 Thread p . dalgaard

Vaidotas Zemlys <[EMAIL PROTECTED]> writes:

> Hi,
> 
> On 04 Nov 2005 13:51:56 +0100, Peter Dalgaard <[EMAIL PROTECTED]> wrote:
> 
> > One further thought about this:
> >
> > On SUSE,
> >
> > rpm -qif /usr/share/mime/
> >
> > points at
> >
> > http://www.freedesktop.org/wiki/Software_2fshared_2dmime_2dinfo
> >
> > So I guess that the proper tree to bark at is the upstreams
> > maintainers of
> >
> > http://freedesktop.org/~jrb/shared-mime-info-*.tar.gz
> >
> > Instructions there say to submit new XML files to
> >
> > https://bugs.freedesktop.org/buglist.cgi?product=shared-mime-info&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED
> >
> > It would likely be a good idea to send them first to R-devel for review.
> >
> 
> I already barked at upstream. The upstream barked back. The result is here:
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=1782

Aha... This is pretty weird, in light of the prescription on the website:

<<
Shared MIME database package

The core database and the update-mime-database program for extending
it are available from the [WWW]software pages.

If you have added types that should go in the common freedesktop.org
base list of types, you should create an enhancement request on
[WWW]the MIME bugtracker with your new XML file.
>>

If the procedure is different, perhaps we should ask them what it is?
I don't think we have a real problem with maintaining a "freedesktop"
subdir somewhere in the sources since it appears to cover quite a wide
range of systems, but we don't seem to know what to do with it.

The procedure appears to be different between Linuxen: On SUSE, I get

viggo:~/>rpm -qf /usr/share/mime/text/x-texinfo.xml
shared-mime-info-0.15.cvs20050321-3

whereas FC4 has

[EMAIL PROTECTED] ~]$ rpm -qf /usr/share/mime/text/x-texinfo.xml
file /usr/share/mime/text/x-texinfo.xml is not owned by any package

(and likewise 60-odd other .xml files). So it seems that SUSE collects
all this stuff in a single RPM and FC4 lets it be handled by the
post-install mechanism (on each package or by "exploding"
freedesktop.org.xml ??)
 
> There you can find xml file for R scripts. I've made it from some
> example. It is really only a proof of a concept. But it would not be
> very difficult to produce xml files for mimetypes of all R related
> files. We must only decide which R related files would benefit from
> having mimetypes.
> 
> My proposal is
> 1. R source code, R scripts. Files with extensions .R, .r and others
> (.q?, .s?, .S?). Mimetypes text/x-R, text/x-Rsrc

My inclination would be to stick with .R, possibly adding .r to guard
against Windows case-folding issues, but .r used to be Ratfor files.
.q/.s/.S are used by some people supporting both R and S-PLUS, but I
don't think they care how such files are displayed by Nautilus and
Konqueror... 

> 2. R documentation files. File extension .Rd. Mimetype text/x-Rd

OK, modulo case-fold

> 3. RData files. File extension .RData, files which at beginning have
> RDX2. Mimetype application/x-RData.

Why the RDX2 bit?? We do have .RDA from windows, too. 


> 4. Rhistory files. File extension .Rhistory. Mimetype text/x-Rhistory

OK.

> 5. R transcript files from ESS/Emacs. File extension .Rt. Mimetype
> text/x-Rtranscript

.Rout, please. Also .Rout.save and .Rout.fail. (And it's not just
ESS that creates them).

Also

6. Rprofile files .Rprofile or Rprofile.

> The relevant xml code could be pushed upstream to end up in
> freedesktop.org.xml, or it could be distributed with R linux package,
> and installed into relevant subdirectory of /usr/share/mime. With a
> bit more work the result could be, that people using for example
> Nautilus (graphical Gnome browser) could see R related files displayed
> with R logo, and clicking them could result in various appropriate
> actions. For example for .RData R process could be iinvoked and
> relevant .RData file could be loaded.

Some fun potential with gedit/Kate plugins too (ESS for the 21st
century anyone?)

> I could write and test the xml code. But first we have to agree on
> which files benefit from having mimetypes and how the mimetypes should
> be named. Feel free to suggest.


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Assigning a zero length vector to a list (PR#8157)

2005-09-26 Thread p . dalgaard

Duncan Murdoch <[EMAIL PROTECTED]> writes:

> After foo<-list(), foo$bar is NULL, so we can simplify this.
> 
> Here's a simpler version:
> 
> # These work, which is a bit of a surprise, but there is some 
> inconsistency:   one x becomes a list, the other is numeric:
>  > x <- NULL
>  > x[[1]] <- 1:10
>  > x
> [[1]]
>   [1]  1  2  3  4  5  6  7  8  9 10
> 
>  > x <- NULL
>  > x[[1]] <- 1
>  > x
> [1] 1
> 
> 
> # This generates the same bug as the above:
>  > x <- NULL
>  > x[[1]] <- numeric(0)
>  > x
> [1] 4.250083e-314
> 
> It looks like we're trying to be too clever with handling assignments to 
> components of NULL.  Wouldn't it make more sense for those to generate 
> an error?

Once upon a time, we had pairlists, and NULL was the empty list. This
looks like it might be a relic. If so, it likely also predates
consistent handling of zero-length vectors, so something is getting
confused. I think it would be reasonable to expect similar results to
this: 

> x<-list()
> x[[1]] <- numeric(0)
> x
[[1]]
numeric(0)

 

S-PLUS also tries to handle NULL as a zero length list, with some
anomalies:

> x <- NULL
> x[[1]] <- numeric(0)
> x
$value:
numeric(0)


> x <- list()
> x[[1]] <- numeric(0)
> x
[[1]]:
numeric(0)


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

43 matches

Mail list logo