from:"Prof Brian Ripley"

Re: [R] How to do ANOVA with fractional values and overcome the error: Error in `storage.mode<-`(`tmp`, value = "double") : invalid to change the storage mode of a factor

2007-09-06 Thread Prof Brian Ripley

Your data file has commas as the decimal point. Use read.csv2 for such 
files.

What happened was that PercentError was read as a factor, and you can't do 
ANOVA on factors.  The warning

> In addition: Warning message:
> using type="numeric" with a factor response will be ignored in:
> model.response(mf, "numeric")

told you and us this quite explicitly.  If you get an error, also look at 
the warnings which may well (as here) tell you what precipitated the 
error.

On Thu, 6 Sep 2007, Emre Sevinc wrote:

> I have exported a CSV file from my EXCEL worksheet and its last column 
> contained decimal values:
>
> Subject;Group;Side;Difference;PercentError
> M3;1;1;;
> M5;1;1;375;18,75
> M8;1;1;250;14,58
> M10;1;1;500;12,50
> M12;1;1;375;25,00
> .
> .
> .
>
>
> When I tried to do ANOVA test on it, R complained by givin error:
>
>> Anova3LongAuditoryFemaleError.data <- read.csv("C:\\Documents\ and\ 
>> Settings\\Administrator\\My 
>> Documents\\CogSci\\tez\\Anova3LongAuditoryFemaleError.csv", header = TRUE, 
>> sep = ";")
>
>> Anova3LongAuditoryFemaleError.aov = aov(PercentError ~ (Group * Side), data 
>> = Anova3LongAuditoryFemaleError.data)
>
> Error in `storage.mode<-`(`*tmp*`, value = "double") :
>invalid to change the storage mode of a factor
> In addition: Warning message:
> using type="numeric" with a factor response will be ignored in:
> model.response(mf, "numeric")
>
> What must I do in order to make the ANOVA test on these fractional data?
>
> Regards.
>
> --
> Emre Sevinc
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rggobi compilation error: display.c

2007-09-05 Thread Prof Brian Ripley


On Wed, 5 Sep 2007, Yuelin Li wrote:


On a ubuntu linux computer (Feisty, i386), I compile R and additional
packages from source.  The compiler is gcc 4.1.2.

The problem is, I can run "sudo R" and successfully compile all
packages (e.g., MASS, lattice) except rggobi.  The error seems to be
in display.c.  My ggobi is in /usr/local/, which R can find.  I don't
think this is a dependence issue because install.packages(...,
dependence=TRUE).  The script complains about not finding
"/usr/local/lib/R/library/rggobi/libs/*", but that directory is there,
with one file "rggobi.so" (possibly from an earlier successful
compilation).


I don't think the file was there at the time, before the previous 
installation was restored.



Any suggestions on what I am doing wrong?  Many thanks in advance.


Here is the error:


display.c:37: error: too many arguments to function ÿÿklass->createWithVarsÿÿ


That is a symptom of installing rggobi_2.1.6 against ggobi 2.1.4: 
unfortunately rggobi's configure did not check the ggobi version.
It looks like you have an earlier rggobi installed, probably the one 
appropriate to your ggobi version.



Yuelin.


- R output ---

install.packages("rggobi", repos = "http://lib.stat.cmu.edu/R/CRAN";, 
dependencies = TRUE, clean = TRUE)

trying URL 'http://lib.stat.cmu.edu/R/CRAN/src/contrib/rggobi_2.1.6.tar.gz'
Content type 'application/x-gzip' length 424483 bytes
opened URL
==
downloaded 414Kb

* Installing *source* package 'rggobi' ...
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for GGOBI... yes
configure: creating ./config.status
config.status: creating src/Makevars
** libs
gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/lib/R/include -g 
-DUSE_EXT_PTR=1 -D_R_=1 -I/usr/local/include/ggobi -I/usr/include/gtk-2.0 
-I/usr/include/libxml2 -I/usr/lib/gtk-2.0/include -I/usr/include/atk-1.0 
-I/usr/include/cairo -I/usr/include/pango-1.0 -I/usr/include/glib-2.0 
-I/usr/lib/glib-2.0/include -I/usr/include/freetype2 -I/usr/include/libpng12   
-I/usr/local/include-fpic  -g -O2 -c brush.c -o brush.o

[... snipped ...]

gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/lib/R/include -g 
-DUSE_EXT_PTR=1 -D_R_=1 -I/usr/local/include/ggobi -I/usr/include/gtk-2.0 
-I/usr/include/libxml2 -I/usr/lib/gtk-2.0/include -I/usr/include/atk-1.0 
-I/usr/include/cairo -I/usr/include/pango-1.0 -I/usr/include/glib-2.0 
-I/usr/lib/glib-2.0/include -I/usr/include/freetype2 -I/usr/include/libpng12   
-I/usr/local/include-fpic  -g -O2 -c dataset.c -o dataset.o
gcc -std=gnu99 -I/usr/local/lib/R/include -I/usr/local/lib/R/include -g 
-DUSE_EXT_PTR=1 -D_R_=1 -I/usr/local/include/ggobi -I/usr/include/gtk-2.0 
-I/usr/include/libxml2 -I/usr/lib/gtk-2.0/include -I/usr/include/atk-1.0 
-I/usr/include/cairo -I/usr/include/pango-1.0 -I/usr/include/glib-2.0 
-I/usr/lib/glib-2.0/include -I/usr/include/freetype2 -I/usr/include/libpng12   
-I/usr/local/include-fpic  -g -O2 -c display.c -o display.o
display.c: In function ÿÿRS_GGOBI_createDisplayÿÿ:
display.c:37: warning: passing argument 3 of ÿÿklass->createWithVarsÿÿ makes 
pointer from integer without a cast
display.c:37: warning: passing argument 4 of ÿÿklass->createWithVarsÿÿ from 
incompatible pointer type
display.c:37: warning: passing argument 5 of ÿÿklass->createWithVarsÿÿ from 
incompatible pointer type
display.c:37: error: too many arguments to function ÿÿklass->createWithVarsÿÿ
display.c:39: warning: passing argument 4 of ÿÿklass->createÿÿ from 
incompatible pointer type
display.c:39: error: too many arguments to function ÿÿklass->createÿÿ
make: *** [display.o] Error 1
chmod: cannot access `/usr/local/lib/R/library/rggobi/libs/*': No such file or 
directory
ERROR: compilation failed for package 'rggobi'
** Removing '/usr/local/lib/R/library/rggobi'
** Restoring previous '/usr/local/lib/R/library/rggobi'

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Choosing the optimum lag order of ARIMA model

2007-09-05 Thread Prof Brian Ripley

On Wed, 5 Sep 2007, Megh Dal wrote:

> Hi Leeds, Thanx for this reply. Actually I did not want to know whether 
> any differentiation is needed or not. My question was that : what is the 
> difference between two models :
>
>  arima(data, c(2,1,2))
>
>  and
>
>  arima(diff(data), c(2,0,2))
>
>  If I am correct then those two models are same. Therefore I should get 
> same results for both of the cases. Am I doing something wrong?

They are not the same.  Please do study the help page, and in particular 
the 'include.mean' argument.  One is a model for n observations and 
the other for n-1 observations, and how that affects the issue is 
discussed on the help page.  With the right options you will get similar 
but not identical results.

> arima(x, c(2,1,2), method="ML")

Coefficients:
  ar1  ar2  ma1 ma2
   0.0786  -0.3561  -0.0869  0.1272
s.e.  0.6135   0.4296   0.6564  0.4549

sigma^2 estimated as 0.01368:  log likelihood = 46.46,  aic = -82.92

> arima(diff(x), c(2,0,2), method="ML", include.mean=FALSE)

Coefficients:
  ar1  ar2  ma1 ma2
   0.0786  -0.3561  -0.0869  0.1272
s.e.  0.6135   0.4296   0.6564  0.4549

sigma^2 estimated as 0.01329:  log likelihood = 47.38,  aic = -82.76

And did you have permission to copy private (and impolite) messages from 
Mr Leeds to this list?  If you did, please say so in your own posting for 
the record.  Since I don't have such permission I have deleted them from 
this reply.

Professor Ripley

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] capture.out(system())?

2007-09-05 Thread Prof Brian Ripley

On Wed, 5 Sep 2007, Gustaf Rydevik wrote:

> On 9/4/07, Werner Wernersen <[EMAIL PROTECTED]> wrote:
>> Hi,
>>
>> I am trying to capture the console output of program I
>> call via system() but that always returns only
>> character(0).
>>
>> For example:
>> capture.output(system("pdflatex out.tex") )
>>
>> will yield:
>> character(0)
>>
>> and the output still written to the R console.
>>
>> Is there a command for intercepting this output?
>>
>> Thank you!
>>   Werner
>>
>
> ?sink()

That is used by capture.output() to capture R output, but this question is 
about output that never goes near R.

The answer is in ?system, but might depend on the unstated OS.  Arguments 
'intern' and 'show.output.on.console' are relevant.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bootstrap confidence intervals with previously existing bootstrap sample

2007-09-04 Thread Prof Brian Ripley

On Tue, 4 Sep 2007, [EMAIL PROTECTED] wrote:

> Dear R users,
>
> I am new to R. I would like to calculate bootstrap confidence intervals
> using the BCa method for a parameter of interest. My situation is this: I
> already have a set of 1000 bootstrap replicates created from my original
> data set. I have already calculated the statistic of interest for each
> bootstrap replicate, and have also calculated the mean for this statistic
> across all the replicates. Now I would like to calculate Bca confidence
> intervals for this statistic. Is there a way to import my
> previously-calculated set of 1000 statistics into R, and then calculate
> bootstrap confidence intervals around the mean from this imported data?
>
> I have found the code for boot.ci in the manual for the boot package, but
> it looks like it requires that I first use the "boot" function, and then
> apply the output to "boot.ci". Because my bootstrap samples already exist,
> I don't want to use "boot", but just want to import the 1000 values I have
> already calculated, and then get R to calculate the mean and Bca confidence
> intervals based on these values. Is this possible?

Yes, it is possible but you will have to study the internal structure of 
an object of class "boot" (which is documented on the help page) and mimic 
it.  You haven't told us which type of bootstrap you used, which is one of 
the details you need to supply.

It might be slightly easier to work with function bcanon in package 
bootstrap, which you would need to edit to suit your purposes.

I don't know why you have picked on the BCa method: my experience is that 
if you need to correct the basic method you often need far more than 1000 
samples to get reliable results.

> Hopefully this makes sense. Thanks so much for any help or advice,
>
> Christy Dolph
>
> Graduate Student
> Water Resources Science
> University of Minnesota-Twin Cities

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Efficient sampling from a discrete distribution in R

2007-09-03 Thread Prof Brian Ripley

On Mon, 3 Sep 2007, Issac Trotts wrote:

> Hello r-help,
>
> As far as I've seen, there is no function in R dedicated to sampling
> from a discrete distribution with a specified mass function.  The
> standard library doesn't come with anything called rdiscrete or rpmf,
> and I can't find any such thing on the cheat sheet or in the
> Probability Distributions chapter of _An Introduction to R_.  Googling
> also didn't bring back anything.  So, here's my first attempt at a
> solution.  I'm hoping someone here knows of a more efficient way.

It's called sample().

There are much more efficient algorithms than the one you used, and 
sample() sometimes uses one of them (Walker's alias method): see any good 
book on simulation (including my 'Stochastic Simulation, 1987).

> # Sample from a discrete distribution with given probability mass function
> rdiscrete = function(size, pmf) {
>  stopifnot(length(pmf) > 1)
>  cmf = cumsum(pmf)
>  icmf = function(p) {
>min(which(p < cmf))
>  }
>  ps = runif(size)
>  sapply(ps, icmf)
> }
>
> test.rdiscrete = function(N = 1) {
>  err.tol = 6.0 / sqrt(N)
>  xs = rdiscrete(N, c(0.5, 0.5))
>  err = abs(sum(xs == 1) / N - 0.5)
>  stopifnot(err < err.tol)
>  list(e = err, xs = xs)
> }
>
> Thanks,
> Issac
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different behavior of mtext

2007-09-03 Thread Prof Brian Ripley


On Mon, 3 Sep 2007, Sébastien wrote:

Ok, the problem is clear now. I did not get that 'user-coordinates' was 
refering to par("usr"), when I read the help of mtext. If I may ask you some 
additional questions:
- you mentioned a missing unit() call ; at which point should it be done in 
my code examples ?


Before it is used.  The problem is that I believe more than one package 
has a unit() function.


- could you give me some advices or helpful links about how to set up a grid 
viewport ? - and finally, probably a stupid question: is a gridview 
automatically set up when a plotting function is called ?


If you want to mix grid and base graphics, you need package gridBase, but 
really I would not advise a beginner to be using grid directly (that is, 
not via lattice to ggplot*).




Sebastien

PS: To answer to your final question, my goal is to center a block of legend 
text on the device but to align the text to the left of this block.


Prof Brian Ripley a écrit :

On Sun, 2 Sep 2007, Sébastien wrote:


Dear R Users,

I am quite surprised to see that mtext gives different results when it
is used with 'pairs' and with "plot'. In the two following codes, it
seems that the 'at' argument in mtext doesn't consider the same unit 
system.


It is stated to be in 'user coordinates'.  Your code does not work because 
unit() is missing.  If you mean the one from package grid, "npc" is not 
user coordinates (and refers to a grid viewport which you have not set up 
and coincidentally is the same as the initial user coordinate system to 
which pairs() has reverted).


Try par("usr") after your pairs() and plot() calls to see the difference.
Plotting a 2x2 array of plots _is_ different from plotting one, so this 
should be as expected.


Since centring is the default for 'adj', it is unclear what you are trying 
to achieve here.



I would appreciate your comments on this issue.

Sebastien

# Pairs

mydata<-data.frame(x=1:10,y=1:10)

par(cex.main=1, cex.axis=1, cex.lab=1, lwd=1,
   mar=c(5 + 5,4,4,2)+0.1)

pairs(mydata,oma=c(5 + 5,4,4,2))

mylegend<-c("mylegend A","mylegend B","mylegend C","mylegend test")
mylegend.width = strwidth(mylegend[which.max(nchar(mylegend))], "figure")

for (i in 1:4) {
mtext(text=mylegend[i],
   side = 1,
   line = 3+i,
   at = unit((1-mylegend.width)/2,"npc"),# centers the
legend at the bottom
   adj=0,
   padj=0)}

# plot

mydata<-data.frame(x=1:10,y=1:10)

par(cex.main=1, cex.axis=1, cex.lab=1, lwd=1,
   mar=c(5 + 5,4,4,2)+0.1)

plot(mydata,oma=c(5 + 5,4,4,2))

mylegend<-c("mylegend A","mylegend B","mylegend C","mylegend test")
mylegend.width = strwidth(mylegend[which.max(nchar(mylegend))], "figure")

for (i in 1:4) {
mtext(text=mylegend[i],
   side = 1,
   line = 3+i,
   at = unit((1-mylegend.width)/2,"npc"),# should
center the legend at the bottom but doesn't do it !
   adj=0,
   padj=0)}






--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different behavior of mtext

2007-09-03 Thread Prof Brian Ripley


On Sun, 2 Sep 2007, Sébastien wrote:


Dear R Users,

I am quite surprised to see that mtext gives different results when it
is used with 'pairs' and with "plot'. In the two following codes, it
seems that the 'at' argument in mtext doesn't consider the same unit system.


It is stated to be in 'user coordinates'.  Your code does not work because 
unit() is missing.  If you mean the one from package grid, "npc" is not 
user coordinates (and refers to a grid viewport which you have not set up 
and coincidentally is the same as the initial user coordinate system to 
which pairs() has reverted).


Try par("usr") after your pairs() and plot() calls to see the difference.
Plotting a 2x2 array of plots _is_ different from plotting one, so this 
should be as expected.


Since centring is the default for 'adj', it is unclear what you are trying 
to achieve here.



I would appreciate your comments on this issue.

Sebastien

# Pairs

mydata<-data.frame(x=1:10,y=1:10)

par(cex.main=1, cex.axis=1, cex.lab=1, lwd=1,
   mar=c(5 + 5,4,4,2)+0.1)

pairs(mydata,oma=c(5 + 5,4,4,2))

mylegend<-c("mylegend A","mylegend B","mylegend C","mylegend test")
mylegend.width = strwidth(mylegend[which.max(nchar(mylegend))], "figure")

for (i in 1:4) {
mtext(text=mylegend[i],
   side = 1,
   line = 3+i,
   at = unit((1-mylegend.width)/2,"npc"),# centers the
legend at the bottom
   adj=0,
   padj=0)}

# plot

mydata<-data.frame(x=1:10,y=1:10)

par(cex.main=1, cex.axis=1, cex.lab=1, lwd=1,
   mar=c(5 + 5,4,4,2)+0.1)

plot(mydata,oma=c(5 + 5,4,4,2))

mylegend<-c("mylegend A","mylegend B","mylegend C","mylegend test")
mylegend.width = strwidth(mylegend[which.max(nchar(mylegend))], "figure")

for (i in 1:4) {
mtext(text=mylegend[i],
   side = 1,
   line = 3+i,
   at = unit((1-mylegend.width)/2,"npc"),# should
center the legend at the bottom but doesn't do it !
   adj=0,
   padj=0)}


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sin(pi)?

2007-09-03 Thread Prof Brian Ripley

On Mon, 3 Sep 2007, Nguyen Dinh Nguyen wrote:

> Dear all,
> I found something strange when calculating sin of pi value

What exactly?  Comments below on two guesses as to what.

> sin(pi)
> [1] 1.224606e-16

That is non-zero due to using finite-precision arithmetic.  The number 
stored as pi is not exactly the mathematics quantity, and so 
sin(representation of pi) should be non-zero (although there is also 
rounding error in calculating what it is).

Note that sin() is computed by your C runtime, so the exact result will 
depend on your OS, compiler and possibly CPU.

> pi
> [1] 3.141593

That is the printout of pi to the default 7 significant digits.  R knows 
pi to higher accuracy:

> print(pi, digits=16)
[1] 3.141592653589793
> sin(3.141592653589793)
[1] 1.224606e-16

but note that printing to 16 digits and reading back in might not have 
given the same number, but happens to for pi at least on my system:

> 3.141592653589793 == pi
[1] TRUE

> sin(3.141593)
> [1] -3.464102e-07
>
> Any help and comment should be appreciated.
> Regards
> Nguyen
>
> 
> Nguyen Dinh Nguyen
> Garvan Institute of Medical Research
> Sydney, Australia
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] About "=" in command line in windows.

2007-08-31 Thread Prof Brian Ripley

On Fri, 31 Aug 2007, Vladimir Eremeev wrote:

> It seems, I don't understand something, or there is a bug in R.

A limitation in command-line parsing which is Windows-specific.

Don't use -e for complex expressions, as the quoting is getting removed by 
your shell.  In Windows both the shell (and it matters which shell) and 
the (compiled C) executable parse the command line, and that leads to 
surprises as in your (1).  At that point the rule about NAME=VALUE on the 
command line meaning 'set environment variable NAME' comes into play.

We could try harder (and maybe one day we will), but this really is 
'quoting hell' and there is no hardship in using alternatives such as

% cat > test.R
mean(x=1:3)
^D
% rscript test.R

and

% echo "mean(x=1:3)" | rscript -


>
> I have made some experiments after my yesterday post about using "=" with -e
> switch to the Rscript.
>
> Now, I've found:
>
> (1)
> C:\users\wl\trainings\r>rscript --verbose -e "mean(x=1:3)"
> running
>  'C:\Program Files\R\bin\Rterm.exe --slave --no-restore -e mean(x=1:3)'
>
> Error in -args : invalid argument to unary operator
> Execution halted
>
> (2)
> C:\users\wl\trainings\r>Rterm --slave --no-restore -e "mean(x=1:3)"
>
> Nothing is printed on the console, but the window appears, saying "R for
> Windows terminal front-end has encountered a problem and needs to close. We
> are sorry for the inconvenience."
>
> (3)
> C:\users\wl\trainings\r>rscript --verbose -e "mean(1:3)"
> running
>  'C:\Program Files\R\bin\Rterm.exe --slave --no-restore -e mean(1:3)'
>
> [1] 2
>
> (4)
> C:\users\wl\trainings\r>Rterm.exe --slave --no-restore -e "mean(1:3)"
> [1] 2
>
> (5)
> C:\users\wl\trainings\r>Rterm.exe --slave --no-restore -e 'mean(1:3)'
> [1] "mean(1:3)"
>
> Points (1) and (2) don't seem normal to me, however, I don't see, what I am
> doing wrong.
> I use windowsXP Pro, my colleague uses windows 2000 and reports the same
> problems.
> My sessionInfo():
>
>> sessionInfo()
> R version 2.5.1 Patched (2007-08-19 r42614)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=Russian_Russia.1251;LC_CTYPE=Russian_Russia.1251;LC_MONETARY=Russian_Russia.1251;LC_NUMERIC=C;LC_TIME=Russian_Russia.1251
>
> attached base packages:
> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"
> [7] "base"
>
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Incomplete Gamma function

2007-08-31 Thread Prof Brian Ripley

On Fri, 31 Aug 2007, Robin Hankin wrote:

> Hi Kris
>
>
> lgamma() gives the log of the gamma function.

Yes, but he used Igamma.  According to ?pgamma,

  'pgamma' is closely related to the incomplete gamma function.  As
  defined by Abramowitz and Stegun 6.5.1

  P(a,x) = 1/Gamma(a) integral_0^x t^(a-1) exp(-t) dt

  P(a, x) is 'pgamma(x, a)'.  Other authors (for example Karl
  Pearson in his 1922 tables) omit the normalizing factor, defining
  the incomplete gamma function as 'pgamma(x, a) * gamma(a)'.

and that seems to be what Igamma is following.  GSL on the other hand has 
the other tail, so

> a <- 9
> x <- 11.1
> pgamma(x, a, lower=FALSE)*gamma(a)
[1] 9000.501

>
> You need gamma_inc() of the gsl package, a wrapper for the
> GSL library:
>
> > gamma_inc(9,11.1)
> [1] 9000.501
> >

As the above shows, you don't *need* this, but you do need the GSL 
documentation to find out what R package gsl does.  Why it differs from 
the usual references is something for you to explain.  Wikipedia
http://en.wikipedia.org/wiki/Incomplete_gamma_function
distinguishes them, as does MathWorld.

I suggest you add a clarification to the gsl package as to what the 
'incomplete gamma function' means there.

> On 31 Aug 2007, at 00:29, [EMAIL PROTECTED] wrote:
>
>> Hello
>>
>> I am trying to evaluate an Incomplete gamma function
>> in R. Library Zipfr gives the Igamma function. From
>> Mathematica, I have:
>>
>> "Gamma[a, z] is the incomplete gamma function."
>>
>> In[16]: Gamma[9,11.1]
>> Out[16]: 9000.5
>>
>> Trying the same in R, I get
>>
>>> Igamma(9,11.1)
>> [1] 31319.5
>> OR
>>> Igamma(11.1,9)
>> [1] 1300998
>>
>> I know I have to understand the theory and the math
>> behind it rather than just ask for help, but while I
>> am trying to do that (and only taking baby steps, I
>> must admit), I was hoping someone could help me out.
>>
>> Regard
>>
>> Kris.
>>
>>
>>
>> __
>> __
>> Got a little couch potato?
>> Check out fun summer activities for kids.
>>
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> --
> Robin Hankin
> Uncertainty Analyst
> National Oceanography Centre, Southampton
> European Way, Southampton SO14 3ZH, UK
>  tel  023-8059-7743
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R and Windows Vista

2007-08-31 Thread Prof Brian Ripley

On Thu, 30 Aug 2007, Jan Budczies wrote:

>
> Hello group,
>
> it is reported (R for Windows FAQ) that R runs under Windows Vista.
> However, does someone here have experience with R under Vista 64
> and large (>3 or 4 GB) memory?

Yes, the person who wrote the FAQ entry does.

Note that the distributed Windows binary of R is a 32-bit executable, so 
the maximum memory it can address is 4GB (and it can do that in Vista 64 
on a 4GB RAM machine, unlike any 32-bit version of Windows).

If you want to use R on Vista 64, I suggest you use a current R-devel 
snapshot, as some changes have been made based on this experience.

> Greeting - Jan Budczies
>
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Choosing the optimum lag order of ARIMA model

2007-08-31 Thread Prof Brian Ripley

On Fri, 31 Aug 2007, Megh Dal wrote:

> Dear all R users,
>
>  I am really struggling to determine the most appropriate lag order of 
> ARIMA model. My understanding is that, as for MA [q] model the auto 
> correlation coeff vanishes after q lag, it says the MA order of a ARIMA 
> model, and for a AR[p] model partial autocorrelation vanishes after p 
> lags it helps to determine the AR lag. And most appropriate model 
> choosed by this argument gives min AIC.

The last part is fallacious.  Also, you are applying your rules to 
selecting the orders in ARMA models, and they apply only to pure MA or AR 
models.

The R test file src/library/stats/tests/ts-tests.R has an example of model 
selection by AIC.

>
>  Now I considered following data :
>
>  2.1948 2.2275 2.2669 2.2839 1.9481 2.1319 2.0238 2.3109 2.5727 2.5176
> 2.5728 2.6828 2.8221 2.879 2.8828 2.9955 2.9906 2.9861 3.0452 3.068
> 2.9569 3.0256 3.0977 2.985 2.9572 3.0877 3.1009 3.1149 2.8886 2.9631
> 3.0325 2.9175 2.7231 2.7905 2.8493 2.8208 2.8156 2.9115 2.701 2.6928
> 2.7881 2.723 2.7266 2.9494 3.113 3.0566 3.0358 3.05 3.0724 3.1365
> 3.1083 3.0257 3.2211 3.4269 3.327 3.1205 2.9997 3.0201 3.0803 3.2059
> 3.1997 3.038 3.1613 3.2802 3.2194
>
>  ACF for 1st diff series:
>  Autocorrelations of series 'diff(data1)', by lag
>   0  1  2  3  4  5  6  7  8  9 10
> 1.000 -0.022 -0.258 -0.016  0.066  0.034  0.035 -0.001 -0.089  0.028  0.222
>11 12 13 14 15 16 17 18
> -0.132 -0.184 -0.038  0.048 -0.026 -0.041 -0.067  0.059
>
>PACF for 1st diff series:
>  Partial autocorrelations of series 'diff(data1)', by lag
>   1  2  3  4  5  6  7  8  9 10 11
> -0.022 -0.258 -0.031 -0.002  0.026  0.057  0.021 -0.069  0.029  0.194 -0.124
>12 13 14 15 16 17 18
> -0.100 -0.111 -0.043 -0.078 -0.056 -0.085  0.086
>
>  On basis of that I choose ARIMA[2,1,2] for the original data
>
>  But I got error while doing that :
>
>  > arima(data1, c(2,1,2))
> Error in arima(data1, c(2, 1, 2)) : non-stationary AR part from CSS
>
>  And AIC for other combination of lags are:
>  > arima(data1, c(2,1,1))$aic
> [1] -84.83648
>> arima(data1, c(1,1,2))$aic
> [1] -84.35737
>> arima(data1, c(1,1,1))$aic
> [1] -83.79392
>
>  Hence on basis of AIC criteria if I choose ARIMA[2,1,1] model, then the 
> first rule that I said earlier does not support.
>
>  Am I making anything wrong? Can anyone give me any suggestion on what 
> is the "universal" rule for choosing the best lag?
>
>  Regards,
>
>
>
>
>
>
>
>
> -
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Q: how to interrupt long calculation?

2007-08-30 Thread Prof Brian Ripley

On Thu, 30 Aug 2007, D. R. Evans wrote:

> Prof Brian Ripley said the following at 08/30/2007 11:00 AM :
>> On Thu, 30 Aug 2007, D. R. Evans wrote:
>>
>>> Paul Smith said the following at 08/29/2007 04:32 PM :
>>>
>>>> The instance of R running will be immediately killed and then you can
>>>> start R again.
>>> But then I would lose all the work. There must be some way to merely
>>> interrupt the current calculation. Mustn't there?
>>
>> Only if it is long-running in R code, when ctrl-C or equivalent (Esc in
>> Rgui) works. If it is long-running in C or Fortran code, there is not.
>>
>
> It's inside loess()... so isn't that R code?

No, it is mainly Fortran, called from C called from R.

> I can sit hitting ctrl-C all day (well, it seems like it), but the code
> does not get interrupted :-(
>
>> Assuming a Unix-alike, sending SIGUSR1 will save the current workspace and
>> quit.  Even that is a little dangerous as the workspace need not be in a
>> consistent state.
>>
>
> That's helpful, thank you; at least it means I stand a chance of being able
> to interrupt the code and recover.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Q: how to interrupt long calculation?

2007-08-30 Thread Prof Brian Ripley

On Thu, 30 Aug 2007, D. R. Evans wrote:

> Paul Smith said the following at 08/29/2007 04:32 PM :
>
>> The instance of R running will be immediately killed and then you can
>> start R again.
>
> But then I would lose all the work. There must be some way to merely
> interrupt the current calculation. Mustn't there?

Only if it is long-running in R code, when ctrl-C or equivalent (Esc in 
Rgui) works. If it is long-running in C or Fortran code, there is not.

Assuming a Unix-alike, sending SIGUSR1 will save the current workspace and 
quit.  Even that is a little dangerous as the workspace need not be in a 
consistent state.

People who have been bitten will learn safer programming practices, for 
example to call save.image() at suitable checkpoint times.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Behaviour of very large numbers

2007-08-30 Thread Prof Brian Ripley

On Thu, 30 Aug 2007, willem vervoort wrote:

> Dear all,
> I am struggling to understand this.
>
> What happens when you raise a negative value to a power and the result
> is a very large number?

Where are the 'very large numbers' here?  R can cope with much larger 
numbers (over 10^300).

> B
> [1] 47.73092
>
>> -51^B
> [1] -3.190824e+81

Yes, that is -(51^B).

> # seems fine
> # now this:
>> x <- seq(-51,-49,length=100)
>
>> x^B
>  [1] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 
> 
>> is.numeric(x^B)
> [1] TRUE
>> is.real(x^B)
> [1] TRUE
>> is.infinite(x^B)
>  [1] FALSE FALSE FALSE FALSE FALSE
>
> I am lost, I checked the R mailing help, but could not find anything
> directly. I loaded package Brobdingnag and tried:
> as.brob(x^B)
>  [1] NAexp(NaN) NAexp(NaN) NAexp(NaN) NAexp(NaN) NAexp(NaN)
>> as.brob(x)^B
>
> I guess I must be misunderstanding something fundamental.

You are.  A negative number to a non-integer power is undefined in the 
real number system.

Look at (x+0i)^B.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Another issue with the Matrix package under R-devel

2007-08-30 Thread Prof Brian Ripley

I suspect you have not using a re-installed Matrix after re-building R.
I can reproduce the problem using a version of Matrix I installed under 
2.5.1, but not with one installed under R-devel this week.

Since R-devel is 'Under development' you may need to reinstall packages 
when it changes.  This is particularly prevalent with S4-using packages, 
as the methods code tends to capture the value of objects as they existed 
when a package was installed.  In this case log() was changed quite a few 
weeks ago, but Matrix needs to be reinstalled after other changes last 
weekend.

For the record, the packages I know need to reinstalled under a recent 
R-devel are Matrix, distr, kernlab, kinship and matlab (but there may be 
others).

And please use the appropriately named R-devel list for questions about 
R-devel.

On Thu, 30 Aug 2007, Tony Chiang wrote:

> Hi all,
>
> I am encountering a strange issue with the Matrix package. I have just built
> R-devel from source on my macbook pro, and I wonder if others can reproduce
> this problem. I will give example code to go along:
>
> Starting a fresh R session:
>
> R version 2.6.0 Under development (unstable) (2007-08-30 r42697)
> Copyright (C) 2007 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
>
> ...
>
>> log(2)
> [1] 0.6931472
>> library(Matrix)
> Loading required package: lattice
>> log(2)
> Error in log(2) :
>  could not find symbol "base" in environment of the generic function
>
> There seems to be something wrong here and I cannot figure out what it is.
> Am I doing something wrong or is it an issue with Matrix (which is what I
> have narrowed it down to). I think that it might be a namespace collision or
> something, but I am certainly not sure.
>
> Here is my sessionInfo() output:
>> sessionInfo()
> R version 2.6.0 Under development (unstable) (2007-08-30 r42697)
> i386-apple-darwin8.10.1
>
> locale:
> C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> other attached packages:
> [1] Matrix_0.99875-2 lattice_0.16-3
>
> loaded via a namespace (and not attached):
> [1] grid_2.6.0
>
> Thanks.
>
> Tony
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xeon processor and ATLAS

2007-08-29 Thread Prof Brian Ripley

On Tue, 28 Aug 2007, hui xie wrote:

> hi everyone:
>
> I have a Dell Server that has a Xeon processor, and I would like to use 
> the best ATLAS posted in the R website. I find that R has ATLAS for 
> core2duo and P4. I am not sure which one of these two is best suited for 
> Xeon processor, or is that neither of these two is good and I should 
> stick with the default one that was installed originally?

And your OS is?

There are many different 'Xeon' processors with very different 
capabilities.  You really ought to build ATLAS for yourself if numerical 
linear algebra performance matters to you (and it makes little difference 
to most people: I think Uwe Ligges quoted 10% for testing all CRAN 
packages).

> Your advice is very much appreciated!
>
> Best,
>
> Hui
>
>
> -
> Park yourself in front of a world of choices in alternative vehicles.
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Please do!

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sql query over local tables

2007-08-29 Thread Prof Brian Ripley

On Tue, 28 Aug 2007, Jorge Cornejo Donoso wrote:

> Hi i have to table with IDs in each one.

And what is a 'table'?  If these are data frames, see ?merge.  If they are 
tables (which are arrays in R), then still use merge() if they can be 
converted to data frames.

> I want to make a join (as in sql) by the ID. Is any way to use the RODBC
> package (or other) in local tables (not a access, mysql, sql, etc. )  and
> made the join?

No, they use the DBMS to do the hard work.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rmpi and x86

2007-08-28 Thread Prof Brian Ripley

On Tue, 28 Aug 2007, Martin Morgan wrote:

> Edna --
>
> I'll keep this on the list, so that others will learn, and others will
> correct me when I give bad advice!
>
>> relocation R_X86_64_32 against `lam_mpi_comm_world' can not be used
>> when making a shared object; recompile with -fPIC
>
> This likely means that your lam was not built with the --enable-shared
> configure option, as documented in the installation guide. (It might
> also mean that R was not configure with --enable-R-shlib; the message
> is opaque to me).

The line above is

/usr/lib64/gcc/x86_64-suse-linux/4.1.0/../../../../lib64/libmpi.a(infoset.o):

so it means lam needs to be rebuilt either with --enable-shared or just 
with -fPIC added to CFLAGS.  (Users migrating from i386 Linux to x86_64 
Linux got quite used to this quirk.)

> I believe others on the list have only had success with specific
> versions of LAMMPI, so that would be the next place to look (after
> sorting out the shared library issue)
>
> Martin
>
> "Edna Bell" <[EMAIL PROTECTED]> writes:
>
>> Here is what happens:
>> Note: lam-7.1.4
>>
>> linux-tw9c:/home/bell/Desktop/R-2.5.1/bin # ./R CMD INSTALL --clean
>> Rmpi_0.5-3.tar.gz
>> * Installing to library '/home/bell/Desktop/R-2.5.1/library'
>> * Installing *source* package 'Rmpi' ...
>> checking for gcc... gcc
>> checking for C compiler default output... a.out
>> checking whether the C compiler works... yes
>> checking whether we are cross compiling... no
>> checking for suffix of executables...
>> checking for suffix of object files... o
>> checking whether we are using the GNU C compiler... yes
>> checking whether gcc accepts -g... yes
>> checking for gcc option to accept ANSI C... none needed
>> checking how to run the C preprocessor... gcc -E
>> checking for egrep... grep -E
>> checking for ANSI C header files... yes
>> checking for sys/types.h... yes
>> checking for sys/stat.h... yes
>> checking for stdlib.h... yes
>> checking for string.h... yes
>> checking for memory.h... yes
>> checking for strings.h... yes
>> checking for inttypes.h... yes
>> checking for stdint.h... yes
>> checking for unistd.h... yes
>> checking mpi.h usability... yes
>> checking mpi.h presence... yes
>> checking for mpi.h... yes
>> Try to find libmpi or libmpich ...
>> checking for main in -lmpi... yes
>> Try to find liblam ...
>> checking for main in -llam... yes
>> checking for openpty in -lutil... yes
>> checking for main in -lpthread... yes
>> configure: creating ./config.status
>> config.status: creating src/Makevars
>> ** libs
>> gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
>> -I/home/hodgesse/Desktop/R-2.5.1/include -DPACKAGE_NAME=\"\"
>> -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\"
>> -DPACKAGE_BUGREPORT=\"\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
>> -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
>> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
>> -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
>> -fpic  -g -O2 -c conversion.c -o conversion.o
>> gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
>> -I/home/bell/Desktop/R-2.5.1/include -DPACKAGE_NAME=\"\"
>> -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\"
>> -DPACKAGE_BUGREPORT=\"\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
>> -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
>> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
>> -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
>> -fpic  -g -O2 -c internal.c -o internal.o
>> gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
>> -I/home/bell/Desktop/R-2.5.1/include -DPACKAGE_NAME=\"\"
>> -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\"
>> -DPACKAGE_BUGREPORT=\"\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
>> -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
>> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
>> -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
>> -fpic  -g -O2 -c RegQuery.c -o RegQuery.o
>> gcc -std=gnu99 -I/home/bell/Desktop/R-2.5.1/include
>> -I/home/bell/Desktop/R-2.5.1/include -DPACKAGE_NAME=\"\"
>> -DPACKAGE_TARNAME=\"\" -DPACKAGE_VERSION=\"\" -DPACKAGE_STRING=\"\"
>> -DPACKAGE_BUGREPORT=\"\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1
>> -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1
>> -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1
>> -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1   -DMPI2 -I/usr/local/include
>> -fpic  -g -O2 -c Rmpi.c -o Rmpi.o
>> gcc -std=gnu99 -shared -L/usr/local/lib64 -o Rmpi.so conversion.o
>> internal.o RegQuery.o Rmpi.o -lmpi -llam -lutil -lpthread
>> /usr/lib64/gcc/x86_64-suse-linux/4.1.0/../../../../x86_64-suse-linux/bin/ld:
>> /usr/lib64/gcc/x86_64-suse-linux/4.1.0/../../../../lib64/libmpi.a(infoset.o):
>> relocation R_X86_64_32 against `lam_mpi_comm_world' can not be used
>> when making a shared object; recompile with -fPIC
>> /usr/lib64/gcc/x86_64-suse-linux/4.1.0/../../../../lib64/libmpi.a:
>> could not read symbols: Bad

Re: [R] Forcing coefficients in lm object

2007-08-28 Thread Prof Brian Ripley


It is fit$coefficients, not fit$coef .


From the help page:


name: A literal character string or a name (possibly backtick
  quoted).  For extraction, this is normally (see under
  Environments) partially matched to the 'names' of the object.

Note the qualifier 'for extraction', so you assigned a new element with 
name 'coef', and predict.lm used fit$coefficients.



On Tue, 28 Aug 2007, [EMAIL PROTECTED] wrote:


Dear all,

I would like to use predict.lm() with an existing lm object but with new arbitrary 
coefficients. I modify 'fit$coef' (see example below) "by hand" but the actual 
model in 'fit' used for prediction does not seem to be altered (although fit$coef is!).

Can anyone please help me do this properly?

Thanks in advance,

Jérémie




dat <- data.frame(y=c(0,25,32,15), x=as.factor(c(1,1,2,2)))
fit <- lm(y ~ x, data=dat)
fit


Call:
lm(formula = y ~ x, data = dat)

Coefficients:
(Intercept)   x2
  12.5 11.0


fit$coef[[2]] <- 100
dat.new <- data.frame(x=as.factor(c(1,2,1,2)))
predict.lm(fit, dat.new)

  1234
12.5 23.5 12.5 23.5

fit


Call:
lm(formula = y ~ x, data = dat)

Coefficients:
(Intercept)   x2
  12.5 11.0


fit$coef

(Intercept)  x2
  12.5   100.0






Jérémie Lebrec
Dept. of Medical Statistics and Bioinformatics
Leiden University Medical Center
Postzone S-05-P
P.O. Box 9600
2300 RC Leiden
The Netherlands
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Limiting size of pairs plots

2007-08-28 Thread Prof Brian Ripley


From ?pairs


 The graphical parameter 'oma' will be set by 'pairs.default'
 unless supplied as an argument.

so try

pairs(iris[1:4], main = "Anderson's Iris Data -- 3 species", pch = 21,
  bg = c("red", "green3", "blue")[unclass(iris$Species)],
  oma = c(8,3,5,3))



On Tue, 28 Aug 2007, Sébastien wrote:


Dear R-users,

I would like to add a legend at the bottom of pairs plots (it's my first
use of this function). With the plot function, I usually add some
additional space at the bottom when I define the size of the graphical
device (using mar); grid functions then allows me to draw my legend as I
want.
Unfortunatley, this technique does not seem to work with the pairs
function as the generated plots use all the available space on the
device (see below). I guess I am missing a key argument... my attempts
to modify the oma, mar, usr arguments were unsuccesfull, and I could not
find any helpful threads on the archives.

As usual, any advice would be greatly appreciated

Sebastien


pdf(file="C:/test.pdf", width=6, height= 6 + 0.2*6)

par(mar=c(5 + 6,4,4,2)+0.1)

pairs(iris[1:4], main = "Anderson's Iris Data -- 3 species", pch = 21,
bg = c("red", "green3", "blue")[unclass(iris$Species)])

dev.off()

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with save or/and if (I think but maybe not ...)

2007-08-27 Thread Prof Brian Ripley


On Mon, 27 Aug 2007, Ptit_Bleu wrote:



Hi,

I recently discovered the R program and I thought it could be useful to me.
I have to analyse data saved as .Px file (x between 0 and 8 - .P0 files have
18 lines at the beginning that I have to skip). New files are generated
everyday.


relrfichiers<-dir(chemin, pattern=".P")

does not do that, though.  Better to use

dir(chemin, pattern="\\.[0-8]$", full.names=TRUE)

or

Sys.glob(file.path(chemin, "*.P[0-8]"))



This is my strategy :

In order to analyse the data, I first want to copy the new data in a
database in MySQL (which already contains the previous data).
So the first task is to compare the list of the files in the directory
(object : rfichiers) to the list of the files already saved (object :
tfichiers). The list containing the new files is then given by
nfichiers<-setdiff(rfichiers, tfichiers).

It sounds easy ...
... but it doesn't work !!!

Up to now, I'm am able to connect to MySQL and, if the file "tfichiers.r"
doesn't exist, I can copy data files to the MySQL database.
But if "tfichiers.r" already exists and there is no new file to save, it
ignores the condition if (nfichiers!="0") and save all the files of the
directory to the database.


What did you intend there?  It is not a test of no difference, but a test 
that each element of the difference is not "0", and furthermore if() 
expects a test of length one, not the length of nfichiers.  I suspect you 
intended to test length(nfichiers) > 0.


It often helps to print (or use str on) the objects you create.  Try this 
on


nfichiers
nfichiers!="0"


Is it a problem with the way I save tfichiers or is it a problem with the
condition if (nfichiers!="0") ?


Saving in R save format with extension .r is going to confuse others. 
Extension .rda is conventional for save format (and I doubt you need an 
ascii save).



Could you please give me some advices to correct my script (written with
Tinn-R) ?

I thank you in advance for your help.
Have a nice week,
Ptit Bleu.

PS : Ptit Bleu means something like "Full Newbye" in french. So thanks to be
patient :-)
PPS : I hope you understand my french english

--


# Connexion a la base de donnees database de MySQL

library(DBI)
library(RMySQL)
drv<-dbDriver("MySQL")
con<-dbConnect(drv, username="user", password="password", dbname="database",
host="localhost")


# Creation des objets contenant la liste des fichiers (rel pour chemin
relatif)
# - dans le repertoire : objet rfichiers
# - deja traites : objet tfichiers
# - nouveaux depuis la derniere connexion : objet nfichiers
# chemin est le repertoire de stockage des donnees
# RWork est le repertoire de travail de R
# sep='' pour eviter l'ajout d'un espace apres Mydata/

setwd("D:/RWork")
chemin<-"d:/Mydata/"
relrfichiers<-dir(chemin, pattern=".P")
rfichiers<-paste(chemin,relrfichiers, sep='')
if (file.exists("tfichiers.r"))
 {
   tfichiers<-load("tfichiers.r")
   nfichiers<-setdiff(rfichiers,tfichiers)
 } else {
   nfichiers<-rfichiers
 }


# p0fichiers : fichiers avec l'extension .P0 (fichiers contenant des lignes
d'infos à ne pas charger)
# pxfichiers : fichiers avec les extensions P1, ..., P8 (sans infos au
debut)

if (nfichiers!="0")
{
 p0fichiers<-nfichiers[grep(".P0", nfichiers)]
 pxfichiers<-setdiff(nfichiers, p0fichiers)


# Fusion des colonnes jour et heure pour permettre de tracer des variations
en fonction du temps
# Chaque fichier contenu dans l'objet p0fichiers est chargé, en supprimant
les 18 premieres lignes,
# et on met dans l'objet jourheure la fusion de la colonne jour (V1) et de
la colonne heure (V2)
# L'objet jourheure est recopie dans la premiere colonne de donnees
# On supprime ensuite la deuxieme colonne (contenant les heures) qui est
maintenant superflue
# L'objet donnees est copié dans la base de donnees MySQL Mydata
# Remarque : R comprend le format jour/mois/annee - MySQL : annee/mois/jour
-> stockage en CHAR dans MySQL

 for (i in 1:length(p0fichiers))
   {
 donnees<-read.table(p0fichiers[i], quote="\"", sep=";", dec=",",
skip=18)
 jourheure<-paste(donnees$V1, donnees$V2, sep=" ")
 donnees[1]<-jourheure
 donnees<-donnees[,-2]
#  assignTable(con, "Datatable", donnees, append=TRUE) - Ne marche pas
 dbWriteTable(con, "Datatable", donnees, append=TRUE)
 rm(donnees, jourheure)
   }


# Idem avec les fichiers d'extension .Px en chargant toutes les lignes
(skip=0)
# Amelioration possible : creer une fonction avec en argument p0fichiers ou
pxfichiers

 for (i in 1:length(pxfichiers))
   {
 donnees<-read.table(pxfichiers[i], quote="\"", sep=";", dec=",",
skip=0)
 jourheure<-paste(donnees$V1, donnees$V2, sep=" ")
 donnees[1]<-jourheure
 donnees<-donnees[,-2]
#   assignTable(con, "Datatable", donnees, append=TRUE) - Ne marche pas
 dbWriteTable(con, Datatable", donnees, append=TRUE)
 rm(donnees, jourheure)
   }
}

tfichiers<-rfichiers
save(rfichiers, file="tfichiers.r", ascii=TRUE)
rm(

Re: [R] R-2.5.1 RedHat EL5 compilation failed

2007-08-26 Thread Prof Brian Ripley

Well, the INSTALL file said

   The main source of information on installation is the `R Installation
   and Administration Manual', an HTML copy of which is available as file
   `doc/html/R-admin.html'.  Please read that before installing R.  But
   if you are impatient, read on but please refer to the manual to
   resolve any problems.  (If you obtained R using Subversion, the manual
   is at doc/manual/R-admin.texi.)

and this _is_ discussed there.  Hint: is readline-devel installed?

On Sun, 26 Aug 2007, Wang Chengbin wrote:

> I can't get R-2.5.1 compiled under RedHat EL5 with gcc 4.1.1. Configure
> failed at the following:
>
> checking readline/history.h usability... no
> checking readline/history.h presence... no
> checking for readline/history.h... no
> checking readline/readline.h usability... no
> checking readline/readline.h presence... no
> checking for readline/readline.h... no
> checking for rl_callback_read_char in -lreadline... no
> checking for main in -lncurses... no
> checking for main in -ltermcap... no
> checking for main in -ltermlib... no
> checking for rl_callback_read_char in -lreadline... no
> checking for history_truncate_file... no
> configure: error: --with-readline=yes (default) and headers/libs are not
> available
>
> Thanks.
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to provide argument when opening RGui from an external application

2007-08-26 Thread Prof Brian Ripley


On Sun, 26 Aug 2007, Duncan Murdoch wrote:


On 26/08/2007 7:14 AM, Sébastien wrote:

Thanks for your reply.
When you say "look into Rscript.exe", do you have a specific document in 
mind ? I tried to google it but could not find much... I forgot to mention 
in my first email that I am working under the Windows XP environment.


You could try ?Rscript within R, or Rscript --help from the command line 
(assuming you have R's bin directory on your path.


Or read 'An Introduction to R'.



Duncan Murdoch



Prof Brian Ripley a écrit :
Look into Rscript.exe (on Windows), which is a flexible way to run 
scripts.  Neither using a GUI nor using source() are recommended.


On Fri, 24 Aug 2007, Sébastien wrote:


Dear R-users,

I have written a small application (in visual basic) that automatically
generate some R scripts. I would like to execute these scripts when my
application is being closed.
My problem is that I don't know how to pass the
'source(c:/.../myscript.r)' instruction when I programmatically start
RGui. Tinn-R is capable of doing such things, so I guess there must be a
way to pass arguments to RGui.

Any advice or link to relevant references would be greatly appreciated.

Sebastien


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Character position command

2007-08-25 Thread Prof Brian Ripley

On Sat, 25 Aug 2007, Mitchell Hoffman wrote:

> This is a very simple question, so I apologize I couldn't find it online:
>
> I want to shorten the string 'apples.pears' to 'apples'.
>
> string='apples.pears'
> string1=substr(string,0,x)
>
> For x above, I would like to have a command like charAt(string,"."), i.e.
> the position of the period in the word, but I can't seem to find a charAt
> command in R.

See ?regexpr

But a simpler solution is sub("\\..*", "", string).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to provide argument when opening RGui from an external application

2007-08-25 Thread Prof Brian Ripley

Look into Rscript.exe (on Windows), which is a flexible way to run 
scripts.  Neither using a GUI nor using source() are recommended.


On Fri, 24 Aug 2007, Sébastien wrote:


Dear R-users,

I have written a small application (in visual basic) that automatically
generate some R scripts. I would like to execute these scripts when my
application is being closed.
My problem is that I don't know how to pass the
'source(c:/.../myscript.r)' instruction when I programmatically start
RGui. Tinn-R is capable of doing such things, so I guess there must be a
way to pass arguments to RGui.

Any advice or link to relevant references would be greatly appreciated.

Sebastien


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can i inhibit this work "Please select a CRAN mirror for use in this session "?

2007-08-25 Thread Prof Brian Ripley

On Sat, 25 Aug 2007, zhijie zhang wrote:

> Dear Rusers,
>  When i start R, there always the following work to do first, how should i
> cancel it?
> *--- Please select a CRAN mirror for use in this session ---*
>  I don't know why it does so, maybe i have done something unintentionally.

You certainly have.  Try starting R with --vanilla and it should go away.
If so, see ?Startup and look at the various files it mentions.  Do you 
have a .Rprofile?  Have you changed Rprofile.site ... ?

The message comes from contrib.url(): most likely you have a call to 
update.packages() in a startup file.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error building R 2-5.2.1 on Sun Solaris 8

2007-08-23 Thread Prof Brian Ripley

What is 'R 2-5.2.1'?  AFAIK there is no such version.

I can tell you the most likely issue: is your Perl is pre 5.6.1 (very old 
indeed)?

The current R-patched (2.5.1 patched) requires Perl 5.6.1, and we do 
suggest that you install that rather than 2.5.1.

(Interestingly, all versions of R since 2.4.0 have needed Perl >= 5.6.1 
but this was not reported until after R 2.5.1 was released, and this is 
the third instance in a month or so.  Perl 5.8, already over 5 years' 
old, will be required for future versions of R.)


On Thu, 23 Aug 2007, Mike Box wrote:

>
> As shown below, the build process fails with only vague messages,
> leaving me clueless as to how to resolve.
>
> Thanks, in advance, for any help that you may offer.
>
> Mike
>
> --
>
> # ./configure --prefix=/SOURCES/R-2.5.1 --with-iconv=no
> ...
> ...
> ...
> R is now configured for sparc-sun-solaris2.8
>
> Source directory: .
> Installation directory: /SOURCES/R-2.5.1
>
> C compiler: gcc -std=gnu99 -g -O2
> Fortran 77 compiler: g77 -g -O2
>
> C++ compiler: g++ -g -O2
> Fortran 90/95 compiler: f95 -g
> Obj-C compiler: -g -O2
>
> Interfaces supported: X11
> External libraries: readline
> Additional capabilities: NLS
> Options enabled: shared BLAS, R profiling, Java
>
> Recommended packages: yes
>
> configure: WARNING: you cannot build info or HTML versions of the R manuals
>
> # make
> ...
> ...
> ...
> * Installing *source* package 'MASS' ...
> ** libs
> gcc -std=gnu99 -I/reserve/R-2.5.1/include -I/reserve/R-2.5.1/include
> -I/usr/local/include -fPIC -g -O2 -c MASS.c -o MASS.o
> gcc -std=gnu99 -I/reserve/R-2.5.1/include -I/reserve/R-2.5.1/include
> -I/usr/local/include -fPIC -g -O2 -c lqs.c -o lqs.o
> gcc -std=gnu99 -G -L/usr/local/lib -o MASS.so MASS.o lqs.o
> ** R
> ** data
> ** moving datasets to lazyload DB
> ** inst
> ** preparing package for lazy loading
> ** help
> Can't use an undefined value as filehandle reference at
> /reserve/R-2.5.1/share/perl/R/Rdconv.pm line 78.
 Building/Updating help pages for package 'MASS'
> Formats: text html latex example
> ERROR: building help failed for package 'MASS'
> ** Removing '/reserve/R-2.5.1/library/MASS'
> ** Removing '/reserve/R-2.5.1/library/class'
> ** Removing '/reserve/R-2.5.1/library/nnet'
> ** Removing '/reserve/R-2.5.1/library/spatial'
> *** Error code 1
> make: Fatal error: Command failed for target `VR.ts'
> Current working directory /reserve/R-2.5.1/src/library/Recommended
> *** Error code 1
> make: Fatal error: Command failed for target `recommended-packages'
> Current working directory /reserve/R-2.5.1/src/library/Recommended
> *** Error code 1
> make: Fatal error: Command failed for target `stamp-recommended'
> #
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FAQ 7.x when 7 does not exist. Useability question

2007-08-23 Thread Prof Brian Ripley

On Thu, 23 Aug 2007, John Kane wrote:

> The FAQ Section 7 is a very useful place for new users
> to find out any number of R idiosycracies.  However
> there is no numbering on the FAQ Table of Content or
> on the Sections Tables of Contents.

Hmm, doc/FAQ does have a numbered table of contents and numbered sections 
and doc/manual/R-FAQ.html does have numbered sections and my browser's 
search finds 7.10 straight away.

> An R-help list reply of "Read FAQ 7.10" in response to
> a question about converting a factor to numeric is  a
> bit cryptic. The only time 7.10 appears is after the
> searcher has found the entry.

It would help if you told us what you are searching that did not contain 
'7.10'.

> Would it be a good idea to actually number the entries
> for the FAQ Table of Contents and the Table of
> Contents for the Sections?

I think we do.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Estimate Intercept in ARIMA model

2007-08-23 Thread Prof Brian Ripley

This is described on the help page!

include.mean: Should the ARIMA model include a mean term? The default
   is 'TRUE' for undifferenced series, 'FALSE' for differenced
   ones (where a mean would not affect the fit nor predictions).

  Further, if 'include.mean' is true, this formula applies to X-m
  rather than X.  For ARIMA models with differencing, the
  differenced series follows a zero-mean ARMA model.

You can add an intercept to your own xreg: you don't need a package to 
help you, but you do need to study the documentation.

On Thu, 23 Aug 2007, doublelin15 wrote:

> Hi, All,
>   This is my program
>
> ts1.sim <- arima.sim(list(order = c(1,1,0), ar = c(0.7)), n = 200)
> ts2.sim <- arima.sim(list(order = c(1,1,0), ar = c(0.5)), n = 200)
> tdata<-ts(c(ts1.sim[-1],ts2.sim[-1]))
> tre<-c(rep(0,200),rep(1,200))
> gender<-rbinom(400,1,.5)
> x<-matrix(0,2,400)
> x[1,]<-tre
> x[2,]<-gender
> fit <- arima(tdata, c(1, 1, 0), method = "CSS",xreg=t(x))

Use

arima(tdata, c(1, 1, 0), method = "CSS", xreg=cbind(intercept=1, t(x)))

>   I try to fit a ARIMA model and aclude some other independent
> variable in this model, but why the outcome does not have the
> intercept estimate value? and if the model is arima(tdata, c(p, 0, q)
> there will have this value, why have this difference?
>
>   And if i want analysis Interrupted time series, what can i do?
> I find urca can help you to find the interrupted point, but if I
> already know this point, and want to compare the mean, level and
> slope, any package can help me to do this?
>
>
> Thanks for your attention!
>
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting lda results

2007-08-22 Thread Prof Brian Ripley

Read ?plot.lda, which tells you the ... arguments are (for dimen=1, the 
only option for two groups) passed to ldahist, so then read its help page.

I don't know what you want (and your example is not reproducible): I would 
expect you to get a single plot with two panels (figures), but there are 
options to have a single panel.  (Reading 'An Introduction to R' may help 
you to use standard terminology that others will be able to follow.)

On Wed, 22 Aug 2007, Silvia Lomascolo wrote:

>
> Hi all,
> I am trying to plot the results of a discriminant analysis done with
> lda(MASS) but my groups appear in two different plots (in the same graphics
> device) and I want to combine them in one plot. My code looks like:
>
> BirdTrain.lda <- lda(Bdisperser~., data=BirdTrain.mx)
> predict(BirdTrain.lda)
> plot(BirdTrain.lda)
>
> I have two types of Bdisperser, so I only get one linear discriminant
> function. Can anyone please tell me how to combine the data in one plot?
>
> I work with R 2.4.1 using Windows.

But the version of MASS is what is relevant, and it would have been in 
the sessionInfo() output the R posting guide asked you for.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do i print a "main title" on a win.graph with several plots?

2007-08-22 Thread Prof Brian Ripley

?title, look at the 'outer' argument.

You can see further discussion of the outer margins in 'An Introduction to 
R'.

I don't know why you are using win.graph(): it is a deprecated form of 
windows() with many of the arguments taking unchangable defaults.

On Wed, 22 Aug 2007, Tom Willems wrote:

> Good Mornig All,
>
> How R you today? ;o)
>
> I have lots of questions, but i l start with the simplest one,
> to wich i am shy to say, i did not find the answer.
>
> It is the following:
>
> When i make a summary plot like for example plot( summary(glm)),
> i get one window, one main title, and 4 graph's in that window.
>
> Now i do know how to get several graphs in one window,
> buit i don't manage geting one main title , in the top middel of the
> window.
> I can give every plot a different main and subtitle, but i can't put no
> title in the "win.graph()" box.
>
> this is what i do:
>
> win.graph();   op <- par(mfrow = c(1,2))
> boxplot(lg_value~labo, main="Test 1 at day 1",ylab="log(x) ",
> xlab="different lab's", data=dataset,ylim=c(-0.05,5))
> boxplot(lg_value~labo, main="Test 2 at day 1",ylab="log(x)",
> xlab="different lab's", data=dataset,ylim=c(-0.05,5))
> par(op);
>
> Then i get one graph window, two plots each with their onw main title.
> What i'd like to have is, one main title saying " At day 1", and then two
> plots with the different tests.
> How can i do this, pls?
>
> Kind regards,
> Tom W.
>
>
> Disclaimer: click here
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] open/execute/call/run an external file

2007-08-21 Thread Prof Brian Ripley

On Tue, 21 Aug 2007, STEPHEN M POWERS wrote:

> I'm trying to figure out how to trigger a process from within R. I have 
> an exectuable file that runs a Fortran model, but ideally, would like to 
> run it from R. Note that I'm not talking about importing the function at 
> all, passing variables, or anything complicated like that. I basically 
> just want a script that "double-clicks" on a particular file and 
> opens/runs it for me.
>
> The idea here is that the executable Fortran file, when double clicked, 
> simply draws all necessary inputs from text files within the same 
> directory and I have no need to change this. So I've used R to summarize 
> some raw data and format these required text input files in the way the 
> Fortran executable requires, and also have scripts to interpret the 
> Fortran text file outputs and summarize/plot them in R. The problem is I 
> must run the first part of the R script to send data from R to the 
> model, then double click the Fortran executable, then run the second 
> part of the R script to get the model outputs into R, in three separate 
> steps. Given that I may be doing this hundereds of times, I'd prefer to 
> do it all in one step.
>
> Any thoughts?---steve

It is described in the relevant manual: Writing R Extensions.
?system, and if this is Windows also ?shell and ?shell.exec.

> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

When you do, you will see that we asked for your OS which is relevant 
here.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R on a flash drive

2007-08-21 Thread Prof Brian Ripley

On Wed, 22 Aug 2007, Williams Scott wrote:

> I often run R via a Ceedo virtualisation on a USB drive
> (http://www.ceedo.com/) with XP. It costs a few dollars to it this way,
> but is a very low stress installation and has worked flawlessly, albeit

It is not necessary though, as R does not need 'virtualisation'.  For 
Windows this is covered in the rw-FAQ Q2.6.

> a little slower (barely noticeable).

Perhaps the overhead of Credo?  Once R starts up (which does take longer 
on a slow drive) I found no time-able difference in 2.1.x (all the files 
frequently used from disc are cached on startup).

It would be nice to give the R developers the credit for writing R in such 
a way that it works well from slow media, instead of it being credited to 
an unnecessary commercial product.

> Very handy if you are often working
> on various machines without administrator rights (as I do in clinic) -
> just plug in your USB and go directly back to your project. It then
> removes any trace of you (so they say) when you log out. And you can use
> it for other software (within limits though) you might want to carry
> around.

Many sites would not allow programs to be run from a USB drive or make it 
a breach of usage conditions to do so.

>
> Hope that helps.
>
> Scott
>
> 
> Scott Williams MD
> Peter MacCallum Cancer Centre
> Melbourne
> Australia
>
> -Original Message-
> From: John Kane [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, 21 August 2007 12:28 AM
> To: John Kane; Erin Hodgess; r-help@stat.math.ethz.ch
> Subject: Re: [R] R on a flash drive
>
> Oops meant to send this to the list.
> --- John Kane <[EMAIL PROTECTED]> wrote:
>
>>
>> --- Erin Hodgess <[EMAIL PROTECTED]> wrote:
>>
>>> Dear R People:
>>>
>>> Has anyone run R from a flash drive, please?
>>>
>>> If so, how did it work, please?
>>
>> Yes I run R, occasionally, on a USB with no problem
>> on
>> WindowsXP. It works well, albeit a bit more slowly
>> than from the hard drive which is as you would
>> expect.
>>
>> The last time I upgraded the USB (to 2.5.0 ?) I
>> simply
>> downloaded R and installed it on the USB drive
>> rather
>> than the C: drive and then installed all my usual
>> optional packages using the normal Rgui interface.
>>
>> I usually have R, Tinn-R and portable versions of
>> OpenOoffice.org, and Firefox installed on the USB.
>>
>>
>>   Get news delivered with the All new Yahoo!
>> Mail.  Enjoy RSS feeds right on your Mail page.
>> Start today at
>> http://mrd.mail.yahoo.com/try_beta?.intl=ca
>>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems installing updated version of vars package

2007-08-21 Thread Prof Brian Ripley

On Tue, 21 Aug 2007, sj wrote:

> All,
>
> I was looking onlin and noticed that the vars package (by Bernhard Pfaff)
> was recently updated (update date listed Aug 6, 2007) The updated packages
> has some features that I would find very useful. I have used the update
> packages function and vars was one of the packages identified as needing an
> update. I was able to updated and it appeared to work, however when  I load
> the package it does not seem to be the most recent version? Has anyone else
> had similar problems? Or does anyone have any suggestions?
>
> System Info:
>
> R 2.4.1

You need to update your R (as the posting guide asked you to before 
posting) or install the package from sources.

The binary builds are not updated for obsolete versions of R: the builds 
for 2.4.x stopped on 28 June.

> Windows XP
> install mirror: USA 3 (UCLA I think)
>
> thanks,
>
> Spencer
>
>   [[alternative HTML version deleted]]
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

PLEASE do!


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] runing .r file from C#

2007-08-21 Thread Prof Brian Ripley

On Tue, 21 Aug 2007, Alex MD wrote:

> Hi,
>
> I know that the general subject "calling R from C" has been discused but I
> have been reading the manuals and also scouting the lists and I can not seam
> to find
> a working solution for my problem.

It's a C# issue.

>  I want to call a R script ( let's call it "test.r" ) from within C# code.
>  After reading about this topic I am trying to do this :
>
> System.Diagnostics.Process proc = new System.Diagnostics.Process();
> proc.StartInfo.FileName = "E:/R/R-2.5.1 /bin/Rterm.exe";
> proc.StartInfo.Arguments = " <'test.r' --no-save";
> proc.StartInfo.UseShellExecute = false;
> proc.StartInfo.RedirectStandardOutput = false;
> proc.Start();
>
>
> bun when Rterm starts it shows parameter 
> When I try to do the same from a command line shell it DOES work just fine :
> " Rterm.exe 
> Do you have any idea how to make it not to ignore the input file? Or is
> there other way to just execute a .r file from C# code?

You need a shell for redirection (< > |) to work, and 'system' commands in 
Windows do not usually use one (as in C, C++, R, Perl): you seem to have 
turned off using a shell in C#.  However, I think you should be using 
RScript.exe, where this is not an issue.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] prediction interval for multiple future observations

2007-08-21 Thread Prof Brian Ripley

On Mon, 20 Aug 2007, Vlad Skvortsov wrote:

> Hi!
>
> '?predict.lm' says that the prediction intervals returned by predict()
> are for single observation only. Is there a way to specify the desired
> number of observations to construct the interval for?

What it says in full is

  The prediction intervals are for a single observation at each case
  in 'newdata' (or by default, the data used for the fit) with error
  variance(s) 'pred.var'.

I think you misunderstand: predict.lm returns a prediction interval for 
each row of 'newdata'.  The comment in part means that those intervals are 
to be considered individually, and not as a joint prediction region for 
all the future observations.  If you want, say, a prediction interval for 
the average of 10 indepedent observations at a case, use 'pred.var' to 
specify the error variance.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Prerequisite for running RWeka

2007-08-19 Thread Prof Brian Ripley

On Sun, 19 Aug 2007, [EMAIL PROTECTED] wrote:

> Hi -
>
> I have a question on RWeka. I installed the package and try to run using 
> some examples available in the package. However, it stalls my machine 
> for a while. I'm wondering if I need weka (which is java implementation) 
> installed before using RWeka? Thank you.

No, and if it did it should be listed in the DESCRIPTION file (and given 
that the maintainer of RWeka wrote the rules, it would be).

> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

PLEASE do!  We don't know your OS, version of R, version of RWeka, version 
of Java, what 'stalls' means, what the maintainer said when you asked him 
(as required by the posting guide) ... in short anything we need to even 
guess at the problem.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can't find "as.family" function

2007-08-19 Thread Prof Brian Ripley


On Sun, 19 Aug 2007, Mario Alfonso Morales Rivera wrote:


Hi R users,

I want to use dglm Package.
I run the examples and it give me an error:

Error en dglm(lot1 ~ log(u), ~1, data = clotting, family = Gamma) :
   no se pudo encontrar la función "as.family"

dglm can't find "as.family" function

why ?


Because it does not exist in R (nor in the current version of package 
dglm).


PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


Please do as we ask.  We asked for the output of sessionInfo(), and one 
guess is that your versions are not current and this is a problem that has 
already been solved: another is that you have attached a package that 
conflicts with dglm -- the information we asked for would have helped in 
both cases.


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] recommended combo of apps for new user?

2007-08-18 Thread Prof Brian Ripley

Some additional comments on the DBMS front.

(a) SPSS is not a DBMS, so it is not clear that you need this. But if you 
do and are storing valuable data in a DBMS a lot of further questions come 
into play, like how you are going to do backups.  I'd say PostgreSQL was 
really only for professional-level administrators.  My sysadmins recommend 
MySQL for most people.  We do also run PostgreSQL and they find it a lot 
trickier to maintain.

'dozens of columns and thousands of rows' is not big.  A data frame with 
50 columns and 5000 rows would only take 2Mb to store, and R will easily 
handle 100x with 4GB of RAM (and if you have less, get 4GB).  So storing 
data in .rda (R's save() format) is most likely viable.  R's indexing etc 
operations make it good at data manipulation, and using a DBMS will 
involve learning SQL, a non-trivial cost.

(b) You have a choice of interfaces to a DBMS, RODBC and the DBI+ family, 
e.g. DBI+RMySQL and DBI+RSQLite.  I'm biased, but I find RODBC more 
intuitive, and many people have reported it to be faster.  If all you want 
is non-permanent storage for manipulation of large data sets, consider 
also SQLiteDF.

On Sat, 18 Aug 2007, Duncan Murdoch wrote:

> Martin Brown wrote:
>> [i sent this message earlier but apparently should have sent it plain
>> text, as follows..]
>>
>> Hi there,
>>
>> I would like some advice, not so much about how to use R, but about
>> software that I need to complement R.  I've rooted around in the FAQ's
>> and done a few searches on this mailing list but haven't quite found
>> the perspective I need.
>>
>> I am an experienced data analyst in my field (forest ecology and
>> ecological monitoring) but new to R. I am a long time user of SPSS and
>> have gotten pretty handy with it.  However, I am frustrated with SPSS
>> for several reasons:  There's the cost (I'm a freelancer; I pay for my
>> software myself);  the Windows dependence (I use Kubuntu as my usual
>> OS now, and switching back and forth is a pain); the horrible
>> inefficiency when I do certain types of file manipulations; and the
>> inability to do the kind of publication-quality graphs I want... I've
>> usually ended up using a commercial graphing program (another source
>> of expense and limitation).
>>
>> I'd like to switch to using R on Kubuntu, for all those reasons.  In
>> addition I think the mathematical formality that R encourages might be
>> good for me.
>>
>> However, reviewing the FAQ's on the R project web site makes me
>> realize that I've been using SPSS as three kinds of software really:
>> a DBMS; a statistical analysis package; and a graphing package.  It
>> looks like moving to R might involve learning three kinds of software,
>> not just one.  I wonder:
>>
>> 1) What open-source DBMS works most seamlessly with R?  I have seen
>> MySQL recommended but wonder if there are alternatives.  I sometimes
>> need to handle big data files.  In fact a lot of my work involves
>> exploratory and descriptive analyses of rather large and messy
>> databases from ecological monitoring, rather than statistical tests
>> per se.  In SPSS the data files I have been generating have dozens of
>> columns and thousands of rows, often with value and variable labels
>> helpful for documenting my work.

See above.

>
> I think you won't find much difference in the R interface between MySQL,
> PostgreSQL, or SQLite.  The choice should be made based on the qualities
> of the database (and I don't know enough about the differences to give a
> recommendaton.)
>> 2) For the purpose of creating publication-quality graphs, do R users
>> typically need to go outside of the R system? If so, what open-source
>> programs would you all recommend?
>>
> R is great for this, but you might need to go outside for some
> specialized stuff (e.g. medical imaging).
>
>> 3) Any other software I need to learn that would make my work in R
>> more productive? (for example, a code editor).
>
> A lot of people are happy with ESS mode in Emacs.
>
> Duncan Murdoch
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem using "rank"

2007-08-17 Thread Prof Brian Ripley

On Fri, 17 Aug 2007, Jiong Zhang, PhD wrote:

> Hi All,
>
> I had 12766 elements in a column, 12566 are values and 200 are "NA"s. I 
> used the following line to get the ranks:
>
> total_list$MB.rank <- rank(-total_list$MB,ties.method="min",na.last=NA)
>
> but I got an error message:
>
> Error in `$<-.data.frame`(`*tmp*`, "BCRP_PW_F.rank", value = c(3949, 6182,  :
>replacement has 12199 rows, data has 12766
>
> What shall I do to keep the "NA"s as "NA"s?  thanks a lot.

If all else fails try reading the help!  You have to select the right 
option for na.last, and yours is not it.  I suspect you want 
na.last="keep", but only you know what you mean.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] installation of the gsl package on Suse 10.1

2007-08-17 Thread Prof Brian Ripley

On Fri, 17 Aug 2007, luca laghi wrote:

> I am trying to install the gsl  package.
> I had gsl installed with YaSt in /usr/lib.
> when I launch R as superuser and launch install.packages, it says in
> cannot find Gnu Scientific Library. How can I make it find them?

Please tell us the exact messages you get.

At a quess, you need gsl-devel (or gsl-dev or some such) as well as gsl.

> Thank you,
> Luca Laghi
>
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

PLEASE do!

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Date format on x-axis

2007-08-17 Thread Prof Brian Ripley

If you want to use English, you need to set your session to be use 
English.


PLEASE do read the posting guide 

http://www.R-project.org/posting-guide.html

We need to know your OS and locale, and you did not follow the guide.

On a Unix-alike probably Sys.setlocale("LC_TIME","en_US") or 
Sys.setlocale("LC_TIME", "en_US.utf8") is needed.  On Windows 
Sys.setlocale("LC_TIME", "en").



On Fri, 17 Aug 2007, [EMAIL PROTECTED] wrote:


Dear R users,

Plotting question from a R beginner...

When I try to plot a response through time, for example:

Date<-c("2006-08-17", "2006-08-18", "2006-08-19", "2006-08-20")
response<-c(4,4,8,12)
as.Date(Date)


I presume that was Date <- as.Date(Date)


plot(Date,response)


The dates on the graphic appear in spanish. This I guess is the default
way of plotting because my windows is in spanish, but I need a "aug 17"
instead of "ago 17" (agosto is the spanish for august)...
I've tried,

format(Date, "%m %d")

And although it does change the way Date is listed, well it's still
plotted in spanish...
I've also searched through par() settings, but xaxp,xaxs, xaxt, xpd and
xlog do not solve my problem...

Could anyone help me solve this format question?

Thanks a million in advance,

Greetings,
Iñaki Etxebeste Larrañaga
M.Sc. Biologist
Producción Vegetal y Recursos Forestales
ETSIIAA Universidad de Valladolid
Avda. Madrid,57
34071 Palencia (Spain)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about sm.options & sm.survival

2007-08-17 Thread Prof Brian Ripley

On Thu, 16 Aug 2007, Rachel Jia wrote:

> Hi, there:
>
> It's my first time to post question in this forum, so thanks for your
> tolerance if my question is too naive. I am using a nonparametric smoothing
> procedure in sm package to generate smoothed survival curves for continuous
> covariate. I want to truncate the suvival curve and only display the part
> with covariate value between 0 and 7. The following is the code I wrote:
>
> sm.options(list(xlab="log_BSI_min3_to_base", xlim=c(0,7), ylab="Median
> Progression Prob"))
> sm.survival(min3.base.prog.cen[,3],min3.base.prog.cen[,2],min3.base.prog.cen[,1],h=sd(min3.base.prog.cen[,3]),status.code=1
> )
>
> But the xlim option does not work. Can anyone help me with this problem?

The help page suggests that you need to use xlim as an inline option (part 
of ...). Following the help page example

> sm.survival(x, y, status, h=2, xlim=c(0,4))

works.  So I think you need to follow the help page exactly.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Trim trailng space from data.frame factor variables

2007-08-16 Thread Prof Brian Ripley

On Thu, 16 Aug 2007, Marc Schwartz wrote:

> The easiest way might be to modify the lapply() call as follows:
>
> d[] <- lapply(d, function(x) if (is.factor(x)) factor(sub(" +$", "", x)) else 
> x)
>
>> str(d)
> 'data.frame':   60 obs. of  3 variables:
> $ x: Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
> $ y: num  7.01 8.33 5.48 6.51 5.61 ...
> $ f: Factor w/ 3 levels "lev1","lev2",..: 1 1 1 1 1 1 1 1 1 1 ...
>
>
> This way the coercion back to a factor takes place within the loop as
> needed.
>
> Note that I also meant to type sub() and not grep() below. The default
> behavior for both is to return a character vector (if 'value = TRUE' in
> grep()). There is not an argument to override that behavior.

I would have thought the thing to do was to apply sub() to the levels:

chfactor <- function(x) { levels(x) <- sub(" +$", "", levels(x)); x }

d[] <- lapply(d, function(x) if (is.factor(x)) chfactor(x) else x)

This has the advantage of not losing the order of the levels.  It will 
merge levels if they only differ in the number of trailing spaces, which 
is probably what you want.


> HTH,
>
> Marc
>
>
> On Thu, 2007-08-16 at 19:19 +0300, Lauri Nikkinen wrote:
>> Thanks Marc! What would be the easiest way to coerce char-variables
>> back to factor-variables? Is there a way to prevent the coercion in
>> d[] <- lapply(d, function(x) if ( is.factor(x)) sub(" +$", "", x) else
>> x) ?
>>
>>
>>
>> -Lauri
>>
>>
>>
>> 2007/8/16, Marc Schwartz <[EMAIL PROTECTED]>:
>> On Thu, 2007-08-16 at 17:54 +0300, Lauri Nikkinen wrote:
>>> Hi folks,
>>>
>>> I would like to trim the trailing spaces in my factor
>> variables using lapply
>>> (described in this post by Marc Schwartz:
>>> http://tolstoy.newcastle.edu.au/R/e2/help/07/08/22826.html)
>> but the code is
>>> not functioning (in this example there is only one factor
>> with trailing
>>> spaces):
>>
>> Ayepas I noted in that post, it was untestedmy error.
>>
>> The problem is that by using ifelse() as I did, the test for
>> the column
>> being a factor returns a single result, not one result per
>> element.
>> Hence, the appropriate conditional code is only performed on
>> the first
>> element in each column, rather than being vectorized on the
>> entire
>> column.
>>
>>> y1 <- rnorm(20) + 6.8
>>> y2 <- rnorm(20) + (1:20* 1.7 + 1)
>>> y3 <- rnorm(20) + (1:20*6.7 + 3.7)
>>> y <- c(y1,y2,y3)
>>> x <- gl(5,12)
>>> f <- gl(3,20, labels=paste("lev", 1:3, "   ", sep=""))
>>> d <- data.frame (x=x,y=y, f=f)
>>> str(d)
>>>
>>> d[] <- lapply(d, function(x) ifelse(is.factor(x), sub(" +$",
>> "", x), x))
>>> str(d)
>>>
>>> How should I modify this?
>>
>> Try this instead:
>>
>> d[] <- lapply(d, function(x) if (is.factor(x)) sub(" +$", "",
>> x) else x)
>>
>>> str(d)
>> 'data.frame':   60 obs. of  3 variables:
>> $ x: chr  "1" "1" "1" "1" ...
>> $ y: num  6.70 4.42 8.03 4.90 6.98 ...
>> $ f: chr  "lev1" "lev1" "lev1" "lev1" ...
>>
>> Note that by using grep(), the factors are coerced to
>> character vectors
>> as expected. You would need to coerce back to factors if you
>> need them
>> as such.
>>
>> HTH,
>>
>> Marc Schwartz

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Polynomial fitting

2007-08-16 Thread Prof Brian Ripley

It is easier to use poly(raw=TRUE), and better to use poly() with 
orthogonal polynomials.

The original poster shows signs of having read neither the help for 
predict.lm nor the posting guide, and so almost certainly misused the 
predict method.


On Thu, 16 Aug 2007, Jon Minton wrote:

> Remember that polynomials of the form
>
> y = b1*x + b2*x^2 + ... + bm*x^m
>
> fit the linear regression equation form
>
> Y = beta_1*x_1 + beta_2*x_2 + ... + beta_m*x_m
>
> If one sets (from the 1st to the 2nd equation)
> x -> x_1
> x^2 -> x_2
> x^3 -> x_3
> etc.
>
> In R this is easy, just use the identity operator I() when specifying the
> equation.
> e.g. for a 3rd order polynomial:
>
> model <- lm(Y ~ x + I(x^2) + I(x^3) + I(x^4))
>
> hth, Jon
>
> ***
>
> I'm looking some way to do in R a polynomial fit, say like polyfit
> function of Octave/MATLAB.
>
> For who don't know, c = polyfit(x,y,m) finds the coefficients of a
> polynomial p(x) of degree m that fits the data, p(x[i]) to y[i], in a
> least squares sense. The result c is a vector of length m+1 containing
> the polynomial coefficients in descending powers:
> p(x) = c[1]*x^n + c[2]*x^(n-1) + ... + c[n]*x + c[n+1]
>
> For prediction, one can then use function polyval like the following:
>
> y0 = polyval( polyfit( x, y, degree ), x0 )
>
> y0 are the prediction values at points x0 using the given polynomial.
>
> In R, we know there is lm for 1-degree polynomial:
> lm( y ~ x ) == polyfit( x, y, 1 )
>
> and for prediction I can just create a function like:
> lsqfit <- function( model, xx ) return( xx * coefficients(model)[2] +
> coefficients(model)[1] );
> and then: y0 <- lsqfit(x0)
> (I've tried with predict.lm( model, newdata=x0 ) but obtain a bad result)
>
> For a degree greater than 1, say m,  what can I use.??
> I've tried with
>   lm( y ~ poly(x, degree=m) )
> I've also looked at glm, nlm, approx, ... but with these I can't
> specify the polynomial degree.
>
> Thank you so much!
>
> Sincerely,
>
> -- Marco

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] an easy way to construct this special matirx

2007-08-16 Thread Prof Brian Ripley

?toeplitz
?lower.tri

since it is the lower triangle of a Toeplitz matrix (or drop the top row)

r <- 0.95
R <- toeplitz(r^(0:4))
R[upper.tri(R)] <- 0
R[-1,]


On Thu, 16 Aug 2007, [EMAIL PROTECTED] wrote:

> Hi,
> Sorry if this is a repost. I searched but found no results.
> I am wondering if it is an easy way to construct the following matrix:
>
> r  1 0 00
> r^2   r 1 00
> r^3   r^2  r 10
> r^4   r^3  r^2  r1
>
> where r could be any number. Thanks.
> Wen

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] binomial simulation

2007-08-16 Thread Prof Brian Ripley

On Wed, 15 Aug 2007, Moshe Olshansky wrote:

> Thank you - I wasn't aware of this function.
> One can even use lchoose which allows really huge
> arguments (more than 2^1000)!

Using dbinom() for binomial probabilities would be even better, 
and that has a log=TRUE argument to return results on natural log scale.

> dbinom(k,N,p,log=TRUE) + dbinom(m,k,q,log=TRUE)
[1] -92.52584
> log(choose(N,k)*p^k*(1-p)^(N-k)*choose(k,m)*q^m*(1-q)^(k-m))
[1] -92.52584

>
> --- "Lucke, Joseph F" <[EMAIL PROTECTED]>
> wrote:
>
>> C is an R function for setting contrasts in a
>> factor.  Hence the funky
>> error message.
>> ?C
>>
>> Use choose() for your C(N,k)
>> ?choose
>>
>> choose(200,2)
>> 19900
>>
>> choose(200,100)
>>  9.054851e+58
>>
>> N=200; k=100; m=50; p=.6; q=.95
>>
> choose(N,k)*p^k*(1-p)^(N-k)*choose(k,m)*q^m*(1-q)^(k-m)
>> 6.554505e-41
>>
>> -Original Message-
>> From: [EMAIL PROTECTED]
>> [mailto:[EMAIL PROTECTED] On Behalf
>> Of Moshe Olshansky
>> Sent: Wednesday, August 15, 2007 2:06 AM
>> To: sigalit mangut-leiba; r-help
>> Subject: Re: [R] binomial simulation
>>
>> No wonder that you are getting overflow, since
>> gamma(N+1) = n! and 200! > (200/e)^200 > 10^370.
>> There exists another way to compute C(N,k). Let me
>> know if you need this
>> and I will explain to you how this can be done.
>> But do you really need to compute the individual
>> probabilities? May be
>> you need something else and there is no need to
>> compute the individual
>> probabilities?
>>
>> Regards,
>>
>> Moshe.
>>
>> --- sigalit mangut-leiba <[EMAIL PROTECTED]> wrote:
>>
>>> Thank you,
>>> I'm trying to run the joint probabilty:
>>>
>>> C(N,k)*p^k*(1-p)^(N-k)*C(k,m)*q^m*(1-q)^(k-m)
>>>
>>> and get the error: Error in C(N, k) : object not
>> interpretable as a
>>> factor
>>>
>>> so I tried the long way:
>>>
>>> gamma(N+1)/(gamma(k+1)*(gamma(N-k)))
>>>
>>> and the same with k, and got the error:
>>>
>>> 1: value out of range in 'gammafn' in: gamma(N +
>> 1)
>>> 2: value out of range in 'gammafn' in: gamma(N -
>> k) 
>>>
>>> Do you know why it's not working?
>>>
>>> Thanks again,
>>>
>>> Sigalit.
>>>
>>>
>>>
>>> On 8/14/07, Moshe Olshansky
>> <[EMAIL PROTECTED]>
>>> wrote:

 As I understand this,
 P(T+ | D-)=1-P(T+ | D+)=0.05
 is the probability not to detect desease for a
>>> person
 at ICU who has the desease. Correct?

 What I asked was whether it is possible to
>>> mistakenly
 detect the desease for a person who does not
>> have
>>> it?

 Assuming that this is impossible the formula is
>>> below:

 If there are N patients, each has a probability
>> p
>>> to
 have the desease (p=0.6 in your case) and q is
>> the probability to
 detect the desease for a person who
>>> has
 it (q = 0.95 for ICU and q = 0.8 for a regular
>>> unit),
 then

 P(k have the desease AND m are detected) = P(k
>> have the desease)*P(m
>>
 are detected / k have
>>> the
 desease) =
 C(N,k)*p^k*(1-p)^(N-k)*C(k,m)*q^m*(1-q)^(k-m)
 where C(a,b) is the Binomial coefficient "a
>> above
>>> b" -
 the number of ways to choose b items out of a
>>> (when
 the order does not matter). You of course must
>>> assume
 that N >= k >= m >= 0 (otherwise the probability
>>> is
 0).

 To generate such pairs (k infected and m
>> detected)
>>> you
 can do the following:

 k <- rbinom(N,1,p)
 m <- rbinom(k,1,q)

 Regards,

 Moshe.

 --- sigalit mangut-leiba <[EMAIL PROTECTED]>
>>> wrote:

> Hi,
> The probability of false detection is: P(T+ |
>> D-)=1-P(T+ |
> D+)=0.05.
> and I want to find the joint probability
> P(T+,D+)=P(T+|D+)*P(D+)
> Thank you for your reply,
> Sigalit.
>
>
> On 8/13/07, Moshe Olshansky
>>> <[EMAIL PROTECTED]>
> wrote:
>>
>> Hi Sigalit,
>>
>> Do you want to find the probability P(T+ = t
>>> AND
> D+ =
>> d) for all the combinations of t and d (for
>>> ICU
> and
>> Reg.)?
>> Is the probability of false detection (when
>>> there
> is
>> no disease) always 0?
>>
>> Regards,
>>
>> Moshe.
>>
>> --- sigalit mangut-leiba <[EMAIL PROTECTED]>
> wrote:
>>
>>> hello,
>>> I asked about this simulation a few days
>>> ago,
> but
>>> still i can't get what i
>>> need.
>>> I have 2 units: icu and regular. from icu
>> I
>>> want
> to
>>> take 200 observations
>>> from binomial distribution, when
>> probability
>>> for
>>> disease is: p=0.6.
>>> from regular I want to take 300
>> observation
>>> with
> the
>>> same probability: p=0.6
>>> .
>>> the distribution to detect disease when
>>> disease
>>> occurred- *for someone from
>>> icu* - is: p(T+ | D+)=0.95.
>>> the distribution to detect disease when
>>> disease
>>> occurred- *for someone from
>>> reg.unit* - is: p(T+ | D+)=0.8.
>>> I want to compute the joint d

Re: [R] getting lapply() to work for a new class

2007-08-15 Thread Prof Brian Ripley

On Wed, 15 Aug 2007, Pijus Virketis wrote:

> Thank you.
>
> When I tried to set as.list() in baseenv(), I learned that its bindings
> are locked.

Of course.  Did you not see my comment about 'to protect code against 
redefining functions'?

> Does this mean that the thing to do is just to write my own
> "lapply", which does the coercion using my "private" as.list(), and then
> invokes the base lapply()?

I believe 'the thing to do' is to call your as.list explicitly.  After 
all, the first 'l' in lapply means 'list', so is it is 'natural' to call 
it on a list.

And please do NOT edit other people's messages without indication: the R 
posting guide covers that and it is a copyright violation.

>
> -P
>
> -Original Message-

Not so: an EDITED version of my message.

> From: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, August 15, 2007 5:18 PM
>
>> As far as I can tell, lapply() needs the class to be coercible to a
>> list. Even after I define as.list() and as.vector(x, mode="list")
>> methods, though, I still get an "Error in as.vector(x, "list") :
>> cannot coerce to vector". What am I doing wrong?
>
> Not considering namespaces.  Setting an S4 method for as.list() creates
> an object called as.list in your workspace, but the lapply function uses
> the as.list in the base namespace.  That's the whole point of
> namespaces: to protect code against redefining functions.
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] getting lapply() to work for a new class

2007-08-15 Thread Prof Brian Ripley

On Wed, 15 Aug 2007, Pijus Virketis wrote:

> I would like to get lapply() to work in the natural way on a class I've
> defined.

What you have not said is that this is an S4 class.

> As far as I can tell, lapply() needs the class to be coercible
> to a list. Even after I define as.list() and as.vector(x, mode="list")
> methods, though, I still get an "Error in as.vector(x, "list") : cannot
> coerce to vector". What am I doing wrong?

Not considering namespaces.  Setting an S4 method for as.list() creates an 
object called as.list in your workspace, but the lapply function uses the 
as.list in the base namespace.  That's the whole point of namespaces: to 
protect code against redefining functions.

This works as documented for S3 methods (since as.list is S3 generic): it 
is a 'feature' of S4 methods that deserves to be much more widely 
understood.

> # dummy class
> setClass("test", representation(test="list"))
>
> # set up as.list()
> test.as.list <- function(x) [EMAIL PROTECTED]
> setMethod("as.list", signature(x="test"), test.as.list)
>
> # set up as.vector(x, mode="list")
> test.as.vector <- function(x, mode) [EMAIL PROTECTED]
> setMethod("as.vector", signature(x="test", mode="character"),
> test.as.vector)
>
> obj <- new("test", test=list(1, 2, 3))
>
> # this produces "Error in as.vector(x, "list") : cannot coerce to
> vector" on R 2.4.1
> lapply(obj, print)
>
> # these work
> lapply(as.list(obj), print)
> lapply(as.vector(obj, "list"), print)
>
> Thank you,
>
> Pijus

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error in building R

2007-08-15 Thread Prof Brian Ripley

On Wed, 15 Aug 2007, Giovanni Petris wrote:

>
> Hello,
>
> I am upgrading to the current R 2.5.1 under Sun Solaris 8.

Actually, 2.5.1 is not current: '2.5.1 patched' aka R-patched is and this 
has already been addressed there.

>  I call the configure script with the --without-readline flag, and it 
> works fine. Then, when I invoke make, I get this kind of error messages:
>
>
> make[2]: Entering directory `/usr/local/R/R-2.5.1-inst/src/library'
> >>> Building/Updating help pages for package 'base'
> Formats: text html latex example
> Can't use an undefined value as filehandle reference at 
> /usr/local/R/R-2.5.1-inst/share/perl/R/Rdconv.pm line 78.
> >>> Building/Updating help pages for package 'tools'
> Formats: text html latex example
> Can't use an undefined value as filehandle reference at 
> /usr/local/R/R-2.5.1-inst/share/perl/R/Rdconv.pm line 78.
> >>> Building/Updating help pages for package 'utils'
> Formats: text html latex example
> Can't use an undefined value as filehandle reference at 
> /usr/local/R/R-2.5.1-inst/share/perl/R/Rdconv.pm line 78.
>
>
> (I don't know if this has to do with perl, but I have version 5.005_03)

It does.  My memory is that version of Perl predates Solaris 8 (it comes 
from the 1990's).  You need Perl >= 5.6.1, and I would suggest installing 
Perl 5.8.x (which is already 6 years' old) as the next version of R will 
require it.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] installation of packages

2007-08-14 Thread Prof Brian Ripley

Please see the discussion in the rw-FAQ.

On Wed, 15 Aug 2007, [EMAIL PROTECTED] wrote:

> Dear All,
>
> Have just installed v2.5.1 on Windows XP. Works fine but I had quite a few
> pakages loaded for 2.5.0 (from contributed) and was wondering how I can
> get 2.5.1 to recognise them without having to reinstall them all.
>
> Is this possible or do I have to reinstall all the packages again?
>
> I required 2.5.1 for lme4 and matrix.
>
> Many thanks in advance.
>
> 
>
> Regards
>
> Robin Dobos,
> Livestock Research Officer (Livestock Production Systems),
> Beef Industry Centre of Excellence,
> NSW Department of Primary Industries,
> Armidale, NSW, Australia, 2351
>
> ph:  +61 2 6770 1824
> fax:  +61 2 6770 1830
> mobile: 0431 391 885
> email: [EMAIL PROTECTED]
>
> If we knew what it was we were doing,
> it would not be called research, would it?
>
> Albert Einstein
>
>
> This message is intended for the addressee named and may con...{{dropped}}
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mann-Whitney U

2007-08-14 Thread Prof Brian Ripley

On Tue, 14 Aug 2007, Natalie O'Toole wrote:

> Hi,
>
> Could someone please tell me how to perform a Mann-Whitney U test on a
> dataset with 2 groups where one group has more data values than another?
>
> I have split up my 2 groups into 2 columns in my .txt file i'm using with
> R. Here is the code i have so far...
>
> group1 <- c(LeafArea2)
> group2 <- c(LeafArea1)
> wilcox.test(group1, group2)
>
> This code works for datasets with the same number of data values in each
> column, but not when there is a different number of data values in one
> column than another column of data.

There is an example of that scenario on the help page for wilcox.test, so 
it does 'work'.  What exactly went wrong for you?

> Is the solution that i have to have a null value in the data column with
> the fewer data values?
>
> I'm testing for significant diferences between the 2 groups, and the
> result i'm getting in R with the uneven values is different from what i'm
> getting in SPSS.

We need a worked example.  As the help page says, definitions do differ. 
If you can provide a reproducible example in R and the output from SPSS we 
may be able to tell you how to relate that to what you see in R.

[...]

> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

As it says, we really need such code (and the output you get) to be able 
to help you.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] weights in GAMs (package mgcv)

2007-08-14 Thread Prof Brian Ripley

Let's simplify to a linear model.  If your covariates have uncertainties, 
most likely a linear regression is not appropriate.  This sounds like an 
'errors in measurements' model, as covered in

@Book{Fuller.87,
   author   = "Fuller, Wayne A.",
   title= "Measurement Error Models",
   publisher= "John Wiley and Sons",
   address =  "New York",
   year = "1987",
   ISBN = "0-471-86187-1",
}

in which there is a true covariate that enters the model, but it is only 
observed with measurement error (or similar scenarios).

This is hard enough for linear models, without thinking about non-normal 
models or extensions beyond linear predictors.  The GLM (including GAM) 
estimation process assumes various things, including that the covariates 
that enter into the model are fixed (possibly by conditioning on them) and 
known.

On Tue, 14 Aug 2007, Julian Burgos wrote:

> Dear list,
>
> I?m using the ?mgcv? package to fit some GAMs. Some of my covariates are
> derived quantities and have an associated standard error, so I would
> like to incorporate this uncertainty into the GAM estimation process.
> Ideally, during the estimation process less importance would be given to
> observations whose covariates have high standard errors.
>
> The gam() function in the ?mgcv? package has a ?weights? argument.
> According to the package documentation, this can be used to provide
> prior weights to the data. This argument (as far as I understand) takes
> a vector of the same length of the data with numeric values higher than
> zero. So it seems that I should combine the standard errors of all
> covariates into a single vector and use it as weights. But it is not
> obvious to me how to do this, given that the covariates have different
> units and ranges of values.

Actually this is just taken from glm(), and case weights are part of the 
definition of a GLM.  In so far as I understand your scenario, you do not 
have a GLM.

> Is there any way to provide weights to the covariates directly (for
> example providing a matrix of n x m values, where n=number of covariates
> and m=number of observations)?
>
> Thanks,
>
> Julian
>
> Julian M. Burgos
>
> Fisheries Acoustics Research Lab
> School of Aquatic and Fishery Science
> University of Washington
>
> 1122 NE Boat Street
> Seattle, WA  98105
>
> Phone: 206-221-6864
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column

2007-08-14 Thread Prof Brian Ripley


On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:


Dear Professor Ripley,

Thank you very much for your response. I send the problem, as I didn't 
have any more ideas were to search for the reason. I didn't say this is 
a R bug, knowing the responses on such mails.-)


But I succeeded in developing a tiny example, that reproduces the bug 
(wherever it is).


Thank you, that was helpful: much easier to follow that the previous code.

...


library(RODBC)
.con <- odbcConnectAccess("./test2.mdb")
(.d <- try(sqlQuery(.con, "select * from Tab1")))

 F1 NO F2
1  1  1  1
2  2  2  2
3  0 NA  1
4  1  0  0

(.d <- try(sqlQuery(.con, "select F1 , NO , F2 from Tab1")))

 F1 Expr1001 F2
1  10  1
2  20  2
3  00  1
4  10  0

close(.con)


So the problem occurs if the column names are specified within the query.
Is the query "select F1 , NO , F2 from Tab1" invalid?


I believe so. 'NO' is an SQL92 and ODBC reserved word, at least according 
to http://www.bairdgroup.com/reservedwords.cfm


See also http://support.microsoft.com/default.aspx?scid=kb;en-us;286335
which says

  For existing objects with names that contain reserved words, you can
  avoid errors by surrounding the object name with brackets ([ ]).

and lists 'NO' as a reserved word.  RODBC quotes all column names it uses 
to be sure (and knows about most non-standard quoting mechanisms from the 
ODBC driver in use).  But this was a query you generated and so you need 
to do the quoting.


Regarding the memory issue, I _knew_ that there must be a reason for the 
running out of memory space. Sorry for not being more specific. My 
question than is:


Is there a way to 'reset' the environment without quitting R and 
restarting it?


Sorry, no.  You cannot move objects in memory.

But why '477Mb' is coming up is still unexplained, and suggests that the 
machine has a peculiar amount of memory or some flag has been used.





Thank you for your help.

Kind regards,
Maciej


-Ursprüngliche Nachricht-
Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED]
Gesendet: Dienstag, 14. August 2007 11:51
An: Maciej Hoffman-Wecker
Cc: r-help@stat.math.ethz.ch
Betreff: Re: [R] Import of Access data via RODBC changes column name ("NO" to 
"Expr1014") and the content of the column

On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:



Dear all,

I have some problems with importing data from an Access data base via
RODBC to R. The data base contains several tables, which all are
imported consecutively. One table has a column with column name "NO".
If I run the code attached on the bottom of the mail I get no
complain, but the column name (name of the respective vector of the
data.frame) is "Expr1014" instead of "NO". Additionally the original
column (type
"text") containes "0"s and missings, but the imported column contains
"0"s only (type "int"). If I change the column name in the Access data
base to "NOx", the import works fine with the right name and the same
data.

Previously I generated a tiny Access data base which reproduced the
problem. To be on the safe site I installed the latest version (2.5.1)
and now the example works fine, but within my production process the
error still remaines. An import into excel via ODBC works fine.

So there is no way to figure it out whether this is a bug or a
feature.-)


It's most likely an ODBC issue, but you have not provided a reproducible 
example.


The second problem I have is that when I rerun "rm(list = ls(all =
T)); gc()" and the import several times I get the following error:

Error in odbcTables(channel) : Calloc could not allocate (263168 of 1)
memory In addition: Warning messages:
1: Reached total allocation of 447Mb: see help(memory.size) in:
odbcQuery(channel, query, rows_at_time)
2: Reached total allocation of 447Mb: see help(memory.size) in:
odbcQuery(channel, query, rows_at_time)
3: Reached total allocation of 447Mb: see help(memory.size) in:
odbcTables(channel)
4: Reached total allocation of 447Mb: see help(memory.size) in:
odbcTables(channel)

which is surprising to me, as the first two statements should delete
all


How do you _know _what they 'should' do?  That only deletes all objects in the 
workspace, not all objects in R, and not all memory blocks used by R.

Please do read ?"Memory-limits" for the possible reasons.

Where did '447Mb' come from?  If this machine has less than 2Gb of RAM, buy 
some more.



objects and recover the memory. Is this only a matter of memory? Is
there any logging that reduces the memory? Or is this issue connected to
the upper problem?

I added the code on the bottom - maybe there is some kind of misuse I
lost sight of. Any hints are appreciated.

Kind regards,
Maciej


version

  _
platform   i386-

Re: [R] glm(family=binomial) and lmer

2007-08-14 Thread Prof Brian Ripley

On Tue, 14 Aug 2007, Chris O'Brien wrote:

> Dear R users,
>
> I've notice that there are two ways to conduct a binomial GLM with binomial
> counts using R.  The first way is outlined by Michael Crawley in his
> "Statistical Computing book" (p 520-521):

and in the places he got it from (it is not his original work).

These are not the only two ways, and they are not the same analyses as the 
saturated models differ.  The usual way to use weights is

y <- dead/batch
model3 <- glm(y ~ log(dose), binomial, weights=batch)
summary(model3)

and internally glm converts models with a two-column response to this 
form, for it is in this form the binomial fits into the GLM framework.

See the White Book or MASS (even the 1994 edition).


> >dose=c(1,3,10,30,100)
> >dead = c(2,10,40,96,98)
> >batch=c(100,90,98,100,100)
> >response = cbind(dead,batch-dead)
> >model1=glm(y~log(dose),binomial)
> >summary(model1)
>
> Which returns (in part):
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept)  -4.5318 0.4381  -10.35   <2e-16 ***
> log(dose) 1.9644 0.1750   11.22   <2e-16 ***
> Null deviance: 408.353  on 4  degrees of freedom
> Residual deviance:  10.828  on 3  degrees of freedom
> AIC: 32.287
>
> Another way to do the same analysis is to reformulate the data, and use GLM
> with weights:
>
> >y1=c(rep(0,5),rep(1,5))
> >dose1=rep(dose,2)
> >number = c(batch-dead,dead)
> >data1=as.data.frame(cbind (y1,dose,number))
> >model2=glm(y1~log(dose1),binomial,weights=number,data=data1)
> >summary(model2)
>
> Which returns:
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept)  -4.5318 0.4381  -10.35   <2e-16 ***
> log(dose1)1.9644 0.1750   11.22   <2e-16 ***
> (Dispersion parameter for binomial family taken to be 1)
> Null deviance: 676.48  on 9  degrees of freedom
> Residual deviance: 278.95  on 8  degrees of freedom
> AIC: 282.95
>
> Number of Fisher Scoring iterations: 6
>
> These two methods are similar in the parameter estimates and standard
> errors, however the deviances, their d.f., and AIC differ.  I take the
> first method to be the correct one.

This form has ten obeservations of groups with weights 2,98,10,80 

> However, I'm really interested in conducting a GLM binomial mixed model,
> and I am unable to figure out how to use the first method with the lmer
> function from the lme4 library, e.g.
>
> >model3=lmer(y~log(dose)+time|ID)# the above example data doesn't have
> the random effect, but my own data set does.
>
>   Does anyone have any suggestions?
>
> thanks,
> chris
>
> Thanks,
> Chris O'Brien
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] {grid} plain units with non NULL data arguments

2007-08-14 Thread Prof Brian Ripley

On Tue, 14 Aug 2007, Wolfram Fischer wrote:

> In help(unit) I read:
>
> The 'data' argument must be a list when the 'unit.length()'
> is greater than 1.  For example, 'unit(rep(1, 3), c("npc",
> "strwidth", "inches"), data=list(NULL, "my string", NULL))'.
>
> In the newest R-versions it is not anymore allowed to let strings
> in the data-argument for plain units, otherwise one gets the
> following error:
>Non-NULL value supplied for plain unit
>
> I have some labels. Between them I wanted to set a distance of 1.5 lines.
> (I wanted to use that for a grid.layout for a legend:
> The space is for the symbols.)
>
>labels <- c( ':', 'a', 'bb', 'ccc', '', 'e' )
>n <- length( labels )
>s <- as.list( c( labels[1], rep( labels[-1], each=2 ) ) )
>u <- unit( data=s, x=c( 1, rep( c( 1.5, 1 ), n-1 ) ),
>units=c( 'strwidth', rep( c( 'lines', 'strwidth' ), n-1 ) ) )
>
> How can I insert the NULL values into the list ``s''?
>
> To fill every second element of s with NULL, I tried:
>s[ 2 * ( 1 : length( labels[-1] ) ) ] <- NULL
> But this deletes every second element.

A value of list(NULL) is correct for inserting NULLs into lists.
(More generally to substitute in a list you need a list value.)

> The following would work:
>s[ 2 * ( 1 : length( labels[-1] ) ) ] <- NA
> But unit() does not accept NAs.

More to the point, it does not accept logical vectors as NULL values.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] graph dimensions default

2007-08-14 Thread Prof Brian Ripley

On Tue, 14 Aug 2007, Simon Pickett wrote:

> Yes,
>
> Thankyou, that does the trick nicely. I thought that kind of thing could
> be specified using par() but I guess not.

As I said, size is not a property of the plot.
And par() applies to the current device, not future ones.

>
> Thanks again.
>
>
>
>> On Tue, 14 Aug 2007, Simon Pickett wrote:
>>
>>> Hi,
>>>
>>> I would like to (if possible) set the default width and height for
>>> graphs
>>> at the start of each session and have each new graphic device overwrite
>>> the previous one.
>>
>> Hmm.  It is graphics devices that have dimensions, and plots that
>> overwrite other plots on a device, so your intentions are pretty unclear.
>> (If you resize a device window the plot dimensions change so they are not
>> intrinsic to the plot.)
>>
>> If you want the default behaviour to be like normal but with, say, a wider
>> onscreen device window you can have (on Windows, which you didn't say)
>>
>> mywindows <- function(...) windows(width=10, height=6, ...)
>> options(device="mywindows")
>>
>> in your ~/.Rprofile .  Otherwise, please try again to tell us what you
>> do want.
>>
>>
>>>
>>> I only know how to do this using windows(width=,height=...) which opens
>>> up
>>> a new plotting device every time, so I end up with lots of graphs all
>>> over
>>> the place until I get the one I want!
>>>
>>> Thanks in advance,
>>>
>>> Simon
>>>
>>>
>>> Simon Pickett
>>> PhD student
>>> Centre For Ecology and Conservation
>>> Tremough Campus
>>> University of Exeter in Cornwall
>>> TR109EZ
>>> Tel 01326371852
>>>
>>> http://www.uec.ac.uk/biology/research/phd-students/simon_pickett.shtml
>>>
>>> __
>>> R-help@stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> --
>> Brian D. Ripley,  [EMAIL PROTECTED]
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford, Tel:  +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UKFax:  +44 1865 272595
>>
>
>
> Simon Pickett
> PhD student
> Centre For Ecology and Conservation
> Tremough Campus
> University of Exeter in Cornwall
> TR109EZ
> Tel 01326371852
>
> http://www.uec.ac.uk/biology/research/phd-students/simon_pickett.shtml
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear Regression with slope equals 0

2007-08-14 Thread Prof Brian Ripley

On Tue, 14 Aug 2007, [EMAIL PROTECTED] wrote:

>
> Hi there, am trying to run a linear regression with a slope of 0.
>
> I have a dataset as follows
>
> t d
> 1 303
> 2 302
> 3 304
> 4 306
> 5 307
> 6 303
>
> I would like to test the significance that these points would lie on a
> horizontal straight line.
>
> The standard regression lm(d~t) doesn't seem to allow the slope to be set.

lm(d ~ 1) does, though, to zero.

More generally you can use offset(), e.g. lm(d ~ offset(7*t)) forces a 
slope of 7.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] graph dimensions default

2007-08-14 Thread Prof Brian Ripley

On Tue, 14 Aug 2007, Simon Pickett wrote:

> Hi,
>
> I would like to (if possible) set the default width and height for graphs
> at the start of each session and have each new graphic device overwrite
> the previous one.

Hmm.  It is graphics devices that have dimensions, and plots that 
overwrite other plots on a device, so your intentions are pretty unclear. 
(If you resize a device window the plot dimensions change so they are not 
intrinsic to the plot.)

If you want the default behaviour to be like normal but with, say, a wider 
onscreen device window you can have (on Windows, which you didn't say)

mywindows <- function(...) windows(width=10, height=6, ...)
options(device="mywindows")

in your ~/.Rprofile .  Otherwise, please try again to tell us what you 
do want.

>
> I only know how to do this using windows(width=,height=...) which opens up
> a new plotting device every time, so I end up with lots of graphs all over
> the place until I get the one I want!
>
> Thanks in advance,
>
> Simon
>
>
> Simon Pickett
> PhD student
> Centre For Ecology and Conservation
> Tremough Campus
> University of Exeter in Cornwall
> TR109EZ
> Tel 01326371852
>
> http://www.uec.ac.uk/biology/research/phd-students/simon_pickett.shtml
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Import of Access data via RODBC changes column name ("NO" to "Expr1014") and the content of the column

2007-08-14 Thread Prof Brian Ripley

On Tue, 14 Aug 2007, Maciej Hoffman-Wecker wrote:

>
> Dear all,
>
> I have some problems with importing data from an Access data base via
> RODBC to R. The data base contains several tables, which all are
> imported consecutively. One table has a column with column name "NO". If
> I run the code attached on the bottom of the mail I get no complain, but
> the column name (name of the respective vector of the data.frame) is
> "Expr1014" instead of "NO". Additionally the original column (type
> "text") containes "0"s and missings, but the imported column contains
> "0"s only (type "int"). If I change the column name in the Access data
> base to "NOx", the import works fine with the right name and the same
> data.
>
> Previously I generated a tiny Access data base which reproduced the
> problem. To be on the safe site I installed the latest version (2.5.1)
> and now the example works fine, but within my production process the
> error still remaines. An import into excel via ODBC works fine.
>
> So there is no way to figure it out whether this is a bug or a
> feature.-)

It's most likely an ODBC issue, but you have not provided a reproducible 
example.

> The second problem I have is that when I rerun "rm(list = ls(all = T));
> gc()" and the import several times I get the following error:
>
> Error in odbcTables(channel) : Calloc could not allocate (263168 of 1)
> memory
> In addition: Warning messages:
> 1: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcQuery(channel, query, rows_at_time)
> 2: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcQuery(channel, query, rows_at_time)
> 3: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcTables(channel)
> 4: Reached total allocation of 447Mb: see help(memory.size) in:
> odbcTables(channel)
>
> which is surprising to me, as the first two statements should delete all

How do you _know _what they 'should' do?  That only deletes all objects in 
the workspace, not all objects in R, and not all memory blocks used by R.

Please do read ?"Memory-limits" for the possible reasons.

Where did '447Mb' come from?  If this machine has less than 2Gb of RAM, 
buy some more.


> objects and recover the memory. Is this only a matter of memory? Is
> there any logging that reduces the memory? Or is this issue connected to
> the upper problem?
>
> I added the code on the bottom - maybe there is some kind of misuse I
> lost sight of. Any hints are appreciated.
>
> Kind regards,
> Maciej
>
>> version
>   _
> platform   i386-pc-mingw32
> arch   i386
> os mingw32
> system i386, mingw32
> status
> major  2
> minor  5.1
> year   2007
> month  06
> day27
> svn rev42083
> language   R
> version.string R version 2.5.1 (2007-06-27)
>
>
> ## code
>
> get.table <- function(name, db, drop = NULL){
>  .con <- try(odbcConnectAccess(db), silent = T)
>  if(!inherits(.con, "RODBC")) return(.con)
>  ## exclude memo columns
>  .t <- try(sqlColumns(.con, name))
>  if(inherits(.t, "try-error")){close(.con); return(.t)}
>  .t <- .t[.t$"COLUMN_SIZE" < 255, "COLUMN_NAME"]
>  .t <- paste(.t, collapse = ",")
>  ## get table
>  .t <- paste("select", .t, "from", name)
>  .d <- try(sqlQuery(.con, .t), silent = T)
>  if(inherits(.d, "try-error")){close(.con); return(.d)}
>  .con <- try(close(.con), silent = T)
>  if(inherits(.con, "try-error")) return(.con)
>  .d <- .d[!names(.d) %in% drop]
>  return(.d)
> }
>
> get.alltables <- function(db){
>  .con <- try(odbcConnectAccess(db), silent = T)
>  if(!inherits(.con, "RODBC")) return(.con)
>  .tbls <- try(sqlTables(.con)[["TABLE_NAME"]])
>  if(inherits(.tbls, "try-error")){close(.con); return(.tbls)}
>  .con <- try(close(.con), silent = T)
>  if(inherits(.con, "try-error")) return(.con)
>  .tbls <- .tbls[-grep("^MSys", .tbls)]
>  .d <- lapply(seq(along = .tbls), function(.i){
>.d <-
>  try(get.table(.tbls[.i], db = db))
>return(invisible(.d))
>  })
>  names(.d) <- .tbls
>  .ok <- !sapply(.d, inherits, "try-error")
>  return(list(notdone = .d[!.ok], data = .d[.ok]))
> }
>
> library(RODBC)
>
> alldata <- get.alltables(db = "./myaccessdb.MDB")
>
> ## code end
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://

Re: [R] cov.unscaled in gls object

2007-08-14 Thread Prof Brian Ripley

This is what the vcov() generic is for.  You are asking for internal 
details from a different class ("summary.lm").

On Tue, 14 Aug 2007, Sven Garbade wrote:

> Hi list,
>
> can I extract the cov.unscaled ("the unscaled covariance matrix") from a
> gls fit (package nlme), like with summary.lm? Background: In a fixed
> effect meta analysis regression the standard errors of the coefficients
> can be computed as sqrt(diag(cov.unscaled)) where cov.unscaled is
> (X'WX). I try do do this with a gls-fit.

I don't think so: the 'unscaled' is a clue.  The vcov method is

> stats:::vcov.lm
function (object, ...)
{
 so <- summary.lm(object, corr = FALSE)
 so$sigma^2 * so$cov.unscaled
}

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] invert 160000x160000 matrix

2007-08-14 Thread Prof Brian Ripley

On Tue, 14 Aug 2007, Patnaik, Tirthankar  wrote:

> A variety of tricks would need to be used to invert a matrix of this 
> size. If there are any other properties of the matrix that you know 
> (symmetric, positive definite, etc, sparse) then they could be useful 
> too. You could partition the matrix first, then use an in-place inverse 
> technique for each block to individually calculate the blocks-inverses, 
> then combine to get the inverse of the initial matrix. Again, if the 
> implementation is actually solving an Ax-B = 0 system of equations, then 
> there are specific methods for these too, like an LU decomp, for 
> instance. You might also want to check out some texts for this, like the 
> Numerical Recipes.

> How's the matrix stored right now?

Well, not in R as a matrix: see ?"Memory-limits".  It is about 12x larger 
than the largest possible matrix in R.

>
> Best,
> -Tir
>
> Tirthankar Patnaik
> India Strategy
> Citigroup Investment Research
> +91-22-6631 9887
>
>> -Original Message-
>> From: [EMAIL PROTECTED]
>> [mailto:[EMAIL PROTECTED] On Behalf Of Moshe Olshansky
>> Sent: Tuesday, August 14, 2007 6:40 AM
>> To: Paul Gilbert; Jiao Yang
>> Cc: r-help@stat.math.ethz.ch
>> Subject: Re: [R] invert 16x16 matrix
>>
>> While inverting the matrix may be a problem, if you need to
>> solve an equation A*x = b you do not need to invert A, there
>> exist iterative methods which do need A or inv(A) - all you
>> need to provide is a function that computes A*x for an
>> arbitrary vector x.
>> For such a large matrix this may be slow but possible.
>>
>> --- Paul Gilbert <[EMAIL PROTECTED]>
>> wrote:
>>
>>> I don't think you can define a matrix this large in R, even if you
>>> have the memory. Then, of course, inverting it there may be other
>>> programs that have limitations.
>>>
>>> Paul
>>>
>>> Jiao Yang wrote:

 Can R invert a 16x16 matrix with all
>>> positive numbers?  Thanks a lot!

>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question regarding is.factor()

2007-08-13 Thread Prof Brian Ripley

typeof() for 'types'.

However, "factor" is not a type but a class, so class() is probably what 
you want.

On Mon, 13 Aug 2007, Jabez Wilson wrote:

> Dear all, please help with what must be a straightforward question which 
> I can't answer.

But 'An Introduction to R' could.

>  I add a column of my dataframe as factor of an existing column e.g.
>
>  df[,5] <- factor(df[,2])
>
>  and can test that it is by is.factor()
>
>  but if I did not know in advance what "types" the columns were, is 
> there a function to tell me what they are.
>
>  i.e. instead of is.factor(), is.matrix(), is.list(), a function more 
> like what.is()
>
>
> -
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A clean way to initialize class slot of type "numeric" vector

2007-08-12 Thread Prof Brian Ripley

Well, c() is NULL, so R did as you asked it to.  See ?integer: an integer 
vector of length 0 can be gotten by integer(0) (and other ways).

If you want integers, why have a slot which is numeric?

> setClass("foo", representation(members="integer"))
[1] "foo"
> new("foo")
An object of class "foo"
Slot "members":
integer(0)

is the natural and simpler way to do this.


On Mon, 13 Aug 2007, [EMAIL PROTECTED] wrote:

> Hi,
>
> I have a class definition like this:
>
> setClass("foo", representation(members="numeric"),
>   prototype(members=c()))
>
> I intend my class to have members, a slot whose value should be a vector 
> of integer. When I initialize this class, I don't have any member yet. 
> So my member is blank. But if I run the above definition into R, it will 
> complain that my slot members is assigned to NULL which does not extend 
> class "numeric". So how can I fix this? Is there any clean way to do 
> this? This is quite a common situation but I can't seem to find a way 
> out. Any help would be really appreciated. Thank you.
>
> - adschai
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert factor to numeric vector of labels

2007-08-12 Thread Prof Brian Ripley


See the FAQ Q7.10 (and please study the posting guide)

On Sun, 12 Aug 2007, Falk Lieder wrote:


Hi,

I have imported a data file to R. Unfortunately R has interpreted some
numeric variables as factors. Therefore I want to reconvert these to numeric
vectors whose values are the factor levels' labels. I tried
as.numeric(),
but it returns a vector of factor levels (i.e. 1,2,3,...) instead of labels
(i.e. 0.71, 1.34, 2.61,).
What can I do instead?

Best wishes, Falk

[[alternative HTML version deleted]]



PLEASE do read the posting guide http://www.R-project.org/posting-guide.html


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Connecting to database on statup

2007-08-11 Thread Prof Brian Ripley

On Sat, 11 Aug 2007, Ruddy M wrote:

> Hello,
> Q/ Is it possible to create a DBMS connection automatically on startup of R? 
> (Making sure of course that the db server has been started...)
> I am running MySQL on Mac OS X 10.4.2 with R2.4.1.
>
> I have tried to write a function using the RMySQL commands (below) and place 
> them in .First of .RProfile:
>
> drv <- dbDriver("MySQL")
> dbcon <- dbConnect(drv, {other parameters present in my.cnf file} 
> dbname="mydbName")
>
> DOES create a connection when entered into my R console individually but NOT 
> when I place them in a function, i.e.,
>
> condb <- function() {
>   drv <- dbDriver("MySQL")
>   dbcon <- dbConnect(drv, dbname="mydbName")
>   dbGetInfo(db)
>   }
>
> When the function is called, the dbGetInfo(dbcon) does return connection 
> info but no connection object is present.

What do you think the return value of this function is?

You need to return dbcon, not the value of dbGetInfo().  Perhaps you meant to print the latter?: if so you need at 
explicit print() statement.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-excel

2007-08-10 Thread Prof Brian Ripley


On Fri, 10 Aug 2007, Peter Wickham wrote:


I am running R 2.5.1 using Mac OSX 10.4.10. "xlsReadWrite" is a Windows
binary. Instead, install and load packages: (1) "gtools":(2) "gdata". These
are both Windows and Mac binaries. "gdata" depends on "gtools", so be sure
to load "gtools" first or set the installation depends parameters. Then you


The R default *is* to install dependencies in R >= 2.5.0.


can use "read.xls".  Thus, in Mac: "data<-read.xls("/Users/your
name/Documents/data.xls",sheet=1). For Windows, substitute the appropriate
filepath and file name in the first argument of "read.xls": e.g.,
"data<-read.xls("A:/filename.xls",sheet-1)". Thanks to correspondents for


You mean sheet=1 

There are other platforms, and the usage of gdata::read.xls is common to 
all platforms.



their advice; but I hope that this may alleviate some of the frustration
(referred to in the R Import/Export Manual) associated with dealing with


That is described in the 'R Data Import/Export Manual' (sic).

It *increases* the frustration of those who WTFM to see it and its 
contents misdescribed in this way.  Further, people who search the list 
archives are liable to make use of buggy posts like this one, so it seems 
necessary to put the corrections and frustration on the record.


Please just point people to the appropriate manual



EXCEL files in R.

Erika Frigo wrote:



Good morning to everybody,
I have a problem : how can I import excel files in R???

thank you very much


Dr.sa. Erika Frigo
Università degli Studi di Milano
Facoltà di Medicina Veterinaria
Dipartimento di Scienze e Tecnologie Veterinarie per la Sicurezza
Alimentare (VSA)

Via Grasselli, 7
20137 Milano
Tel. 02/50318515
Fax 02/50318501
[[alternative HTML version deleted]]


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] kde2d error message

2007-08-10 Thread Prof Brian Ripley

If X or Y contains missing values, _you_ supplied missing values as the 
'lims' argument and it will be those missing values that are reported.

I do not see how you expect to be able to do density estimation with 
missing values: they are unknown and so no part of the answer is known. If 
you are prepared to omit them, you can do so but my software (if this is 
indeed kde2d from package MASS, uncredited) does not make such arbitrary 
choices for you.

On Fri, 10 Aug 2007, Jennifer Dillon wrote:

> Hello!
>
> I am trying to do a smooth with the kde2d function,

That is not what the only kde2d function I know of does.

> and I'm getting an error message about NAs.  Does anyone have any 
> suggestions?  Does this function not do well with NAs in general?
>
> fit <- kde2d(X, Y, n=100,lims=c(range(X),range(Y)))
>
> Error in if (from == to || length.out < 2) by <- 1 :
>missing value where TRUE/FALSE needed
>
>
> Thanks in advance!!
>
> Jen
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

PLEASE do as we ask.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cleaning up the memory

2007-08-10 Thread Prof Brian Ripley

On Fri, 10 Aug 2007, Monica Pisica wrote:

Thanks! I will look into ...

I have 4 GB RAM, and i was monitoring the memory with Windows task 
manager so i was looking how R "gets" more and more memory allocation 
from less than 100Mb to  1500Mb .

Then you are almost certainly fragmenting the address space.

We still don't know your OS and whether you have enabled the /3GB switch 
(if relevant to that version of Windows).   Most versions of Windows have 
a 2Gb address space, but some can be as high as 4Gb (Vista 64 which I use 
is one: the details are in the rw-FAQ for the latest versions of R, e.g. 
R-patched and R-devel).  That factor of 2 can make a big difference.

My initial tables are between 30 to 80 Mb and the resulting tables that 
incorporate the initial tables plus PCA and kmeans results are inbetween 
50 to 200MB or thereabouts!

And yes, i don't really care about memory allocation in detail - what i 
want is to free that memory after every cycle ;-)

Although, after i didn't do anything in R and it was idle for more than 
30 min. the memory allocation according to Task manager dropped to 15 Mb 
. which is good - but i cannot wait inbetween cycles half an hour 
though .

Calling gc() will reduce the memory allocation, but that is not the point.
You can have 15Mb allocated and still not a 50Mb hole in the address 
space (although that would be extremely unlucky, not having several 200Mb 
holes is quite likely).

Again thanks,

Monica> Date: Fri, 10 Aug 2007 18:28:07 +0100> From: 
[EMAIL PROTECTED]> To: [EMAIL PROTECTED]> CC: 
r-help@stat.math.ethz.ch> Subject: Re: [R] Cleaning up the memory> > On 
Fri, 10 Aug 2007, Monica Pisica wrote:> > >> > Hi,> >> > I have 4 huge 
tables on which i want to do a PCA analysis and a kmean > > clustering. 
If i run each table individually i have no problems, but if > > i want 
to run it in a for loop i exceed the memory alocation after the > > 
second table, even if i save the results as a csv table and i clean up > 
> all the big objects with rm command. To me it seems that even if i 
don't > > have the objects anymore, the memory these objects used to 
occupy is not > > cleared. Is there any way to clear up the memory as 
well? I don't want > > to close R and start it up again. Also i am 
running R under Windows.> > See ?gc, which does the clearing.> > 
However, unless you study the memory allocation in detail (which you > 
cannot do from R code), you don't actually know that this is the 
problem. > More likely is that you have fragmentation of your 32-bit 
address space: > see ?"Memory-limits".> > Without any idea what memory 
you have and what 'huge' means, we can only > make wild guesses. It 
might be worth raising the memory limit (the > --max-mem-size flag).> > 
>> > thanks,> >> > Monica> > 
_> > 
[[trailing spam removed]]> >> > [[alternative HTML version deleted]]> >> 
> __> > 
R-help@stat.math.ethz.ch mailing list> > 
https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the 
posting guide http://www.R-project.org/posting-guide.html> > and provide 
commented, minimal, self-contained, reproducible code.> >> > -- > Brian 
D. Ripley, [EMAIL PROTECTED]> Professor of Applied Statistics, 
http://www.stats.ox.ac.uk/~ripley/> University of Oxford, Tel: +44 1865 
272861 (self)> 1 South Parks Road, +44 1865 272866 (PA)> Oxford OX1 3TG, 
UK Fax: +44 1865 272595 
_ 
Messenger Café ? open for fun 24/7. Hot games, cool activities served 
daily. Visit now. http://cafemessenger.com?ocid=TXT_TAGLM_AugWLtagline

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cleaning up the memory

2007-08-10 Thread Prof Brian Ripley

On Fri, 10 Aug 2007, Monica Pisica wrote:

>
> Hi,
>
> I have 4 huge tables on which i want to do a PCA analysis and a kmean 
> clustering. If i run each table individually i have no problems, but if 
> i want to run it in a for loop i exceed the memory alocation after the 
> second table, even if i save the results as a csv table and i clean up 
> all the big objects with rm command. To me it seems that even if i don't 
> have the objects anymore, the memory these objects used to occupy is not 
> cleared. Is there any way to clear up the memory as well? I don't want 
> to close R and start it up again. Also i am running R under Windows.

See ?gc, which does the clearing.

However, unless you study the memory allocation in detail (which you 
cannot do from R code), you don't actually know that this is the problem. 
More likely is that you have fragmentation of your 32-bit address space: 
see ?"Memory-limits".

Without any idea what memory you have and what 'huge' means, we can only 
make wild guesses.  It might be worth raising the memory limit (the 
--max-mem-size flag).

>
> thanks,
>
> Monica
> _
> [[trailing spam removed]]
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reading xcms files

2007-08-10 Thread Prof Brian Ripley

On Fri, 10 Aug 2007, Roberto Olivares Hernandez wrote:

> Hi,
>
> I am using xcms library to read mass spectrum data. I generate objects 
> from CDF files using the command line
>
>>  SME10 <- xcmsRaw("SME_10.CDF")
>
> I have 50 CDF files with different name and I don't want to repeat the 
> command for each one. Is there any option to read all the files and 
> generate a corresponding object name?

Something like

for(f in Sys.glob("*.CDF")) assign(sub("\\.CDF$", "", f), xcmsRaw(f))

(untested, of course).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-10 Thread Prof Brian Ripley

I don't understand why one would run a 64-bit version of R on a 2GB 
server, especially if one were worried about object size.  You can run 
32-bit versions of R on x86_64 Linux (see the R-admin manual for a 
comprehensive discussion), and most other 64-bit OSes default to 32-bit 
executables.

Since most OSes limit 32-bit executables to around 3GB of address space, 
there starts to become a case for 64-bit executables at 4GB RAM but not 
much case at 2GB.

It was my intention when providing the infrastructure for it that Linux 
binary distributions on x86_64 would provide both 32-bit and 64-bit 
executables, but that has not happened.  It would be possible to install 
ix86 builds on x86_64 if -m32 was part of the ix86 compiler specification 
and the dependency checks would notice they needed 32-bit libraries. 
(I've had trouble with the latter on FC5: an X11 update removed all my 
32-bit X11 RPMs.)

On Fri, 10 Aug 2007, Michael Cassin wrote:

> Thanks for all the comments,
>
> The artificial dataset is as representative of my 440MB file as I could 
> design.
>
> I did my best to reduce the complexity of my problem to minimal
> reproducible code as suggested in the posting guidelines.  Having
> searched the archives, I was happy to find that the topic had been
> covered, where Prof Ripley suggested that the I/O manuals gave some
> advice.  However, I was unable to get anywhere with the I/O manuals
> advice.
>
> I spent 6 hours preparing my post to R-help. Sorry not to have read
> the 'R-Internals' manual.  I just wanted to know if I could use scan()
> more efficiently.
>
> My hurdle seems nothing to do with efficiently calling scan() .  I
> suspect the same is true for the originator of this memory experiment
> thread. It is the overhead of storing short strings, as Charles
> identified and Brian explained.  I appreciate the investigation and
> clarification you both have made.
>
> 56B overhead for a 2 character string seems extreme to me, but I'm not
> complaining. I really like R, and being free, accept that
> it-is-what-it-is.

Well, there are only about 5 2-char strings in an 8-bit locale, so 
this does seem a case for using factors (as has been pointed out several 
times).

And BTW, it is not 56B overhead, but 56B total for up to 7 chars.

> In my case pre-processing is not an option, it is not a one off
> problem with a particular file. In my application, R is run in batch
> mode as part of a tool chain for arbitrary csv files.  Having found
> cases where memory usage was as high as 20x file size, and allowing
> for a copy of the the loaded dataset, I'll just need to document that
> it is possible that files as small as 1/40th of system memory may
> consume it all.  That rules out some important datasets (US Census, UK
> Office of National Statistics files, etc) for 2GB servers.
>
> Regards, Mike
>
>
> On 8/9/07, Prof Brian Ripley <[EMAIL PROTECTED]> wrote:
>> On Thu, 9 Aug 2007, Charles C. Berry wrote:
>>
>>> On Thu, 9 Aug 2007, Michael Cassin wrote:
>>>
>>>> I really appreciate the advice and this database solution will be useful to
>>>> me for other problems, but in this case I  need to address the specific
>>>> problem of scan and read.* using so much memory.
>>>>
>>>> Is this expected behaviour?
>>
>> Yes, and documented in the 'R Internals' manual.  That is basic reading
>> for people wishing to comment on efficiency issues in R.
>>
>>>> Can the memory usage be explained, and can it be
>>>> made more efficient?  For what it's worth, I'd be glad to try to help if 
>>>> the
>>>> code for scan is considered to be worth reviewing.
>>>
>>> Mike,
>>>
>>> This does not seem to be an issue with scan() per se.
>>>
>>> Notice the difference in size of big2, big3, and bigThree here:
>>>
>>>> big2 <- rep(letters,length=1e6)
>>>> object.size(big2)/1e6
>>> [1] 4.000856
>>>> big3 <- paste(big2,big2,sep='')
>>>> object.size(big3)/1e6
>>> [1] 36.2
>>
>> On a 32-bit computer every R object has an overhead of 24 or 28 bytes.
>> Character strings are R objects, but in some functions such as rep (and
>> scan for up to 10,000 distinct strings) the objects can be shared.  More
>> string objects will be shared in 2.6.0 (but factors are designed to be
>> efficient at storing character vectors with few values).
>>
>> On a 64-bit computer the overhead is usually double.  So I would expect
>> just over 56 bytes/string for distinct short strings (and that is what
>> big3 gives).
>

Re: [R] S4 based package giving strange error at install time, but not at check time

2007-08-09 Thread Prof Brian Ripley

On Thu, 9 Aug 2007, Rajarshi Guha wrote:

> Hi, I have a S4 based package package that was loading fine on R
> 2.5.0 on both OS X and
> Linux. I was checking the package against 2.5.1 and doing R CMD check
> does not give any warnings. So I next built the package and installed
> it. Though the package installed fine I noticed the following message:
>
> Loading required package: methods
> Error in loadNamespace(package, c(which.lib.loc, lib.loc),
> keep.source = keep.source) :
> in 'fingerprint' methods specified for export, but none
> defined: fold, euc.vector, distance, random.fingerprint,
> as.character, length, show
> During startup - Warning message:
> package fingerprint in options("defaultPackages") was not found
   ^^^

Do you have this package in your startup files or the environment variable 
R_DEFAULT_PACKAGES?  R CMD check should not look there: whatever you are 
quoting above seems to.

> However, I can load the package in R with no errors being reported and
> it seems that the functions are working fine.
>
> Looking at the sources I see that my NAMESPACES file contains the
> following:
>
> importFrom("methods")

That should specify what to import, or be imports("methods").  See 
'Writing R Extensions'.

> exportClasses("fingerprint")
> exportMethods("fold", "euc.vector", "distance", "random.fingerprint",
> "as.character", "length", "show")
> export("fp.sim.matrix", "fp.to.matrix", "fp.factor.matrix",
> "fp.read.to.matrix", "fp.read", "moe.lf", "bci.lf", "cdk.lf")
>
> and all the exported methods are defined. As an example consider the
> 'fold' method. It's defined as
>
> setGeneric("fold", function(fp) standardGeneric("fold"))
> setMethod("fold", "fingerprint",
>   function(fp) {
> ## code for the function snipped
>   })
>
> Since the method has been defined I can't see why I should see the
> error during install time, but nothing when the package is checked.
>
> Any pointers would be appreciated.
>
> ---
> Rajarshi Guha  <[EMAIL PROTECTED]>
> GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04  06F7 1BB9 E634 9B87 56EE
> ---
> Bus error -- driver executed.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RMySQL loading error

2007-08-09 Thread Prof Brian Ripley

On Thu, 9 Aug 2007, Clara Anton wrote:

> Hi,
>
> I am having problems loading RMySQL.
>
> I am using MySQL 5.0,  R version 2.5.1, and RMySQL with Windows XP.

More exact versions would be helpful.

> When I try to load rMySQL I get the following error:
>
> > require(RMySQL)
> Loading required package: RMySQL
> Error in dyn.load(x, as.logical(local), as.logical(now)) :
>unable to load shared library
> 'C:/PROGRA~1/R/R-25~1.1/library/RMySQL/libs/RMySQL.dll':
>  LoadLibrary failure:  Invalid access to memory location.
>
>
> I did not get any errors while installing MySQL or RMySQL. It seems that
> there are other people with similar problems, although I could not find
> any hint on how to try to solve the problem.

It is there, unfortunately along with a lot of uniformed speculation.

> Any help, hint or advice would be greatly appreciated.

The most likely solution is to update (or downdate) your MySQL.  You 
possibly got RMySQL from the CRAN Extras site, and if so this is covered 
in the ReadMe there:

   The build of RMySQL_0.6-0 is known to work with MySQL 5.0.21 and 5.0.45,
   and known not to work (it crashes on startup) with 5.0.41.

Usually the message is the one you show, but I have seen R crash.  The 
issue is the MySQL client DLL: that from 5.0.21 or 5.0.45 works in 5.0.41.

All the reports of problems I have seen are for MySQL versions strictly 
between 5.0.21 and 5.0.45.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory Experimentation: Rule of Thumb = 10-15 Times the Memory

2007-08-09 Thread Prof Brian Ripley

On Thu, 9 Aug 2007, Charles C. Berry wrote:

> On Thu, 9 Aug 2007, Michael Cassin wrote:
>
>> I really appreciate the advice and this database solution will be useful to
>> me for other problems, but in this case I  need to address the specific
>> problem of scan and read.* using so much memory.
>>
>> Is this expected behaviour?

Yes, and documented in the 'R Internals' manual.  That is basic reading 
for people wishing to comment on efficiency issues in R.

>> Can the memory usage be explained, and can it be
>> made more efficient?  For what it's worth, I'd be glad to try to help if the
>> code for scan is considered to be worth reviewing.
>
> Mike,
>
> This does not seem to be an issue with scan() per se.
>
> Notice the difference in size of big2, big3, and bigThree here:
>
>> big2 <- rep(letters,length=1e6)
>> object.size(big2)/1e6
> [1] 4.000856
>> big3 <- paste(big2,big2,sep='')
>> object.size(big3)/1e6
> [1] 36.2

On a 32-bit computer every R object has an overhead of 24 or 28 bytes. 
Character strings are R objects, but in some functions such as rep (and 
scan for up to 10,000 distinct strings) the objects can be shared.  More 
string objects will be shared in 2.6.0 (but factors are designed to be 
efficient at storing character vectors with few values).

On a 64-bit computer the overhead is usually double.  So I would expect 
just over 56 bytes/string for distinct short strings (and that is what 
big3 gives).

But 56Mb is really not very much (tiny on a 64-bit computer), and 1 
million items is a lot.

[...]

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ARIMA fitting

2007-08-09 Thread Prof Brian Ripley


On Tue, 7 Aug 2007, [EMAIL PROTECTED] wrote:


Hello,
Im trying to fit an ARIMA process, using STATS package, arima function.
Can I expect, that fitted model with any parameters is stationary, causal
and invertible?


Please read ?arima: it answers all your questions, and points out that the 
answer depends on the arguments passed to arima().


The posting guide did ask you to do this *before* posting: please study it 
more carefully.


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on R performance using aov function

2007-08-09 Thread Prof Brian Ripley

aov() will handle multiple responses and that would be considerably more 
efficient than running separate fits as you seem to be doing.


Your code is nigh unreadable: please use your spacebar and remove the 
redundant semicolons: `Writing R Extensions' shows you how to tidy up 
your code to make it presentable.  But I think anova_[[1]] is really

coef(summary(aov_)) which is a lot more intelligible.

On Thu, 9 Aug 2007, Francoise PFIFFELMANN wrote:


Hi,
Im trying to replace some SAS statistical functions by R (batch calling).
But Ive seen that calling R in a batch mode (under Unix) takes about 2or 3
times more than SAS software. So its a great problem of performance for me.
Here is an extract of the calculation:

stoutput<-file("res_oneWayAnova.dat","w");
cat("Param|F|Prob",file=stoutput,"\n");
for (i in 1:n) {
p<-list_param[[i]]
aov_<-aov(A[,p]~ A[,"wafer"],data=A);
anova_<-summary(aov_);
if (!is.na(anova_[[1]][1,5]) & anova_[[1]][1,5]<=0.0001)
res_aov<-cbind(p,anova_[[1]][1,4],"<0.0001") else
res_aov<-cbind(p,anova_[[1]][1,4],anova_[[1]][1,5]);
cat(res_aov, file=stoutput, append = TRUE,sep = "|","\n");
};
close(stoutput);


A is a data.frame of about (400 lines and 1800 parameters).
Im a new user of R and I dont know if its a problem in my code or if
there are some tips that I can use to optimise my treatment.

Thanks a lot for your help.

Françoise Pfiffelmann
Engineering Data Analysis Group
--
Crolles2 Alliance
860 rue Jean Monnet
38920 Crolles, France
Tel: +33 438 92 29 84
Email: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] tcltk error on Linux

2007-08-09 Thread Prof Brian Ripley

On Thu, 9 Aug 2007, Mark W Kimpel wrote:

> I am having trouble getting tcltk package to load on openSuse 10.2
> running R-devel. I have specifically put my /usr/share/tcl directory in
> my PATH, but R doesn't seem to see it. I also have installed tk on my
> system. Any ideas on what the problem is?

Whether Tcl/Tk would available was determined when you installed R.  The 
relevant information was in the configure output and log, which we don't 
have.

You are not running a released version of R: please don't use the 
development version unless you are familiar with the build process and 
know how to debug such things yourself.  The rule is that questions about 
development versions of R should not be asked here but on R-devel (and not 
to R-core which I have deleted from the recipients).

I suggest reinstalling R (preferably R-patched) and if tcltk still is not 
available sending the relevant configure information to the R-devel list.

> Also, note that I have some warning messages on starting up R, not sure
> what they mean or if they are pertinent.

Those are coming from a Bioconductor package: again you must be using 
development versions with R-devel and those are not stable (last time I 
looked even Biobase would not install, and the packages change daily).

If you have all those packages in your startup, please don't -- there will 
be a considerable performance hit so only load them when you need them.

>
> Thanks, Mark
>
> Warning messages:
> 1: In .updateMethodsInTable(fdef, where, attach) :
>   Couldn't find methods table for "conditional", package "Category" may
> be out of date
> 2: In .updateMethodsInTable(fdef, where, attach) :
>   Methods list for generic "conditional" not found
> > require(tcltk)
> Loading required package: tcltk
> Error in firstlib(which.lib.loc, package) :
>   Tcl/Tk support is not available on this system
> > sessionInfo()
> R version 2.6.0 Under development (unstable) (2007-08-01 r42387)
> i686-pc-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] splines   tools stats graphics  grDevices utils datasets
> [8] methods   base
>
> other attached packages:
>  [1] affycoretools_1.9.3annaffy_1.9.1  xtable_1.5-0
>  [4] gcrma_2.9.1matchprobes_1.9.10 biomaRt_1.11.4
>  [7] RCurl_0.8-1XML_1.9-0  GOstats_2.3.8
> [10] Category_2.3.19genefilter_1.15.9  survival_2.32
> [13] KEGG_1.17.0RBGL_1.13.3annotate_1.15.3
> [16] AnnotationDbi_0.0.88   RSQLite_0.6-0  DBI_0.2-3
> [19] GO_1.17.0  limma_2.11.9   affy_1.15.7
> [22] preprocessCore_0.99.12 affyio_1.5.6   Biobase_1.15.23
> [25] graph_1.15.10
>
> loaded via a namespace (and not attached):
> [1] cluster_1.11.7  rcompgen_0.1-15
> >
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R memory usage

2007-08-09 Thread Prof Brian Ripley

See

?gc
?"Memory-limits"

On Wed, 8 Aug 2007, Jun Ding wrote:

> Hi All,
>
> I have two questions in terms of the memory usage in R
> (sorry if the questions are naive, I am not familiar
> with this at all).
>
> 1) I am running R in a linux cluster. By reading the R
> helps, it seems there are no default upper limits for
> vsize or nsize. Is this right? Is there an upper limit
> for whole memory usage? How can I know the default in
> my specific linux environment? And can I increase the
> default?

See ?"Memory-limits", but that is principally a Linux question.

>
> 2) I use R to read in several big files (~200Mb each),
> and then I run:
>
> gc()
>
> I get:
>
>used  (Mb) gc trigger   (Mb)  max used
> Ncells  23083130 616.4   51411332 1372.9  51411332
> Vcells 106644603 813.7  240815267 1837.3 227550003
>
> (Mb)
> 1372.9
> 1736.1
>
> What do columns of "used", "gc trigger" and "max used"
> mean? It seems to me I have used 616Mb of Ncells and
> 813.7Mb of Vcells. Comparing with the numbers of "max
> used", I still should have enough memory. But when I
> try
>
> object.size(area.results)   ## area.results is a big
> data.frame
>
> I get an error message:
>
> Error: cannot allocate vector of size 32768 Kb
>
> Why is that? Looks like I am running out of memory. Is
> there a way to solve this problem?
>
> Thank you very much!
>
> Jun


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading time/date string

2007-08-09 Thread Prof Brian Ripley

On Thu, 9 Aug 2007, Matthew Walker wrote:

> Thanks Mark, that was very helpful.  I'm now so close!
>
> Can anyone tell me how to extract the "value" from an instance of a
> "difftime" class?  I can see the value, but how can I place it in a
> dataframe?

as.numeric(time_delta)

Hint: you want the number, not the value (which is a classed object).

>
> > time_string1 <- "10:17:07 02 Aug 2007"
> > time_string2 <- "13:17:40 02 Aug 2007"
> >
> > time1 <- strptime(time_string1, format="%H:%M:%S %d %b %Y")
> > time2 <- strptime(time_string2, format="%H:%M:%S %d %b %Y")
> >
> > time_delta <- difftime(time2,time1, unit="sec")
> > time_delta
> Time difference of 10833 secs # <--- I'd like this value just here!
> >
> > data.frame(time1, time2, time_delta)
> Error in as.data.frame.default(x[[i]], optional = TRUE) :
>cannot coerce class "difftime" into a data.frame
>
>
>
> Thanks again,
>
> Matthew
>
>
> Mark W Kimpel wrote:
>> Look at some of these functions...
>>
>> DateTimeClasses(base)   Date-Time Classes
>> as.POSIXct(base)Date-time Conversion Functions
>> cut.POSIXt(base)Convert a Date or Date-Time Object to a Factor
>> format.Date(base)   Date Conversion Functions to and from Character
>>
>> Mark
>> ---
>>
>> Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
>> Indiana University School of Medicine
>>
>> 15032 Hunter Court, Westfield, IN  46074
>>
>> (317) 490-5129 Work, & Mobile & VoiceMail
>> (317) 663-0513 Home (no voice mail please)
>>
>> **
>>
>> Matthew Walker wrote:
>>> Hello everyone,
>>>
>>> Can anyone tell me what function I should use to read time/date
>>> strings and turn them into a form such that I can easily calculate
>>> the difference of two?  The strings I've got look like "10:17:07 02
>>> Aug 2007".  If I could calculate the number of seconds between them
>>> I'd be very happy!
>>>
>>> Cheers,
>>>
>>> Matthew
>>>
>>> __
>>> R-help@stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> .
>>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: Cannot Coerce POSIXt to POSIXct when building package

2007-08-08 Thread Prof Brian Ripley

On Wed, 8 Aug 2007, Praveen Kanakamedala wrote:

> A newbie here - please forgive me if this is a basic question.  We have an
> in house package built in R 2.2.1 (yes we're a little behind the times at
> our firm)and would like to rebuild it using R 2.5.1.  However, when I try
> and build the package from source, I keep getting this error:
>
> Error in as(slotVal, slotClass, strict = FALSE) :
>no method or default for coercing "POSIXt" to "POSIXct"
> Error : unable to load R code in package 'Mango'
> Error: package/namespace load failed for 'Mango'
>
>
> I tried defining a new method "as.POSIXct" in the package to coerce POSIXt
> to POSIXct and then added the as.POSIXct method to the "NAMSPACE" file.  The
> build still doesn't work (I get the same error message). Any idea what I am
> doing wrong? The coercion statement looks like this and works in R GUI:

How did you get this?  There should be no objects of class 'POSIXt' alone, 
and I get e.g.

> now <- Sys.time()
> as(now, "POSIXct")
Error in asMethod(object) : explicit coercion of old-style class (POSIXt, 
POSIXct) is not defined

That can be fixed (see ?as), but you seem to have a malformed object in 
one of your slots.

As often applies,

> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



> #from is a vector of dates in the format "%d-%b-%Y")
> from <- as.POSIXct(strptime(from, format = "%d%b%Y"), tz = "GMT")
>
> Here is my environment info:
>
> R version 2.5.1 (2007-06-27)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
> Kingdom.1252;LC_MONETARY=English_United
> Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] "tcltk" "stats" "graphics"  "grDevices" "utils" "datasets"
> "methods"   "base"
>
> other attached packages:
>  fSeries  nnet  mgcv   fBasics fCalendar   fEcofin   spatial
> MASS
> "251.70"  "7.2-34"  "1.3-25"  "251.70"  "251.70"  "251.70"  "7.2-34"  "
> 7.2-34"
> I would sincerely appreciate any help.
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing font in boxplots

2007-08-08 Thread Prof Brian Ripley

On Wed, 8 Aug 2007, G Iossa, School Biological Sciences wrote:

> Hi John,
>
> Thanks so much for such a quick reply.
> I have tried to set all to Times font running
>
> par(font.lab=6) (not 4, maybe this is a local setting on my machine?)

'6' is a setting specific to certain devices on Windows.  You should 
really be using font families (which are quite new and so not used in 
many of the introductions).

par(family="serif")

will change the default for all the text on subsequent plots to be in 
a serif font, which on the windows() device is (by default) Times.

The R posting guide does ask you to tell us your OS, so that points like 
this do not have to be guessed at.

> but now the boxplot shown has the x and y labels in Times New Roman and the
> x and y axis still in Arial. Any idea why R is not setting those in Times?

Because you did not ask it to.  The font of axis annotation is set by 
font.axis, not font.lab (which is controls title()'s xlab and ylab and 
nothing in axis()).  See ?axis and ?par, both of which make this clear.

John Kane has claimed that what inline pars are used by boxplot() is 'not 
clear from ?boxplot', but the lack of clarity is his, not in the 
documentation. ?boxplot refers you to ?bxp, and that spells out exactly 
which inline pars are used.

>
> Thanks a lot for your advice,
> Graziella
>
> --On 08 August 2007 09:16 -0400 John Kane <[EMAIL PROTECTED]> wrote:
>
>> I don't know if boxplot will accept a font argument.m
>> From ?boxplot it is not clear.
>> You may need to set the par() command before the
>> boxplot
>>
>> Example:
>> par(font.lab=4)
>> boxplot(mass ~ family, data=mydata, ylab="mass %",
>> xlab="family",las=1, cex.axis=1)
>>
>> --- "G Iossa, School Biological Sciences"
>> <[EMAIL PROTECTED]> wrote:
>>
>>> Hi all,
>>>
>>> I am very new to R and this might be a simple
>>> question but I have looked
>>> everywhere you suggest before writing to you.
>>>
>>> I am trying to change font type from san-serif to a
>>> serif (Times New
>>> Romans) on all labels and axis of my boxplot. I have
>>> used this function in
>>> other plots before, e.g.:
>>>
>>> plot(residuals~lnlifespan, data=mydata, pch=psymb,
>>> font=6, xlab="ln
>>> reproductive lifespan", ylab="residuals ln mass",
>>> font.lab=6, cex=1.5,
>>> cex.axis=1.5, cex.lab=1.5)
>>>
>>> and found that font.lab or font.axis=6 gives Times
>>> font. However, when I
>>> try for boxplot:
>>>
>>> boxplot(mass ~ family, data=mydata, ylab="mass %",
>>> xlab="family",
>>> font.axis=6,  font=6, par(las=1), cex.axis=1)
>>>
>>> it does not work (R does not give any warning
>>> messages). I have also tried
>>> family="Times" but without success. Any idea of why
>>> is not doing it and
>>> what I can do to get Times font on my boxplot?
>>> I run R on Windows.
>>>
>>> Thanks a lot,
>>> Graziella
>>>
>>>
>> *
>>> Dr. Graziella Iossa
>>>
>>> Mammal Research Unit
>>> School Biological Sciences
>>> University of Bristol
>>> Woodland Road
>>> Bristol BS8 1UG, UK
>>>
>>> E-mail: [EMAIL PROTECTED]
>>> Tel 0044 (0)117 9288918
>>> Fax 0044 (0)117 3317985
>>> http://www.bio.bris.ac.uk/research/mammal/index.html
>>> http://www.bio.bris.ac.uk/people/Iossa.htm

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Find out the workspace name

2007-08-08 Thread Prof Brian Ripley

On Wed, 8 Aug 2007, ONKELINX, Thierry wrote:

> ?getwd()

and ?setWindowTitle, which even has this as the first example.

help.search("window title") gets you there.

>> [mailto:[EMAIL PROTECTED] Namens Luis Ridao Cruz
>>
>> Sometimes there might be several R sessions open at the same
>> time. In Windows no name appears in the R main bar (just R
>> Console)
>>
>> Is it possible to know the name of the workspace.
>> I ussually write it on the script I am working on but I wish
>> to know without having to search in a text file.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to include bar values in a barplot?

2007-08-08 Thread Prof Brian Ripley

Please see

?format
?round

Note that text() is said to expect a character vector, so why did you 
supply a numeric vector?

   labels: a character vector or expression specifying the _text_ to be
   written.  An attempt is made to coerce other language objects
   (names and calls) to expressions, and vectors and other
   classed objects to character vectors by 'as.character'. If
   'labels' is longer than 'x' and 'y', the coordinates are
   recycled to the length of 'labels'.

and try as.character(vals) for yourself.

> Is there any way to round up those numbers?

See library(fortunes); fortune("Yoda")

On Wed, 8 Aug 2007, Donatas G. wrote:

> On Wednesday 08 August 2007 00:40:56 Donatas G. wrote:
>> On Tuesday 07 August 2007 22:09:52 Donatas G. wrote:
>>> How do I include bar values in a barplot (or other R graphics, where this
>>> could be applicable)?
>>>
>>> To make sure I am clear I am attaching a barplot created with
>>> OpenOffice.org which has barplot values written on top of each barplot.
>>
>> After more than two hours search I finally found a solution:
>> http://tolstoy.newcastle.edu.au/R/help/06/05/27286.html
>
> Hey, the solution happens to be only partiall... If the values are not real
> numbers, and have a lot of digits after the dot, the graph might become
> unreadable...
>
> see this
>
> vals <-
> c(1,1.1236886,4.77554676,5.3345245,1,1.1236886,4.77554676,5.3345245,5.5345245,5.4345245,1.1236886,4.77554676,5.3345245,1.1236886,4.77554676,5.3345245)
> names(vals) <- LETTERS[1:16]
> mp <- barplot(vals, ylim = c(0, 6))
> text(mp, vals, labels = vals, pos = 3)
>
> Is there any way to round up those numbers?
>
> I tried using
> options(digits=2)
> , and it does change the display of a table, but it does not influence the
> barplot...

Well, it does not affect as.character, nor should it.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interaction factor and numeric variable versus separate regressions

2007-08-07 Thread Prof Brian Ripley

These are not the same model.  You want x*f, and then you will find
the differences in intercepts and slopes from group 1 as the coefficients.

Remember too that the combined model pools error variances and the 
separate model has separate error variance for each group.

To understand model formulae, study Bill Venables' exposition in chapter 6 
of MASS.

On Tue, 7 Aug 2007, Sven Garbade wrote:

> Dear list members,
>
> I have problems to interpret the coefficients from a lm model involving
> the interaction of a numeric and factor variable compared to separate lm
> models for each level of the factor variable.
>
> ## data:
> y1 <- rnorm(20) + 6.8
> y2 <- rnorm(20) + (1:20*1.7 + 1)
> y3 <- rnorm(20) + (1:20*6.7 + 3.7)
> y <- c(y1,y2,y3)
> x <- rep(1:20,3)
> f <- gl(3,20, labels=paste("lev", 1:3, sep=""))
> d <- data.frame(x=x,y=y, f=f)
>
> ## plot
> # xyplot(y~x|f)
>
> ## lm model with interaction
> summary(lm(y~x:f, data=d))
>
> Call:
> lm(formula = y ~ x:f, data = d)
>
> Residuals:
>Min  1Q  Median  3Q Max
> -2.8109 -0.8302  0.2542  0.6737  3.5383
>
> Coefficients:
>Estimate Std. Error t value Pr(>|t|)
> (Intercept)  3.687990.41045   8.985 1.91e-12 ***
> x:flev1  0.208850.04145   5.039 5.21e-06 ***
> x:flev2  1.496700.04145  36.109  < 2e-16 ***
> x:flev3  6.708150.04145 161.838  < 2e-16 ***
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Residual standard error: 1.53 on 56 degrees of freedom
> Multiple R-Squared: 0.9984,   Adjusted R-squared: 0.9984
> F-statistic: 1.191e+04 on 3 and 56 DF,  p-value: < 2.2e-16
>
> ## separate lm fits
> lapply(by(d, d$f, function(x) lm(y ~ x, data=x)), coef)
> $lev1
> (Intercept)   x
> 6.77022860 -0.01667528
>
> $lev2
> (Intercept)   x
>   1.0190781.691982
>
> $lev3
> (Intercept)   x
>   3.2746566.738396
>
>
> Can anybody give me a hint why the coefficients for the slopes
> (especially for lev1) are so different and how the coefficients from the
> lm model with interaction are related to the separate fits?
>
> Thanks, Sven
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cannot add lines to plot

2007-08-07 Thread Prof Brian Ripley

On Tue, 7 Aug 2007, Zeno Adams wrote:

>
> Hello,
>
> I want to plot a time series and add lines to the plot later on.
> However, this seems to work only as long as I plot the series against
> the default index. As soon as I plot against an object
> of class chron or POSIXt (i.e. I want to add a date/time axis), the
> lines do not appear anymore. The command to add the lines is executed
> without an error message.
>
> (THIS DOES NOT ADD THE LINES)
> plot(datum2[(3653):(3653+i)],dlindus[(3653):(3653+i)], col
> =hcl(h=60,c=35,l=60), ylim=c(-8,8), type = "l", xlab=(""),
> ylab=("Return"), main = ("Industry"))
> lines(gvarindus, type="l", lwd=2)
> lines(quantindustlow, col ="black", type = "l",lty=3)
> lines(quantindusthigh, col ="black", type = "l",lty=3)
>
> (THIS ADDS THE LINES, but then I dont have an date axis)
> plot(dlindus[(3653):(3653+i)], col =hcl(h=60,c=35,l=60), ylim=c(-8,8),
> type = "l", xlab=(""), ylab=("Return"), main = ("Industry"))
> lines(gvarindus, type="l", lwd=2)
> lines(quantindustlow, col ="black", type = "l",lty=3)
> lines(quantindusthigh, col ="black", type = "l",lty=3)
>
> This sounds like a fairly simple problem, but I cannot find any answer
> in the R-help archives.

Look at the help for lines: the standard call is lines(x, y, col ="black", 
type = "l",lty=3) and you have omitted x.  See ?xy.coords for what 
happens then.

I think the reason you did not find this in the archives is that this is a 
rare misreading (or non-reading) of the help pages.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Y-intercept Value

2007-08-06 Thread Prof Brian Ripley

?offset : you can specify a different intercept for each case, or a common 
one.

Or you could just use lm (y - 3 ~ 0 +x), but offset() works better for 
prediction.

On Mon, 6 Aug 2007, Benjamin Zuckerberg wrote:

>
> Hello everyone,
>
> Quick question...is there a way of specifying a y-intercept value
> within a lm statement.  For example, if I wanted to specify the
> regression to pass through the origin I would enter lm(y~0+x).  But
> can I specify an actual term such as 1,2,3,4, etc. as an intercept
> value?  Thank you!
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data analysis

2007-08-06 Thread Prof Brian Ripley

On Tue, 7 Aug 2007, [EMAIL PROTECTED] wrote:

> On 06-Aug-07 19:26:59, lamack lamack wrote:
>> Dear all, I have a factorial design where the
>> response is an ordered categorical response.
>>
>> treatment (two levels: 1 and 2)
>> time (four levels: 30, 60,90 and 120)
>> ordered response (0,1,2,3)
>>
>> could someone suggest a correct analysis or some references?
>
> For your data below, I would be inclined to start from here,
> which gives the counts for the different responses:
>
>
>   Response
> 
> Trt Time   0123
> ++
> Tr1  30 |   1 3  |  4
> 60 |   211  |  4
> 90 |   31   |  4
>120 |   31   |  4
> ++---
> Tr2  30 |   2 2  |  4
> 60 |   31   |  4
> 90 |   3 1  |  4
>120 |  121   |  4
> =
> Tr1 |  0934  | 16
> ++---
> Tr2 |  1   1023  | 16
> =
>
> This suggests that, if anything is happening there at all,
> it is a tendency for high response to occur at shorter times,
> and low response at longer times, with little if any difference
> between the treatments.
>
> To approach this formally, I would consider adopting a
> "re-randomisation" approach, re-allocating the outcomes at
> random in such a way as to preserve the marginal totals,
> and evaluating a statistic T, defined in terms of the counts
> and such as to be sensitive to the kind of effect you seek.
>
> Then situate the value of T obtained from the above counts
> within the distribution of T obtained by this re-randomisation.
>
> There must be, somewhere in R, routines which can perform this
> kind of constrained re-randomisation,but I'm not sufficiently
> familiar with that area of R to know for sure about them.

?r2dtable  for 2D tables.  But there is a classic way to do this without 
using randomization and holding the time*treatment marginals fixed: 
log-linear models.

> I hope other readers who know about this area in R can come
> up with suggestions!

However, that approach is not taking into account that the response is 
ordered. First make sure the variables are factors: here in data frame 
'dat'.

dat <- read.table("...", header=TRUE, colClasses="factor")
library(MASS)
summary(polr(response ~ time*treatment, data = dat))

suggests there is nothing very significant here, and dropping the 
interaction

> summary(polr(response ~ time+treatment, data = dat))

Re-fitting to get Hessian

Call:
polr(formula = response ~ time + treatment, data = dat)

Coefficients:
 Value Std. Error   t value
time60 -1.7030709  1.0323027 -1.649779
time90 -2.1833059  1.0959290 -1.992196
time120-2.7900588  1.1703586 -2.383935
treatment2 -0.8168075  0.7663541 -1.065836

shows a marginal effect of time:

> stepAIC(polr(response ~ time*treatment, data = dat))

selects a model with just 'time' as an explanatory variable.

> anova(polr(response ~ time, dat), polr(response ~ 1, dat))
Likelihood ratio tests of ordinal regression models

Response: response
   Model Resid. df Resid. Dev   TestDf LR stat.Pr(Chi)
1 129   66.58130
2  time26   59.68091 1 vs 2 3 6.900383 0.07514162

again suggests that the effect of time is marginal.

References: obviously this is covered in MASS (see the R FAQ).

>
> best wishes,
> Ted.
>
>> subject treatment  time   response
>> 1   130   3
>> 2   130   3
>> 3   130   1
>> 4   130   3
>> 5   160   3
>> 6   160   1
>> 7   160   1
>> 8   160   2
>> 9   190   2
>> 10  190   1
>> 11  190   1
>> 12  190   1
>> 13  1   120   2
>> 14  1   120   1
>> 15  1   120   1
>> 16  1   120   1
>> 17  230   3
>> 18  230   3
>> 19  230   1
>> 20  230   1
>> 21  260   1
>> 22  260   2
>> 23  260   1
>> 24  260   1
>> 25  290   1
>> 26  290   1
>> 27  290   1
>> 28  290   3
>> 29  2   120   1
>> 30  2   120   2
>> 31  2   120   0
>> 32  2   120   1

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

_

Re: [R] Function for trim blanks from a string(s)?

2007-08-06 Thread Prof Brian Ripley

I am sure Marc knows that ?sub has examples of trimming trailing space and 
whitespace in various styles.

On Mon, 6 Aug 2007, Marc Schwartz wrote:

> On Mon, 2007-08-06 at 12:15 -0700, adiamond wrote:
>> I feel like an idiot posting this because every language I've ever seen has a
>> string function that trims blanks off strings (off the front or back or
>> both).

Some very common languages do not, though.  It is an exercise in Kernighan 
& Ritchie (the original C reference), and an FAQ entry for Perl.

>> Ideally, it would process whole data frames/matrices etc but I don't
>> even see one that processes a single string.  But I've searched and I don't
>> even see that.  There's a strtrim function but it does something completely
>> different.
>
> If you want to do this while initially importing the data into R using
> one of the read.table() family of functions, see the 'strip.white'
> argument in ?read.table, which would do an entire data frame in one
> call.
>
> Otherwise, the easiest way to do it would be to use sub() or gsub()
> along the lines of the following:
>
> # Strip leading space
> sub("^ +", "", YourTextVector)
>
>
> # Strip trailing space
> sub(" +$", "", YourTextVector)
>
>
> # Strip both
> gsub("(^ +)|( +$)", "", YourTextVector)
>
>
>
>
> Examples of use:
>
>> sub("^ +", "", "   Leading Space")
> [1] "Leading Space"
>
>
>> sub(" +$", "", "Trailing Space")
> [1] "Trailing Space"
>
>
>> gsub("(^ +)|( +$)", "", "Leading and Trailing Space")
> [1] "Leading and Trailing Space"
>
>
> See ?sub which also has ?gsub
>
> Note that the above will only strip spaces, not all white space.
>
> You can then use the appropriate call in one of the *apply() family of
> functions to loop over columns/rows as may be appropriate.

Well, arrays are vectors and so can be done by

A[] <- sub(., A)

and data frames with character columns by

A[] <- lapply(A, function(x) sub(., x))

> HTH,
>
> Marc Schwartz
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Q: calling par() in .First()

2007-08-06 Thread Prof Brian Ripley

On Mon, 6 Aug 2007, Greg Snow wrote:

> Be aware that the effects of calls to par usually only last for the
> duration of the graphics device, not the R session.

They always apply to the current device only (and will create a current 
device if possible).

> If you put a call to par in your startup script, then it will open a 
> graphics device and set the option, but if you close that graphics 
> device and do another plot then a new graphics device will be started 
> with the default parameters rather than what you set in the startup 
> script.
>
> You can set some of the options (including background color) when
> starting a graphics device, that may be the better option.

You can also set a hook (see ?setHook) on plot.new (see its help page), 
which could be used to set par(bg=).  A hook on package grDevices would 
have avoided the reported error messages.  (Calling graphics::par in 
startup code works in R-devel but not in 2.5.x, but using hooks works in 
any fairly recent R.)

> There was some discussion a while back on having global options for some
> of the graphics defaults, but I don't think anything has been
> implemented yet.

I don't believe there was agreement that was desirable.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] warnings()

2007-08-06 Thread Prof Brian Ripley

Possible routes:

1) Use options(warn=2) and traceback().

2) Search the *package* sources.  This is from package GRASS, I believe.
(Not all messages come from packages: they can come from R itself or from 
compiled code linked into a package.)

On Mon, 6 Aug 2007, javier garcia-pintado wrote:

> Hi,
> Is there a way to know which library is giving a warning?
> Specifically, I'm getting a set of warnings:
>
> "Too many open raster files"
>
> Thanks and best wishes,
>
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lda and maximum likelihood

2007-08-06 Thread Prof Brian Ripley

On Mon, 6 Aug 2007, [EMAIL PROTECTED] wrote:

> I am trying to compare several methods for classify data into groups.
> In that purpose I 'd like to developp model comparison and selection
> using AIC.
>
> In the lda function of the MASS library, the maximum likelihood of the
> function is not given in the output and the script is not available.

The source _is_ available: it is part of the R tarball, and in the VR 
bundle on CRAN.

> Do anyone know how to extract or compute the maximum likelihood used in
> the lda function?

It does not maximize a likelihood: what it does do is described in the 
book for which this is support software.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question regarding QT device

2007-08-05 Thread Prof Brian Ripley

grDevices::deviceIsInteractive is only in the unreleased R-devel version 
of R: which version are you using?

Please do study the R posting guide: we do ask for basic information for a 
good reason, and do ask for questions on packages (especially unreleased 
packages) to be sent to the maintainer.


On Sun, 5 Aug 2007, Saptarshi Guha wrote:

> Hi,
>   After a few modifications in the makefiles, I successfully compiled
> the Qt device (written by Deepayan Sirkar) for OS X 10.4.9 on a
> Powerbook.
>   However when loading into R
>
>   If i remove this line from zzz.R in qtutils/R
>
> grDevices::deviceIsInteractive("QT")
>
>   and then install
>   >library(qtutils)
>
>   loads fine and the QT() calls returns a QT window, however, if i
> switch to another application and then switch back to the R GUI, the
> menubar has disappeared.
>
>   If I do not remove the line
>
> grDevices::deviceIsInteractive("QT")
>
>   the following error appears an qtutils does not load
>   Error : 'deviceIsInteractive' is not an exported object from
> 'namespace:grDevices'
>   Error : .onLoad failed in 'loadNamespace' for 'qtutils'
>   Error: package/namespace load failed for 'qtutils'
>
>   Could anyone provide some pointers to get that deviceIsInteractive
> to work?
>
>   Thanks for your time
>   Saptarshi
>
> Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sorting data for multiple regressions

2007-08-03 Thread Prof Brian Ripley

Well, R has a by() function that does what you want, and its help page 
contains an example of doing regression by group.

(There are other ways.)

On Fri, 3 Aug 2007, Paul Young wrote:

> So I am trying to perform a robust regression (fastmcd in the robust
> package) on a dataset and I want to perform individual regressions based

fastmcd does not do regression ... or I would have adapted the ?by 
example to show you.

> on the groups within the data.  We have over 300 sites and we want to
> perform a regression based on the day of week and the hour for every
> site.  I was wondering if anyone knows of a "'by' command similar to the
> one used in SAS that automatically groups the data for the regressions.
> If not, does anyone have any tips on how to split the data into smaller
> sets and then perform the regression on each set.  I am new to R, so I
> don't know all of the common work arounds and such.  At the moment the
> only method I can think of is to split the data using condition
> statements and manually running the regression on each set.  Thanks or
> your help
>
> -Paul

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FW: Selecting undefined column of a data frame (was [BioC] read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Prof Brian Ripley

I've since seen your followup a more detailed explanation may help.
The path through the code for your argument list does not go where you 
quoted, and there is a reason for it.

Generally when you extract in R and ask for an non-existent index you get 
NA or NULL as the result (and no warning), e.g.

> y <- list(x=1, y=2)
> y[["z"]]
NULL

Because data frames 'must' have (column) names, they are a partial 
exception and when the result is a data frame you get an error if it would 
contain undefined columns.

But in the case of foo[, "FileName"], the result is a single column and so 
will not have a name: there seems no reason to be different from

> foo[["FileName"]]
NULL
> foo$FileName
NULL

which similarly select a single column.  At one time they were different 
in R, for no documented reason.


On Fri, 3 Aug 2007, Prof Brian Ripley wrote:

> You are reading the wrong part of the code for your argument list:
>
>>  foo["FileName"]
> Error in `[.data.frame`(foo, "FileName") : undefined columns selected
>
> [.data.frame is one of the most complex functions in R, and does many 
> different things depending on which arguments are supplied.
>
>
> On Fri, 3 Aug 2007, Steven McKinney wrote:
>
>> Hi all,
>> 
>> What are current methods people use in R to identify
>> mis-spelled column names when selecting columns
>> from a data frame?
>> 
>> Alice Johnson recently tackled this issue
>> (see [BioC] posting below).
>> 
>> Due to a mis-spelled column name ("FileName"
>> instead of "Filename") which produced no warning,
>> Alice spent a fair amount of time tracking down
>> this bug.  With my fumbling fingers I'll be tracking
>> down such a bug soon too.
>> 
>> Is there any options() setting, or debug technique
>> that will flag data frame column extractions that
>> reference a non-existent column?  It seems to me
>> that the "[.data.frame" extractor used to throw an
>> error if given a mis-spelled variable name, and I
>> still see lines of code in "[.data.frame" such as
>> 
>> if (any(is.na(cols)))
>>stop("undefined columns selected")
>> 
>> 
>> 
>> In R 2.5.1 a NULL is silently returned.
>> 
>>> foo <- data.frame(Filename = c("a", "b"))
>>> foo[, "FileName"]
>> NULL
>> 
>> Has something changed so that the code lines
>> if (any(is.na(cols)))
>>stop("undefined columns selected")
>> in "[.data.frame" no longer work properly (if
>> I am understanding the intention properly)?
>> 
>> If not, could  "[.data.frame" check an
>> options() variable setting (say
>> warn.undefined.colnames) and throw a warning
>> if a non-existent column name is referenced?
>> 
>> 
>> 
>> 
>>> sessionInfo()
>> R version 2.5.1 (2007-06-27)
>> powerpc-apple-darwin8.9.1
>> 
>> locale:
>> en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
>> 
>> attached base packages:
>> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods" 
>> "base"
>> 
>> other attached packages:
>> plotrix lme4   Matrix  lattice
>> "2.2-3"  "0.99875-4" "0.999375-0" "0.16-2"
>>> 
>> 
>> 
>> 
>> Steven McKinney
>> 
>> Statistician
>> Molecular Oncology and Breast Cancer Program
>> British Columbia Cancer Research Centre
>> 
>> email: smckinney +at+ bccrc +dot+ ca
>> 
>> tel: 604-675-8000 x7561
>> 
>> BCCRC
>> Molecular Oncology
>> 675 West 10th Ave, Floor 4
>> Vancouver B.C.
>> V5Z 1L3
>> Canada
>> 
>> 
>> 
>> 
>> -Original Message-
>> From: [EMAIL PROTECTED] on behalf of Johnstone, Alice
>> Sent: Wed 8/1/2007 7:20 PM
>> To: [EMAIL PROTECTED]
>> Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
>> 
>> For interest sake, I have found out why I wasn't getting my expected
>> results when using read.AnnotatedDataFrame
>> Turns out the error was made in the ReadAffy command, where I specified
>> the filenames to be read from my AnnotatedDataFrame object.  There was a
>> typo error with a capital N ($FileName) rather than lowercase n
>> ($Filename) as in my target file..whoops.  However this meant the
>> filename argument was ignored without th

Re: [R] FW: Selecting undefined column of a data frame (was [BioC] read.phenoData vs read.AnnotatedDataFrame)

2007-08-03 Thread Prof Brian Ripley

You are reading the wrong part of the code for your argument list:

>  foo["FileName"]
Error in `[.data.frame`(foo, "FileName") : undefined columns selected

[.data.frame is one of the most complex functions in R, and does many 
different things depending on which arguments are supplied.


On Fri, 3 Aug 2007, Steven McKinney wrote:

> Hi all,
>
> What are current methods people use in R to identify
> mis-spelled column names when selecting columns
> from a data frame?
>
> Alice Johnson recently tackled this issue
> (see [BioC] posting below).
>
> Due to a mis-spelled column name ("FileName"
> instead of "Filename") which produced no warning,
> Alice spent a fair amount of time tracking down
> this bug.  With my fumbling fingers I'll be tracking
> down such a bug soon too.
>
> Is there any options() setting, or debug technique
> that will flag data frame column extractions that
> reference a non-existent column?  It seems to me
> that the "[.data.frame" extractor used to throw an
> error if given a mis-spelled variable name, and I
> still see lines of code in "[.data.frame" such as
>
> if (any(is.na(cols)))
>stop("undefined columns selected")
>
>
>
> In R 2.5.1 a NULL is silently returned.
>
>> foo <- data.frame(Filename = c("a", "b"))
>> foo[, "FileName"]
> NULL
>
> Has something changed so that the code lines
> if (any(is.na(cols)))
>stop("undefined columns selected")
> in "[.data.frame" no longer work properly (if
> I am understanding the intention properly)?
>
> If not, could  "[.data.frame" check an
> options() variable setting (say
> warn.undefined.colnames) and throw a warning
> if a non-existent column name is referenced?
>
>
>
>
>> sessionInfo()
> R version 2.5.1 (2007-06-27)
> powerpc-apple-darwin8.9.1
>
> locale:
> en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
>
> attached base packages:
> [1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods"   
> "base"
>
> other attached packages:
> plotrix lme4   Matrix  lattice
> "2.2-3"  "0.99875-4" "0.999375-0" "0.16-2"
>>
>
>
>
> Steven McKinney
>
> Statistician
> Molecular Oncology and Breast Cancer Program
> British Columbia Cancer Research Centre
>
> email: smckinney +at+ bccrc +dot+ ca
>
> tel: 604-675-8000 x7561
>
> BCCRC
> Molecular Oncology
> 675 West 10th Ave, Floor 4
> Vancouver B.C.
> V5Z 1L3
> Canada
>
>
>
>
> -Original Message-
> From: [EMAIL PROTECTED] on behalf of Johnstone, Alice
> Sent: Wed 8/1/2007 7:20 PM
> To: [EMAIL PROTECTED]
> Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
>
> For interest sake, I have found out why I wasn't getting my expected
> results when using read.AnnotatedDataFrame
> Turns out the error was made in the ReadAffy command, where I specified
> the filenames to be read from my AnnotatedDataFrame object.  There was a
> typo error with a capital N ($FileName) rather than lowercase n
> ($Filename) as in my target file..whoops.  However this meant the
> filename argument was ignored without the error message(!) and instead
> of using the information in the AnnotatedDataFrame object (which
> included filenames, but not alphabetically) it read the .cel files in
> alphabetical order from the working directory - hence the wrong file was
> given the wrong label (given by the order of Annotated object) and my
> comparisons were confused without being obvious as to why or where.
> Our solution: specify that filename is as.character so assignment of
> file to target is correct(after correcting $Filename) now that using
> read.AnnotatedDataFrame rather than readphenoData.
>
> Data<-ReadAffy(filenames=as.character(pData(pd)$Filename),phenoData=pd)
>
> Hurrah!
>
> It may be beneficial to others, that if the filename argument isn't
> specified, that filenames are read from the phenoData object if included
> here.
>
> Thanks!
>
> -Original Message-
> From: Martin Morgan [mailto:[EMAIL PROTECTED]
> Sent: Thursday, 26 July 2007 11:49 a.m.
> To: Johnstone, Alice
> Cc: [EMAIL PROTECTED]
> Subject: Re: [BioC] read.phenoData vs read.AnnotatedDataFrame
>
> Hi Alice --
>
> "Johnstone, Alice" <[EMAIL PROTECTED]> writes:
>
>> Using R2.5.0 and Bioconductor I have been following code to analysis
>> Affymetrix expression data: 2 treatments vs control.  The original
>> code was run last year and used the read.phenoData command, however
>> with the newer version I get the error message Warning messages:
>> read.phenoData is deprecated, use read.AnnotatedDataFrame instead The
>> phenoData class is deprecated, use AnnotatedDataFrame (with
>> ExpressionSet) instead
>>
>> I use the read.AnnotatedDataFrame command, but when it comes to the
>> end of the analysis the comparison of the treatment to the controls
>> gets mixed up compared to what you get using the original
>> read.phenoData ie it looks like the 3 groups get labelled wrong and so
>
>> the comparisons are different (but they can still be matched up).
>> My questions are,
>>

Re: [R] How to properly finalize external pointers?

2007-08-03 Thread Prof Brian Ripley


On Fri, 3 Aug 2007, Duncan Murdoch wrote:


On 8/3/2007 9:19 AM, Jens Oehlschlägel wrote:

Dear R .Call() insiders,

Can someone enlighten me how to properly finalize external pointers in C code 
(R-2.5.1 win)? What is the relation between R_ClearExternalPtr and the 
finalizer set in R_RegisterCFinalizer?

I succeeded registering a finalizer that works when an R object containing an 
external pointer is garbage collected. However, I have some difficulties 
figuring out how to do that in an explicit closing function.

I observed that
- calling R_ClearExternalPtr does not trigger the finalizer and is dangerous 
because it removes the pointer before the finalizer needs it at 
garbage-collection-time (no finalization = memory leak)
- calling the finalizer directly ensures finalization but now the finalizer is 
called twice (once again at garbage collection time, and I did not find 
documentation how to unregister the finalizer)
- It works to delete the SEXP external pointer object but only if not calling 
R_ClearExternalPtr (but why then do we need it?) Furthermore it is unfortunate 
to delay freeing the external pointers memory if I know during runtime that it 
can be done immediately.

Shouldn't R_ClearExternalPtr call the finalizer and then unregister it? 
(this would also work when removing the SEXP external pointer object is 
difficult because it was handed over to the closing function directly 
as a parameter)


I think we want R_ClearExternalPtr to work even if the finalizer would
fail (e.g. to clean up when there was an error when trying to build the
external object).

So I'd suggest that when you want to get rid of an external object
immediately, you call the finalizer explicitly, then call
R_ClearExternalPtr.  The documentation doesn't address the question of
whether this will clear the registered finalizer so I don't know if
you'll get a second call to the finalizer during garbage collection, but
even if you do, isn't it easy enough to do nothing when you see the null
ptr, as you do below?


You will get a further finalizer call at GC, and I know of no way to 
unregister finalizers. So make sure the finalizer does nothing the second 
time.


The way connections are handled in R-devel provides an example (although 
it is work in progress).


Another possibility is to call the GC yourself if you know there are a lot 
of objects to clear up.



By the way, questions about programming at this level are better asked
in the R-devel group.


Indeed.

[...]

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plot to postscript orientation

2007-08-03 Thread Prof Brian Ripley

Do you have the Orientation menu set to 'Auto'?
The effect described seems that if 'Rotate media' is selected, which it 
should not be.

The files look fine to me in GSView 4.8 on Windows and other viewers on 
Linux.  I agree with Uwe that it is a viewer issue (most reported 
postscript/PDF are).

On Fri, 3 Aug 2007, John Kane wrote:

>
> I seem to see the same problem that Miruna gets just
> to confirm that it is not just her set-up.
>
> I'm using GSview4.8 if that helps
>
> --- Uwe Ligges <[EMAIL PROTECTED]>
> wrote:
>
>
>>
>>
>> Miruna Petrescu-Prahova wrote:
>>>  Hi
>>>
>>>  I am trying to save some plots in a postscript
>> file. When I generate the
>>> plots in the main window, they appear correctly -
>> their orientation is
>>> landscape (i.e., horizontal). However, when I open
>> the .ps file with GSview,
>>> the whole page appears vertically, and the plot
>> appears horizontally, which
>>> means that the plot is only partially visible
>> (example here
>>>
>>
> https://webfiles.uci.edu/mpetresc/postscript.files/default.ps
>> ). I searched
>>> the R-help mailing list archive and found 2
>> suggestions: setting the width
>>> and height and setting horizontal = FALSE. I have
>> tried setting the width
>>> and height but it makes no difference. I have also
>> tried using "horizontal =
>>> FALSE". This rotates and elongates the plot, but
>> it is still displayed
>>> horizontally on a vertical page, and so only
>> partially  visible (example
>>> here
>>
> https://webfiles.uci.edu/mpetresc/postscript.files/horiz.false.ps).
>> I
>>> am not sure what is wrong. Plots are created with
>> "filled.contour".
>>
>>
>> I guess this is a misconfiguration of your GSview.
>> The plots are fine
>> for me. Anyway, you might also want to set the
>> argument
>> paper="special" in the postscript() call.
>>
>> Uwe Ligges
>>
>>
>>>  Thanks
>>>  Miruna
>>>
>>>
>>> 
>>> Miruna Petrescu-Prahova
>>> Department of Sociology
>>> University of California, Irvine
>>> [EMAIL PROTECTED]
>>>
>>> __
>>> R-help@stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained,
>> reproducible code.
>>
>> __
>> R-help@stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained,
>> reproducible code.
>>
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 5059 matches

Mail list logo