date:20070724

[R] Redirecting print output

2007-07-24 Thread Stan Hopkins

I see a rich set of graphic device functions to redirect that output.  Are 
there commands to redirect text as well.  I have a set of functions that 
execute many linear regression tests serially and I want to capture this in a 
file for printing.

Thanks,

Stan Hopkins
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] OT(slightly) - Tracking extended projects

2007-07-24 Thread D L McArthur

James MacDonald jmacdon at med.umich.edu writes:

 
 Hi all,
 
 Most of the analyses I do are short little once-and-done type things that are 
easily encapsulated in a .Rnw
 file. However, I sometimes end up with projects that take an extended amount 
of time. Usually these
 projects are not easily encapsulated in an .Rnw file, so I have been using a 
single .R file with lots of comments.
 
 The problem with this approach is keeping track of what you have done and 
what the results were. Once the .R
 file gets to be a certain size, the comments aren't as useful, and I find it 
easy to get lost. I have to assume
 that others have encountered this problem and hopefully have come up with 
something more elegant.
 
 Any suggestions?
 
 Best,
 
 Jim
 
One possible choice is the intuitively structured approach of Projects in 
Tinn-R (see http://www.sciviews.org/Tinn-R/).
 -- D L McArthur  dmca at ucla.edu

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] function optimization: reducing the computing time

2007-07-24 Thread Matthieu Dubois

Dear useRs,

I have written a function that implements a Bayesian method to  
compare a patient's score on two tasks with that of a small control  
group, as described in Crawford, J. and Garthwaite, P. (2007).  
Comparison of a single case to a control or normative sample in  
neuropsychology: Development of a bayesian approach. Cognitive  
Neuropsychology, 24(4):343–372.

The function (see below) return the expected results, but the time  
needed to compute is quite long (at least for a test that may be  
routinely used). There is certainly room for  improvement. It would  
really be helpful if some experts of you may have  a  look ...

Thanks a lot.
Regards,

Matthieu


FUNCTION
--
The function takes the performance on two tasks  and estimate the  
rarity (the p-value) of the difference between the patient's two  
scores, in comparison to the difference i the  controls subjects. A  
standardized and an unstandardized version are provided (controlled  
by the parameter standardized: T vs. F). Also, for congruency with  
the original publication, both the raw data  and  summary statistics  
could be used for the control group.

##
# Bayesian (un)Standardized Difference Test
##

#from Crawford and Garthwaite (2007) Cognitive Neuropsychology
# implemented by Matthieu Dubois, Matthieu.Duboisatpsp.ucl.ac.be

#PACKAGE MCMCpack REQUIRED

# patient: a vector with the two scores; controls: matrix/data.frame  
with the raw scores (one column per  task)
# mean.c, sd.c, r, n: possibility to enter summaries statistics  
(mean, standard deviation, correlation, group size)
# n.simul: number of simulations
# two-sided (Boolean): two-sided (T) vs. one-sided (F) Bayesian  
Credible interval
# standardized (Boolean): standardized (T) vs. unstandardized (F) test
# values are: $p.value (one_tailed), $confidence.interval

crawford.BSDT - function(patient, controls, mean.c=0, sd.c=0 , r=0,  
n=0, na.rm=F, n.simul=10, two.sided=T, standardized=T)
{
library(MCMCpack)

#if no summaries are entered, they are computed
if(missing(n))
{   
if(!is.data.frame(controls)) controls - as.data.frame(controls)
n - dim(controls)[1]
mean.c - mean(controls, na.rm=na.rm)
sd.c - sd(controls, na.rm=na.rm)

na.method - ifelse(na.rm,complete.obs,all.obs)

r - cor(controls[,1], controls[,2], na.method)
}

#variance/covariance matrix
s.xx - (sd.c[1]^2) * (n-1)
s.yy - (sd.c[2]^2) * (n-1)
s.xy - sd.c[1] * sd.c[2] * r * (n-1)

A - matrix(c(s.xx, s.xy, s.xy, s.yy), ncol=2)

#estimation function
if(standardized)
{
estimation - function(patient, mean.c, n, A)
{
#estimation of a variance/covariance matrix (sigma)
sigma = riwish(n,A) #random obs. from an 
inverse-Wishart distribution

#estimation of the means (mu)
z - rnorm(2)
T - t(chol(sigma)) #Cholesky decomposition
mu - mean.c + T %*% z/sqrt(n)

#standardization
z.x - (patient[1]-mu[1]) / sqrt(sigma[1,1])
z.y - (patient[2]-mu[2]) / sqrt(sigma[2,2])
rho.xy - sigma[2.2] / sqrt(sigma[1,1]*sigma[2,2])

z.star - (z.x - z.y) / sqrt(2-2*rho.xy)

#conditional p-value
p - pnorm(z.star)
p
}
}
else
{
estimation - function(patient, mean.c, n, A)
{
#estimation of a variance/covariance matrix (sigma)
sigma = riwish(n,A) #random obs. from an 
inverse-Wishart distribution

#estimation of the means (mu)
z - rnorm(2)
T - t(chol(sigma)) #Cholesky decomposition
mu - mean.c + T %*% z/sqrt(n)

num - (patient[1]-mu[1]) - (patient[2] - mu[2])
denom - sqrt(sigma[1,1]+sigma[2,2]-(2*sigma[1,2]))

z.star - num/denom

#conditional p-value
p - pnorm(z.star)
p
}
}

#application
p - replicate(n.simul, estimation(patient, mean.c, n, A))

#outputs
pval - mean(p)
CI - if(two.sided) 100*quantile(p,c(0.025,0.975)) else 100*quantile 
(p,c(0.95))
output -

Re: [R] The 'REP' term in AMMI{agricolae}

2007-07-24 Thread joris . dewolf



The AMMI senso strictu part starts from the corrected data for genotype and
environment. In most cases where AMMI is applied (maybe also in the
agricolae package, I haven't checked), starts from the interaction effects
obtained through a general linear model anova.
It should be possible to replace the input by blups from your mixed model.
R news 7/1 (p14-19) gave some basic code for AMMI. There you should be able
to get ideas how to modify the code.


Joris








   
 CG Pettersson   
 [EMAIL PROTECTED] 
 e.slu.se  To 
 Sent by:  r-help@stat.math.ethz.ch
 [EMAIL PROTECTED]  cc 
 at.math.ethz.ch   
   Subject 
   [R] The 'REP' term in   
 23/07/2007 19:59  AMMI{agricolae} 
   
   
   
   
   
   




Dear all,

W2k, R 2.5.1

I am trying out the AMMI function in the agricolae package, to analyse the
dependence of environment for a certain cultivar. The function responds to
four basic variables:

ENV Environment
GEN Genotype
REP Replication
Y Response

My question concerns the 'REP' term. When I normally do an analysis of
variance, the replication would refer to repeated observations within the
each field trial. In the case I am analysing right now, I have five years
of observations for each 'ENV', in such a case it feels natural to use
year as the most important replication of the environment- especially as I
am interested in long term trends. This approach also seems to work
allright.

But the field trials are also structured in three main blocks, subdivided
into five 'lattice' blocks, a structure that is powerful in the analysis
of variance. (I use a random call in lme{nlme}).

Is it possible to use the block structure also in the AMMI analysis? If
that is possible, how should I code? I have tried to find out in the
documentation, but if it is stated there I do not understand it.

Thank you
/CG

--
CG Pettersson, PhD
Swedish University of Agricultural Sciences (SLU)
Dept. of Crop Production Ecology. Box 7043.
SE-750 07 Uppsala, Sweden
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Redirecting print output

2007-07-24 Thread Vladimir Eremeev




Stan Hopkins wrote:
 
 I see a rich set of graphic device functions to redirect that output.  Are
 there commands to redirect text as well.  I have a set of functions that
 execute many linear regression tests serially and I want to capture this
 in a file for printing.
 
 Thanks,
 
 Stan Hopkins
 

Yes, there are.
?sink

You could also run your functions from a batch mode:
R  your_script.R  output.txt
or
Rscript your_script.R  output.txt

This, however, will give you a single file, while sink() allows creation of
multiple files.

capture.output can store the output in an array of character strings.

-- 
View this message in context: 
http://www.nabble.com/Redirecting-print-output-tf4134131.html#a11758652
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Redirecting print output

2007-07-24 Thread Stephen Tucker

Here are two simple ways:

=== method1 ===
cat(line1,\n,file=output.txt)
cat(line2,\n,file=output.txt,append=TRUE)

=== method2 ===
sink(output.txt)
cat(line1,\n)
cat(line2,\n)
out - lm(y~x,data=data.frame(x=1:10,y=(1:10+rnorm(10,0,0.1
print(out)
sink()

And then there is 'Sweave'. Check out, for instance
http://www.stat.umn.edu/~charlie/Sweave/

You can embed R code, figures, and output from print methods into your latex
document.

ST
--- Stan Hopkins [EMAIL PROTECTED] wrote:

 I see a rich set of graphic device functions to redirect that output.  Are
 there commands to redirect text as well.  I have a set of functions that
 execute many linear regression tests serially and I want to capture this in
 a file for printing.
 
 Thanks,
 
 Stan Hopkins
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   
Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] OT(slightly) - Tracking extended projects

2007-07-24 Thread Heinz Tuechler

At 22:36 23.07.2007 +, D L McArthur wrote:
James MacDonald jmacdon at med.umich.edu writes:

 
 Hi all,
 
 Most of the analyses I do are short little once-and-done type things
that are 
easily encapsulated in a .Rnw
 file. However, I sometimes end up with projects that take an extended
amount 
of time. Usually these
 projects are not easily encapsulated in an .Rnw file, so I have been
using a 
single .R file with lots of comments.
 
 The problem with this approach is keeping track of what you have done and 
what the results were. Once the .R
 file gets to be a certain size, the comments aren't as useful, and I
find it 
easy to get lost. I have to assume
 that others have encountered this problem and hopefully have come up with 
something more elegant.
 
 Any suggestions?
 
 Best,
 
 Jim
 
One possible choice is the intuitively structured approach of Projects in 
Tinn-R (see http://www.sciviews.org/Tinn-R/).
 -- D L McArthur  dmca at ucla.edu


Or, in case you use emacs and ESS you could find some hints at the ESS help
list ([EMAIL PROTECTED]). Look for [ESS] hide function bodies
and [ESS] outline-minor-mode.

Heinz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] geoR/likfit: variance explained by model

2007-07-24 Thread Ailsa Kerswell

I have been modelling spatial data using function likfit in package geoR.  Now 
that I have the spatial regression models, I would like to calculate the amount 
of variance explained by both the trend and the spatial parts of the model.  
Any advice on how to do this would be much appreciated.

Cheers,
Ailsa

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] persp and greek symbols in the axes labels

2007-07-24 Thread Stephen Tucker

I don't know why it doesn't work but I think people generally recommend that
you use wireframe() in lattice rather than persp(), because wireframe is more
customizable (the pdf document referred to in this post is pretty good):
http://tolstoy.newcastle.edu.au/R/e2/help/07/03/12534.html

Here's an example:

library(lattice)
library(reshape)
x - 1:5
y - 1:3
z - matrix(1:15,ncol=3,dimnames=list(NULL,y))
M - melt(data.frame(x,z,check.names=FALSE),id=1,variable=y)
wireframe(value~x*y,data=M,
  screen=list(z=45,x=-75),
  xlab=expression(kappa[lambda]),
  ylab=as.expression(substitute(paste(phi,=,true,sigma),
  list(true=5))),
  zlab = Z)

[you can play around with the 'screen' argument to rotate the view, analogous
to phi and theta in persp()]


--- Nathalie Peyrard [EMAIL PROTECTED] wrote:

 Hello,
 
 I am plotting a 3D function using persp and I would like to use greek 
 symbols in the axes labels.
 I  have found examples like  this one on the web:
 

plot(0,0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)))
 
 this works well with plot but not with persp:
 with the command
 
 persp(M,theta = -20,phi = 

0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)),zlab
 
 = Z)
 
 I get the labels as in toto.eps
 
 Any suggestion? Thanks!
 
 Nathalie
 
 -- 
 ~~   
 INRA  Toulouse - Unité de Biométrie et  Intelligence Artificielle 
 Chemin de Borde-Rouge BP 52627 31326 CASTANET-TOLOSAN cedex FRANCE 
 Tel : +33(0)5.61.28.54.39 - Fax : +33(0)5.61.28.53.35
 Web :http://mia.toulouse.inra.fr/index.php?id=217
 ~~
 
 
  __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



   
Ready
 for the edge of your seat? 
Check out tonight's top picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function optimization: reducing the computing time

2007-07-24 Thread Prof Brian Ripley

You need to make use of the profiling methods described in 'Writing R 
Exensions'. My machine is about 4x faster than yours: I get


Each sample represents 0.02 seconds.
Total run time: 62.08041 seconds.

Total seconds: time spent in function and callees.
Self seconds: time spent in function alone.

   %   total   %   self
 totalseconds selfsecondsname
100.00 62.08  0.00  0.00 system.time
 99.94 62.04  0.00  0.00 crawford.BSDT
 99.94 62.04  0.00  0.00 eval
 99.10 61.52  1.00  0.62 lapply
 99.10 61.52  0.00  0.00 sapply
 99.00 61.46  0.00  0.00 replicate
 98.61 61.22  2.26  1.40 FUN
 98.26 61.00  3.32  2.06 estimation
 83.92 52.10  0.26  0.16 riwish
 83.67 51.94  4.25  2.64 solve
 55.57 34.50  7.18  4.46 solve.default
 51.68 32.08  3.77  2.34 rwish
...

so 84% of the time is being spent in riwish.  Now given that A is fixed, 
you should be able to speed that up by precomputing the constant parts of 
the computation (and you can also precompute your 'T').



On Tue, 24 Jul 2007, Matthieu Dubois wrote:


Dear useRs,

I have written a function that implements a Bayesian method to
compare a patient's score on two tasks with that of a small control
group, as described in Crawford, J. and Garthwaite, P. (2007).
Comparison of a single case to a control or normative sample in
neuropsychology: Development of a bayesian approach. Cognitive
Neuropsychology, 24(4):343ÿÿ372.

The function (see below) return the expected results, but the time
needed to compute is quite long (at least for a test that may be
routinely used). There is certainly room for  improvement. It would
really be helpful if some experts of you may have  a  look ...

Thanks a lot.
Regards,

Matthieu


FUNCTION
--
The function takes the performance on two tasks  and estimate the
rarity (the p-value) of the difference between the patient's two
scores, in comparison to the difference i the  controls subjects. A
standardized and an unstandardized version are provided (controlled
by the parameter standardized: T vs. F). Also, for congruency with
the original publication, both the raw data  and  summary statistics
could be used for the control group.

##
# Bayesian (un)Standardized Difference Test
##

#from Crawford and Garthwaite (2007) Cognitive Neuropsychology
# implemented by Matthieu Dubois, Matthieu.Duboisatpsp.ucl.ac.be

#PACKAGE MCMCpack REQUIRED

# patient: a vector with the two scores; controls: matrix/data.frame
with the raw scores (one column per  task)
# mean.c, sd.c, r, n: possibility to enter summaries statistics
(mean, standard deviation, correlation, group size)
# n.simul: number of simulations
# two-sided (Boolean): two-sided (T) vs. one-sided (F) Bayesian
Credible interval
# standardized (Boolean): standardized (T) vs. unstandardized (F) test
# values are: $p.value (one_tailed), $confidence.interval

crawford.BSDT - function(patient, controls, mean.c=0, sd.c=0 , r=0,
n=0, na.rm=F, n.simul=10, two.sided=T, standardized=T)
{
library(MCMCpack)

#if no summaries are entered, they are computed
if(missing(n))
{
if(!is.data.frame(controls)) controls - as.data.frame(controls)
n - dim(controls)[1]
mean.c - mean(controls, na.rm=na.rm)
sd.c - sd(controls, na.rm=na.rm)

na.method - ifelse(na.rm,complete.obs,all.obs)

r - cor(controls[,1], controls[,2], na.method)
}

#variance/covariance matrix
s.xx - (sd.c[1]^2) * (n-1)
s.yy - (sd.c[2]^2) * (n-1)
s.xy - sd.c[1] * sd.c[2] * r * (n-1)

A - matrix(c(s.xx, s.xy, s.xy, s.yy), ncol=2)

#estimation function
if(standardized)
{
estimation - function(patient, mean.c, n, A)
{
#estimation of a variance/covariance matrix (sigma)
sigma = riwish(n,A) #random obs. from an 
inverse-Wishart distribution

#estimation of the means (mu)
z - rnorm(2)
T - t(chol(sigma)) #Cholesky decomposition
mu - mean.c + T %*% z/sqrt(n)

#standardization
z.x - (patient[1]-mu[1]) / sqrt(sigma[1,1])
z.y - (patient[2]-mu[2]) / sqrt(sigma[2,2])
rho.xy - sigma[2.2] / sqrt(sigma[1,1]*sigma[2,2])

z.star - (z.x - z.y) / sqrt(2-2*rho.xy)

#conditional p-value
p - pnorm(z.star)
p
}
}
else
{

Re: [R] Fitting exponential curve to data points

2007-07-24 Thread Ted Harding

On 24-Jul-07 01:09:06, Andrew Clegg wrote:
 Hi folks,
 
 I've looked through the list archives and online resources, but I
 haven't really found an answer to this -- it's pretty basic, but I'm
 (very much) not a statistician, and I just want to check that my
 solution is statistically sound.
 
 Basically, I have a data file containing two columns of data, call it
 data.tsv:
 
 year  count
 1999  3
 2000  5
 2001  9
 2002  30
 2003  62
 2004  154
 2005  245
 2006  321
 
 These look exponential to me, so what I want to do is plot these
 points on a graph with linear axes, and add an exponential curve over
 the top. I also want to give an R-squared for the fit.
 
 The way I did it was like so:
 
 
# Read in the data, make a copy of it, and take logs
 data = read.table(data.tsv, header=TRUE)
 log.data = data
 log.data$count = log(log.data$count)
 
# Fit a model to the logs of the data
 model = lm(log.data$count ~ year, data = log.data)
 
# Plot the original data points on a graph
 plot(data)
 
# Draw in the exponents of the model's output
 lines(data$year, exp(fitted(model)))
 
 
 Is this the right way to do it? log-ing the data and then exp-ing the
 results seems like a bit of a long-winded way to achieve the desired
 effect. Is the R-squared given by summary(model) a valid measurement
 of the fit of the points to an exponential curve, and should I use
 multiple R-squared or adjusted R-squared?
 
 The R-squared I get from this method (0.98 multiple) seems a little
 high going by the deviation of the last data point from the curve --
 you'll see what I mean if you try it.

I just did. From the plot of log(count) against year, with the plot
of the linear fit of log(count)~year superimposed, I see indications
of a non-linear relationship.

The departures of the data from the fit follow a rather systematic
pattern. Initially the data increase more slowly than the fit,
and lie below it. Then they increase faster and corss over above it.
Then the data increase less fast than the fit, and the final data
point is below the fit.

There are not enough data to properly identify the non-linearity,
but the overall appearance of the data plot suggests to me that
you should be considering one of the growth curve models.

Many such models start of with an increasing rate of growth,
which then slows down, and typically levels off to an asymptote.
The apparent large discrepancy of your final data point could
be compatible with this kind of behaviour.

At this point, knowledge of what kind of thing is represented
by your count variable might be helpful. If, for instance,
it is the count of the numbers of individuals of a species in
an area, then independent knowledge of growth mechanisms may
help to narrow down the kind of model you should be tring to fit.

As to your question about Is this the right way to do it
(i.e. fitting an exponential curve by doing a linear fit of the
logarithm), generally speaking the answer is Yes. But of course
you need to be confident that exponential is the right curve
to be fitting in the first place. If it's the wrong type of
curve to be considering, then it's not the right way to do it!

Hoping this help[s,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 24-Jul-07   Time: 10:08:33
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Warning generated by Panda GateDefender Integra.

2007-07-24 Thread gatedefender

07/24/2007 11:10:07 [GMT+0100]
 For security reasons certain items found in an email with your address as the 
sender have not been accepted.

File name: attachment.zip

Filtered by: Malformed messages

Sender: r-help@stat.math.ethz.ch
Recipients: [EMAIL PROTECTED]
CC:

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting exponential curve to data points

2007-07-24 Thread Stephen Tucker

I think your way is probably the easiest (shockingly). For instance, here are
some alternatives - I think in both cases you have to calculate the
coefficient of determination (R^2) manually. My understanding is that
multiple R^2 in your case is the usual R^2 because you only have one
predictor variable, and the adjusted R^2 considers the degrees of freedom and
penalizes for additional predictors. Which is better... depends? (Perhaps
more stats-savvy people can help you on that one. I'm a chemical engineer so
I unjustifiably claim ignorance).

## Data input
input -
Year   Count
19993
20005
20019
200230
200362
2004154
2005245
2006321

dat - read.table(textConnection(input),header=TRUE)
dat[,] - lapply(dat,function(x) x-x[1])
  # shifting in origin; will need to add back in later

## Nonlinear least squares
plot(dat)
out - nls(Count~b0*exp(b1*Year),data=dat,
   start=list(b0=1,b1=1))
lines(dat[,1],fitted(out),col=2)
out - nls(Count~b0+b1*Year+b2*Year^2,data=dat, #polynomial
   start=list(b0=0,b1=1,b2=1))
lines(dat[,1],fitted(out),col=3)

## Optim
f - function(.pars,.dat,.fun) sum((.dat[,2]-.fun(.pars,.dat[,1]))^2)
fitFun - function(b,x) cbind(1,x,x^2)%*%b
expFun - function(b,x) b[1]*exp(b[2]*x)

plot(dat)
out - optim(c(0,1,1),f,.dat=dat,.fun=fitFun)
lines(dat[,1],fitFun(out$par,dat[,1]),col=2)
out - optim(c(1,1),f,.dat=dat,.fun=expFun)
lines(dat[,1],expFun(out$par,dat[,1]),col=3)


--- Andrew Clegg [EMAIL PROTECTED] wrote:

 Hi folks,
 
 I've looked through the list archives and online resources, but I
 haven't really found an answer to this -- it's pretty basic, but I'm
 (very much) not a statistician, and I just want to check that my
 solution is statistically sound.
 
 Basically, I have a data file containing two columns of data, call it
 data.tsv:
 
 year  count
 1999  3
 2000  5
 2001  9
 2002  30
 2003  62
 2004  154
 2005  245
 2006  321
 
 These look exponential to me, so what I want to do is plot these
 points on a graph with linear axes, and add an exponential curve over
 the top. I also want to give an R-squared for the fit.
 
 The way I did it was like so:
 
 
 # Read in the data, make a copy of it, and take logs
 data = read.table(data.tsv, header=TRUE)
 log.data = data
 log.data$count = log(log.data$count)
 
 # Fit a model to the logs of the data
 model = lm(log.data$count ~ year, data = log.data)
 
 # Plot the original data points on a graph
 plot(data)
 
 # Draw in the exponents of the model's output
 lines(data$year, exp(fitted(model)))
 
 
 Is this the right way to do it? log-ing the data and then exp-ing the
 results seems like a bit of a long-winded way to achieve the desired
 effect. Is the R-squared given by summary(model) a valid measurement
 of the fit of the points to an exponential curve, and should I use
 multiple R-squared or adjusted R-squared?
 
 The R-squared I get from this method (0.98 multiple) seems a little
 high going by the deviation of the last data point from the curve --
 you'll see what I mean if you try it.
 
 Thanks in advance for any help!
 
 Yours gratefully,
 
 Andrew.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting exponential curve to data points

2007-07-24 Thread Stephen Tucker

Well spoken. And since log transformations are nonlinear and 'compresses' the
data, it's not surprising to find that the fit doesn't look so nice while the
fit metrics tell you that a model does a good job.

--- [EMAIL PROTECTED] wrote:

 On 24-Jul-07 01:09:06, Andrew Clegg wrote:
  Hi folks,
  
  I've looked through the list archives and online resources, but I
  haven't really found an answer to this -- it's pretty basic, but I'm
  (very much) not a statistician, and I just want to check that my
  solution is statistically sound.
  
  Basically, I have a data file containing two columns of data, call it
  data.tsv:
  
  year  count
  1999  3
  2000  5
  2001  9
  2002  30
  2003  62
  2004  154
  2005  245
  2006  321
  
  These look exponential to me, so what I want to do is plot these
  points on a graph with linear axes, and add an exponential curve over
  the top. I also want to give an R-squared for the fit.
  
  The way I did it was like so:
  
  
 # Read in the data, make a copy of it, and take logs
  data = read.table(data.tsv, header=TRUE)
  log.data = data
  log.data$count = log(log.data$count)
  
 # Fit a model to the logs of the data
  model = lm(log.data$count ~ year, data = log.data)
  
 # Plot the original data points on a graph
  plot(data)
  
 # Draw in the exponents of the model's output
  lines(data$year, exp(fitted(model)))
  
  
  Is this the right way to do it? log-ing the data and then exp-ing the
  results seems like a bit of a long-winded way to achieve the desired
  effect. Is the R-squared given by summary(model) a valid measurement
  of the fit of the points to an exponential curve, and should I use
  multiple R-squared or adjusted R-squared?
  
  The R-squared I get from this method (0.98 multiple) seems a little
  high going by the deviation of the last data point from the curve --
  you'll see what I mean if you try it.
 
 I just did. From the plot of log(count) against year, with the plot
 of the linear fit of log(count)~year superimposed, I see indications
 of a non-linear relationship.
 
 The departures of the data from the fit follow a rather systematic
 pattern. Initially the data increase more slowly than the fit,
 and lie below it. Then they increase faster and corss over above it.
 Then the data increase less fast than the fit, and the final data
 point is below the fit.
 
 There are not enough data to properly identify the non-linearity,
 but the overall appearance of the data plot suggests to me that
 you should be considering one of the growth curve models.
 
 Many such models start of with an increasing rate of growth,
 which then slows down, and typically levels off to an asymptote.
 The apparent large discrepancy of your final data point could
 be compatible with this kind of behaviour.
 
 At this point, knowledge of what kind of thing is represented
 by your count variable might be helpful. If, for instance,
 it is the count of the numbers of individuals of a species in
 an area, then independent knowledge of growth mechanisms may
 help to narrow down the kind of model you should be tring to fit.
 
 As to your question about Is this the right way to do it
 (i.e. fitting an exponential curve by doing a linear fit of the
 logarithm), generally speaking the answer is Yes. But of course
 you need to be confident that exponential is the right curve
 to be fitting in the first place. If it's the wrong type of
 curve to be considering, then it's not the right way to do it!
 
 Hoping this help[s,
 Ted.
 
 
 E-Mail: (Ted Harding) [EMAIL PROTECTED]
 Fax-to-email: +44 (0)870 094 0861
 Date: 24-Jul-07   Time: 10:08:33
 -- XFMail --
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] values from a linear model

2007-07-24 Thread Manuele Pesenti

Dear R users,
how can I extrapolate values listed in the summary of an lm model but not 
directly available between object values such as the the standard errors of 
the calculated parameters?

for example I got a model:

mod - lm(Crd ~ 1 + Week, data=data)

and its summary:

 summary(mod)

Call:
lm(formula = Crd ~ 1 + Week, data = data, model = TRUE, y = TRUE)

Residuals:
   Min 1Q Median 3QMax 
-4.299e-03 -1.653e-03  2.628e-05  1.291e-03  5.130e-03 

Coefficients:
 Estimate Std. Error  t value Pr(|t|)
(Intercept) 1.000e+01  3.962e-04 25238.73   2e-16 ***
Week5.038e-04  6.812e-0673.96   2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 0.001966 on 98 degrees of freedom
Multiple R-Squared: 0.9824, Adjusted R-squared: 0.9822 
F-statistic:  5469 on 1 and 98 DF,  p-value:  2.2e-16

I'm interested in values of Std. Error of coefficients...

thank you very much

Best regards
Manuele

-- 
Manuele Pesenti
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://mpesenti.polito.it

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems with character objects and calls to list() [solved]

2007-07-24 Thread Neil Shephard

Thanks to Patrick Burns and Mark Lyman for their suggestions in
solving my problem.

Patrick suggested creating the list directly (the solution I opted for)

On 7/23/07, Patrick Burns [EMAIL PROTECTED] wrote:
 Why not make the list directly:

 list.to.convert - vector('list', n)
 for(x in 1:n) list.to.convert[[x]] - seq((2*x)-1, 2*x)

 S Poetry may be of use to you.

Whilst Mark suggested the correct evaluation of the character object

   list(1:2, 3:4, 5:6)
  [[1]]
  [1] 1 2
 
  [[2]]
  [1] 3 4
 
  [[3]]
  [1] 5 6
 

  eval(parse(text=paste(list(,to.convert,),sep=)))
 [[1]]
 [1] 1 2

 [[2]]
 [1] 3 4

 [[3]]
 [1] 5 6

 [[4]]
 [1] 7 8

 [[5]]
 [1]  9 10

 [[6]]
 [1] 11 12


-- 
In mathematics you don't understand things. You just get used to
them.  - Johann von Neumann

Email - [EMAIL PROTECTED] / [EMAIL PROTECTED]
Website - http://slack.ser.man.ac.uk/
Photos - http://www.flickr.com/photos/slackline/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] values from a linear model

2007-07-24 Thread ONKELINX, Thierry

It's not very clear to me but I think summary(mod)$coef[, Std. Error]
is wat you need?

Cheers,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
[EMAIL PROTECTED]
www.inbo.be 

Do not put your faith in what statistics say until you have carefully
considered what they do not say.  ~William W. Watt
A statistical analysis, properly conducted, is a delicate dissection of
uncertainties, a surgery of suppositions. ~M.J.Moroney

 

 -Oorspronkelijk bericht-
 Van: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] Namens Manuele Pesenti
 Verzonden: dinsdag 24 juli 2007 12:03
 Aan: r-help@stat.math.ethz.ch
 Onderwerp: [R] values from a linear model
 
 Dear R users,
 how can I extrapolate values listed in the summary of an lm 
 model but not directly available between object values such 
 as the the standard errors of the calculated parameters?
 
 for example I got a model:
 
 mod - lm(Crd ~ 1 + Week, data=data)
 
 and its summary:
 
  summary(mod)
 
 Call:
 lm(formula = Crd ~ 1 + Week, data = data, model = TRUE, y = TRUE)
 
 Residuals:
Min 1Q Median 3QMax 
 -4.299e-03 -1.653e-03  2.628e-05  1.291e-03  5.130e-03 
 
 Coefficients:
  Estimate Std. Error  t value Pr(|t|)
 (Intercept) 1.000e+01  3.962e-04 25238.73   2e-16 ***
 Week5.038e-04  6.812e-0673.96   2e-16 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
 Residual standard error: 0.001966 on 98 degrees of freedom
 Multiple R-Squared: 0.9824, Adjusted R-squared: 0.9822 
 F-statistic:  5469 on 1 and 98 DF,  p-value:  2.2e-16
 
 I'm interested in values of Std. Error of coefficients...
 
 thank you very much
 
 Best regards
   Manuele
 
 --
 Manuele Pesenti
   [EMAIL PROTECTED]
   [EMAIL PROTECTED]
   http://mpesenti.polito.it
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] values from a linear model

2007-07-24 Thread Dimitris Rizopoulos

try this:

coef(summary(mod))[, Std. Error]


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Manuele Pesenti [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Tuesday, July 24, 2007 12:02 PM
Subject: [R] values from a linear model


 Dear R users,
 how can I extrapolate values listed in the summary of an lm model 
 but not
 directly available between object values such as the the standard 
 errors of
 the calculated parameters?

 for example I got a model:

 mod - lm(Crd ~ 1 + Week, data=data)

 and its summary:

 summary(mod)

 Call:
 lm(formula = Crd ~ 1 + Week, data = data, model = TRUE, y = TRUE)

 Residuals:
   Min 1Q Median 3QMax
 -4.299e-03 -1.653e-03  2.628e-05  1.291e-03  5.130e-03

 Coefficients:
 Estimate Std. Error  t value Pr(|t|)
 (Intercept) 1.000e+01  3.962e-04 25238.73   2e-16 ***
 Week5.038e-04  6.812e-0673.96   2e-16 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 Residual standard error: 0.001966 on 98 degrees of freedom
 Multiple R-Squared: 0.9824, Adjusted R-squared: 0.9822
 F-statistic:  5469 on 1 and 98 DF,  p-value:  2.2e-16

 I'm interested in values of Std. Error of coefficients...

 thank you very much

 Best regards
 Manuele

 -- 
 Manuele Pesenti
 [EMAIL PROTECTED]
 [EMAIL PROTECTED]
 http://mpesenti.polito.it

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] values from a linear model

2007-07-24 Thread Vladimir Eremeev




Manuele Pesenti wrote:
 
 Dear R users,
 how can I extrapolate values listed in the summary of an lm model but not 
 directly available between object values such as the the standard errors
 of 
 the calculated parameters?
 
 for example I got a model:
 
 mod - lm(Crd ~ 1 + Week, data=data)
 
 and its summary:
 
 summary(mod)
 
 Call:
 lm(formula = Crd ~ 1 + Week, data = data, model = TRUE, y = TRUE)
 
 Residuals:
Min 1Q Median 3QMax 
 -4.299e-03 -1.653e-03  2.628e-05  1.291e-03  5.130e-03 
 
 Coefficients:
  Estimate Std. Error  t value Pr(|t|)
 (Intercept) 1.000e+01  3.962e-04 25238.73   2e-16 ***
 Week5.038e-04  6.812e-0673.96   2e-16 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
 
 Residual standard error: 0.001966 on 98 degrees of freedom
 Multiple R-Squared: 0.9824, Adjusted R-squared: 0.9822 
 F-statistic:  5469 on 1 and 98 DF,  p-value:  2.2e-16
 
 I'm interested in values of Std. Error of coefficients...
 
 thank you very much
 

If you want to assign these values to some other variables, try assigning
the result of the summary() to a variable and working with its components
(the result is a list, use $ or [[]] to get its members)

mod.sum-summary(mod)
then
coef(mod.sum)[,2]
or
mod.sum$coefficients[,2]

will give you those Std. Errors

-- 
View this message in context: 
http://www.nabble.com/values-from-a-linear-model-tf4134911.html#a11760459
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] extension of rnormp package

2007-07-24 Thread Elio Mineo

Il giorno mar, 24/07/2007 alle 06.39 +0100, Prof Brian Ripley ha
scritto:
 On Mon, 23 Jul 2007, Iwona Szyd?owska wrote:
 
  Hello,
 
  I would like to ask You, how to generate random numbers from an 
  exponential power family with a shape parameter p less than 1(p-0). I 
  found the rnormp package, which can generate numbers from this 
  distribution, but only for parameter less or equal 1.
 
 It seems you mean package 'normalp', and that the package author believes 
 that the exponential power distribution is only defined for p = 1 
 (although that is not on the help page). Other authors believe it is 
 defined by a relationship to the gamma for all p  0. So all you need to 
 do is to change the condition from p  1 to p = 0 in rnormp and friends.
 
Well, I know that an exponential power distribution is defined for p0,
(I think quite all the references I know consider p0), but for 0p1
the algorithms that I have implemented for the estimates of the
distribution parameters and for the regression parameters are really
instable (pratically are not usable at all). Then, I prefered for all
the functions of the normalp package consider only the case p=1.
All the best,
Angelo Mineo

 However, the algorithms used are not adequate for large or small p.  We 
 know that the distribution tends to uniform for p - Inf, but pnormp and 
 rnormp break down for quite modest values of p.  As p - 0 it tends to a 
 point distribution at 0, but you will see very large values far too often.
 So if you want p smaller than say 0.01 you will need to implement a 
 different algorithm.
 
 
  Regards,
 Iwona Szydlowska
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nnet 10-fold cross-validation

2007-07-24 Thread David Meyer

Hi all,

there is tune() in the e1071 package for doing this in general, and, 
among others, a tune.nnet() wrapper (see ?tune):


  tmodel = tune.nnet(Species ~ ., data = iris, size = 1:5)
  summary(tmodel)

Parameter tuning of `nnet':

- sampling method: 10-fold cross validation

- best parameters:
  size
 1

- best performance: 0.0133

- Detailed performance results:
   size  error dispersion
11 0.0133 0.02810913
22 0.0267 0.04661373
33 0.0267 0.04661373
44 0.0200 0.04499657
55 0.0267 0.04661373

  plot(tmodel)
  tmodel$best.model
a 4-1-3 network with 11 weights
inputs: Sepal.Length Sepal.Width Petal.Length Petal.Width
output(s): Species
options were - softmax modelling

etc.

Best
David



On 7/23/07, S.O. Nyangoma [EMAIL PROTECTED] wrote:
   Hi
   It clear that to do a classification with svm under 10-fold cross
   validation one uses
  
   svm(Xm, newlabs, type = C-classification, kernel = linear,cross =
   10)
  
   What corresponds to the nnet?
   nnet(.,cross=10)?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Overlaying a single contour from a new data array in levelplot

2007-07-24 Thread Jenny Barnes

Dear R-Help community,

I am trying to overlay a single contour line over a correlation plot using 
levelplot in the lattice package. These are the two arrays:

1) a correlation plot over Africa - so each grid square is a different colour 
dependent on correlation - this is in an array: result_cor with dim[465,465]

2) a single contour line from a ***different data source*** - this is from data 
related to the p-values for the above correlation plot - I want to overlay only 
the 95% confidence contour. The p-values are stored in an array: 
result.p.values 
with same dimensions as above.

I have read about using panel.levelplot and panel.contourplot in the R-help 
mailing list but I don't know the right way to call two different data arrays, 
can anybody help me please? I appreciate your time and help with this question.

Many thanks,

Jenny 



~~
Jennifer Barnes
PhD student: long range drought prediction 
Climate Extremes Group
Department of Space and Climate Physics
University College London
Holmbury St Mary 
Dorking, Surrey, RH5 6NT
Tel: 01483 204149
Mob: 07916 139187
Web: http://climate.mssl.ucl.ac.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting exponential curve to data points

2007-07-24 Thread Andrew Clegg

Stephen, Ted -- thanks for your input. I'm glad to know I was barking
up the right-ish tree at least.

On 7/24/07, Ted Harding [EMAIL PROTECTED] wrote:

 There are not enough data to properly identify the non-linearity,
 but the overall appearance of the data plot suggests to me that
 you should be considering one of the growth curve models.

 Many such models start of with an increasing rate of growth,
 which then slows down, and typically levels off to an asymptote.
 The apparent large discrepancy of your final data point could
 be compatible with this kind of behaviour.

You may have hit the nail on the head there. At least I now know that
my method would be reasonable *if* I had a genuine exponential curve.
Bound to come in handy.

 At this point, knowledge of what kind of thing is represented
 by your count variable might be helpful. If, for instance,
 it is the count of the numbers of individuals of a species in
 an area, then independent knowledge of growth mechanisms may
 help to narrow down the kind of model you should be tring to fit.

It's the cumulative number of citations in the MEDLINE literature
database about a particular topic (natural language processing in
biomedicine). So indeed, it can't maintain an exponential growth rate
for long, and an initial spurt while the field is novel and trendy,
followed by a levelling-off, is just what we'd expect. There was a
review a year or so ago that showed a very good exponential fit *then*
but if I could show the last point was indicative of a slowdown, that
would be news at least.

Can anyone point me at a better modelling framework than lm(), in that case?

Thanks again,

Andrew.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plotting gam models

2007-07-24 Thread Lucia Zarauz


Hi everybody,

I am working with gams and I have found some questions when plotting gams 
models.

I am using mgcv, and my model looks something like this:

model- gam(x ~ s(lat,long))

I can plot the output of the model using plot(model) or plot.gam(model) and I 
get a surface plot. 

That is ok, but what I want to do now is to extract the data used to perform 
the surface plot. Like that I would be able to superpose them to a geographical 
map, and to plot these data using other programs.

Thank you very much in advance

Lucía
_ 
Lucia Zarauz 
AZTI - Tecnalia / Unidad de Investigación Marina
Herrera kaia portualdea z/g
20110 Pasaia (Gipuzkoa)
Tel: 943 004 800 - Fax: 943 004 801
e-mail: [EMAIL PROTECTED]
www.azti.es ; www.tecnalia.info
_ 
  
 LEGE OHARRA  AVISO LEGAL  
DISCLAIMER 
Mezu hau pertsonala eta isilpekoa da eta baimenik gabeko erabilera debekatua 
dago legalki. Jasotzailea ez bazara ezabatu mezua, bidali eta kontserbatu gabe. 
Este mensaje es personal y confidencial y su uso no autorizado está prohibido 
legalmente. Si usted no es el destinatario, proceda a borrarlo, sin reenviarlo 
ni conservarlo.
This message is personal and confidential, unauthorised use is legally 
prohibited. If you are not the intended recipient, delete it without resending 
or backing it.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: Re: Fitting exponential curve to data points

2007-07-24 Thread Stephen Tucker

Hope these help for alternatives to lm()? I show the use of a 2nd order
polynomial as an example to generalize a bit.

Sometimes from the subject line two separate responses can appear as reposts
when in fact they are not... (though there are identical reposts too). I
should probably figure a way around that.

--- Stephen Tucker [EMAIL PROTECTED] wrote:

 ## Data input
 input -
 Year Count
 1999  3
 2000  5
 2001  9
 2002  30
 2003  62
 2004  154
 2005  245
 2006  321
 
 dat - read.table(textConnection(input),header=TRUE)
 dat[,] - lapply(dat,function(x) x-x[1])
   # shifting in origin; will need to add back in later
 
 ## Nonlinear least squares
 plot(dat)
 out - nls(Count~b0*exp(b1*Year),data=dat,
start=list(b0=1,b1=1))
 lines(dat[,1],fitted(out),col=2)
 out - nls(Count~b0+b1*Year+b2*Year^2,data=dat, #polynomial
start=list(b0=0,b1=1,b2=1))
 lines(dat[,1],fitted(out),col=3)
 
 ## Optim
 f - function(.pars,.dat,.fun) sum((.dat[,2]-.fun(.pars,.dat[,1]))^2)
 fitFun - function(b,x) cbind(1,x,x^2)%*%b
 expFun - function(b,x) b[1]*exp(b[2]*x)
 
 plot(dat)
 out - optim(c(0,1,1),f,.dat=dat,.fun=fitFun)
 lines(dat[,1],fitFun(out$par,dat[,1]),col=2)
 out - optim(c(1,1),f,.dat=dat,.fun=expFun)
 lines(dat[,1],expFun(out$par,dat[,1]),col=3)



   

Got a little couch potato? 
Check out fun summer activities for kids.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting gam models

2007-07-24 Thread Henrique Dallazuanna

see ?predict.gam

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

On 24/07/07, Lucia Zarauz [EMAIL PROTECTED] wrote:


 Hi everybody,

 I am working with gams and I have found some questions when plotting gams
 models.

 I am using mgcv, and my model looks something like this:

 model- gam(x ~ s(lat,long))

 I can plot the output of the model using plot(model) or plot.gam(model)
 and I get a surface plot.

 That is ok, but what I want to do now is to extract the data used to
 perform the surface plot. Like that I would be able to superpose them to a
 geographical map, and to plot these data using other programs.

 Thank you very much in advance

 Lucía
 _
 Lucia Zarauz
 AZTI - Tecnalia / Unidad de Investigación Marina
 Herrera kaia portualdea z/g
 20110 Pasaia (Gipuzkoa)
 Tel: 943 004 800 - Fax: 943 004 801
 e-mail: [EMAIL PROTECTED]
 www.azti.es ; www.tecnalia.info
 _

  LEGE OHARRA  AVISO LEGAL 
 DISCLAIMER 
 Mezu hau pertsonala eta isilpekoa da eta baimenik gabeko erabilera
 debekatua dago legalki. Jasotzailea ez bazara ezabatu mezua, bidali eta
 kontserbatu gabe.
 Este mensaje es personal y confidencial y su uso no autorizado está
 prohibido legalmente. Si usted no es el destinatario, proceda a borrarlo,
 sin reenviarlo ni conservarlo.
 This message is personal and confidential, unauthorised use is legally
 prohibited. If you are not the intended recipient, delete it without
 resending or backing it.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting gam models

2007-07-24 Thread Lucia Zarauz

Hi Henrique,

Thank you for your suggestion. 

Actually, I have already tried it, but I am confused because the plot I get is 
not the same as the output of plot(model) or plot.gam(model). The yaxis is 
different

On the other hand, if I build a more complex model, as for example:

model- gam(x ~ s(lat,long) + s(temperature))

I would like to extract the information to build the effects for each 
explanatory factor (one graph for s(lat,long) and another for s(temperature)), 
as R does when you use 'plot(model)' and you press return for subsequent pages.

My final aim is to plot the influence of s(lat,long) as a contourplot 
superposed on a geographical map. Maybe there is an easier way to do it...

Thank you very much

lucía



_ 
Lucia Zarauz 
AZTI - Tecnalia / Unidad de Investigación Marina
Herrera kaia portualdea z/g
20110 Pasaia (Gipuzkoa)
Tel: 943 004 800 - Fax: 943 004 801
e-mail: [EMAIL PROTECTED]
www.azti.es ; www.tecnalia.info
_ 
  
 LEGE OHARRA  AVISO LEGAL  
DISCLAIMER 
Mezu hau pertsonala eta isilpekoa da eta baimenik gabeko erabilera debekatua 
dago legalki. Jasotzailea ez bazara ezabatu mezua, bidali eta kontserbatu gabe. 
Este mensaje es personal y confidencial y su uso no autorizado está prohibido 
legalmente. Si usted no es el destinatario, proceda a borrarlo, sin reenviarlo 
ni conservarlo.
This message is personal and confidential, unauthorised use is legally 
prohibited. If you are not the intended recipient, delete it without resending 
or backing it.


De: Henrique Dallazuanna [mailto:[EMAIL PROTECTED] 
Enviado el: martes, 24 de julio de 2007 13:36
Para: Lucia Zarauz
CC: r-help@stat.math.ethz.ch
Asunto: Re: [R] plotting gam models

see ?predict.gam

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O 
On 24/07/07, Lucia Zarauz [EMAIL PROTECTED] wrote:

Hi everybody,

I am working with gams and I have found some questions when plotting gams 
models.

I am using mgcv, and my model looks something like this:

model- gam(x ~ s(lat,long))

I can plot the output of the model using plot(model) or plot.gam(model) and I 
get a surface plot.

That is ok, but what I want to do now is to extract the data used to perform 
the surface plot. Like that I would be able to superpose them to a geographical 
map, and to plot these data using other programs. 

Thank you very much in advance

Lucía
_
Lucia Zarauz
AZTI - Tecnalia / Unidad de Investigación Marina
Herrera kaia portualdea z/g
20110 Pasaia (Gipuzkoa) 
Tel: 943 004 800 - Fax: 943 004 801
e-mail: [EMAIL PROTECTED]
www.azti.es ; www.tecnalia.info
_

 LEGE OHARRA  AVISO LEGAL  
DISCLAIMER 
Mezu hau pertsonala eta isilpekoa da eta baimenik gabeko erabilera debekatua 
dago legalki. Jasotzailea ez bazara ezabatu mezua, bidali eta kontserbatu gabe. 
Este mensaje es personal y confidencial y su uso no autorizado está prohibido 
legalmente. Si usted no es el destinatario, proceda a borrarlo, sin reenviarlo 
ni conservarlo.
This message is personal and confidential, unauthorised use is legally 
prohibited. If you are not the intended recipient, delete it without resending 
or backing it. 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting gam models

2007-07-24 Thread Rod

2007/7/24, Lucia Zarauz [EMAIL PROTECTED]:

 Hi everybody,

 I am working with gams and I have found some questions when plotting gams 
 models.

 I am using mgcv, and my model looks something like this:

 model- gam(x ~ s(lat,long))

 I can plot the output of the model using plot(model) or plot.gam(model) and I 
 get a surface plot.

 That is ok, but what I want to do now is to extract the data used to perform 
 the surface plot. Like that I would be able to superpose them to a 
 geographical map, and to plot these data using other programs.

 Thank you very much in advance

 Lucía
 _
 Lucia Zarauz
 AZTI - Tecnalia / Unidad de Investigación Marina
 Herrera kaia portualdea z/g
 20110 Pasaia (Gipuzkoa)
 Tel: 943 004 800 - Fax: 943 004 801
 e-mail: [EMAIL PROTECTED]
 www.azti.es ; www.tecnalia.info
 _

  LEGE OHARRA  AVISO LEGAL  
 DISCLAIMER 
 Mezu hau pertsonala eta isilpekoa da eta baimenik gabeko erabilera debekatua 
 dago legalki. Jasotzailea ez bazara ezabatu mezua, bidali eta kontserbatu 
 gabe.
 Este mensaje es personal y confidencial y su uso no autorizado está prohibido 
 legalmente. Si usted no es el destinatario, proceda a borrarlo, sin 
 reenviarlo ni conservarlo.
 This message is personal and confidential, unauthorised use is legally 
 prohibited. If you are not the intended recipient, delete it without 
 resending or backing it.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Also see ?vis.gam.
Produces perspective or contour plot views of gam model predictions.

Rod.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting gam models

2007-07-24 Thread Simon Wood


 Thank you for your suggestion.

 Actually, I have already tried it, but I am confused because the plot I get
 is not the same as the output of plot(model) or plot.gam(model). The yaxis
 is different
-- `plot.gam' will always plot the `centered smooth', i.e.  the smooth 
constrained to sum to zero over the data. By default (i.e. using type=link) 
`predict.gam' will include the intercept in the predctions. If you want 
centered terms out of `predict.gam' use the `type=terms option.


 On the other hand, if I build a more complex model, as for example:

 model- gam(x ~ s(lat,long) + s(temperature))

 I would like to extract the information to build the effects for each
 explanatory factor (one graph for s(lat,long) and another for
 s(temperature)), as R does when you use 'plot(model)' and you press return
 for subsequent pages.

Again, try predict.gam with type=terms

best,
Simon
-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] persp and greek symbols in the axes labels

2007-07-24 Thread Prof Brian Ripley


On Tue, 24 Jul 2007, Stephen Tucker wrote:


I don't know why it doesn't work but I think people generally recommend that


It has never been implemented, and I believe the main reason is that the 
labels are plotted at an angle other than a multiple of 90 degrees.  Not 
all devices can do that, and rotated plotmath text can look quite ugly.


And of course this _is_ documented in ?persp

xlab, ylab, zlab: titles for the axes.  N.B. These must be character
  strings; expressions are not accepted.  Numbers will be
  coerced to character strings.


you use wireframe() in lattice rather than persp(), because wireframe is more
customizable (the pdf document referred to in this post is pretty good):
http://tolstoy.newcastle.edu.au/R/e2/help/07/03/12534.html

Here's an example:

library(lattice)
library(reshape)
x - 1:5
y - 1:3
z - matrix(1:15,ncol=3,dimnames=list(NULL,y))
M - melt(data.frame(x,z,check.names=FALSE),id=1,variable=y)
wireframe(value~x*y,data=M,
 screen=list(z=45,x=-75),
 xlab=expression(kappa[lambda]),
 ylab=as.expression(substitute(paste(phi,=,true,sigma),
 list(true=5))),
 zlab = Z)

[you can play around with the 'screen' argument to rotate the view, analogous
to phi and theta in persp()]


Of course, that does not rotate the labels.   If unrotated labels are 
acceptable, you can easily set up a new coordinate system (by par(usr=)) 
and call text() to put labels where you want on that.  You can even try 
rotating them via srt=.


There would be no harm in implementing this for use on devices where it 
will work: a nice self-contained project for someone who would like to 
learn about R internals.




--- Nathalie Peyrard [EMAIL PROTECTED] wrote:


Hello,

I am plotting a 3D function using persp and I would like to use greek
symbols in the axes labels.
I  have found examples like  this one on the web:



plot(0,0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)))


this works well with plot but not with persp:
with the command

persp(M,theta = -20,phi =


0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)),zlab


= Z)

I get the labels as in toto.eps

Any suggestion? Thanks!

Nathalie

--
~~
INRA  Toulouse - Unité de Biométrie et  Intelligence Artificielle
Chemin de Borde-Rouge BP 52627 31326 CASTANET-TOLOSAN cedex FRANCE
Tel : +33(0)5.61.28.54.39 - Fax : +33(0)5.61.28.53.35
Web :http://mia.toulouse.inra.fr/index.php?id=217
~~



__

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






Ready
 for the edge of your seat?
Check out tonight's top picks on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] recursive vectorized arithmetics

2007-07-24 Thread Boguslaw Kluge

Hi,

I would like to write something similar to:
for(t in 1:100)
v[x[t]] - v[y[t]] + v[z[t]]
in a vectorized form. The x, y, and z vectors may contain duplicates (so v[x] 
- v[y] + v[z] has different semantics). The for loop is not efficient enough 
for my purposes and I would like to avoid using C/Fortran.

This problem occurred to me on several occasions and I feel it is quite 
general. Does anyone have an idea how to solve it nicely?

Thanks,
bk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] persp and greek symbols in the axes labels

2007-07-24 Thread Nathalie Peyrard

Thank you for these answers.

I ended up modifying the ps file directly. But next time I will consider 
wireframe.

Nathalie

Prof Brian Ripley wrote:
 On Tue, 24 Jul 2007, Stephen Tucker wrote:

 I don't know why it doesn't work but I think people generally 
 recommend that

 It has never been implemented, and I believe the main reason is that 
 the labels are plotted at an angle other than a multiple of 90 
 degrees.  Not all devices can do that, and rotated plotmath text can 
 look quite ugly.

 And of course this _is_ documented in ?persp

 xlab, ylab, zlab: titles for the axes.  N.B. These must be character
   strings; expressions are not accepted.  Numbers will be
   coerced to character strings.

 you use wireframe() in lattice rather than persp(), because wireframe 
 is more
 customizable (the pdf document referred to in this post is pretty good):
 http://tolstoy.newcastle.edu.au/R/e2/help/07/03/12534.html

 Here's an example:

 library(lattice)
 library(reshape)
 x - 1:5
 y - 1:3
 z - matrix(1:15,ncol=3,dimnames=list(NULL,y))
 M - melt(data.frame(x,z,check.names=FALSE),id=1,variable=y)
 wireframe(value~x*y,data=M,
  screen=list(z=45,x=-75),
  xlab=expression(kappa[lambda]),
  ylab=as.expression(substitute(paste(phi,=,true,sigma),
  list(true=5))),
  zlab = Z)

 [you can play around with the 'screen' argument to rotate the view, 
 analogous
 to phi and theta in persp()]

 Of course, that does not rotate the labels.   If unrotated labels are 
 acceptable, you can easily set up a new coordinate system (by 
 par(usr=)) and call text() to put labels where you want on that.  You 
 can even try rotating them via srt=.

 There would be no harm in implementing this for use on devices where 
 it will work: a nice self-contained project for someone who would like 
 to learn about R internals.


 --- Nathalie Peyrard [EMAIL PROTECTED] wrote:

 Hello,

 I am plotting a 3D function using persp and I would like to use greek
 symbols in the axes labels.
 I  have found examples like  this one on the web:


 plot(0,0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)))
  


 this works well with plot but not with persp:
 with the command

 persp(M,theta = -20,phi =

 0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)),zlab
  


 = Z)

 I get the labels as in toto.eps

 Any suggestion? Thanks!

 Nathalie

 -- 
 ~~
 INRA  Toulouse - Unité de Biométrie et  Intelligence Artificielle
 Chemin de Borde-Rouge BP 52627 31326 CASTANET-TOLOSAN cedex FRANCE
 Tel : +33(0)5.61.28.54.39 - Fax : +33(0)5.61.28.53.35
 Web :http://mia.toulouse.inra.fr/index.php?id=217
 ~~


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 Ready
  
 for the edge of your seat?
 Check out tonight's top picks on Yahoo! TV.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting exponential curve to data points

2007-07-24 Thread Andrew Clegg

On 7/24/07, Stephen Tucker [EMAIL PROTECTED] wrote:
 Hope these help for alternatives to lm()? I show the use of a 2nd order
 polynomial as an example to generalize a bit.

Great, thanks. If I want to demonstrate that a non-linear curve fits
better than an exponential, what's the best measure for that? Given
that neither of nls() or optim() provide R-squared. Sorry if these are
really silly questions.

 Sometimes from the subject line two separate responses can appear as reposts
 when in fact they are not... (though there are identical reposts too). I
 should probably figure a way around that.

Nope, my fault, I didn't read them properly and thought you were
demonstrating a different way to do exponential curves.

Cheers,

Andrew.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting exponential curve to data points

2007-07-24 Thread Karl Ove Hufthammer

Andrew Clegg:

 Great, thanks. If I want to demonstrate that a non-linear curve fits
 better than an exponential, what's the best measure for that? Given
 that neither of nls() or optim() provide R-squared.

You really need to *very* careful when trying to interprete R² (which can
be defined in many nonequivalent ways) in the nonlinear case. Recommended
(and, dare I say, *required* reading):

Anderson-Sprecher R. (1994). ‘Model comparisons and R²’. The American
  Statistician, volume 48, no. 2, pages 113–117. DOI: 10.2307/2684259

Kvålseth T.O. (1985). ‘Cautionary note about R²’. The American Statistician,
  volume 39, no. 4, pages 279–285. DOI: 10.2307/2683704

Scott A. and Wild C. (1991). ‘Transformations and R²’. The American
  Statistician, volume 45, no. 2, pages 127–129. ISSN 0003-1305.
  DOI: 10.2307/2684375

The Scott  Wild paper has an example that looks very similar to yours, and
that may be instructive.

FYI, in case you’re not used to DOIs: you can resolve the above DOIs to
fulltext URLs using http://dx.doi.org/

-- 
Karl Ove Hufthammer

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Comparing a distance vs. correlation matrices

2007-07-24 Thread Urmi Trivedi

Hi all,

I am a student of MRes Bioinformatics and Comp. Biology at University of Leeds, 
UK. 

I am trying to compare two matrices in R. One is the matrix containing distance 
and the other one contains the correlation values. I would like to create a XY 
plot of these matrices and do some linear regression on it. But, I am a bit new 
to R, so i have a few questions (I searched in the documentation with no 
success).

I have been successful so far in reading both matrices but dont exactly know 
how can I get the plot for the same.

Thanks very much.

Urmi


   
-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fit t distribution

2007-07-24 Thread livia


Hi all, I am trying to fit t distribution using the function tFit in the
library(fBasics). 

I am using the code tFit(datac[[2]]) and it returns the following list.

Title:
 Student-t Parameter Estimation 

Call:
 tFit(x = datac[[2]])

Model:
 Student-t Distribution

Estimated Parameter(s):
 df 
78.4428 

I just wonder how can I refer to the estimated parameters. I tried
tFit(datac[[2]]) $df,tFit(datac[[2]])@df, but neither of them work.
Could anyone give me some advice?

-- 
View this message in context: 
http://www.nabble.com/Fit-t-distribution-tf4136445.html#a11764680
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fit t distribution

2007-07-24 Thread Henrique Dallazuanna

Hi,

tFit(datac[[2]])@fit$estimate


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

On 24/07/07, livia [EMAIL PROTECTED] wrote:


 Hi all, I am trying to fit t distribution using the function tFit in the
 library(fBasics).

 I am using the code tFit(datac[[2]]) and it returns the following list.

 Title:
 Student-t Parameter Estimation

 Call:
 tFit(x = datac[[2]])

 Model:
 Student-t Distribution

 Estimated Parameter(s):
  df
 78.4428

 I just wonder how can I refer to the estimated parameters. I tried
 tFit(datac[[2]]) $df,tFit(datac[[2]])@df, but neither of them work.
 Could anyone give me some advice?

 --
 View this message in context:
 http://www.nabble.com/Fit-t-distribution-tf4136445.html#a11764680
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fit t distribution

2007-07-24 Thread livia


It works. Many thanks

Henrique Dallazuanna wrote:
 
 Hi,
 
 tFit(datac[[2]])@fit$estimate
 
 
 -- 
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O
 
 On 24/07/07, livia [EMAIL PROTECTED] wrote:


 Hi all, I am trying to fit t distribution using the function tFit in
 the
 library(fBasics).

 I am using the code tFit(datac[[2]]) and it returns the following list.

 Title:
 Student-t Parameter Estimation

 Call:
 tFit(x = datac[[2]])

 Model:
 Student-t Distribution

 Estimated Parameter(s):
  df
 78.4428

 I just wonder how can I refer to the estimated parameters. I tried
 tFit(datac[[2]]) $df,tFit(datac[[2]])@df, but neither of them work.
 Could anyone give me some advice?

 --
 View this message in context:
 http://www.nabble.com/Fit-t-distribution-tf4136445.html#a11764680
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
   [[alternative HTML version deleted]]
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Fit-t-distribution-tf4136445.html#a11764994
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] apply incompatible dimensions error

2007-07-24 Thread Bernzweig, Bruce \(Consultant\)

Hi,

I've created the following two matrices (mat1 and mat2) and a function
(f) to calculate the correlations between the two on a row by row basis.

mat1 - matrix(sample(1:500,50), ncol = 5, 
dimnames=list(paste(row, 1:10, sep=), 
paste(col, 1:5, sep=)))

mat2 - matrix(sample(501:1000,50), ncol = 5, 
dimnames=list(paste(row, 1:10, sep=), 
paste(col, 1:5, sep=)))

f - function(x,y) cor(x,y)

When the matrices are squares (# rows = # columns) I have no problems.

However, when they are not (as in the example above with 5 columns and
10 rows), I get the following error:

 apply(mat1, 1, f, y=mat2)
Error in cor(x, y, na.method, method == kendall) : 
incompatible dimensions

Any help would be appreciated.  Thanks!

- Bruce



**
Please be aware that, notwithstanding the fact that the pers...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] randomForest importance problem with combine [Broadcast]

2007-07-24 Thread Liaw, Andy

I've been fixing some problems in the combine() function, but that's
only for regression data.  Looks like you are doing classification, and
I don't see the problem:

R library(randomForest)
randomForest 4.5-19 
Type rfNews() to see new features/changes/bug fixes.
R set.seed(1)
R rflist - replicate(50, randomForest(iris[-5], iris[[5]], ntree=50,
importance=TRUE), simplify=FALSE)
R rfall - do.call(combine, rflist)
R importance(rfall)
setosa versicolor virginica MeanDecreaseAccuracy
Sepal.Length 0.4457861 0.53883425 0.55806570.4120840
Sepal.Width  0.3266790 0.07652383 0.36202400.2128450
Petal.Length 1.1950989 1.42014628 1.32204710.7989841
Petal.Width  1.1986973 1.40855969 1.36406200.7951053
 MeanDecreaseGini
Sepal.Length 9.578580
Sepal.Width  2.301172
Petal.Length42.935832
Petal.Width 44.409058
R importance(rflist[[1]])
   setosa  versicolor virginica MeanDecreaseAccuracy
Sepal.Length 0.401714  0.71583422 0.49464200.4166555
Sepal.Width  0.00 -0.03155946 0.68292870.2317111
Petal.Length 1.290430  1.47915219 1.34567700.8219003
Petal.Width  1.110142  1.44996777 1.35847990.7881210
 MeanDecreaseGini
Sepal.Length 6.168439
Sepal.Width  2.240723
Petal.Length48.821726
Petal.Width 42.059112

Please provide a reproducible example.

Andy
 

From: Joseph Retzer
 
 My apologies, subject corrected.
 
 
 I'm building a RF 50 trees at a time due to memory limitations (I have
  roughly .5 million observations and around 20 variables). I thought I
  could combine some or all of my forests later and look at global
  importance. 
 
 If I have say 2 forests : tree1 and tree2, they have similar Gini and
  Raw importances and, additionally, are similar to one another. After
  combining (using the combine command) the trees into one however, the
  combined tree Raw importances have changed in rank order 
 rather dramtically
  (e.g. the top most important becomes least important. It is not
  however a completely reversed ordering). In addtion, the 
 scale of both the
  Raw and Gini importances is orders of magnitude smaller for 
 the combined
  tree.
 
 Note that the combined tree Gini importance looks roughly similar to
  the individual tree Gini (and Raw) importance, at least in 
 terms of rank
  ordering.
 
 I'm using the non-formula randomForest specification  along  with
   norm.votes=FALSE to facilitate  large sample  estimation  and  tree
  combining.
 
 I'm using R 2.5.0 on a windows XP machine with 2 gig RAM. I'm also
  using randomForest 4.5-18.
 
 Any advice is appreciated,
 Many thanks,
 Joe
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply incompatible dimensions error

2007-07-24 Thread Gabor Grothendieck

Your apply is trying to take the correlations of the rows of mat1 with the
columns of mat2 which, of course, does not work if they have different
numbers of columns. I think you mean to take the correlations of the columns
of mat1 with the columns of mat2.  For example, to take the correlations
of the 5 columns of mat1 with the first 4 columns of mat2 try:

 cor(mat1, mat2[,1:4])
col1   col2   col3   col4
col1 -0.34624254 -0.2669519 -0.2705077  0.2183249
col2 -0.26553255 -0.2687643 -0.0865895  0.1819025
col3  0.19474613 -0.2334986  0.1746522  0.2326915
col4  0.09328338  0.5117784  0.2413143 -0.3374916
col5  0.27519716  0.1605331 -0.4057137  0.3282105


On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:
 Hi,

 I've created the following two matrices (mat1 and mat2) and a function
 (f) to calculate the correlations between the two on a row by row basis.

mat1 - matrix(sample(1:500,50), ncol = 5,
dimnames=list(paste(row, 1:10, sep=),
paste(col, 1:5, sep=)))

mat2 - matrix(sample(501:1000,50), ncol = 5,
dimnames=list(paste(row, 1:10, sep=),
paste(col, 1:5, sep=)))

f - function(x,y) cor(x,y)

 When the matrices are squares (# rows = # columns) I have no problems.

 However, when they are not (as in the example above with 5 columns and
 10 rows), I get the following error:

  apply(mat1, 1, f, y=mat2)
 Error in cor(x, y, na.method, method == kendall) :
incompatible dimensions

 Any help would be appreciated.  Thanks!

 - Bruce



 **
 Please be aware that, notwithstanding the fact that the pers...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] cor inside/outside a function has different output

2007-07-24 Thread Bernzweig, Bruce \(Consultant\)

I'm calculating correlations between two matrices 

 

mat1 - matrix(sample(1:500,25), ncol = 5, 

dimnames=list(paste(mat1row, 1:5, sep=), 

paste(mat1col, 1:5, sep=)))

 

mat2 - matrix(sample(501:1000,25), ncol = 5, 

dimnames=list(paste(mat2row, 1:5, sep=), 

paste(mat2col, 1:5, sep=)))

 

using what would seem to be two similar methods:

 

  Method 1:

 

f - function(x,y) cor(x,y)

apply(mat1, 1, f, y=mat2)

 

  Method 2:

 

 cor(mat1, mat2)

 

However, the results (see blow) are different:

 

 apply(mat1, 1, f, y=mat2)

 

   mat1row1   mat1row2mat1row3mat1row4mat1row5

[1,] -0.27601028 -0.1352143  0.03538690 -0.03084075 -0.60171704

[2,] -0.01595532 -0.3881197 -0.43663982  0.49081806  0.33291995

[3,]  0.35969624 -0.0582948  0.57462169  0.09926796 -0.02948423

[4,] -0.41435920 -0.7164638 -0.21213496 -0.55183934 -0.25341790

[5,]  0.33802803  0.5371508  0.05219095  0.83533575  0.17850291

 

 cor(mat1, mat2)

mat2col1mat2col2   mat2col3   mat2col4   mat2col5

mat1col1 -0.84077496 -0.01538414 -0.6078933 -0.2263840 -0.1421335

mat1col2  0.23074421  0.54606286 -0.2354733  0.5214255 -0.2129077

mat1col3 -0.8528  0.19550100 -0.5920509 -0.8694040  0.6853990

mat1col4  0.08050976 -0.55449840  0.6225666  0.6187971 -0.8971584

mat1col5 -0.10199564 -0.43854767 -0.5803077 -0.5100285  0.2848351

 

Also, for method 2, the calculations are done on a column x column
basis.  Is there any way to do this on a row by row basis.  Looking at
the help page for cor, I don't see any parameters that could be used to
do this.

 

Thanks,

 

- Bruce



**
Please be aware that, notwithstanding the fact that the pers...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply incompatible dimensions error

2007-07-24 Thread Benilton Carvalho

are you positive that your function is doing what you expect it to do?

it looks like you want something like:

sapply(1:10, function(i) cor(mat1[i,], mat2[i,]))

b

On Jul 24, 2007, at 11:05 AM, Bernzweig, Bruce ((Consultant)) wrote:

 Hi,

 I've created the following two matrices (mat1 and mat2) and a function
 (f) to calculate the correlations between the two on a row by row  
 basis.

   mat1 - matrix(sample(1:500,50), ncol = 5,
   dimnames=list(paste(row, 1:10, sep=),
   paste(col, 1:5, sep=)))

   mat2 - matrix(sample(501:1000,50), ncol = 5,
   dimnames=list(paste(row, 1:10, sep=),
   paste(col, 1:5, sep=)))

   f - function(x,y) cor(x,y)

 When the matrices are squares (# rows = # columns) I have no problems.

 However, when they are not (as in the example above with 5 columns and
 10 rows), I get the following error:

 apply(mat1, 1, f, y=mat2)
 Error in cor(x, y, na.method, method == kendall) :
 incompatible dimensions

 Any help would be appreciated.  Thanks!

 - Bruce



 **
 Please be aware that, notwithstanding the fact that the pers... 
 {{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply incompatible dimensions error

2007-07-24 Thread Bernzweig, Bruce \(Consultant\)

Thanks Gabor!

You state that my apply is taking rows of mat1 with columns of mat2.

Is this because I have the y=mat2 parameter?

 apply(mat1, 1, f, y=mat2)

Actually, what I would like is to run the correlations on a row against
row basis:  mat1 row1 x mat2 row1, etc.

Thanks again,

- Bruce


-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, July 24, 2007 11:31 AM
To: Bernzweig, Bruce (Consultant)
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] apply  incompatible dimensions error

Your apply is trying to take the correlations of the rows of mat1 with
the
columns of mat2 which, of course, does not work if they have different
numbers of columns. I think you mean to take the correlations of the
columns
of mat1 with the columns of mat2.  For example, to take the correlations
of the 5 columns of mat1 with the first 4 columns of mat2 try:

 cor(mat1, mat2[,1:4])
col1   col2   col3   col4
col1 -0.34624254 -0.2669519 -0.2705077  0.2183249
col2 -0.26553255 -0.2687643 -0.0865895  0.1819025
col3  0.19474613 -0.2334986  0.1746522  0.2326915
col4  0.09328338  0.5117784  0.2413143 -0.3374916
col5  0.27519716  0.1605331 -0.4057137  0.3282105


On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:
 Hi,

 I've created the following two matrices (mat1 and mat2) and a function
 (f) to calculate the correlations between the two on a row by row
basis.

mat1 - matrix(sample(1:500,50), ncol = 5,
dimnames=list(paste(row, 1:10, sep=),
paste(col, 1:5, sep=)))

mat2 - matrix(sample(501:1000,50), ncol = 5,
dimnames=list(paste(row, 1:10, sep=),
paste(col, 1:5, sep=)))

f - function(x,y) cor(x,y)

 When the matrices are squares (# rows = # columns) I have no problems.

 However, when they are not (as in the example above with 5 columns and
 10 rows), I get the following error:

  apply(mat1, 1, f, y=mat2)
 Error in cor(x, y, na.method, method == kendall) :
incompatible dimensions

 Any help would be appreciated.  Thanks!

 - Bruce



 **
 Please be aware that, notwithstanding the fact that the
pers...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




**
Please be aware that, notwithstanding the fact that the pers...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply incompatible dimensions error

2007-07-24 Thread Gabor Grothendieck

Then try this:

cor(t(mat1), t(mat2))

Also note

1. the above implies that mat1 and mat2 must have the same
number of columns since if x and y are vectors cor(x,y) only makes
sense if they have the same length.

2. the usual convention is that variables are stored as columns
andt that rows correspond to cases so typically you would have
(in terms of your mat1 and mat2):

Mat1 - t(mat1)
Mat2 - t(mat2)

and then use Mat1 and Mat2, e.g. cor(Mat1, Mat2)



On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:
 Thanks Gabor!

 You state that my apply is taking rows of mat1 with columns of mat2.

 Is this because I have the y=mat2 parameter?

  apply(mat1, 1, f, y=mat2)

 Actually, what I would like is to run the correlations on a row against
 row basis:  mat1 row1 x mat2 row1, etc.

 Thanks again,

 - Bruce


 -Original Message-
 From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, July 24, 2007 11:31 AM
 To: Bernzweig, Bruce (Consultant)
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] apply  incompatible dimensions error

 Your apply is trying to take the correlations of the rows of mat1 with
 the
 columns of mat2 which, of course, does not work if they have different
 numbers of columns. I think you mean to take the correlations of the
 columns
 of mat1 with the columns of mat2.  For example, to take the correlations
 of the 5 columns of mat1 with the first 4 columns of mat2 try:

  cor(mat1, mat2[,1:4])
col1   col2   col3   col4
 col1 -0.34624254 -0.2669519 -0.2705077  0.2183249
 col2 -0.26553255 -0.2687643 -0.0865895  0.1819025
 col3  0.19474613 -0.2334986  0.1746522  0.2326915
 col4  0.09328338  0.5117784  0.2413143 -0.3374916
 col5  0.27519716  0.1605331 -0.4057137  0.3282105


 On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:
  Hi,
 
  I've created the following two matrices (mat1 and mat2) and a function
  (f) to calculate the correlations between the two on a row by row
 basis.
 
 mat1 - matrix(sample(1:500,50), ncol = 5,
 dimnames=list(paste(row, 1:10, sep=),
 paste(col, 1:5, sep=)))
 
 mat2 - matrix(sample(501:1000,50), ncol = 5,
 dimnames=list(paste(row, 1:10, sep=),
 paste(col, 1:5, sep=)))
 
 f - function(x,y) cor(x,y)
 
  When the matrices are squares (# rows = # columns) I have no problems.
 
  However, when they are not (as in the example above with 5 columns and
  10 rows), I get the following error:
 
   apply(mat1, 1, f, y=mat2)
  Error in cor(x, y, na.method, method == kendall) :
 incompatible dimensions
 
  Any help would be appreciated.  Thanks!
 
  - Bruce
 
 
 
  **
  Please be aware that, notwithstanding the fact that the
 pers...{{dropped}}
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 **
 Please be aware that, notwithstanding the fact that the person sending
 this communication has an address in Bear Stearns' e-mail system, this
 person is not an employee, agent or representative of Bear Stearns.
 Accordingly, this person has no power or authority to represent, make
 any recommendation, solicitation, offer or statements or disclose
 information on behalf of or in any way bind Bear Stearns or any of its
 affiliates.
 **


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply incompatible dimensions error

2007-07-24 Thread Benilton Carvalho

that's garbor's suggestion then.
sorry for the misunderstanding. :-)
b

On Jul 24, 2007, at 11:35 AM, Bernzweig, Bruce ((Consultant)) wrote:

 Thanks Benilton,

 I know what I want to do, just not sure how to do it using R.  The  
 help
 documentation is not very clear.

 What I am trying to do is calculate correlations on a row against row
 basis:  mat1 row1 x mat2 row1, mat1 row1 x mat2 row2, ... mat1 row1 x
 mat2 row-n, mat1 row-n, mat2 row-n

 - Bruce

 -Original Message-
 From: Benilton Carvalho [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, July 24, 2007 11:31 AM
 To: Bernzweig, Bruce (Consultant)
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] apply  incompatible dimensions error

 are you positive that your function is doing what you expect it to do?

 it looks like you want something like:

 sapply(1:10, function(i) cor(mat1[i,], mat2[i,]))

 b

 On Jul 24, 2007, at 11:05 AM, Bernzweig, Bruce ((Consultant)) wrote:

 Hi,

 I've created the following two matrices (mat1 and mat2) and a  
 function
 (f) to calculate the correlations between the two on a row by row
 basis.

  mat1 - matrix(sample(1:500,50), ncol = 5,
  dimnames=list(paste(row, 1:10, sep=),
  paste(col, 1:5, sep=)))

  mat2 - matrix(sample(501:1000,50), ncol = 5,
  dimnames=list(paste(row, 1:10, sep=),
  paste(col, 1:5, sep=)))

  f - function(x,y) cor(x,y)

 When the matrices are squares (# rows = # columns) I have no  
 problems.

 However, when they are not (as in the example above with 5 columns  
 and
 10 rows), I get the following error:

 apply(mat1, 1, f, y=mat2)
 Error in cor(x, y, na.method, method == kendall) :
 incompatible dimensions

 Any help would be appreciated.  Thanks!

 - Bruce



 * 
 *
 Please be aware that, notwithstanding the fact that the pers...
 {{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.



 **
 Please be aware that, notwithstanding the fact that the person sending
 this communication has an address in Bear Stearns' e-mail system, this
 person is not an employee, agent or representative of Bear Stearns.
 Accordingly, this person has no power or authority to represent, make
 any recommendation, solicitation, offer or statements or disclose
 information on behalf of or in any way bind Bear Stearns or any of its
 affiliates.
 **

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply incompatible dimensions error

2007-07-24 Thread Bernzweig, Bruce \(Consultant\)

Thanks for the explanation.

As for the rows/columns thing, the data I receive is given to me in that
way.  I currently read it in using read.csv.  Is there a function I
should look at that can take that and transpose it or should I just
process the data first outside of R?

Thanks,

- Bruce

-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, July 24, 2007 11:43 AM
To: Bernzweig, Bruce (Consultant)
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] apply  incompatible dimensions error

Then try this:

cor(t(mat1), t(mat2))

Also note

1. the above implies that mat1 and mat2 must have the same
number of columns since if x and y are vectors cor(x,y) only makes
sense if they have the same length.

2. the usual convention is that variables are stored as columns
andt that rows correspond to cases so typically you would have
(in terms of your mat1 and mat2):

Mat1 - t(mat1)
Mat2 - t(mat2)

and then use Mat1 and Mat2, e.g. cor(Mat1, Mat2)



On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:
 Thanks Gabor!

 You state that my apply is taking rows of mat1 with columns of mat2.

 Is this because I have the y=mat2 parameter?

  apply(mat1, 1, f, y=mat2)

 Actually, what I would like is to run the correlations on a row
against
 row basis:  mat1 row1 x mat2 row1, etc.

 Thanks again,

 - Bruce


 -Original Message-
 From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, July 24, 2007 11:31 AM
 To: Bernzweig, Bruce (Consultant)
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] apply  incompatible dimensions error

 Your apply is trying to take the correlations of the rows of mat1 with
 the
 columns of mat2 which, of course, does not work if they have different
 numbers of columns. I think you mean to take the correlations of the
 columns
 of mat1 with the columns of mat2.  For example, to take the
correlations
 of the 5 columns of mat1 with the first 4 columns of mat2 try:

  cor(mat1, mat2[,1:4])
col1   col2   col3   col4
 col1 -0.34624254 -0.2669519 -0.2705077  0.2183249
 col2 -0.26553255 -0.2687643 -0.0865895  0.1819025
 col3  0.19474613 -0.2334986  0.1746522  0.2326915
 col4  0.09328338  0.5117784  0.2413143 -0.3374916
 col5  0.27519716  0.1605331 -0.4057137  0.3282105


 On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:
  Hi,
 
  I've created the following two matrices (mat1 and mat2) and a
function
  (f) to calculate the correlations between the two on a row by row
 basis.
 
 mat1 - matrix(sample(1:500,50), ncol = 5,
 dimnames=list(paste(row, 1:10, sep=),
 paste(col, 1:5, sep=)))
 
 mat2 - matrix(sample(501:1000,50), ncol = 5,
 dimnames=list(paste(row, 1:10, sep=),
 paste(col, 1:5, sep=)))
 
 f - function(x,y) cor(x,y)
 
  When the matrices are squares (# rows = # columns) I have no
problems.
 
  However, when they are not (as in the example above with 5 columns
and
  10 rows), I get the following error:
 
   apply(mat1, 1, f, y=mat2)
  Error in cor(x, y, na.method, method == kendall) :
 incompatible dimensions
 
  Any help would be appreciated.  Thanks!
 
  - Bruce
 
 
 
 
**
  Please be aware that, notwithstanding the fact that the
 pers...{{dropped}}
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 **
 Please be aware that, notwithstanding the fact that the person sending
 this communication has an address in Bear Stearns' e-mail system, this
 person is not an employee, agent or representative of Bear Stearns.
 Accordingly, this person has no power or authority to represent, make
 any recommendation, solicitation, offer or statements or disclose
 information on behalf of or in any way bind Bear Stearns or any of its
 affiliates.
 **




**
Please be aware that, notwithstanding the fact that the pers...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using lmer with huge amount of data

2007-07-24 Thread Gang Chen

Based on the examples I've seen in using statistical analysis  
packages such as lmer, it seems that people usually tabulate all the  
input data into one file with the first line indicating the variable  
names (or labels), and then read the file inside R. However, in my  
case I can't do that because of the huge amount of imaging data.

Suppose I have a one-way within-subject ANCOVA with one covariate,  
and I would like to use lmer in R package lme4 to analyze the data.  
In the terminology of linear mixed models, I have a fixed factor A  
with 3 levels, a random factor B (subject), and a covariate (age)  
with a model like this

MyResult - lmer(Response ~ FactorA + Age + (1 | subject), MyData, ...)

My input data are like this: For each subject I have a file (a huge  
matrix) storing the response values of the subject at many locations  
(~30,000 voxels) corresponding to factor A at the 1st level, another  
file for factor A at the 2nd level, and a 3rd file for factor A at  
the 3rd level. Then I have another file storing the age of those  
subjects. The analysis with the linear mixed model above would be  
done at each voxel separately.

It seems impractical to create one gigantic file or matrix to feed  
into the above command line because of the big number of voxels. I'm  
not sure how to proceed in this case. Any suggestions would be highly  
appreciated.

Also if I'm concerned about any potential violation of sphericity  
among the 3 levels of factor A, how can I test sphericity violation  
in lmer? And if violation exists, how can I make corrections in  
contrast testing?

Thank you very much,
Gang

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cor inside/outside a function has different output

2007-07-24 Thread Bernzweig, Bruce \(Consultant\)

Sorry.  I looked up t after writing the previous email and realized that
was what I was looking for!



-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, July 24, 2007 11:48 AM
To: Bernzweig, Bruce (Consultant)
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] cor inside/outside a function has different output

I think this is really answered already in my previous post but just in
case
try this:

 res1 - t(apply(mat1, 1, cor, t(mat2)))
 res2 - cor(t(mat1), t(mat2))
 all.equal(res1, res2, check.attributes = FALSE)
[1] TRUE


On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:
 I'm calculating correlations between two matrices



 mat1 - matrix(sample(1:500,25), ncol = 5,

 dimnames=list(paste(mat1row, 1:5, sep=),

 paste(mat1col, 1:5, sep=)))



 mat2 - matrix(sample(501:1000,25), ncol = 5,

 dimnames=list(paste(mat2row, 1:5, sep=),

 paste(mat2col, 1:5, sep=)))



 using what would seem to be two similar methods:



  Method 1:



f - function(x,y) cor(x,y)

apply(mat1, 1, f, y=mat2)



  Method 2:



 cor(mat1, mat2)



 However, the results (see blow) are different:



  apply(mat1, 1, f, y=mat2)



   mat1row1   mat1row2mat1row3mat1row4mat1row5

 [1,] -0.27601028 -0.1352143  0.03538690 -0.03084075 -0.60171704

 [2,] -0.01595532 -0.3881197 -0.43663982  0.49081806  0.33291995

 [3,]  0.35969624 -0.0582948  0.57462169  0.09926796 -0.02948423

 [4,] -0.41435920 -0.7164638 -0.21213496 -0.55183934 -0.25341790

 [5,]  0.33802803  0.5371508  0.05219095  0.83533575  0.17850291



  cor(mat1, mat2)

mat2col1mat2col2   mat2col3   mat2col4   mat2col5

 mat1col1 -0.84077496 -0.01538414 -0.6078933 -0.2263840 -0.1421335

 mat1col2  0.23074421  0.54606286 -0.2354733  0.5214255 -0.2129077

 mat1col3 -0.8528  0.19550100 -0.5920509 -0.8694040  0.6853990

 mat1col4  0.08050976 -0.55449840  0.6225666  0.6187971 -0.8971584

 mat1col5 -0.10199564 -0.43854767 -0.5803077 -0.5100285  0.2848351



 Also, for method 2, the calculations are done on a column x column
 basis.  Is there any way to do this on a row by row basis.  Looking at
 the help page for cor, I don't see any parameters that could be used
to
 do this.



 Thanks,



 - Bruce




 **
 Please be aware that, notwithstanding the fact that the
pers...{{dropped}}


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





**
Please be aware that, notwithstanding the fact that the pers...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cor inside/outside a function has different output

2007-07-24 Thread Gabor Grothendieck

I think this is really answered already in my previous post but just in case
try this:

 res1 - t(apply(mat1, 1, cor, t(mat2)))
 res2 - cor(t(mat1), t(mat2))
 all.equal(res1, res2, check.attributes = FALSE)
[1] TRUE


On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:
 I'm calculating correlations between two matrices



 mat1 - matrix(sample(1:500,25), ncol = 5,

 dimnames=list(paste(mat1row, 1:5, sep=),

 paste(mat1col, 1:5, sep=)))



 mat2 - matrix(sample(501:1000,25), ncol = 5,

 dimnames=list(paste(mat2row, 1:5, sep=),

 paste(mat2col, 1:5, sep=)))



 using what would seem to be two similar methods:



  Method 1:



f - function(x,y) cor(x,y)

apply(mat1, 1, f, y=mat2)



  Method 2:



 cor(mat1, mat2)



 However, the results (see blow) are different:



  apply(mat1, 1, f, y=mat2)



   mat1row1   mat1row2mat1row3mat1row4mat1row5

 [1,] -0.27601028 -0.1352143  0.03538690 -0.03084075 -0.60171704

 [2,] -0.01595532 -0.3881197 -0.43663982  0.49081806  0.33291995

 [3,]  0.35969624 -0.0582948  0.57462169  0.09926796 -0.02948423

 [4,] -0.41435920 -0.7164638 -0.21213496 -0.55183934 -0.25341790

 [5,]  0.33802803  0.5371508  0.05219095  0.83533575  0.17850291



  cor(mat1, mat2)

mat2col1mat2col2   mat2col3   mat2col4   mat2col5

 mat1col1 -0.84077496 -0.01538414 -0.6078933 -0.2263840 -0.1421335

 mat1col2  0.23074421  0.54606286 -0.2354733  0.5214255 -0.2129077

 mat1col3 -0.8528  0.19550100 -0.5920509 -0.8694040  0.6853990

 mat1col4  0.08050976 -0.55449840  0.6225666  0.6187971 -0.8971584

 mat1col5 -0.10199564 -0.43854767 -0.5803077 -0.5100285  0.2848351



 Also, for method 2, the calculations are done on a column x column
 basis.  Is there any way to do this on a row by row basis.  Looking at
 the help page for cor, I don't see any parameters that could be used to
 do this.



 Thanks,



 - Bruce




 **
 Please be aware that, notwithstanding the fact that the pers...{{dropped}}


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plotting a summary.rq object in using pkg quantreg

2007-07-24 Thread Jeff G.

Hello,

I am having problems adjusting the plot output from the quantreg 
package.  Anyone know what I'm doing wrong?

For example (borrowing from the help files):

plot(summary(rq(foodexp~income,tau = 1:49/50,data=engel)), nrow=1, 
ncol=2,alpha = .4, ols = TRUE, xlab=test)

The alpha= parameter seems to have no effect on my output, even when I 
set it to a ridiculous value like 0.4.  Also, though in the help file it 
says |...| = optional arguments to plot, xlab (as an example) seems 
to do nothing.  If the answer is that I should extract the values I need 
and construct the plot I want independently of the rq.process object, 
that it okay I suppose, if inefficient.  Maybe a more fundamental 
question is how do I get in and see how plot is working in this case so 
that I can modify.

Thanks much!

J

P.S.  I've explored using plot.summary.rqs but the problems seem to be 
the same.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting exponential curve to data points

2007-07-24 Thread Dieter Menne

Andrew Clegg andrew.clegg at gmail.com writes:

 
 ... If I want to demonstrate that a non-linear curve fits
 better than an exponential, what's the best measure for that? Given
 that neither of nls() or optim() provide R-squared. 

To supplement Karl's comment, try Douglas Bates' (author of nls) comments on the
matter

http://www.ens.gu.edu.au/ROBERTK/R/HELP/00B/0399.HTML

Short summary:
* ... the lack of automatic ANOVA, R^2 and adj. R^2 from nls is a feature,
not a bug :-)
* My best advice regarding R^2 statistics with nonlinear models is, as Nancy
Reagan suggested, Just say no.

Dieter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] crimtab related question

2007-07-24 Thread Jean lobry

Dear all,

the dataset documented under ?crimtab was also used in:

@article{TreloarAE1934,
 title = {The adequacy of {S}tudent's criterion of
  deviations in small sample means},
 author = {Treloar, A.E. and Wilder, M.A.},
 journal = {The Annals of Mathematical Statistics},
 volume = {5},
 pages = {324-341},
 year = {1934}
}

The following is from page 335 of the above paper:

From the table provided by MacDonell (1902) on
the associated variation of stature (to the nearest inch)
and length of the left middle finger (to the nearest
millimeter) in 3000 British criminals, the measusurements
were transferred to 3000 numbered Denison metal-rim
tags from which the cords has been removed. After
thorough checking and mixing of these circular disks,
samples of 5 tags each were drawn at random until the
supply was exhausted. Unfortunately, three of these
samples were erroneously returned to a receiving box
before being copied, and the records of 597 samples only
are available.

Could someone give me a clue about the kind of device
that was used here? Is it a kind of lottery machine?
I don't understand why three samples were lost. What
is this receiving box?

Thanks for any hint,

Best,
-- 
Jean R. Lobry([EMAIL PROTECTED])
Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I,
43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE
allo  : +33 472 43 27 56 fax: +33 472 43 13 88
http://pbil.univ-lyon1.fr/members/lobry/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.

2007-07-24 Thread Allan Kamau

Hi all,
If the question below as been answered before I
apologize for the posting.
I would like to get the frequencies of occurrence of
all values in a given variable in a multivariate
dataset. In short for each variable (or field) a
summary of values contained with in a value:frequency
pair, there can be many such pairs for a given
variable. I would like to do the same for several such
variables.
I have used table() but am unable to extract the
individual value and frequency values.
Please advise.

Allan.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fitting the best line to the plot of distance vs. correlation matrix

2007-07-24 Thread Urmi Trivedi

Hi all, 

Thanks for your help in generating the matrix of distance vs correlation. I did 
it using 

 plot(as.vector(as.matrix(cormat)), as.vector(as.matrix(distmat)))

Now I want to quantitate the same. May be on linear regression or some other 
statistical functions. I have tried using linmod for linear regression. But as 
I have two matrices in the form of the dataframes, I'm wondering if it is the 
right way to do it in this? Or are there even better options than this 
available. 

Can anyone please help me for that. 

Thanks. 

Urmi

   
-
 Once upon a time there was 1 GB storage in your inbox. Click here for happy 
ending.
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lme or gls prediction intervals

2007-07-24 Thread Martin Henry H. Stevens

Hi folks,
I am trying to generate 95% confidence intervals for a gls model  
using predict.nlme
with
R version 2.5.1 (2007-06-27)
. nlme: Linear  and Nonlinear Mixed Effects Models. R package version
   3.1-83.

I have looked in help, and I can do it for lm and glm models, and I  
can generate simple predictions for lme models with various levels --  
I am familiar with the basics.

Is there a way to get prediction intervals for gls models? My best  
model uses varPower(), so I am reluctant to fall back on lm predictions.

Thank you,

Hank


Dr. Hank Stevens, Associate Professor
338 Pearson Hall
Botany Department
Miami University
Oxford, OH 45056

Office: (513) 529-4206
Lab: (513) 529-4262
FAX: (513) 529-4243
http://www.cas.muohio.edu/~stevenmh/
http://www.muohio.edu/ecology/
http://www.muohio.edu/botany/

E Pluribus Unum

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] x,y,z table to matrix with x as rows and y as columns

2007-07-24 Thread jiho

Hello all,

I am sure I am missing something obvious but I cannot find the  
function I am looking for. I have a data frame with three columns: X,  
Y and Z, with X and Y being grid coordinates and Z the value  
associated with these coordinates. I want to transform this data  
frame in a matrix of Z values, on the grid defined by X and Y (and,  
as a plus, fill the X.Y combinations which do no exist in the  
original data frame with NAs in the resulting matrix). I could do  
this manually but I guess the appropriate function should be  
somewhere around. I just can't find it.

Thank you in advance for your help.

JiHO
---
http://jo.irisson.free.fr/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting a summary.rq object in using pkg quantreg

2007-07-24 Thread roger koenker

Package questions to package maintainers  please.

The short answer is that your alpha = .4 parameter needs to
be passed to  summary not to plot.  Try this:

 plot(summary(rq(foodexp~income,tau = 1:49/50,data=engel),alpha =. 
 4), nrow=1,
 ncol=2, ols = TRUE)

A longer answer would involve a boring disquisition about various  
fitting methods
and standard error estimation methods and their historical evolution  
and defaults.
(By default rank-based confidence bands are being used for the engel  
data since
the sample size is relatively small.)

Regarding your more fundamental question:  you can always modify   
functions
such as summary.rq or plot.summary.rqs  -- see for example ?fix.




url:www.econ.uiuc.edu/~rogerRoger Koenker
email   [EMAIL PROTECTED]   Department of Economics
vox:217-333-4558University of Illinois
fax:217-244-6678Champaign, IL 61820


On Jul 24, 2007, at 11:07 AM, Jeff G. wrote:

 Hello,

 I am having problems adjusting the plot output from the quantreg
 package.  Anyone know what I'm doing wrong?

 For example (borrowing from the help files):

 plot(summary(rq(foodexp~income,tau = 1:49/50,data=engel)), nrow=1,
 ncol=2,alpha = .4, ols = TRUE, xlab=test)

 The alpha= parameter seems to have no effect on my output, even  
 when I
 set it to a ridiculous value like 0.4.  Also, though in the help  
 file it
 says |...| = optional arguments to plot, xlab (as an example)  
 seems
 to do nothing.  If the answer is that I should extract the values I  
 need
 and construct the plot I want independently of the rq.process object,
 that it okay I suppose, if inefficient.  Maybe a more fundamental
 question is how do I get in and see how plot is working in this  
 case so
 that I can modify.

 Thanks much!

 J

 P.S.  I've explored using plot.summary.rqs but the problems seem to be
 the same.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calculating subsets of row pairs using somthing faster than a for loop.

2007-07-24 Thread Bernzweig, Bruce \(Consultant\)

Hi all,

 

Situation:

 

 - I have two matrices each w/ 4 rows and 20 columns.

 

mat1 - matrix(sample(1:500,80), ncol = 20, 

dimnames=list(paste(mat1row, 1:4, sep=), 

paste(mat1col, 1:20, sep=)))

 

mat2 - matrix(sample(501:1000,80), ncol = 20, 

dimnames=list(paste(mat2row, 1:4, sep=), 

paste(mat2col, 1:20, sep=)))

 

 - Each column represents a value in a time series.

 

Q: What do I want:

 

   Calculate moving average correlations for each row x row pair:

 

   For each row x row pair I want 10 values representing moving average

   correlations for 10 sets of time-values:

 

   cor(mat1[1,1:10], mat2[1,1:10])

   cor(mat1[1,2:11], mat2[1,2:11])

   ...

   cor(mat1[1,11:20], mat2[1,11:20])

   cor(mat1[1,1:10], mat2[2,1:10])

   ...

   cor(mat1[4,11:20], mat2[4,11:20])

 

   Result would be a 16 (rows) x 10 (col) matrix matMA

 

  ma1, ma2, ..., ma10 for (mat1 row1) x (mat2 row1)

  ma1, ma2, ..., ma10 for (mat1 row1) x (mat2 row2)

  ...

  ma1, ma2, ..., ma10 for (mat1 row4) x (mat2 row3)

  ma1, ma2, ..., ma10 for (mat1 row4) x (mat2 row4) 

 

   I would like to be able to do this without using a for loop

   due to the slowness of that method.

 

   Is it possible to iterate through subsets w/o using a for loop?

 

Thanks,

 

- Bruce

 

  P



**
Please be aware that, notwithstanding the fact that the pers...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] apply incompatible dimensions error

2007-07-24 Thread Bernzweig, Bruce \(Consultant\)

Thanks Benilton,

I know what I want to do, just not sure how to do it using R.  The help
documentation is not very clear.

What I am trying to do is calculate correlations on a row against row
basis:  mat1 row1 x mat2 row1, mat1 row1 x mat2 row2, ... mat1 row1 x
mat2 row-n, mat1 row-n, mat2 row-n

- Bruce

-Original Message-
From: Benilton Carvalho [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, July 24, 2007 11:31 AM
To: Bernzweig, Bruce (Consultant)
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] apply  incompatible dimensions error

are you positive that your function is doing what you expect it to do?

it looks like you want something like:

sapply(1:10, function(i) cor(mat1[i,], mat2[i,]))

b

On Jul 24, 2007, at 11:05 AM, Bernzweig, Bruce ((Consultant)) wrote:

 Hi,

 I've created the following two matrices (mat1 and mat2) and a function
 (f) to calculate the correlations between the two on a row by row  
 basis.

   mat1 - matrix(sample(1:500,50), ncol = 5,
   dimnames=list(paste(row, 1:10, sep=),
   paste(col, 1:5, sep=)))

   mat2 - matrix(sample(501:1000,50), ncol = 5,
   dimnames=list(paste(row, 1:10, sep=),
   paste(col, 1:5, sep=)))

   f - function(x,y) cor(x,y)

 When the matrices are squares (# rows = # columns) I have no problems.

 However, when they are not (as in the example above with 5 columns and
 10 rows), I get the following error:

 apply(mat1, 1, f, y=mat2)
 Error in cor(x, y, na.method, method == kendall) :
 incompatible dimensions

 Any help would be appreciated.  Thanks!

 - Bruce



 **
 Please be aware that, notwithstanding the fact that the pers... 
 {{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.



**
Please be aware that, notwithstanding the fact that the pers...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Computing the rank of a matrix.

2007-07-24 Thread Martin Maechler

 RV == RAVI VARADHAN [EMAIL PROTECTED]
 on Sat, 07 Apr 2007 18:39:36 -0400 writes:
 

this is a bit a late reply...  better late than never

RV Hi Martin,

Hi Ravi, thanks for your research.


RV I played a bit with rankMat.  I will first state the
RV conclusions of my numerical experiments and then present
RV the results to support them:

RV 1.  I don't believe that rankMat, or equivalently
RV Matlab's rank computation, is necessarily better than
RV qr(.)$rank or (qr., LAPACK=TRUE)$rank.  In fact, for the
RV notorious Hilbert matrix, rankMat can give poor
RV estimates of rank.

RV 2.  There exists no universally optimal (i.e. optimal
RV for all A) tol in qr(A, tol)$rank that would be
RV optimally close to rankMat.  The discrepancy in the
RV ranks computed by qr(A)$rank and rankMat(A) obviously
RV depends on the matrix A. This is evident from the tol
RV used in rankMat:
RV  tol - max(d) * .Machine$double.eps * abs(singValA[1])
RV So, this value of tol in qr will also minimize the discrepancy. 

RV 3.  The tol parameter is irrelevant in qr(A, tol,
RV LAPACK=TRUE)$rank, i.e. LAPACK doesn't seem to utilize
RV the tol parameter when computing the rank.  However,
RV qr(A, tol, LAPACK=FALSE)$rank does depend on tol.

Yes, indeed!  The help page of qr() actually says so 
{at least now, don't know about 3 months ago}.

RV Now, here are the results:
RV 1.
 hilbert.rank - matrix(NA,20,3)
 hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) }
 for (i in 1:20) {
RV + himat - hilbert(i)
RV + hilbert.rank[i,1] - rankMat(himat)
RV + hilbert.rank[i,2] - qr(himat,tol=1.0e-14)$rank
RV + hilbert.rank[i,3] - qr(himat, tol = .Machine$double.eps, LAPACK = 
TRUE)$rank
RV + }
 hilbert.rank
RV [,1] [,2] [,3]
RV [1,]111
RV [2,]222
RV [3,]333
RV [4,]444
RV [5,]555
RV [6,]666
RV [7,]777
RV [8,]888
RV [9,]999
RV [10,]   10   10   10
RV [11,]   10   11   11
RV [12,]   11   12   12
RV [13,]   11   12   13
RV [14,]   11   13   14
RV [15,]   12   13   15
RV [16,]   12   15   16
RV [17,]   12   16   17
RV [18,]   12   16   18
RV [19,]   13   17   19
RV [20,]   13   17   20

RV We see that rankMat underestimates the rank for n  10, but qr(himat, 
tol = .Machine$double.eps, LAPACK=TRUE)$rank gets it right.

Yes, indeed;  and that's  seems a bit  against the  ``common
knowledge'' that svd() is more reliable than qr()

Hmm I'm still baffled a bit..

Note that with the Hilbert matrices, 
one might argue that  hilbert(20) might really not have a
correct estimated rank of 20,
but at least for hilbert(13) or so, the rank should be == n.

BTW, there's a nice plot, slightly related to this problem,
using rcond() from the Matrix package :

  library(Matrix)

  hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) }
  rcHilb - sapply(1:20, function(n) rcond(Matrix(hilbert(n

  plot(rcHilb, xlab = n, log = y, col = 2, type = b,
   main = reciprocal condition numbers of Hilbert(n))

where one sees that the LAPACK code that underlies
Matrix::rcond() also gets into problem when estimating the
condition number for hilbert(n) when n ~ 16 .

I think if we wanted to make real progress here, we'd have to
consult with numerical analyist specialists.
But for me the topic is too remote to be followed up further at
the moment.

One conclusion for me is that to estimate the rank of a matrix
in current versions of R, one should use

  rankMat - function(x)  qr(x, LAPACK = TRUE)$rank

(as was suggested as one possibility in the original thread)

Regards, and thank you again, Ravi.

Martin Maechler, ETH Zurich


RV 2.  
RV Here I first, created a random rectangular matrix, and then added a 
number of rows to it, where these new rows are the same as some of the original 
rows except for a tiny amount of noise, which I call eps.  So, the degree of 
rank deficiency is controlled by eps.  I let eps take on 3 values: 1.e-07, 
1.e-10, and 1.e-14, and show that the optimal tol (in terms of being close to 
rankMat) depends on the level of precision (eps) in the matrix.
 set.seed(123)
 nrow - 15
 ncol - 20
 nsim - 5000
 ndefic - 4  # number of nearly dependent rows
 eps - 1.e-07
 rnk - matrix(NA, nsim, 5)
 for (j in 1:nsim) {
RV + A - matrix(rnorm(nrow*ncol),nrow, ncol)
RV + newrows - matrix(NA, ndefic, ncol)
RV + for (i in 1:ndefic) {
RV + newrows[i,] - A[nrow-i,] + rnorm(ncol, sd=min(abs(A[nrow-i+1,]))* 
eps)
RV + }
RV + 
RV + B - rbind(A,newrows)
RV + rnk[j,1] - rankMat(B)
RV + rnk[j,2] - qr(B, tol = 1.e-07)$rank
RV + rnk[j,3] - qr(B, tol = 1.e-11)$rank
RV + rnk[j,4] - qr(B, tol = 1.0e-14)$rank
RV + rnk[j,5] - qr(B,

[R] quantreg behavior changes for N1000

2007-07-24 Thread Jeff G.

Hello again R-experts and novices (like me),

This seems like a bug to me - or maybe it's intentional...can anyone 
confirm?  Up to 1000 reps, summary() of a rq object gives different 
output and subtly different confidence interval estimates.

ThanksJeff

testx=runif(1200)
testy=rnorm(1200, 5)

test.rq=summary(rq(testy[1:1000]~testx[1:1000], tau=2:98/100))
test.rq[[1]]
Gives this output:
Call: rq(formula = testy[1:1000] ~ testx[1:1000], tau = 2:98/100)

tau: [1] 0.02

Coefficients:
  coefficients   lower bd   upper bd
(Intercept)3.00026 2.45142 3.17098
testx[1:1000] -0.00870 -0.39817  0.49946

test.rq=summary(rq(testy[1:1001]~testx[1:1001], tau=2:98/100))
test.rq[[1]]

Gives this (different) output:
Call: rq(formula = testy[1:1001] ~ testx[1:1001], tau = 2:98/100)

tau: [1] 0.02

Coefficients:
  ValueStd. Error t value  Pr(|t|)
   (Intercept)3.00026  0.21605   13.88658  0.0
testx[1:1001] -0.00870  0.32976   -0.02638  0.97896


plot(test.rq, nrow=2, ncol=2) # The slope estimates appear to be the 
same but there are subtle differences in the confidence intervals, which 
shouldn't be due simply to the inclusion of one more point.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quantreg behavior changes for N1000

2007-07-24 Thread roger koenker

When in doubt:  RTFM --  Quoting from ?summary.rq

se: specifies the method used to compute standard standard
   errors.  There are currently five available methods:

  1.  'rank' which produces confidence intervals for the
 estimated parameters by inverting a rank test as
 described in Koenker (1994).  The default option
 assumes that the errors are iid, while the option iid =
 FALSE implements the proposal of Koenker Machado
 (1999).  This is the default method unless the sample
 size exceeds 1001, or cov = FALSE in which case se =
 nid is used.

url:www.econ.uiuc.edu/~rogerRoger Koenker
email[EMAIL PROTECTED]Department of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820


On Jul 24, 2007, at 12:57 PM, Jeff G. wrote:

 Hello again R-experts and novices (like me),

 This seems like a bug to me - or maybe it's intentional...can anyone
 confirm?  Up to 1000 reps, summary() of a rq object gives different
 output and subtly different confidence interval estimates.

 ThanksJeff

 testx=runif(1200)
 testy=rnorm(1200, 5)

 test.rq=summary(rq(testy[1:1000]~testx[1:1000], tau=2:98/100))
 test.rq[[1]]
 Gives this output:
 Call: rq(formula = testy[1:1000] ~ testx[1:1000], tau = 2:98/100)

 tau: [1] 0.02

 Coefficients:
   coefficients   lower bd   upper bd
 (Intercept)3.00026 2.45142 3.17098
 testx[1:1000] -0.00870 -0.39817  0.49946

 test.rq=summary(rq(testy[1:1001]~testx[1:1001], tau=2:98/100))
 test.rq[[1]]

 Gives this (different) output:
 Call: rq(formula = testy[1:1001] ~ testx[1:1001], tau = 2:98/100)

 tau: [1] 0.02

 Coefficients:
   ValueStd. Error t value  Pr(|t|)
(Intercept)3.00026  0.21605   13.88658  0.0
 testx[1:1001] -0.00870  0.32976   -0.02638  0.97896


 plot(test.rq, nrow=2, ncol=2) # The slope estimates appear to be the
 same but there are subtle differences in the confidence intervals,  
 which
 shouldn't be due simply to the inclusion of one more point.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] New package: pomp, inference for partially-observed Markov processes

2007-07-24 Thread Aaron King

To: [EMAIL PROTECTED]
Subject: New package: pomp, inference for partially-observed Markov processes

The new package 'pomp' is built around a very general realization of nonlinear 
partially-observed Markov processes (AKA state-space models, nonlinear 
stochastic dynamical systems). The user provides functions specifying the 
model's process and measurement components. The package's algorithms are 
built on top of these functions. 

At the moment, algorithms are provided for particle filtering (AKA sequential 
importance sampling or sequential Monte Carlo) and the likelihood 
maximization by iterated filtering (MIF) method of Ionides, Breto, and King 
(PNAS, 103:18438-18443, 2006). Future support for a variety of other 
algorithms is envisioned. A working group of the National Center for 
Ecological Analysis and Synthesis (NCEAS), Inference for Mechanistic 
Models, is currently implementing additional methods for this package.

Simple worked examples are provided in the form of a 
vignette, random_walk_example.

The package is provided under the GPL. Contributions are welcome, as are 
comments, suggestions, examples, and bug reports.

The development of this package has been aided by support from the U.S. N.S.F 
(Grants #EF-0545276, #EF-0430120) and by the Inference for Mechanistic 
Models Working Group supported by the National Center for Ecological 
Analysis and Synthesis, a Center funded by NSF (Grant #DEB-0553768), the 
University of California, Santa Barbara, and the State of California.

-- 
Aaron A. King, Ph.D.
Ecology  Evolutionary Biology
University of Michigan
http://www.umich.edu/~kingaa
GPG Public Key: 0x2B00840F

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Renamig a factor

2007-07-24 Thread Agustin Lobo

Which is the proper way to rename a factor?
If I do:
test$Parc[test$Parc==OlÞrdola]-Olèrdola
R complains that
Warning message:
invalid factor level, NAs generated in: `[-.factor`(`*tmp*`, test$Parc 
== OlÞrdola, value = Olèrdola)

Thanks
Agus
-- 
Dr. Agustin Lobo
Institut de Ciencies de la Terra Jaume Almera (CSIC)
LLuis Sole Sabaris s/n
08028 Barcelona
Spain
Tel. 34 934095410
Fax. 34 934110012
email: [EMAIL PROTECTED]
http://www.ija.csic.es/gt/obster

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Latent class analysis with ordinal manifest variables

2007-07-24 Thread Eduard Bonet

Hi,

I need a package to run Latent Class Analysis with ordinal manifest
variables but i have not found it. LCA is for dichotomous variables and
poLCA is for nominal ones. Can anyone tell me where i can find such a
package if it exists.

Thanks.

Eduard

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] x,y,z table to matrix with x as rows and y as columns

2007-07-24 Thread Mark Lyman

jiho jo.irisson at gmail.com writes:

 
 Hello all,
 
 I am sure I am missing something obvious but I cannot find the  
 function I am looking for. I have a data frame with three columns: X,  
 Y and Z, with X and Y being grid coordinates and Z the value  
 associated with these coordinates. I want to transform this data  
 frame in a matrix of Z values, on the grid defined by X and Y (and,  
 as a plus, fill the X.Y combinations which do no exist in the  
 original data frame with NAs in the resulting matrix). I could do  
 this manually but I guess the appropriate function should be  
 somewhere around. I just can't find it.

I have used xtabs in the past, but xtabs returns 0 instead of NA, which makes 
great for cross-tabulation, the offending line is x[is.na(x)] - 0. So you 
would need to either modify the xtabs function or trust that z is never 0 and 
replace all 0's with NA.

 mydat - expand.grid(x=1:5, y=1:5)
 mydat - data.frame(mydat, z=rnorm(25))
 mydat$z[sample(1:25,4)] - NA
 mytab - xtabs(z~x+y, mydat)

Mark Lyman

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to add circular text for a graph with concentric circles

2007-07-24 Thread sparandekar

Dear R experts,

I am plotting the population of students who live in a city, and in 
successive circular bands made of the contiguous districts that surround 
the city. This is a stylized figure, where I specify the area of each 
successive circle based on the cumulative population of students. I want 
to compare two sets of concentric circles across different populations - 
such as 'All students' and 'Private students' (those attending private 
school) by using the same colours and the same dimension of the outer 
circle. I have attached the .pdf file with the output, and the R code to 
generate the first set of circles.

I would appreciate any tips about how to rotate the text label that marks 
each concentric circle (except the central circle) to be curved around the 
circle, located in the middle of each band, thus following the circle 
instead of being horizontal, as I have it now.

Thank you very much,

best regards,
Suhas

# Conurbano1a.R
# R Program to generate circles, radii based on proportion of ALL STUDENTS
# Prepared by Suhas, Tuesday, July 24, 2007

grid.circle(x=0.5, y=0.5, r=3*(.1268), default.units=npc, name=NULL,
 gp=gpar(fill=olivedrab1,col=NULL), draw=TRUE, vp=NULL)
grid.circle(x=0.5, y=0.5, r=3*(0.095882), default.units=npc, name=NULL,
 gp=gpar(fill=cornflowerblue,col=NULL), draw=TRUE, 
vp=NULL)
grid.circle(x=0.5, y=0.5, r=3*(0.077894), default.units=npc, name=NULL,
 gp=gpar(fill=aliceblue,col=NULL), draw=TRUE, vp=NULL)
grid.circle(x=0.5, y=0.5, r=3*(0.061884), default.units=npc, name=NULL,
 gp=gpar(fill=seagreen,col=NULL), draw=TRUE, vp=NULL)
grid.circle(x=0.5, y=0.5, r=3*(0.050740), default.units=npc, name=NULL,
 gp=gpar(fill=lightpink2,col=NULL), draw=TRUE, vp=NULL)
grid.circle(x=0.5, y=0.5, r=3*(0.045906), default.units=npc, name=NULL,
 gp=gpar(fill=navy,col=NULL), draw=TRUE, vp=NULL)

# Now to add the labels ? used trial and error for x and y values

  grid.text(Provincia Interior, x=0.5, y=0.85,
   gp=gpar(fontsize=6, col=black))

  grid.text(Conurbano 4, x=0.5, y=0.75,
   gp=gpar(fontsize=6, col=black))

  grid.text(Conurbano 3, x=0.5, y=0.71,
   gp=gpar(fontsize=6, col=black))

  grid.text(Conurbano 2, x=0.5, y=0.67,
   gp=gpar(fontsize=6, col=white))

  grid.text(Conurbano 1, x=0.5, y=0.645,
   gp=gpar(fontsize=6, col=black))
  grid.text(Ciudad de Buenos Aires, x=0.5, y=0.56,
   gp=gpar(fontsize=6, col=white))

# Title of graph 
grid.text(All Students, x=0.2,y=0.95,gp=gpar(fontsize=20,col=navy))




**
Suhas D. Parandekar
Senior Education Economist
Latin American and Caribbean Region
Tel: 202 458 7622
e-mail: [EMAIL PROTECTED]__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using lmer with huge amount of data

2007-07-24 Thread Gang Chen

Thank you very much for the response, Prof. Ripley.


 I think I am missing something here: how do you make this 'huge'  
 and 'gigantic'?  You have not told us how many subjects you have,  
 but in imaging experiments it is usually no more than 50 and often  
 less.

Usually we have 10-30 subjects.


 For each subject you have 3 x 30,000 responses plus an age.  That  
 is under 1Mb of data per subject, so the problem looks modest  
 unless you have many hundreds of subjects.

 Nothing says you need to read the data in one go, but it will be  
 helpful to have all the data available to R at once (although this  
 could be alleviated by using a DBMS interface).


In the hypothetical situation I mentioned in my previous mail  
(suppose we have 12 subjects), all the input data would be stored in  
3 X 12 files each of which contains 30,000 numbers,  plus one more  
file for age. Sure I can read in those 27 files at once, and I'm not  
concerned about the data size at this point, but my question is: do I  
have to reshuffle those 36 files and create 30,000 separate arrays  
(one for each voxel) in R so that I could run lmer voxel-wise?


 I think the problem is rather going to be running 30,000 lmer fits,  
 which in my experience often take seconds each.  Each fit will only  
 need a modest amount of data (3 responses and one age per subject).

Right. What is the most efficient strategy to run such an analysis  
voxel-wise? Write a function, and then use apply()? Or simply do it  
in a loop?

Thanks,
Gang



 On Tue, 24 Jul 2007, Gang Chen wrote:

 Based on the examples I've seen in using statistical analysis
 packages such as lmer, it seems that people usually tabulate all the
 input data into one file with the first line indicating the variable
 names (or labels), and then read the file inside R. However, in my
 case I can't do that because of the huge amount of imaging data.

 Suppose I have a one-way within-subject ANCOVA with one covariate,
 and I would like to use lmer in R package lme4 to analyze the data.
 In the terminology of linear mixed models, I have a fixed factor A
 with 3 levels, a random factor B (subject), and a covariate (age)
 with a model like this

 MyResult - lmer(Response ~ FactorA + Age + (1 | subject),  
 MyData, ...)

 My input data are like this: For each subject I have a file (a huge
 matrix) storing the response values of the subject at many locations
 (~30,000 voxels) corresponding to factor A at the 1st level, another
 file for factor A at the 2nd level, and a 3rd file for factor A at
 the 3rd level. Then I have another file storing the age of those
 subjects. The analysis with the linear mixed model above would be
 done at each voxel separately.

 It seems impractical to create one gigantic file or matrix to feed
 into the above command line because of the big number of voxels. I'm
 not sure how to proceed in this case. Any suggestions would be highly
 appreciated.

 Also if I'm concerned about any potential violation of sphericity
 among the 3 levels of factor A, how can I test sphericity violation
 in lmer? And if violation exists, how can I make corrections in
 contrast testing?

 Thank you very much,
 Gang

 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] x,y,z table to matrix with x as rows and y as columns

2007-07-24 Thread Mark Lyman

 I am sure I am missing something obvious but I cannot find the  
 function I am looking for. I have a data frame with three columns: X,  
 Y and Z, with X and Y being grid coordinates and Z the value  
 associated with these coordinates. I want to transform this data  
 frame in a matrix of Z values, on the grid defined by X and Y (and,  
 as a plus, fill the X.Y combinations which do no exist in the  
 original data frame with NAs in the resulting matrix). I could do  
 this manually but I guess the appropriate function should be  
 somewhere around. I just can't find it.

Immediately after my last post I realized there was a much better solution

 mydat - expand.grid(x=1:5, y=1:5)
 mydat - data.frame(mydat, z=rnorm(25))
 mydat$z[sample(1:25,4)] - NA
 data2mat - function(x, y, z)
+ {
+ out - matrix(unlist(split(z, interaction(x,y))), ncol=length(unique(y)))
+ dimnames(out) - list(unique(x), unique(y))
+ out
+ }
 with(mydat, data2mat(x, y, z))


Mark Lyman

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Renamig a factor

2007-07-24 Thread Henrique Dallazuanna

Try:

levels(test$Parc)[levels(test$Parc)==OlÞrdola] - Olèrdola

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

On 24/07/07, Agustin Lobo [EMAIL PROTECTED] wrote:

 Which is the proper way to rename a factor?
 If I do:
 test$Parc[test$Parc==OlÞrdola]-Olèrdola
 R complains that
 Warning message:
 invalid factor level, NAs generated in: `[-.factor`(`*tmp*`, test$Parc
 == OlÞrdola, value = Olèrdola)

 Thanks
 Agus
 --
 Dr. Agustin Lobo
 Institut de Ciencies de la Terra Jaume Almera (CSIC)
 LLuis Sole Sabaris s/n
 08028 Barcelona
 Spain
 Tel. 34 934095410
 Fax. 34 934110012
 email: [EMAIL PROTECTED]
 http://www.ija.csic.es/gt/obster

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggplot2 axis color

2007-07-24 Thread Felipe Carrillo

Hi:
Does anyone have an idea on how to color the axis and
labels using ggplot2? This is what I got:

library(ggplot2)
 p - qplot(total_bill, tip, data = tips)
 NewPlot-  p + geom_abline(slope=c(0.1,0.15,0.2),
colour=c(red,blue,yellow),size=c(2,5,2))
NewPlot + geom_smooth(colour=green,
size=3,linetype=3)
NewPlot$background.fill-cornsilk 
NewPlot$background.colour - blue
NewPlot$axis.colour-red  ? it doesn't do it
Thanks

 Felipe D. Carrillo
  Fishery Biologist
  US Fish  Wildlife Service
  Red Bluff, California 96080



  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating subsets of row pairs using somthing faster than a for loop.

2007-07-24 Thread Gabor Grothendieck

I doubt its any faster than using a loop but probably less code is:

library(zoo)
z1 - zoo(t(mat1)); z2 - zoo(t(mat2))
idx - 1:ncol(z1)
out - rollapply(cbind(z1, z2), 11, by.column = FALSE,
FUN = function(x) cor(x[,idx],x[,-idx]))

will give you a 10 x 16 multivariate zoo series such that:

out[1, ] is c(cor(t(mat1[,1:11]), t(mat2[,1:11])))
out[2, ] is c(cor(t(mat1[,2:12]), t(mat2[,2:12])))
etc.

and t(out) is a matrix in the orientation you asked for.

Try

library(zoo)
vignette(zoo)

for an intro to zoo.


On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote:
 Hi all,



 Situation:



  - I have two matrices each w/ 4 rows and 20 columns.



 mat1 - matrix(sample(1:500,80), ncol = 20,

dimnames=list(paste(mat1row, 1:4, sep=),

paste(mat1col, 1:20, sep=)))



 mat2 - matrix(sample(501:1000,80), ncol = 20,

dimnames=list(paste(mat2row, 1:4, sep=),

paste(mat2col, 1:20, sep=)))



  - Each column represents a value in a time series.



 Q: What do I want:



   Calculate moving average correlations for each row x row pair:



   For each row x row pair I want 10 values representing moving average

   correlations for 10 sets of time-values:



   cor(mat1[1,1:10], mat2[1,1:10])

   cor(mat1[1,2:11], mat2[1,2:11])

   ...

   cor(mat1[1,11:20], mat2[1,11:20])

   cor(mat1[1,1:10], mat2[2,1:10])

   ...

   cor(mat1[4,11:20], mat2[4,11:20])



   Result would be a 16 (rows) x 10 (col) matrix matMA



  ma1, ma2, ..., ma10 for (mat1 row1) x (mat2 row1)

  ma1, ma2, ..., ma10 for (mat1 row1) x (mat2 row2)

  ...

  ma1, ma2, ..., ma10 for (mat1 row4) x (mat2 row3)

  ma1, ma2, ..., ma10 for (mat1 row4) x (mat2 row4)



   I would like to be able to do this without using a for loop

   due to the slowness of that method.



   Is it possible to iterate through subsets w/o using a for loop?



 Thanks,



 - Bruce



  P




 **
 Please be aware that, notwithstanding the fact that the pers...{{dropped}}


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.

2007-07-24 Thread Adaikalavan Ramasamy

The name of the table should give you the value. And if you have a 
matrix, you just need to convert it into a vector first.

  m - matrix( LETTERS[ c(1:3, 3:5, 2:4) ], nc=3 )
  m
  [,1] [,2] [,3]
[1,] A  C  B
[2,] B  D  C
[3,] C  E  D
  tb - table( as.vector(m) )
  tb

A B C D E
1 2 3 2 1
  paste( names(tb), :, tb, sep= )
[1] A:1 B:2 C:3 D:2 E:1

If this is not what you want, then please give a simple example.

Regards, Adai



Allan Kamau wrote:
 Hi all,
 If the question below as been answered before I
 apologize for the posting.
 I would like to get the frequencies of occurrence of
 all values in a given variable in a multivariate
 dataset. In short for each variable (or field) a
 summary of values contained with in a value:frequency
 pair, there can be many such pairs for a given
 variable. I would like to do the same for several such
 variables.
 I have used table() but am unable to extract the
 individual value and frequency values.
 Please advise.
 
 Allan.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Regression trees using Goodness-of-Split

2007-07-24 Thread Fiona Callaghan

Hi
I have two questions:
1)
I would like to know if there is a package in R that constructs a
regression tree using the 'goodness-of-split' algorithm for survival
analysis proposed by Le Blanc and Crowley (1993) (rather than the usual
CART algorithm that uses within-node difference and impurity functions). 
In other words, is there a tree package based on between-node difference
using a two sample test e.g log rank test.

2)If such a package exists, are there user-defined split functions (can we
substitute other split functions).

Thanks for any help
Fiona


-- 
Fiona Callaghan, MS
A432 Crabtree Hall
Department of Biostatistics
Graduate School of Public Health
University of Pittsburgh
Phone 412 624 3063

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 axis color

2007-07-24 Thread hadley wickham

Hi Felipe,

Looks like a bug!  I'll try and get it fixed for the next version.  In
the meantime, you can read the last chapter of the ggplot book to see
how to fix it with grid.

Hadley

On 7/24/07, Felipe Carrillo [EMAIL PROTECTED] wrote:
 Hi:
 Does anyone have an idea on how to color the axis and
 labels using ggplot2? This is what I got:

 library(ggplot2)
  p - qplot(total_bill, tip, data = tips)
  NewPlot-  p + geom_abline(slope=c(0.1,0.15,0.2),
 colour=c(red,blue,yellow),size=c(2,5,2))
 NewPlot + geom_smooth(colour=green,
 size=3,linetype=3)
 NewPlot$background.fill-cornsilk
 NewPlot$background.colour - blue
 NewPlot$axis.colour-red  ? it doesn't do it
 Thanks

  Felipe D. Carrillo
   Fishery Biologist
   US Fish  Wildlife Service
   Red Bluff, California 96080



   
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using lmer with huge amount of data

2007-07-24 Thread Prof Brian Ripley

On Tue, 24 Jul 2007, Gang Chen wrote:

 Thank you very much for the response, Prof. Ripley.


 I think I am missing something here: how do you make this 'huge' and 
 'gigantic'?  You have not told us how many subjects you have, but in 
 imaging experiments it is usually no more than 50 and often less.

 Usually we have 10-30 subjects.


 For each subject you have 3 x 30,000 responses plus an age.  That is under 
 1Mb of data per subject, so the problem looks modest unless you have many 
 hundreds of subjects.
 
 Nothing says you need to read the data in one go, but it will be helpful to 
 have all the data available to R at once (although this could be alleviated 
 by using a DBMS interface).


 In the hypothetical situation I mentioned in my previous mail (suppose we 
 have 12 subjects), all the input data would be stored in 3 X 12 files each of 
 which contains 30,000 numbers,  plus one more file for age. Sure I can read 
 in those 27 files at once, and I'm not concerned about the data size at this 
 point, but my question is: do I have to reshuffle those 36 files and create 
 30,000 separate arrays (one for each voxel) in R so that I could run lmer 
 voxel-wise?

No. You can index data structures in R.

 I think the problem is rather going to be running 30,000 lmer fits, which 
 in my experience often take seconds each.  Each fit will only need a modest 
 amount of data (3 responses and one age per subject).

 Right. What is the most efficient strategy to run such an analysis 
 voxel-wise? Write a function, and then use apply()? Or simply do it in a 
 loop?

apply() is a loop internally.  I would just use a for() loop here, 
probably running groups of voxels in different jobs run simultaneously on 
multi-CPU machines.


 Thanks,
 Gang


 
 On Tue, 24 Jul 2007, Gang Chen wrote:
 
 Based on the examples I've seen in using statistical analysis
 packages such as lmer, it seems that people usually tabulate all the
 input data into one file with the first line indicating the variable
 names (or labels), and then read the file inside R. However, in my
 case I can't do that because of the huge amount of imaging data.
 
 Suppose I have a one-way within-subject ANCOVA with one covariate,
 and I would like to use lmer in R package lme4 to analyze the data.
 In the terminology of linear mixed models, I have a fixed factor A
 with 3 levels, a random factor B (subject), and a covariate (age)
 with a model like this
 
 MyResult - lmer(Response ~ FactorA + Age + (1 | subject), MyData, ...)
 
 My input data are like this: For each subject I have a file (a huge
 matrix) storing the response values of the subject at many locations
 (~30,000 voxels) corresponding to factor A at the 1st level, another
 file for factor A at the 2nd level, and a 3rd file for factor A at
 the 3rd level. Then I have another file storing the age of those
 subjects. The analysis with the linear mixed model above would be
 done at each voxel separately.
 
 It seems impractical to create one gigantic file or matrix to feed
 into the above command line because of the big number of voxels. I'm
 not sure how to proceed in this case. Any suggestions would be highly
 appreciated.
 
 Also if I'm concerned about any potential violation of sphericity
 among the 3 levels of factor A, how can I test sphericity violation
 in lmer? And if violation exists, how can I make corrections in
 contrast testing?
 
 Thank you very much,
 Gang
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Computing the rank of a matrix.

2007-07-24 Thread Ravi Varadhan

Hi Martin,

I just realized (courtesy: ?qr) that LAPACK=TRUE always gives full rank, no
matter what the matrix and tolerance are. So, clearly my results using
LAPACK=TRUE should be ignored.  So, the real comparison is only between
rankMat and qr(., LAPACK=FALSE)$rank.  I cant help but fell that there can
be no correct solution to an ill-posed problem.  Furthermore, I haven't
come across a real application where the numerical estimate of a rank is
directly useful.

Best,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: Martin Maechler [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, July 24, 2007 1:43 PM
To: RAVI VARADHAN
Cc: Martin Maechler; r-help@stat.math.ethz.ch
Subject: Re: [R] Computing the rank of a matrix.

 RV == RAVI VARADHAN [EMAIL PROTECTED]
 on Sat, 07 Apr 2007 18:39:36 -0400 writes:
 

this is a bit a late reply...  better late than never

RV Hi Martin,

Hi Ravi, thanks for your research.


RV I played a bit with rankMat.  I will first state the
RV conclusions of my numerical experiments and then present
RV the results to support them:

RV 1.  I don't believe that rankMat, or equivalently
RV Matlab's rank computation, is necessarily better than
RV qr(.)$rank or (qr., LAPACK=TRUE)$rank.  In fact, for the
RV notorious Hilbert matrix, rankMat can give poor
RV estimates of rank.

RV 2.  There exists no universally optimal (i.e. optimal
RV for all A) tol in qr(A, tol)$rank that would be
RV optimally close to rankMat.  The discrepancy in the
RV ranks computed by qr(A)$rank and rankMat(A) obviously
RV depends on the matrix A. This is evident from the tol
RV used in rankMat:
RV  tol - max(d) * .Machine$double.eps * abs(singValA[1])
RV So, this value of tol in qr will also minimize the discrepancy. 

RV 3.  The tol parameter is irrelevant in qr(A, tol,
RV LAPACK=TRUE)$rank, i.e. LAPACK doesn't seem to utilize
RV the tol parameter when computing the rank.  However,
RV qr(A, tol, LAPACK=FALSE)$rank does depend on tol.

Yes, indeed!  The help page of qr() actually says so 
{at least now, don't know about 3 months ago}.

RV Now, here are the results:
RV 1.
 hilbert.rank - matrix(NA,20,3)
 hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) }
 for (i in 1:20) {
RV + himat - hilbert(i)
RV + hilbert.rank[i,1] - rankMat(himat)
RV + hilbert.rank[i,2] - qr(himat,tol=1.0e-14)$rank
RV + hilbert.rank[i,3] - qr(himat, tol = .Machine$double.eps, LAPACK =
TRUE)$rank
RV + }
 hilbert.rank
RV [,1] [,2] [,3]
RV [1,]111
RV [2,]222
RV [3,]333
RV [4,]444
RV [5,]555
RV [6,]666
RV [7,]777
RV [8,]888
RV [9,]999
RV [10,]   10   10   10
RV [11,]   10   11   11
RV [12,]   11   12   12
RV [13,]   11   12   13
RV [14,]   11   13   14
RV [15,]   12   13   15
RV [16,]   12   15   16
RV [17,]   12   16   17
RV [18,]   12   16   18
RV [19,]   13   17   19
RV [20,]   13   17   20

RV We see that rankMat underestimates the rank for n  10, but
qr(himat, tol = .Machine$double.eps, LAPACK=TRUE)$rank gets it right.

Yes, indeed;  and that's  seems a bit  against the  ``common
knowledge'' that svd() is more reliable than qr()

Hmm I'm still baffled a bit..

Note that with the Hilbert matrices, 
one might argue that  hilbert(20) might really not have a
correct estimated rank of 20,
but at least for hilbert(13) or so, the rank should be == n.

BTW, there's a nice plot, slightly related to this problem,
using rcond() from the Matrix package :

  library(Matrix)

  hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) }
  rcHilb - sapply(1:20, function(n) rcond(Matrix(hilbert(n

  plot(rcHilb, xlab = n, log = y, col = 2, type = b,
   main = reciprocal condition numbers of Hilbert(n))

where one sees that the LAPACK code that underlies
Matrix::rcond() also gets into problem when estimating the
condition number for hilbert(n) when n ~ 16 .

I think if we wanted to make real progress here, we'd have to
consult with numerical analyist specialists.
But for me the topic is too remote to be followed up further at
the moment.

One conclusion for me is that to estimate the rank of a matrix
in current versions of R, one should use

  rankMat - function(x)  qr(x, LAPACK = TRUE)$rank

(as was suggested as one possibility in the original

Re: [R] Using lmer with huge amount of data

2007-07-24 Thread Prof Brian Ripley

I think I am missing something here: how do you make this 'huge' and 
'gigantic'?  You have not told us how many subjects you have, but in 
imaging experiments it is usually no more than 50 and often less.

For each subject you have 3 x 30,000 responses plus an age.  That is under 
1Mb of data per subject, so the problem looks modest unless you have many 
hundreds of subjects.

Nothing says you need to read the data in one go, but it will be helpful 
to have all the data available to R at once (although this could be 
alleviated by using a DBMS interface).

I think the problem is rather going to be running 30,000 lmer fits, which 
in my experience often take seconds each.  Each fit will only need a 
modest amount of data (3 responses and one age per subject).

On Tue, 24 Jul 2007, Gang Chen wrote:

 Based on the examples I've seen in using statistical analysis
 packages such as lmer, it seems that people usually tabulate all the
 input data into one file with the first line indicating the variable
 names (or labels), and then read the file inside R. However, in my
 case I can't do that because of the huge amount of imaging data.

 Suppose I have a one-way within-subject ANCOVA with one covariate,
 and I would like to use lmer in R package lme4 to analyze the data.
 In the terminology of linear mixed models, I have a fixed factor A
 with 3 levels, a random factor B (subject), and a covariate (age)
 with a model like this

 MyResult - lmer(Response ~ FactorA + Age + (1 | subject), MyData, ...)

 My input data are like this: For each subject I have a file (a huge
 matrix) storing the response values of the subject at many locations
 (~30,000 voxels) corresponding to factor A at the 1st level, another
 file for factor A at the 2nd level, and a 3rd file for factor A at
 the 3rd level. Then I have another file storing the age of those
 subjects. The analysis with the linear mixed model above would be
 done at each voxel separately.

 It seems impractical to create one gigantic file or matrix to feed
 into the above command line because of the big number of voxels. I'm
 not sure how to proceed in this case. Any suggestions would be highly
 appreciated.

 Also if I'm concerned about any potential violation of sphericity
 among the 3 levels of factor A, how can I test sphericity violation
 in lmer? And if violation exists, how can I make corrections in
 contrast testing?

 Thank you very much,
 Gang

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] crimtab related question

2007-07-24 Thread Mark Difford


Hi Jean,

You haven't yet had a reply from an authoratitive source, so here is my
tuppence worth to part of your enquiry.

It's almost certain that the receiving box is a receptacle into which tags
were placed after they had been drawn and the inscribed measurement noted
down.  Measurements on three tags were unwittingly not noted before the tags
were transferred to the receiving box.  They lay there with a good many
other tags, so the inscribed measurement/tag couldn't be recovered.

I hope this clarifies some points.

Regards,
Mark.


Jean lobry wrote:
 
 Dear all,
 
 the dataset documented under ?crimtab was also used in:
 
 @article{TreloarAE1934,
  title = {The adequacy of {S}tudent's criterion of
   deviations in small sample means},
  author = {Treloar, A.E. and Wilder, M.A.},
  journal = {The Annals of Mathematical Statistics},
  volume = {5},
  pages = {324-341},
  year = {1934}
 }
 
 The following is from page 335 of the above paper:
 
 From the table provided by MacDonell (1902) on
 the associated variation of stature (to the nearest inch)
 and length of the left middle finger (to the nearest
 millimeter) in 3000 British criminals, the measusurements
 were transferred to 3000 numbered Denison metal-rim
 tags from which the cords has been removed. After
 thorough checking and mixing of these circular disks,
 samples of 5 tags each were drawn at random until the
 supply was exhausted. Unfortunately, three of these
 samples were erroneously returned to a receiving box
 before being copied, and the records of 597 samples only
 are available.
 
 Could someone give me a clue about the kind of device
 that was used here? Is it a kind of lottery machine?
 I don't understand why three samples were lost. What
 is this receiving box?
 
 Thanks for any hint,
 
 Best,
 -- 
 Jean R. Lobry([EMAIL PROTECTED])
 Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I,
 43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE
 allo  : +33 472 43 27 56 fax: +33 472 43 13 88
 http://pbil.univ-lyon1.fr/members/lobry/
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/crimtab-related-question-tf4137237.html#a11772414
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] crimtab related question

2007-07-24 Thread roger koenker

While on the subject of mechanical methods of statistical research  I  
can't
resist quoting Doob's (1997) Statistical Science interview:

 My system, complicated by my inaccurate typing, led to retyping  
 material over and over, and for some time I had an electric drill  
 on my desk, provided with an eraser bit which I used to erase  
 typing. I rarely used the system of brushing white fluid over a  
 typed error because I was not patient enough to let the fluid dry  
 before retyping. Long after my first book was done I discovered the  
 tape rolls which cover lines of type. As I typed and retyped my  
 work it became so repugnant to me that I had more and more  
 difficulty even to look at it to check it. This fact accounts for  
 many slips that a careful reading would have discovered. I commonly  
 used a stochastic system of checking, picking a page and then a  
 place on the page at random and reading a few sentences, in order  
 to avoid reading it in context and thereby to avoid reading what  
 was in my mind rather than what I had written. At first I would  
 catch something at almost every trial, and I would continue until  
 several trials would yield nothing. I have tried this system on  
 other authors, betting for example that I would find something to  
 correct on a randomly chosen printed page of text, and  
 nonmathematicans suffering under the delusion that mathematics is  
 errorless would be surprised at how many bets I have won.

The relevance to the present inquiry is confirmed by the misspelling  
of Dennison in the Annals reference
quoted below.  See, for example:

http://www.amazon.com/Avery-Dennison-Metal-Rim-Tags/dp/B000AN376G

On the substance of Jean's question, Mark's interpretation seems very  
plausible.

Thanks to Jean and to Martin Maechler for adding this dataset to R.


url:www.econ.uiuc.edu/~rogerRoger Koenker
email[EMAIL PROTECTED]Department of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820


On Jul 24, 2007, at 4:42 PM, Mark Difford wrote:


 Hi Jean,

 You haven't yet had a reply from an authoratitive source, so here  
 is my
 tuppence worth to part of your enquiry.

 It's almost certain that the receiving box is a receptacle into  
 which tags
 were placed after they had been drawn and the inscribed measurement  
 noted
 down.  Measurements on three tags were unwittingly not noted before  
 the tags
 were transferred to the receiving box.  They lay there with a good  
 many
 other tags, so the inscribed measurement/tag couldn't be recovered.

 I hope this clarifies some points.

 Regards,
 Mark.


 Jean lobry wrote:

 Dear all,

 the dataset documented under ?crimtab was also used in:

 @article{TreloarAE1934,
  title = {The adequacy of {S}tudent's criterion of
   deviations in small sample means},
  author = {Treloar, A.E. and Wilder, M.A.},
  journal = {The Annals of Mathematical Statistics},
  volume = {5},
  pages = {324-341},
  year = {1934}
 }

 The following is from page 335 of the above paper:

 From the table provided by MacDonell (1902) on
 the associated variation of stature (to the nearest inch)
 and length of the left middle finger (to the nearest
 millimeter) in 3000 British criminals, the measusurements
 were transferred to 3000 numbered Denison metal-rim
 tags from which the cords has been removed. After
 thorough checking and mixing of these circular disks,
 samples of 5 tags each were drawn at random until the
 supply was exhausted. Unfortunately, three of these
 samples were erroneously returned to a receiving box
 before being copied, and the records of 597 samples only
 are available.

 Could someone give me a clue about the kind of device
 that was used here? Is it a kind of lottery machine?
 I don't understand why three samples were lost. What
 is this receiving box?

 Thanks for any hint,

 Best,
 -- 
 Jean R. Lobry([EMAIL PROTECTED])
 Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I,
 43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE
 allo  : +33 472 43 27 56 fax: +33 472 43 13 88
 http://pbil.univ-lyon1.fr/members/lobry/

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 -- 
 View this message in context: http://www.nabble.com/crimtab-related- 
 question-tf4137237.html#a11772414
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

[R] Passing equations as arguments

2007-07-24 Thread Anup Nandialath

Friends,

I'm trying to pass an equation as an argument to a function. The idea is as 
follows. Let us say i write an independent function 

Ideal Situation:

ifunc - function(x)
{
return((x*x)-2)
}

mainfunc - function(a,b)
{
evala - ifunc(a)
evalb - ifunc(b)
if (evalaevalb){return(evala)}
else
return(evalb)
}

Now I want to try and write this entire program in a single function with the 
user specifying the equation as an argument to the function. 

myfunc - function(a, b, eqn)
{
func1 - function (x) ??
{
return(eqn in terms of x)  ??
   }

Further arguments to check

The  imply that this does not seem to be correct. The idea is how to 
assign the equation expression from the main equation into the inner function. 
Is there anyway to do that within this set up?
 

Thanks in advance
Regards

Anup

   
-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overlaying a single contour from a new data array in levelplot

2007-07-24 Thread Deepayan Sarkar

On 7/24/07, Jenny Barnes [EMAIL PROTECTED] wrote:
 Dear R-Help community,

 I am trying to overlay a single contour line over a correlation plot using
 levelplot in the lattice package. These are the two arrays:

 1) a correlation plot over Africa - so each grid square is a different colour
 dependent on correlation - this is in an array: result_cor with dim[465,465]

 2) a single contour line from a ***different data source*** - this is from 
 data
 related to the p-values for the above correlation plot - I want to overlay 
 only
 the 95% confidence contour. The p-values are stored in an array: 
 result.p.values
 with same dimensions as above.

 I have read about using panel.levelplot and panel.contourplot in the R-help
 mailing list but I don't know the right way to call two different data arrays,
 can anybody help me please? I appreciate your time and help with this 
 question.

I can think of a couple of different ways, but the simplest will
probably be to compute the single contour beforehand and add it after
the standard levelplot using a panel function.  E.g., using the
'volcano' data for both matrices:

## you need the explicit x and y arguments because
## the default is different from levelplot.

vcl - contourLines(x = seq_len(nrow(volcano)),
   y = seq_len(ncol(volcano)),
   z = volcano,
   levels = c(172, 182))

levelplot(volcano, add.cl = vcl,
  panel = function(..., add.cl) {
  panel.levelplot(...)
  lapply(add.cl, panel.polygon, border = 'red')
  })

-Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Passing equations as arguments

2007-07-24 Thread jim holtman

Here is one possible solution:

ifun -
function(a, b, FUN){
evala - FUN(a)
evalb - FUN(b)
if (evala  evalb) return(evala) else return(evalb)
}
ifun(1,2,function(x) (x*x) - 2)



On 7/24/07, Anup Nandialath [EMAIL PROTECTED] wrote:
 Friends,

 I'm trying to pass an equation as an argument to a function. The idea is as 
 follows. Let us say i write an independent function

 Ideal Situation:

 ifunc - function(x)
 {
 return((x*x)-2)
 }

 mainfunc - function(a,b)
 {
 evala - ifunc(a)
 evalb - ifunc(b)
 if (evalaevalb){return(evala)}
 else
 return(evalb)
 }

 Now I want to try and write this entire program in a single function with the 
 user specifying the equation as an argument to the function.

 myfunc - function(a, b, eqn)
 {
func1 - function (x) ??
{
return(eqn in terms of x)  ??
   }

 Further arguments to check

 The  imply that this does not seem to be correct. The idea is how to 
 assign the equation expression from the main equation into the inner 
 function. Is there anyway to do that within this set up?


 Thanks in advance
 Regards

 Anup


 -

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregate.ts

2007-07-24 Thread Achim Zeileis

On Wed, 25 Jul 2007, laimonis wrote:

 Consider the following scrap of code:

...slightly modified to
  x1 - ts(1:24, start = c(2000, 10), freq = 12)
  x2 - ts(1:24, start = c(2000, 11), freq = 12)

and then
  y1 - aggregate(x1, nfreq = 4)
gives the desired result while
  y2 - aggregate(x2, nfreq = 4)
probably does not what you would like it to do. In both cases, the 24
observations are broken into 8 segments of equal length (as documented on
?aggregate.ts) and then aggregated. Therefore
  as.vector(y1)
  as.vector(y2)
are identical (and not matched by quarter...as you would probably like).

 And don't tell me that the aggregating a monthly series into quarters
 makes no sense!! (see response to Bug 9798).

1. Your tone is not appropriate.
2. You're not quoting the reply correctly. It said: You cannot aggregate
   a time series that does not run over quarters into quarters. The
   speculation is plain wrong.
   The reply means that aggregate.ts() does not do what you think
   it does. As I tried to illustrate with the example above.

One can probably argue about whether it makes sense to aggregate a monthly
time series into quarter when I don't have complete observations in each
quarter. But maybe it might be worth considering a change in
aggregate.ts() so that the old and new frequency are matched even with
incomplete observations?

Currently, the zoo implementation allows this: Coercing back and forth
gives:
  library(zoo)
  z1 - as.ts(aggregate(as.zoo(x1), as.yearqtr, sum))
  z2 - as.ts(aggregate(as.zoo(x2), as.yearqtr, sum))
where z1 is identical to y1, and z2 is what you probably want.

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] aggregate.ts

2007-07-24 Thread laimonis

Consider the following scrap of code:

  x- ts(1:50,start=c(1,11),freq=12)
  y - aggregate(x,nfreq=4)
  c(y)
 [1]   6  15  24  33  42  51  60  69  78  87  96 105 114 123 132 141
  y
Error in rep.int(, start.pad) : invalid number of copies in rep.int()
  tsp(y)
[1] 1.83 5.58 4.00

So we can aggregate into quarters, but we cannot print it using 
print.ts  Even if print.ts cannot line the series into columns as it 
normally does for quarterly data, we would expect it to behave as it 
does when we  aggregate into thirds.

  y3 - aggregate(x,nfreq=3)
  y3
Time Series:
Start = 1.83
End = 5.5
Frequency = 3
 [1]  10  26  42  58  74  90 106 122 138 154 170 186

And don't tell me that the aggregating a monthly series into quarters 
makes no sense!! (see response to Bug 9798).

Laimonis Kavalieris

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] initalizing and checking validity of S4 classes

2007-07-24 Thread Bojanowski, M.J. (Michal)

Dear useRs and wizaRds,

I am currently developing a set of functions using S4 classes. On the way I 
encountered the problem exemplified with the code below. For some reason the 
'validity' method does not seem to work, i.e. does not check for errors in the 
specification of the slots of the defined class. Any hints?

My understanding of the whole S4 system was that validity checks are made 
*after* class initialization. Is that correct?

Thanks a lot in advance!

PS. Session info:

R version 2.5.1 (2007-06-27) 
i386-pc-mingw32 

locale:
LC_COLLATE=Polish_Poland.1250;LC_CTYPE=Polish_Poland.1250;LC_MONETARY=Polish_Poland.1250;LC_NUMERIC=C;LC_TIME=Polish_Poland.1250

attached base packages:
[1] stats graphics  grDevices utils datasets  methods  
[7] base 
 




-b-e-g-i-n--r--c-o-d-e-

setClass( someclass, representation(v=numeric, l=character),
prototype( v=numeric(0),
l=character(0) )
)

setMethod(initialize, someclass,
function(.Object, v=numeric(0), l=character(0))
{
# strip the vector names
cv - v
cl - l
names(cv) - NULL
names(cl) - NULL
rval - .Object
[EMAIL PROTECTED] - cv
[EMAIL PROTECTED] - cl
rval
} )

# at this point this should be OK
o - new(someclass, v=1:2, l=letters[1:3])
o

# check validity
f - function(object)
{
rval - NULL
if( length([EMAIL PROTECTED]) != length([EMAIL PROTECTED]) )
rval - c( rval, lengths dont match)
if( is.null(rval) ) return(TRUE)
else return(rval)
}
 
# this should return error description
f(o)


# define the validity method
setValidity( someclass, f)

# this should return an error
new(someclass, v=1:2, l=letters[1:3])

# but it doesn't...

-e-n-d--r--c-o-d-e-





Michal Bojanowski
ICS / Department of Sociology
Utrecht University
Heidelberglaan 2; 3584 CS Utrecht
Room 1428
m.j.bojanowski at uu dot nl
http://www.fss.uu.nl/soc/bojanowski/


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] initalizing and checking validity of S4 classes

2007-07-24 Thread Martin Morgan

Hi Michal --

Add validObject to your initialize method:

 setMethod(initialize, someclass,
+ function(.Object, v=numeric(0), l=character(0))
+ {
+ # strip the vector names
+ cv - v
+ cl - l
+ names(cv) - NULL
+ names(cl) - NULL
+ rval - .Object
+ [EMAIL PROTECTED] - cv
+ [EMAIL PROTECTED] - cl
+ validObject(rval)
+ rval
+ } )
[1] initialize
 new(someclass, v=1:2, l=letters[1:3])
Error in validObject(rval) : 
  invalid class someclass object: lengths dont match

Note that without initialize, we have:

 setClass(A,
+  representation=representation(
+v=numeric, l=character))
[1] A
 setValidity(A, f)

Slots:
  
Name:  v l
Class:   numeric character
 new(A, v=1:2, l=letters[1:3])
Error in validObject(.Object) : 
  invalid class A object: lengths dont match
 
Here are two interpretations of this. (1) using 'initialize' means
that you are taking control of the initialization process, and hence
know when you need to call validObject. (2) Responsibility for object
validity is ambiguous -- does it belong with 'new', 'initialize', or a
'constructor' that the programmer might write? This is particularly
problematic with R's copy semantics, where creating transiently
invalid objects seems to be almost necessary (e.g., callNextMethod()
in 'initialize' might initialize the inherited slots of the object,
but the object itself is of the derived class and could well be
invalid 'invalid' after the base class has finished with initialize).

Martin

Bojanowski, M.J.  (Michal) [EMAIL PROTECTED] writes:

 Dear useRs and wizaRds,

 I am currently developing a set of functions using S4 classes. On the way I 
 encountered the problem exemplified with the code below. For some reason the 
 'validity' method does not seem to work, i.e. does not check for errors in 
 the specification of the slots of the defined class. Any hints?

 My understanding of the whole S4 system was that validity checks are made 
 *after* class initialization. Is that correct?

 Thanks a lot in advance!

 PS. Session info:

 R version 2.5.1 (2007-06-27) 
 i386-pc-mingw32 

 locale:
 LC_COLLATE=Polish_Poland.1250;LC_CTYPE=Polish_Poland.1250;LC_MONETARY=Polish_Poland.1250;LC_NUMERIC=C;LC_TIME=Polish_Poland.1250

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods  
 [7] base 
 




 -b-e-g-i-n--r--c-o-d-e-

 setClass( someclass, representation(v=numeric, l=character),
 prototype( v=numeric(0),
   l=character(0) )
 )

 setMethod(initialize, someclass,
 function(.Object, v=numeric(0), l=character(0))
 {
 # strip the vector names
 cv - v
 cl - l
 names(cv) - NULL
 names(cl) - NULL
 rval - .Object
 [EMAIL PROTECTED] - cv
 [EMAIL PROTECTED] - cl
 rval
 } )

 # at this point this should be OK
 o - new(someclass, v=1:2, l=letters[1:3])
 o

 # check validity
 f - function(object)
 {
 rval - NULL
 if( length([EMAIL PROTECTED]) != length([EMAIL PROTECTED]) )
   rval - c( rval, lengths dont match)
 if( is.null(rval) ) return(TRUE)
 else return(rval)
 }
  
 # this should return error description
 f(o)


 # define the validity method
 setValidity( someclass, f)

 # this should return an error
 new(someclass, v=1:2, l=letters[1:3])

 # but it doesn't...

 -e-n-d--r--c-o-d-e-




 
 Michal Bojanowski
 ICS / Department of Sociology
 Utrecht University
 Heidelberglaan 2; 3584 CS Utrecht
 Room 1428
 m.j.bojanowski at uu dot nl
 http://www.fss.uu.nl/soc/bojanowski/


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question on using gl1ce from lasso2 package

2007-07-24 Thread Li Li

Hi,
I tried several settings by using the family=gaussian
in gl1ce, but none of them works.
For the case glm can work.
Here is the error message I got:

 glm(Petal.Width~Sepal.Length+Sepal.Width+Petal.Length
,data=iris,family=gaussian())
 gl1ce(Petal.Width~Sepal.Length+Sepal.Width+Petal.Length
,data=iris,family=gaussian())
Error in eval(expr, envir, enclos) : object etastart not found

Does anyone have experience with this function?
Any help will be appreciated,

Li

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

89 matches

Mail list logo