date:20090217

See ?which.max :

library(zoo)
z <- read.zoo("myfile.dat", sep = ",", format = "%m/%d/%Y")
f <- function(x) time(x)[which.max(x)]
ix <- tapply(z, floor(as.yearmon(time(z))), f)
z[ix]


On Tue, Feb 17, 2009 at 3:58 AM, CJ Rubio  wrote:
>
> i have the following code - assimilating the maximum annual discharge each
> year ffrom a daily discharge record from year 1989-2005.
>
> m <- read.table("D:/documents/5 stations/01014000.csv", sep =",")
> z <- zoo(m[,4],as.Date(as.character(m[,3]), "%m/%d/%Y"))
> x <- aggregate(z, floor(as.numeric(as.yearmon(time(z, max)   #code
> suggested by Gabor Grothendieck
>
> which gives me the maximum discharge each year and the year itself.. what i
> am trying now is to produce the date when the maximum discharge was observed
> in the pattern -mm-dd, like:
>
> 1988 11/9/1988  18600
> 1989  5/8/1989   49000
> ...
> 2005 4/26/2005111000
>
> thank you all in advance...
> --
> View this message in context: 
> http://www.nabble.com/problem-with-dates-in-zoo-package-tp22053103p22053103.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Combination

2009-02-17 Thread Dani Valverde


Hello,
I have a sequence of numbers:
seq(1:50)
and I would like to have all the possible combinations with this numbers 
without repeating any combination:

11, 12, 13, ... ,22,23,24,...
How can I do it?
Best,

Dani

--
Daniel Valverde Saubí

Grup de Biologia Molecular de Llevats
Facultat de Veterinària de la Universitat Autònoma de Barcelona
Edifici V, Campus UAB
08193 Cerdanyola del Vallès- SPAIN

Centro de Investigación Biomédica en Red
en Bioingeniería, Biomateriales y
Nanomedicina (CIBER-BBN)

Grup d'Aplicacions Biomèdiques de la RMN
Facultat de Biociències
Universitat Autònoma de Barcelona
Edifici Cs, Campus UAB
08193 Cerdanyola del Vallès- SPAIN
+34 93 5814126

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Combination

2009-02-17 Thread Henrique Dallazuanna

Try this:

apply(expand.grid(1:50, 1:50), 1, paste, collapse = '')

On Tue, Feb 17, 2009 at 8:37 AM, Dani Valverde wrote:

> Hello,
> I have a sequence of numbers:
> seq(1:50)
> and I would like to have all the possible combinations with this numbers
> without repeating any combination:
> 11, 12, 13, ... ,22,23,24,...
> How can I do it?
> Best,
>
> Dani
>
> --
> Daniel Valverde Saubí
>
> Grup de Biologia Molecular de Llevats
> Facultat de Veterinària de la Universitat Autònoma de Barcelona
> Edifici V, Campus UAB
> 08193 Cerdanyola del Vallès- SPAIN
>
> Centro de Investigación Biomédica en Red
> en Bioingeniería, Biomateriales y
> Nanomedicina (CIBER-BBN)
>
> Grup d'Aplicacions Biomèdiques de la RMN
> Facultat de Biociències
> Universitat Autònoma de Barcelona
> Edifici Cs, Campus UAB
> 08193 Cerdanyola del Vallès- SPAIN
> +34 93 5814126
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Combination

2009-02-17 Thread Eik Vettorazzi


Hi Dani,
see ?combn .

combn(1:50,2)
gives you all combinations as matrix.
you can do sth like

apply(combn(1:50,2),2, paste, sep="", collapse="")

to get concatenated results.

hth.

Dani Valverde schrieb:

Hello,
I have a sequence of numbers:
seq(1:50)
and I would like to have all the possible combinations with this 
numbers without repeating any combination:

11, 12, 13, ... ,22,23,24,...
How can I do it?
Best,

Dani



--
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/42803-8243
F ++49/40/42803-7790

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with calculating the residuals in an AR(p)-model

2009-02-17 Thread Spiderschwein


Hi,

I'm trying to calculate the residuals in a one-sided AR(p) model:
sum (phi_j (X_t-j - EX)) = epsilon

My X is an ARMA(1,1)-model, which can be represented as an AR(infty) model.
I calculate the order of the model with AIC and the parameters with the
Yule-Walker method.

For the residuals, I tried:

for (k in (p+1):n){ 
 for (j in 1:p){
  eps[k] <- sum(phi[j] * (X[k-j] - mean(X)))
 }
 ceps[k] <- eps[k] - sum(eps)/(n-p+1) #centering the residuals
}


The residuals in my ARMA(1,1)-models are t-distributed with 6 degrees of
freedom.
Unfortunately my code results in residuals, which are all around zero. So
ecdf(eps) or ecdf(ceps) are very different from a t(6)-distribution.

Where's the bug in my code?

Regards,
Martin
-- 
View this message in context: 
http://www.nabble.com/Problem-with-calculating-the-residuals-in-an-AR%28p%29-model-tp22055764p22055764.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using eval in multinom argument

2009-02-17 Thread Crouch, Daniel

Thanks for your help,

I managed to get it working like this:

fml<-as.formula(paste("y ~", paste(PCnames, collapse="+")))
y = class.ind(grp[outgroup])
z1=multinom(formula = fml, data=data.frame(scores))

Daniel Crouch



**QUOTE:
Forget eval(parse(text = ))

See

?as.formula
?update.formula

and try out the example() s there.

HTH,

Chuck




On Mon, 16 Feb 2009, Crouch, Daniel wrote:

> > Hi,
> >
> > I am having difficulty entering a 'programmable' argument into the multinom 
> > function from the nnet package. Interactively, I can get the function to 
> > work fine by calling it this way:
> >
> > z1=multinom(formula = class.ind(grp[-outgroup])~ (PC1 + PC2 + PC3), 
> > data=data.frame(scores))
> >
> > However I need to be able to change the number of variables I am looking 
> > for in 'scores' and so am trying to call it this way...
> >
> > z1=multinom(formula = class.ind(grp[-outgroup])~ eval(parse(text=PCnames)), 
> > data=data.frame(scores))
> >
> > ...where, for example, PCnames = c("PC1", "+",   "PC2", "+",   "PC3")
> >
> > This gives no error messages, but only the last variable (in this case PC3) 
> > gets considered in the model. z1 looks like this:
> >
> >  (Intercept) eval(parse(text = PCnames))
> > 23.530352  -116.87140
> > 3   -1.30861313.59134
> > 43.662172   -57.52198
> > 5   -1.216041  -242.38827
> > 6   -9.377894  -367.71614
> > 7   -3.145738  -286.19766
> >
> > Rather than this:
> >
> >  (Intercept)   PC1PC2   PC3
> > 2   288.97131   889.281  3776.5837 -2105.751
> > 3  -712.53519  2775.663  8490.5724  8602.834
> > 4   229.17772  4234.950   329.6995 -2182.238
> > 585.54585 -3036.657  3968.2517 -3450.070
> > 6  -676.55377 -9545.785  2422.5340 -7183.686
> > 7  -631.91921 10997.432 -3310.2905 -5348.513
> >
> > Any help would be much appreciated.
> >
> > Thanks,
> > Daniel Crouch
> >
> >
> > Daniel Crouch
> > Research Student
> > Department of Medical & Molecular Genetics
> > King's College London
> > 8th Floor, Tower Wing
> > Guy's Hospital
> > London SE1 9RT
> > United Kingdom
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >

Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901





Daniel Crouch
Research Student
Department of Medical & Molecular Genetics
King's College London
8th Floor, Tower Wing
Guy's Hospital
London SE1 9RT
United Kingdom
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparison of age categories using contrasts

2009-02-17 Thread Mark Difford


Hi Dylan,

>> Am I trying to use contrast.Design() for something that it was not
>> intended for? ...

I think Prof. Harrell's main point had to do with how interactions are
handled. You can also get the kind of contrasts that Patrick was interested
in via multcomp. If we do this using your artificial data set we see that
the contrasts are the same as those got by fitting the model using
contr.sdif, but a warning is generated about interactions in the model &c.
[see example code]

Part of Prof. Harrell's "system" is that in generating contrasts via
contr.Design an appropriate value is automatically chosen for the
interacting variable (in this case the median value of x) so that a
reasonable default set of contrasts is calculated. This can of course be
changed.

Coming to your question [?] about how to generate the kind of contrasts that
Patrick wanted using contrast.Design. Well, it is not that straightforward,
though I may have missed something in the documentation to the function. In
the past I have cobbled them together using the kind of hack given below:

## Exampe code
x <- rnorm(100) 
y1 <- x[1:25] * 2 + rnorm(25, mean=1)
y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5)
y3 <- x[51:75] * 2.9 + rnorm(25, mean=5)
y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5)


d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25))) 


# now try with contrast.Design(): 
library(multcomp)
dd <- datadist(d) 
options(datadist='dd') 
l <- ols(y ~ x * f, data=d)

## model fitted using successive difference contrasts
library(MASS)
T.lm <- lm(y ~ x * f, contrasts=list(f="contr.sdif"), data=d)
summary(T.lm)

## show the warning: model fitted using ols() and treatment contrasts
summary(glht(l, linfct=mcp(f="Seq")))

## "custom" successive difference contrasts using contrast.Design
TFin <- {}
for (i in 1:(length(levels(d$f))-1)) {
tcont <- contrast(l, a=list(f=levels(d$f)[i]),
b=list(f=levels(d$f)[i+1]))
TFin <- rbind(TFin, tcont)
rownames(TFin)[i] <-  paste(levels(d$f)[i], levels(d$f)[i+1], sep="-")
}
TFin[,1:9]

Regards, Mark.



Dylan Beaudette-2 wrote:
> 
> On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux
>  wrote:
>> Greg Snow a écrit :
>>> One approach is to create your own contrasts matrix:
>>>
>>>
 mycmat <- diag(8)
 mycmat[ row(mycmat) == col(mycmat) + 1 ] <- -1
 mycmati <- solve(mycmat)
 contrasts(agefactor) <- mycmati[,-1]

>>>
>>> Now when you use agefactor, the intercept will be the first age group
>>> and the slopes will be the differences between the pairs of groups (make
>>> sure that the order of the levels of agefactor is correct).
>>>
>>> The difference between this method and the contr.sdif function in MASS
>>> is how the intercept will end up being interpreted (and the dimnames).
>>>
>>> Hope this helps,
>>>
>>>
>>
>> Actually, exactly what I needed including the reference to contr.sdif in
>> MASS I did not spot before (although I am a faithful reader of the
>> yellow book... but so many things still escape to me). Again thanks a
>> lot.
>>
>> Patrick
>>
> 
> Glad you were able to solve your problem. Frank replied earlier with
> the suggestion to use contrast.Design() to perform the tests after the
> initial model has been fit. Perhaps I am a little denser than normal,
> but I cannot figure out how to apply contrast.Design() to a similar
> model with several levels of a factor. Example data:
> 
> # need these
> library(lattice)
> library(Design)
> 
> # replicate an important experimental dataset
> set.seed(10101010)
> x <- rnorm(100)
> y1 <- x[1:25] * 2 + rnorm(25, mean=1)
> y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5)
> y3 <- x[51:75] * 2.9 + rnorm(25, mean=5)
> y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5)
> 
> d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4],
> each=25)))
> 
> # plot
> xyplot(y ~ x, groups=f, data=d, auto.key=list(columns=4, title='Beard
> Type', lines=TRUE, points=FALSE, cex=0.75), type=c('p','r'),
> ylab='Number of Pirates', xlab='Distance from Land')
> 
> # standard comparison to base level of f
> summary(lm(y ~ x * f, data=d))
> 
> # compare to level 4 of f
> summary(lm(y ~ x * C(f, base=4), data=d))
> 
> 
> # now try with contrast.Design():
> dd <- datadist(d)
> options(datadist='dd')
> l <- ols(y ~ x * f, data=d)
> 
> # probably the wrong approach, as the results do not look too familiar:
> contrast(l, list(f=levels(d$f)))
> 
>  x  f Contrast  S.E.  Lower   Upper t Pr(>|t|)
>  -0.3449623 a 0.3966682 0.1961059 0.007184856 0.7861515  2.02 0.0460
>  -0.3449623 b 0.5587395 0.1889173 0.183533422 0.9339456  2.96 0.0039
>  -0.3449623 c 4.1511677 0.1889170 3.775962254 4.5263730 21.97 0.
>  -0.3449623 d 4.3510407 0.120 3.975904726 4.7261766 23.04 0.
> 
> 
> This is adjusting the output to a given value of 'x'... Am I trying to
> use contrast.Design() for something that it was not intended for? Any
> tips Frank?
> 
> Cheers,
> Dylan
> 
> __
> R-help@r-proje

Re: [R] Comparison of age categories using contrasts

2009-02-17 Thread Chuck Cleland

On 2/16/2009 10:18 PM, Dylan Beaudette wrote:
> On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux
>  wrote:
>> Greg Snow a écrit :
>>> One approach is to create your own contrasts matrix:
>>>
>>>
 mycmat <- diag(8)
 mycmat[ row(mycmat) == col(mycmat) + 1 ] <- -1
 mycmati <- solve(mycmat)
 contrasts(agefactor) <- mycmati[,-1]

>>> Now when you use agefactor, the intercept will be the first age group and 
>>> the slopes will be the differences between the pairs of groups (make sure 
>>> that the order of the levels of agefactor is correct).
>>>
>>> The difference between this method and the contr.sdif function in MASS is 
>>> how the intercept will end up being interpreted (and the dimnames).
>>>
>>> Hope this helps,
>>>
>>>
>> Actually, exactly what I needed including the reference to contr.sdif in
>> MASS I did not spot before (although I am a faithful reader of the
>> yellow book... but so many things still escape to me). Again thanks a lot.
>>
>> Patrick
>>
> 
> Glad you were able to solve your problem. Frank replied earlier with
> the suggestion to use contrast.Design() to perform the tests after the
> initial model has been fit. Perhaps I am a little denser than normal,
> but I cannot figure out how to apply contrast.Design() to a similar
> model with several levels of a factor. Example data:
> 
> # need these
> library(lattice)
> library(Design)
> 
> # replicate an important experimental dataset
> set.seed(10101010)
> x <- rnorm(100)
> y1 <- x[1:25] * 2 + rnorm(25, mean=1)
> y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5)
> y3 <- x[51:75] * 2.9 + rnorm(25, mean=5)
> y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5)
> 
> d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25)))
> 
> # plot
> xyplot(y ~ x, groups=f, data=d, auto.key=list(columns=4, title='Beard
> Type', lines=TRUE, points=FALSE, cex=0.75), type=c('p','r'),
> ylab='Number of Pirates', xlab='Distance from Land')
> 
> # standard comparison to base level of f
> summary(lm(y ~ x * f, data=d))
> 
> # compare to level 4 of f
> summary(lm(y ~ x * C(f, base=4), data=d))
> 
> # now try with contrast.Design():
> dd <- datadist(d)
> options(datadist='dd')
> l <- ols(y ~ x * f, data=d)
> 
> # probably the wrong approach, as the results do not look too familiar:
> contrast(l, list(f=levels(d$f)))
> 
>  x  f Contrast  S.E.  Lower   Upper t Pr(>|t|)
>  -0.3449623 a 0.3966682 0.1961059 0.007184856 0.7861515  2.02 0.0460
>  -0.3449623 b 0.5587395 0.1889173 0.183533422 0.9339456  2.96 0.0039
>  -0.3449623 c 4.1511677 0.1889170 3.775962254 4.5263730 21.97 0.
>  -0.3449623 d 4.3510407 0.120 3.975904726 4.7261766 23.04 0. 
> 
> This is adjusting the output to a given value of 'x'... Am I trying to
> use contrast.Design() for something that it was not intended for? Any
> tips Frank?

  Does this help?

# Compare with summary(lm(y ~ x * f, data=d))
contrast(l, a=list(f=levels(d$f)[2:4], x=0),
b=list(f=levels(d$f)[1],   x=0))

 f Contrast  S.E.  Lower  Upper t Pr(>|t|)
 b 0.3673455 0.2724247 -0.1737135 0.9084046  1.35 0.1808
 c 4.1310015 0.2714011  3.5919754 4.6700275 15.22 0.
 d 4.4308653 0.2731223  3.8884207 4.9733098 16.22 0.

Error d.f.= 92

# Compare with summary(lm(y ~ x * C(f, base=4), data=d))
contrast(l, a=list(f=levels(d$f)[1:3], x=0),
b=list(f=levels(d$f)[4],   x=0))

 f Contrast   S.E.  Lower  Upper  t  Pr(>|t|)
 a -4.4308653 0.2731223 -4.9733098 -3.8884207 -16.22 0.
 b -4.0635198 0.2782704 -4.6161888 -3.5108507 -14.60 0.
 c -0.2998638 0.2772684 -0.8505427  0.2508151  -1.08 0.2823

Error d.f.= 92

> Cheers,
> Dylan
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] assuming AR(1) residuals in OLS

2009-02-17 Thread constantine

Thank you for the lightening replies.

I tested various corStruct objects (?corClasses) using the nlme
package and all work flawlessly.

My best regards to all...

Constantine Tsardounis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Combination

2009-02-17 Thread Dimitris Rizopoulos


Eik Vettorazzi wrote:

Hi Dani,
see ?combn .

combn(1:50,2)
gives you all combinations as matrix.
you can do sth like

apply(combn(1:50,2),2, paste, sep="", collapse="")


note that you could simplify the above by

combn(50, 2, paste, collapse = "")


Best,
Dimitris



to get concatenated results.

hth.

Dani Valverde schrieb:

Hello,
I have a sequence of numbers:
seq(1:50)
and I would like to have all the possible combinations with this 
numbers without repeating any combination:

11, 12, 13, ... ,22,23,24,...
How can I do it?
Best,

Dani





--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparison of age categories using contrasts

2009-02-17 Thread Mark Difford


Hi Dylan, Chuck,

>> contrast(l, a=list(f=levels(d$f)[1:3], x=0), b=list(f=levels(d$f)[4],  
>> x=0))

There is a subtlety here that needs to be emphasized. Setting the
interacting variable (x) to zero is reasonable in this case, because the
mean value of rnorm(n) is zero. However, in the real world of real data it
is somewhat unusual for zero to be a reasonable value for an interacting
variable. In fact, it is this setting-to-zero that causes multcomp's "bells
to ring" with a word of warning.

##
glht(l, linfct=mcp(f="Seq"))$linfct

Regards, Mark.



Chuck Cleland wrote:
> 
> On 2/16/2009 10:18 PM, Dylan Beaudette wrote:
>> On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux
>>  wrote:
>>> Greg Snow a écrit :
 One approach is to create your own contrasts matrix:


> mycmat <- diag(8)
> mycmat[ row(mycmat) == col(mycmat) + 1 ] <- -1
> mycmati <- solve(mycmat)
> contrasts(agefactor) <- mycmati[,-1]
>
 Now when you use agefactor, the intercept will be the first age group
 and the slopes will be the differences between the pairs of groups
 (make sure that the order of the levels of agefactor is correct).

 The difference between this method and the contr.sdif function in MASS
 is how the intercept will end up being interpreted (and the dimnames).

 Hope this helps,


>>> Actually, exactly what I needed including the reference to contr.sdif in
>>> MASS I did not spot before (although I am a faithful reader of the
>>> yellow book... but so many things still escape to me). Again thanks a
>>> lot.
>>>
>>> Patrick
>>>
>> 
>> Glad you were able to solve your problem. Frank replied earlier with
>> the suggestion to use contrast.Design() to perform the tests after the
>> initial model has been fit. Perhaps I am a little denser than normal,
>> but I cannot figure out how to apply contrast.Design() to a similar
>> model with several levels of a factor. Example data:
>> 
>> # need these
>> library(lattice)
>> library(Design)
>> 
>> # replicate an important experimental dataset
>> set.seed(10101010)
>> x <- rnorm(100)
>> y1 <- x[1:25] * 2 + rnorm(25, mean=1)
>> y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5)
>> y3 <- x[51:75] * 2.9 + rnorm(25, mean=5)
>> y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5)
>> 
>> d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4],
>> each=25)))
>> 
>> # plot
>> xyplot(y ~ x, groups=f, data=d, auto.key=list(columns=4, title='Beard
>> Type', lines=TRUE, points=FALSE, cex=0.75), type=c('p','r'),
>> ylab='Number of Pirates', xlab='Distance from Land')
>> 
>> # standard comparison to base level of f
>> summary(lm(y ~ x * f, data=d))
>> 
>> # compare to level 4 of f
>> summary(lm(y ~ x * C(f, base=4), data=d))
>> 
>> # now try with contrast.Design():
>> dd <- datadist(d)
>> options(datadist='dd')
>> l <- ols(y ~ x * f, data=d)
>> 
>> # probably the wrong approach, as the results do not look too familiar:
>> contrast(l, list(f=levels(d$f)))
>> 
>>  x  f Contrast  S.E.  Lower   Upper t Pr(>|t|)
>>  -0.3449623 a 0.3966682 0.1961059 0.007184856 0.7861515  2.02 0.0460
>>  -0.3449623 b 0.5587395 0.1889173 0.183533422 0.9339456  2.96 0.0039
>>  -0.3449623 c 4.1511677 0.1889170 3.775962254 4.5263730 21.97 0.
>>  -0.3449623 d 4.3510407 0.120 3.975904726 4.7261766 23.04 0. 
>> 
>> This is adjusting the output to a given value of 'x'... Am I trying to
>> use contrast.Design() for something that it was not intended for? Any
>> tips Frank?
> 
>   Does this help?
> 
> # Compare with summary(lm(y ~ x * f, data=d))
> contrast(l, a=list(f=levels(d$f)[2:4], x=0),
> b=list(f=levels(d$f)[1],   x=0))
> 
>  f Contrast  S.E.  Lower  Upper t Pr(>|t|)
>  b 0.3673455 0.2724247 -0.1737135 0.9084046  1.35 0.1808
>  c 4.1310015 0.2714011  3.5919754 4.6700275 15.22 0.
>  d 4.4308653 0.2731223  3.8884207 4.9733098 16.22 0.
> 
> Error d.f.= 92
> 
> # Compare with summary(lm(y ~ x * C(f, base=4), data=d))
> contrast(l, a=list(f=levels(d$f)[1:3], x=0),
> b=list(f=levels(d$f)[4],   x=0))
> 
>  f Contrast   S.E.  Lower  Upper  t  Pr(>|t|)
>  a -4.4308653 0.2731223 -4.9733098 -3.8884207 -16.22 0.
>  b -4.0635198 0.2782704 -4.6161888 -3.5108507 -14.60 0.
>  c -0.2998638 0.2772684 -0.8505427  0.2508151  -1.08 0.2823
> 
> Error d.f.= 92
> 
>> Cheers,
>> Dylan
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code. 
> 
> -- 
> Chuck Cleland, Ph.D.
> NDRI, Inc. (www.ndri.org)
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailm

[R] Chromatogram deconvolution and peak matching

2009-02-17 Thread bartjoosen


Hi,

I'm trying to match peaks between chromatographic runs.
I'm able to match peaks when they are chromatographed with the same method,
but not when there are different methods are used and spectra comes in to
play.

While searching I found the ALS package which should be usefull for my
application, but I couldn't figure it out.

I made some dummy chroms with R, which mimic my actual datasets, to play
with, but after looking at the manuals of ALS, I'm affraid I can't get the
job done. Can someone put me on the right way?

Here is my code to generate the dummy chroms, which also plots the 2 chroms
and the spectra of the 3 peaks:

#2D chromatogram generation
par(mfrow=c(3,1))
time <- seq(0,20,by=0.05)
f <- function(x,rt) dnorm((x-rt),mean=0,sd=rt/35)
c1 <- f(time,6.1)
c2 <- f(time,5.6)
c3 <- f(time,15)
plot(c1+c2+c3~time,type="l",main="chrom1")

#spectrum generation
spectra <- function(x,a,b,c,d,e) a + b*(x-e) + c*((x-e)^2) + d*((x-e)^3)
x <- 220:300
s1 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),0)
s2 <- spectra(x,(-1.054e02),1.3,(-5.239e-03),(6.927e-06),-20)
s3 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),20)

chrom1.tot <-
data.frame(time,outer(c1,s1,"*")+outer(c2,s2,"*")+outer(c2,s2,"*"))
names(chrom.tot)[-1] <- x

#generation of chromatogram 2
c1 <- f(time,2.1)
c2 <- f(time,4)
c3 <- f(time,8) 
plot(c1+c2+c3~time,type="l",main="chrom2")

chrom2.tot <-
data.frame(time,outer(c1,s1,"*")+outer(c2,s2,"*")+outer(c2,s2,"*"))
names(chrom.tot)[-1] <- x

plot(s1~x,type="l",main="spectra")
lines(s2~x,col=2)
lines(s3~x,col=3)

Thanks for your time

Kind Regards

Bart
-- 
View this message in context: 
http://www.nabble.com/Chromatogram-deconvolution-and-peak-matching-tp22057592p22057592.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stopping stepAIC (MASS package)

Peter Jepsen  DCE.AU.DK> writes:
> [snip]
>  ... I would save valuable time if
> the procedure would stop automatically when it encounters any anomaly
> that causes it to warn me. Is this possible? And if so, how?
> 

  Maybe options(warn=2)  will do what you want.

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] assuming AR(1) residuals in OLS

2009-02-17 Thread constantine

Thank you Gabor Grothendieck for your message.
I would surely like to say, that if someone wants to assume AR(1)
residuals, running the regression y ~ x, could run


gls(y~x, correlation = corAR1(0, ~1))




Constantine Tsardounis
http://www.costis.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] joining "one-to-many"

2009-02-17 Thread Monica Pisica


Hello list,
 
I am wondering if a joining "one-to-many" can be done a little bit easier. I 
tried merge function but I was not able to do it, so I end up using for and if.
 
Suppose you have a table with locations, each location repeated several times, 
and some attributes at that location. The second table has the same locations, 
but only once with a different set of attributes. I would like to add the 
second set of attributes to the first table.
 
Example:
 
set.seed <- 123
loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
val1 <- round(rnorm(10),2)
val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
t1 <- data.frame(loc, val1, val2)
t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 = c(25, 
67, 48))
 
# join one-to-many
 
n <- nrow(t1)
m <- nrow(t2)
t1$val3 <- rep(1, n)
t1$val4 <- rep(1, n)
 
for (i in 1:n) {
for (j in 1:m){
if (t1$loc[i]==t2$loc[j]) {
t1$val3[i] <- as.character(t2$val3[j])
t1$val4[i] <- t2$val4[j]
}
}
}
 
Desired result:

t1
   loc  val1 val2 val3 val4
1   L1 -0.41am   25
2   L1 -0.69bm   25
3   L1  0.36cm   25
4   L2  1.11an   67
5   L2  0.15bn   67
6   L2 -0.80dn   67
7   L2 -0.08fn   67
8   L2 -1.01en   67
9   L3 -1.01bp   48
10  L3 -2.50ep   48

 
This code works OK but it is slow if the data frames are actually bigger than 
my little example. I hope somebody knows of a better way of doing these type of 
things.
 
Thanks,
 
Monica
_


22009
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] joining "one-to-many"

2009-02-17 Thread hadley wickham

On Tue, Feb 17, 2009 at 8:33 AM, Monica Pisica  wrote:
>
> Hello list,
>
> I am wondering if a joining "one-to-many" can be done a little bit easier. I 
> tried merge function but I was not able to do it, so I end up using for and 
> if.
>
> Suppose you have a table with locations, each location repeated several 
> times, and some attributes at that location. The second table has the same 
> locations, but only once with a different set of attributes. I would like to 
> add the second set of attributes to the first table.
>
> Example:
>
> set.seed <- 123
> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
> val1 <- round(rnorm(10),2)
> val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
> t1 <- data.frame(loc, val1, val2)
> t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 = c(25, 
> 67, 48))
>
> # join one-to-many
>
> n <- nrow(t1)
> m <- nrow(t2)
> t1$val3 <- rep(1, n)
> t1$val4 <- rep(1, n)
>
> for (i in 1:n) {
>for (j in 1:m){
>if (t1$loc[i]==t2$loc[j]) {
>t1$val3[i] <- as.character(t2$val3[j])
>t1$val4[i] <- t2$val4[j]
>}
>}
> }
>
> Desired result:
>
> t1
>   loc  val1 val2 val3 val4
> 1   L1 -0.41am   25
> 2   L1 -0.69bm   25
> 3   L1  0.36cm   25
> 4   L2  1.11an   67
> 5   L2  0.15bn   67
> 6   L2 -0.80dn   67
> 7   L2 -0.08fn   67
> 8   L2 -1.01en   67
> 9   L3 -1.01bp   48
> 10  L3 -2.50ep   48
>
>
> This code works OK but it is slow if the data frames are actually bigger than 
> my little example. I hope somebody knows of a better way of doing these type 
> of things.

merge(t1, t2)


Hadley
-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] joining "one-to-many"

2009-02-17 Thread Marc Schwartz

on 02/17/2009 08:33 AM Monica Pisica wrote:
> Hello list,
>  
> I am wondering if a joining "one-to-many" can be done a little bit
easier. I tried merge function but I was not able to do it, so I end up
using for and if.
> 
> Suppose you have a table with locations, each location repeated
several times, and some attributes at that location. The second table
has the same locations, but only once with a different set of
attributes. I would like to add the second set of attributes to the
first table.

> Example:
>  
> set.seed <- 123

This needs to be set.seed(123)

See ?set.seed

:-)

> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
> val1 <- round(rnorm(10),2)
> val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
> t1 <- data.frame(loc, val1, val2)
> t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 = c(25, 
> 67, 48))
>  
> # join one-to-many
>  
> n <- nrow(t1)
> m <- nrow(t2)
> t1$val3 <- rep(1, n)
> t1$val4 <- rep(1, n)
>  
> for (i in 1:n) {
> for (j in 1:m){
> if (t1$loc[i]==t2$loc[j]) {
> t1$val3[i] <- as.character(t2$val3[j])
> t1$val4[i] <- t2$val4[j]
> }
> }
> }
>  
> Desired result:
> 
> t1
>loc  val1 val2 val3 val4
> 1   L1 -0.41am   25
> 2   L1 -0.69bm   25
> 3   L1  0.36cm   25
> 4   L2  1.11an   67
> 5   L2  0.15bn   67
> 6   L2 -0.80dn   67
> 7   L2 -0.08fn   67
> 8   L2 -1.01en   67
> 9   L3 -1.01bp   48
> 10  L3 -2.50ep   48
> 
>  
> This code works OK but it is slow if the data frames are actually
bigger than my little example. I hope somebody knows of a better way of
doing these type of things.

> merge(t1, t2, by = "loc")
   loc  val1 val2 val3 val4
1   L1 -0.32am   25
2   L1 -1.50bm   25
3   L1 -0.31cm   25
4   L2  1.42an   67
5   L2  0.32bn   67
6   L2 -0.12dn   67
7   L2  0.33fn   67
8   L2 -1.74en   67
9   L3  0.88bp   48
10  L3  1.88ep   48

> system.time(merge(t1, t2, by = "loc"))
   user  system elapsed
  0.004   0.000   0.019


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] frequency table for multiple variables

2009-02-17 Thread Hans Ekbrand

Hi r-help!

Consider the following data-frame:

   var1 var2 var3 
1 314 
2 223 
3 223 
4 44   NA 
5 435 
6 223 
7 343 

How can I get R to convert this into the following?

Value 1  2  3  4  5 
var1  0  3  2  2  0
var2  1  3  1  2  0 
var3  0  0  4  1  1

TIA,

-- 
Hans Ekbrand (http://sociologi.cjb.net) 
Q. What is that strange attachment in this mail?
A. My digital signature, see www.gnupg.org for info on how you could
   use it to ensure that this mail is from me and has not been
   altered on the way to you.


signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] joining "one-to-many"

Try merge(t1, t2)


On Tue, Feb 17, 2009 at 9:33 AM, Monica Pisica  wrote:
>
> Hello list,
>
> I am wondering if a joining "one-to-many" can be done a little bit easier. I 
> tried merge function but I was not able to do it, so I end up using for and 
> if.
>
> Suppose you have a table with locations, each location repeated several 
> times, and some attributes at that location. The second table has the same 
> locations, but only once with a different set of attributes. I would like to 
> add the second set of attributes to the first table.
>
> Example:
>
> set.seed <- 123
> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
> val1 <- round(rnorm(10),2)
> val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
> t1 <- data.frame(loc, val1, val2)
> t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 = c(25, 
> 67, 48))
>
> # join one-to-many
>
> n <- nrow(t1)
> m <- nrow(t2)
> t1$val3 <- rep(1, n)
> t1$val4 <- rep(1, n)
>
> for (i in 1:n) {
>for (j in 1:m){
>if (t1$loc[i]==t2$loc[j]) {
>t1$val3[i] <- as.character(t2$val3[j])
>t1$val4[i] <- t2$val4[j]
>}
>}
> }
>
> Desired result:
>
> t1
>   loc  val1 val2 val3 val4
> 1   L1 -0.41am   25
> 2   L1 -0.69bm   25
> 3   L1  0.36cm   25
> 4   L2  1.11an   67
> 5   L2  0.15bn   67
> 6   L2 -0.80dn   67
> 7   L2 -0.08fn   67
> 8   L2 -1.01en   67
> 9   L3 -1.01bp   48
> 10  L3 -2.50ep   48
>
>
> This code works OK but it is slow if the data frames are actually bigger than 
> my little example. I hope somebody knows of a better way of doing these type 
> of things.
>
> Thanks,
>
> Monica
> _
>
>
> 22009
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] spss-file problem with foreign 0.8-32

2009-02-17 Thread Harry Haupt


Hi,
after updating to foreign version 0.8-32, I experienced the following error
when I tried to load a SPSS file:

Fehler in inherits(x, "factor") : objekt "cp" nicht gefunden
Zusätzlich: Warning message:
In read.spss("***l.sav", use.value.labels = TRUE, to.data.frame = TRUE) :
  ***.sav: File-indicated character representation code (1252) looks like a
Windows codepage

It worked without problems with earlier versions.

Thanks for any clues,
best,
Harry
-- 
View this message in context: 
http://www.nabble.com/spss-file-problem-with-foreign-0.8-32-tp22059259p22059259.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Uninstall question

2009-02-17 Thread ANJAN PURKAYASTHA

I need to uninstall R 2.7.1 from my Mac. What is the best way to uninstall
it? Simply delete the R icon in the Applications folder?
Or is it more involved?
TIA,
Anjan

-- 
=
anjan purkayastha, phd
bioinformatics analyst
whitehead institute for biomedical research
nine cambridge center
cambridge, ma 02142

purkayas [at] wi [dot] mit [dot] edu
703.740.6939

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] joining "one-to-many"

2009-02-17 Thread ONKELINX, Thierry

Hi Monica,

merge(t1, t2) works on your example. So why don't you use merge?

HTH,

Thierry




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
thierry.onkel...@inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Monica Pisica
Verzonden: dinsdag 17 februari 2009 15:33
Aan: R help project
Onderwerp: [R] joining "one-to-many"


Hello list,

I am wondering if a joining "one-to-many" can be done a little bit
easier. I tried merge function but I was not able to do it, so I end up
using for and if.

Suppose you have a table with locations, each location repeated several
times, and some attributes at that location. The second table has the
same locations, but only once with a different set of attributes. I
would like to add the second set of attributes to the first table.

Example:

set.seed <- 123
loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
val1 <- round(rnorm(10),2)
val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
t1 <- data.frame(loc, val1, val2)
t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 =
c(25, 67, 48))

# join one-to-many

n <- nrow(t1)
m <- nrow(t2)
t1$val3 <- rep(1, n)
t1$val4 <- rep(1, n)

for (i in 1:n) {
for (j in 1:m){
if (t1$loc[i]==t2$loc[j]) {
t1$val3[i] <- as.character(t2$val3[j])
t1$val4[i] <- t2$val4[j]
}
}
}

Desired result:

t1
   loc  val1 val2 val3 val4
1   L1 -0.41am   25
2   L1 -0.69bm   25
3   L1  0.36cm   25
4   L2  1.11an   67
5   L2  0.15bn   67
6   L2 -0.80dn   67
7   L2 -0.08fn   67
8   L2 -1.01en   67
9   L3 -1.01bp   48
10  L3 -2.50ep   48


This code works OK but it is slow if the data frames are actually bigger
than my little example. I hope somebody knows of a better way of doing
these type of things.

Thanks,

Monica
_


22009
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Uninstall question

2009-02-17 Thread Sundar Dorai-Raj

This is on the Mac FAQ:

http://cran.cnr.berkeley.edu/bin/macosx/RMacOSX-FAQ.html#How-can-R-for-Mac-OS-X-be-uninstalled_003f

HTH,

--sundar

On Tue, Feb 17, 2009 at 7:17 AM, ANJAN PURKAYASTHA
 wrote:
> I need to uninstall R 2.7.1 from my Mac. What is the best way to uninstall
> it? Simply delete the R icon in the Applications folder?
> Or is it more involved?
> TIA,
> Anjan
>
> --
> =
> anjan purkayastha, phd
> bioinformatics analyst
> whitehead institute for biomedical research
> nine cambridge center
> cambridge, ma 02142
>
> purkayas [at] wi [dot] mit [dot] edu
> 703.740.6939
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] joining "one-to-many"

2009-02-17 Thread Monica Pisica



Ok,
 
I feel properly ashamed. I suppose my "real" data is a little bit different 
than my toy data (although i don't know how) because i did try the merge 
function as simple as merge(t1, t2) and did not work. Maybe a reset of my 
session will solve my problems and more coffee my confusion.
 
Again, thanks for your help,
 
Monica
 

> Date: Tue, 17 Feb 2009 10:09:17 -0500
> Subject: Re: [R] joining "one-to-many"
> From: ggrothendi...@gmail.com
> To: pisican...@hotmail.com
> CC: r-help@r-project.org
>
> Try merge(t1, t2)
>
>
> On Tue, Feb 17, 2009 at 9:33 AM, Monica Pisica wrote:
>>
>> Hello list,
>>
>> I am wondering if a joining "one-to-many" can be done a little bit easier. I 
>> tried merge function but I was not able to do it, so I end up using for and 
>> if.
>>
>> Suppose you have a table with locations, each location repeated several 
>> times, and some attributes at that location. The second table has the same 
>> locations, but only once with a different set of attributes. I would like to 
>> add the second set of attributes to the first table.
>>
>> Example:
>>
>> set.seed <- 123
>> loc <- c(rep("L1", 3), rep("L2", 5), rep("L3", 2))
>> val1 <- round(rnorm(10),2)
>> val2 <- c("a", "b", "c", "a", "b", "d", "f", "e", "b", "e")
>> t1 <- data.frame(loc, val1, val2)
>> t2 <- data.frame(loc=c("L1","L2","L3"), val3 = c("m", "n", "p"), val4 = 
>> c(25, 67, 48))
>>
>> # join one-to-many
>>
>> n <- nrow(t1)
>> m <- nrow(t2)
>> t1$val3 <- rep(1, n)
>> t1$val4 <- rep(1, n)
>>
>> for (i in 1:n) {
>> for (j in 1:m){
>> if (t1$loc[i]==t2$loc[j]) {
>> t1$val3[i] <- as.character(t2$val3[j])
>> t1$val4[i] <- t2$val4[j]
>> }
>> }
>> }
>>
>> Desired result:
>>
>> t1
>> loc val1 val2 val3 val4
>> 1 L1 -0.41 a m 25
>> 2 L1 -0.69 b m 25
>> 3 L1 0.36 c m 25
>> 4 L2 1.11 a n 67
>> 5 L2 0.15 b n 67
>> 6 L2 -0.80 d n 67
>> 7 L2 -0.08 f n 67
>> 8 L2 -1.01 e n 67
>> 9 L3 -1.01 b p 48
>> 10 L3 -2.50 e p 48
>>
>>
>> This code works OK but it is slow if the data frames are actually bigger 
>> than my little example. I hope somebody knows of a better way of doing these 
>> type of things.
>>
>> Thanks,
>>
>> Monica
>> _
>>
>>
>> 22009
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

2009-02-17 Thread Bernhard Reinhardt


Hi!

I came across R just a few days ago since I was looking for a toolbox 
for cox-regression.


I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from John Fox.

As described therein plotting survival-functions works well 
(plot(survfit(model))). But I´d like to do some manipulation with the 
survival-functions before plotting them e.g. dividing one 
survival-function by another.


survfit() only returns an object of the following structure

 n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function out 
of these values and how I extract the values from the object? I´m sure 
the calculation is done by the corresponding plot-routine, but I 
couldn´t find that one either.


Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

2009-02-17 Thread Arthur Allignol


Hi,

See ?survfit.object

if fit is the object you get using survfit,
fit$surv will give you the survival probability.

Best,
arthur

Bernhard Reinhardt wrote:

Hi!

I came across R just a few days ago since I was looking for a toolbox 
for cox-regression.


I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from John Fox.

As described therein plotting survival-functions works well 
(plot(survfit(model))). But I´d like to do some manipulation with the 
survival-functions before plotting them e.g. dividing one 
survival-function by another.


survfit() only returns an object of the following structure

 n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function out 
of these values and how I extract the values from the object? I´m sure 
the calculation is done by the corresponding plot-routine, but I 
couldn´t find that one either.


Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spss-file problem with foreign 0.8-32

2009-02-17 Thread Peter Dalgaard

Harry Haupt wrote:
> Hi,
> after updating to foreign version 0.8-32, I experienced the following error
> when I tried to load a SPSS file:
> 
> Fehler in inherits(x, "factor") : objekt "cp" nicht gefunden
> Zusätzlich: Warning message:
> In read.spss("***l.sav", use.value.labels = TRUE, to.data.frame = TRUE) :
>   ***.sav: File-indicated character representation code (1252) looks like a
> Windows codepage
> 
> It worked without problems with earlier versions.
> 
> Thanks for any clues,
> best,
> Harry

Yes, something in the logic appears to have gotten garbled.

It's in this part of read,spss:

if (is.character(reencode)) {
cp <- reencode
reencode <- TRUE
}
else if (codepage <= 500 || codepage >= 2000) {
attr(rval, "codepage") <- NULL
reencode <- FALSE
}
else if (m <- match(cp, knownCP, 0L))
cp <- names(knownCP)[m]

if you get to the 2nd "else if" then cp is unset. Possible the attempted
match should be of codepage? But it still looks wrong: Why restrict to
codepages between 500 and 2000 when knownCP contains several values
above 1???

A workaround is to set reencode="ascii" (or whatever is relevant).

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R scripts and parameters

2009-02-17 Thread mauede

A couple of weeks ago I asked how it is possible to run an R script (not a 
function) passing some parameters.
Someone suggested the function "commandArgs()".
I read the on-line help and found no clarifying example. Therefore I do not 
know how to use it appropriately.
I noticed this function returns the pathname of the R executable which is not 
what I need.

I meant to ask if it is possible to open an R session and launch a script 
passing parameters that the script can retrieve and use itself.
Just like in C I can run a program and call it with  some arguments 

> Example_Prog A B C
  
The program "Example_Prog" can acess its own arguments through the data 
structures "argc" an "argv".

How can I launch an R script simulating the above mechanism ? 
Shall I use source ("script-name") ?
Where are the arguments to be passed, as part of the source call ?
Is the function "commandArgs" to be places as one of the first code lines of 
the script in order to access its own arguments ?
Is there any "commandArgs" usage example ?

Thank you very much in advance.
Maura






tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] frequency table for multiple variables

2009-02-17 Thread Marc Schwartz

on 02/17/2009 09:06 AM Hans Ekbrand wrote:
> Hi r-help!
> 
> Consider the following data-frame:
> 
>var1 var2 var3 
> 1 314 
> 2 223 
> 3 223 
> 4 44   NA 
> 5 435 
> 6 223 
> 7 343 
> 
> How can I get R to convert this into the following?
> 
> Value 1  2  3  4  5 
> var1  0  3  2  2  0
> var2  1  3  1  2  0 
> var3  0  0  4  1  1

> t(sapply(DF, function(x) table(factor(x, levels = 1:5
 1 2 3 4 5
var1 0 3 2 2 0
var2 1 3 1 2 0
var3 0 0 4 1 1

The key is to turn each column into a factor with explicitly defined
common levels for tabulation. This enables the table result to have a
consistent format across each column, allowing for a matrix to be
created, rather than a list.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R crash after fGarch update

2009-02-17 Thread John Kerpel

Prof Ripley:

Many thanks - it did indeed say it cannot find fGarch after I tried your
advice - but a completely clean re-install did the trick.

John

On Tue, Feb 17, 2009 at 1:09 AM, Prof Brian Ripley wrote:

> Start R with --vanilla, or rename youe saved workspace (.RData).
> Then
>
> library(fGarch)
> load(".RData")  # or whatever you renamed it to.
>
> This will either work or (more ikely) tell you it cannot find fGarch or a
> package it depends on).
>
> On Mon, 16 Feb 2009, John Kerpel wrote:
>
> Hi folks!
>> After updating my packages my R seems to have completely crashed as will
>> not
>> start up - even after I installed 2.8.1 from 2.8.0.
>>
>
> You haven't told us your OS: I am guesing Windows.
>
>  I get the following:
>>
>> Fatal error: unable to restore saved data in .Rdata
>>
>> Error in loadNamespeace(name): there is no package called fGarch
>>
>> But I do have a package called fGarch.
>>
>> After I hit ok, it crashes and exits.  I cannot use any functionality at
>> all.  What do I do?
>>
>> John
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] word wrap using Hershey fonts

2009-02-17 Thread SLCMSR


Hi,

is it possible to wrap a text using Hershey fonts? "\n" does not work!

Thanks in advance,
Martina

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

I seriously doubt that a survfit object could only contain that  
information. I suspect that you are erroneously thinking that what  
print.survfit offers is the entire story.

What does str(survfit(, data=) ) show you?

> data(aml)
> aml.mdl <- survfit(Surv(time, status) ~ x, data=aml)
# this is with survfit.Design since I load Hmisc/Design by default now.
> str(aml.mdl)
List of 18
 $ n: int 23
 $ time : num [1:20] 9 13 18 23 28 31 34 45 48 161 ...
 $ n.risk   : num [1:20] 11 10 8 7 6 5 4 3 2 1 ...
 $ n.event  : num [1:20] 1 1 1 1 0 1 1 0 1 0 ...
 $ surv : num [1:20] 0.909 0.818 0.716 0.614 0.614 ...
 $ type : chr "right"
 $ ntimes.strata: Named int [1:2] 10 10
  ..- attr(*, "names")= chr [1:2] "1" "2"
 $ strata   : Named num [1:2] 10 10
  ..- attr(*, "names")= chr [1:2] "Maintained" "Nonmaintained"
 $ strata.all   : Named int [1:2] 11 12
  ..- attr(*, "names")= chr [1:2] "Maintained" "Nonmaintained"
 $ std.err  : num [1:20] 0.0953 0.1421 0.1951 0.2487 0.2487 ...
 $ upper: num [1:20] 0.987 0.951 0.899 0.835 0.835 ...
 $ lower: num [1:20] 0.508 0.447 0.35 0.266 0.266 ...
 $ conf.type: chr "log-log"
 $ conf.int : num 0.95
 $ maxtime  : num 161
 $ units: chr "Day"
 $ time.label   : chr "time"
 $ call : language survfit(formula = Surv(time, status) ~ x,  
data = aml)

 - attr(*, "class")= chr "survfit"
>

I also don't think survfit returns a Cox model.

On Feb 17, 2009, at 10:37 AM, Bernhard Reinhardt wrote:

Hi!

I came across R just a few days ago since I was looking for a  
toolbox for cox-regression.

I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from  
John Fox.

As described therein plotting survival-functions works well  
(plot(survfit(model))). But I´d like to do some manipulation with  
the survival-functions before plotting them e.g. dividing one  
survival-function by another.

survfit() only returns an object of the following structure

n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function  
out of these values and how I extract the values from the object? I 
´m sure the calculation is done by the corresponding plot-routine,  
but I couldn´t find that one either.

Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R scripts and parameters

Try this:

A <- 1
B <- 2
C <- 3
source("myfile.R")

Now the code in myfile can access A, B and C.

On Tue, Feb 17, 2009 at 10:55 AM,   wrote:
> A couple of weeks ago I asked how it is possible to run an R script (not a 
> function) passing some parameters.
> Someone suggested the function "commandArgs()".
> I read the on-line help and found no clarifying example. Therefore I do not 
> know how to use it appropriately.
> I noticed this function returns the pathname of the R executable which is not 
> what I need.
>
> I meant to ask if it is possible to open an R session and launch a script 
> passing parameters that the script can retrieve and use itself.
> Just like in C I can run a program and call it with  some arguments
>
>> Example_Prog A B C
>
> The program "Example_Prog" can acess its own arguments through the data 
> structures "argc" an "argv".
>
> How can I launch an R script simulating the above mechanism ?
> Shall I use source ("script-name") ?
> Where are the arguments to be passed, as part of the source call ?
> Is the function "commandArgs" to be places as one of the first code lines of 
> the script in order to access its own arguments ?
> Is there any "commandArgs" usage example ?
>
> Thank you very much in advance.
> Maura
>
>
>
>
>
>
> tutti i telefonini TIM!
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spss-file problem with foreign 0.8-32

2009-02-17 Thread Harry Haupt



Peter Dalgaard wrote:
> 
> Yes, something in the logic appears to have gotten garbled.
> 
> It's in this part of read,spss:
> 
> if (is.character(reencode)) {
> cp <- reencode
> reencode <- TRUE
> }
> else if (codepage <= 500 || codepage >= 2000) {
> attr(rval, "codepage") <- NULL
> reencode <- FALSE
> }
> else if (m <- match(cp, knownCP, 0L))
> cp <- names(knownCP)[m]
> 
> if you get to the 2nd "else if" then cp is unset. Possible the attempted
> match should be of codepage? But it still looks wrong: Why restrict to
> codepages between 500 and 2000 when knownCP contains several values
> above 1???
> 
> A workaround is to set reencode="ascii" (or whatever is relevant).
> 
> -- 
>O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
> ~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

Dear Peter,
thanks, reencode="ascii" fixed it (and leaves just the warning message,
which seems to have no effect).
Best,
Harry

-
---
Centre for Statistics
Bielefeld University, Germany
-- 
View this message in context: 
http://www.nabble.com/spss-file-problem-with-foreign-0.8-32-tp22059259p22060774.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R scripts and parameters

2009-02-17 Thread Duncan Murdoch


On 2/17/2009 10:55 AM, mau...@alice.it wrote:

A couple of weeks ago I asked how it is possible to run an R script (not a 
function) passing some parameters.
Someone suggested the function "commandArgs()".
I read the on-line help and found no clarifying example. Therefore I do not 
know how to use it appropriately.
I noticed this function returns the pathname of the R executable which is not 
what I need.

I meant to ask if it is possible to open an R session and launch a script 
passing parameters that the script can retrieve and use itself.
Just like in C I can run a program and call it with  some arguments 


Example_Prog A B C
  
The program "Example_Prog" can acess its own arguments through the data structures "argc" an "argv".


How can I launch an R script simulating the above mechanism ? 
Shall I use source ("script-name") ?

Where are the arguments to be passed, as part of the source call ?
Is the function "commandArgs" to be places as one of the first code lines of 
the script in order to access its own arguments ?
Is there any "commandArgs" usage example ?



Gabor gave you a solution from within R.  If you want to run a script 
from the command line, then use commandArgs(TRUE).  For example, put 
this into the file test.R:


commandArgs(TRUE)

(The TRUE says you only want to see the trailing arguments, not 
everything else on the command line.)


Then from the command line, do

Rscript test.R A B C

and you'll see the output

[1] "A" "B" "C"

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ARIMA models

2009-02-17 Thread emj83


is there some sort of R function which can advise me of the best ARIMA(p,q,r)
model to use based on the Schwarz criterion e.g for e.g p=0-5, q =0, r=0-5
or for example p+r< 5???

or is this something I will have to write my own code for?

Thanks Emma
-- 
View this message in context: 
http://www.nabble.com/ARIMA-models-tp22059382p22059382.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to connect R and WinBUGS/OpenBUGS/LinBUGS in Linux in Feb. 2009

2009-02-17 Thread Paul Heinrich Dietrich

Hi all,
I've managed to get JAGS working on my Ubuntu Hardy Linux with a 32-bit
computer and AMD processors using R 2.8.1. JAGS is great. I've read that
JAGS is the fastest, but that hasn't been my experience. At any rate, I
have more experience with WinBUGS under Windows and would like a version of
that working as well.

It seems like I've read a lot on the subject and tried a lot, but haven't
managed to get BUGS to work yet. The most success I've had is to install
WinBUGS or OpenBUGS using this method:
http://www.math.aau.dk/~slb/kurser/bayes-08/install.html

What you also need to know is that you need to open Wine and add a drive.
Although Z is recommended, I haven't been able to specify it, but have
gotten a D drive to work, using:

wine D:/opt/OpenBUGS/winbugs.exe

Using this method, OpenBUGS opens. Now, to be able to open it with R. I've
read all sorts of discussions about BRugs (which is no longer on CRAN, but
old versions can still be found), rbugs, and R2WinBUGS (which I'm used to
using on Windows with WinBUGS). Some people say R2WinBUGS cannot run
OpenBUGS on Linux, some claim they've done it (I think). It seems the same
thing with everything else. I've tried making the linbugs and cbugs file
recommended elsewhere online. It's all very confusing.

Can someone show a method that works currently, along with some sample code?
I'm also new to Linux, and confused by path conventions. For example, in
rbugs, it shows an example of a path such as
"/var/scratch/jyan/wine-20040408/wine", and I don't see how to modify this.
I have no /var/scratch to begin with, and think Wine is installed in
/home/me/.wine...(I don't have Linux in front of me right now).

Please help. Thanks.
--
View this message in context:
http://www.nabble.com/How-to-connect-R-and-WinBUGS-OpenBUGS-LinBUGS-in-Linux-in-Feb.-2009-tp22058716p22058716.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Survival-Analysis: How to get numerical values from survfit (and not just a plot)?

2009-02-17 Thread Eik Vettorazzi


Hi Bernhard,
I'm wondering what you will expect to get in "dividing" two proportional 
survival curves from a fitted cox model.
Anyway, you can provide a newdata object to the survfit function 
containing any combination of cofactors you are interested in and then 
use summary, eg:


fit <- coxph( Surv(futime,fustat)~resid.ds+rx+ecog.ps,data=ovarian)
summary(survfit( fit,newdata=data.frame(rx=1, ecog.ps=2, resid.ds=0)))

hth.

Bernhard Reinhardt schrieb:

Hi!

I came across R just a few days ago since I was looking for a toolbox 
for cox-regression.


I´ve read
"Cox Proportional-Hazards Regression for Survival Data
Appendix to An R and S-PLUS Companion to Applied Regression" from John 
Fox.


As described therein plotting survival-functions works well 
(plot(survfit(model))). But I´d like to do some manipulation with the 
survival-functions before plotting them e.g. dividing one 
survival-function by another.


survfit() only returns an object of the following structure

 n events median 0.9LCL 0.9UCL
55.000 55.000  1.033  0.696  1.637

Can you tell me how I can calculate a survival- or baseline-function 
out of these values and how I extract the values from the object? I´m 
sure the calculation is done by the corresponding plot-routine, but I 
couldn´t find that one either.


Regards

Bernhard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


--
Eik Vettorazzi
Institut für Medizinische Biometrie und Epidemiologie
Universitätsklinikum Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/42803-8243
F ++49/40/42803-7790

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparison of age categories using contrasts

2009-02-17 Thread Mark Difford


Hi Dylan, Chuck,

Mark Difford wrote:
>> Coming to your question [?] about how to generate the kind of contrasts
>> that Patrick wanted 
>> using contrast.Design. Well, it is not that straightforward, though I may
>> have missed 
>> something in the documentation to the function. In the past I have
>> cobbled them together 
>> using the kind of hack given below:

Here is a much simpler route, and a correction to my earlier posting (helped
by a little off-list prompting from Prof. Harrell). All that is required to
get successive difference (i.e. sdif) contrasts using contrast.Design is the
following:


contrast(l, a=list(f=c("a","b","c")), b=list(f=c("b","c","d")))

## Run a full example
set.seed(10101010) 
x <- rnorm(100) 
y1 <- x[1:25] * 2 + rnorm(25, mean=1) 
y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5) 
y3 <- x[51:75] * 2.9 + rnorm(25, mean=5) 
y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5) 

d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4], each=25)))

dd <- datadist(d) 
options(datadist='dd') 
l <- ols(y ~ x * f, data=d)

## compare with multcomp by setting x=0
summary(glht(l, linfct=mcp(f="Seq")))
contrast(l, b=list(x=0, f=c("a","b","c")), a=list(x=0, f=c("b","c","d")))

Regards, and apologies for any confusion, Mark.


Mark Difford wrote:
> 
> Hi Dylan,
> 
>>> Am I trying to use contrast.Design() for something that it was not
>>> intended for? ...
> 
> I think Prof. Harrell's main point had to do with how interactions are
> handled. You can also get the kind of contrasts that Patrick was
> interested in via multcomp. If we do this using your artificial data set
> we see that the contrasts are the same as those got by fitting the model
> using contr.sdif, but a warning is generated about interactions in the
> model &c. [see example code]
> 
> Part of Prof. Harrell's "system" is that in generating contrasts via
> contr.Design an appropriate value is automatically chosen for the
> interacting variable (in this case the median value of x) so that a
> reasonable default set of contrasts is calculated. This can of course be
> changed.
> 
> Coming to your question [?] about how to generate the kind of contrasts
> that Patrick wanted using contrast.Design. Well, it is not that
> straightforward, though I may have missed something in the documentation
> to the function. In the past I have cobbled them together using the kind
> of hack given below:
> 
> ## Exampe code
> x <- rnorm(100) 
> y1 <- x[1:25] * 2 + rnorm(25, mean=1)
> y2 <- x[26:50] * 2.6 + rnorm(25, mean=1.5)
> y3 <- x[51:75] * 2.9 + rnorm(25, mean=5)
> y4 <- x[76:100] * 3.5 + rnorm(25, mean=5.5)
> 
> 
> d <- data.frame(x=x, y=c(y1,y2,y3,y4), f=factor(rep(letters[1:4],
> each=25))) 
> 
> 
> # now try with contrast.Design(): 
> library(multcomp)
> dd <- datadist(d) 
> options(datadist='dd') 
> l <- ols(y ~ x * f, data=d)
> 
> ## model fitted using successive difference contrasts
> library(MASS)
> T.lm <- lm(y ~ x * f, contrasts=list(f="contr.sdif"), data=d)
> summary(T.lm)
> 
> ## show the warning: model fitted using ols() and treatment contrasts
> summary(glht(l, linfct=mcp(f="Seq")))
> 
> ## "custom" successive difference contrasts using contrast.Design
> TFin <- {}
> for (i in 1:(length(levels(d$f))-1)) {
> tcont <- contrast(l, a=list(f=levels(d$f)[i]),
> b=list(f=levels(d$f)[i+1]))
> TFin <- rbind(TFin, tcont)
> rownames(TFin)[i] <-  paste(levels(d$f)[i], levels(d$f)[i+1], sep="-")
> }
> TFin[,1:9]
> 
> Regards, Mark.
> 
> 
> 
> Dylan Beaudette-2 wrote:
>> 
>> On Mon, Feb 16, 2009 at 5:28 PM, Patrick Giraudoux
>>  wrote:
>>> Greg Snow a écrit :
 One approach is to create your own contrasts matrix:


> mycmat <- diag(8)
> mycmat[ row(mycmat) == col(mycmat) + 1 ] <- -1
> mycmati <- solve(mycmat)
> contrasts(agefactor) <- mycmati[,-1]
>

 Now when you use agefactor, the intercept will be the first age group
 and the slopes will be the differences between the pairs of groups
 (make sure that the order of the levels of agefactor is correct).

 The difference between this method and the contr.sdif function in MASS
 is how the intercept will end up being interpreted (and the dimnames).

 Hope this helps,


>>>
>>> Actually, exactly what I needed including the reference to contr.sdif in
>>> MASS I did not spot before (although I am a faithful reader of the
>>> yellow book... but so many things still escape to me). Again thanks a
>>> lot.
>>>
>>> Patrick
>>>
>> 
>> Glad you were able to solve your problem. Frank replied earlier with
>> the suggestion to use contrast.Design() to perform the tests after the
>> initial model has been fit. Perhaps I am a little denser than normal,
>> but I cannot figure out how to apply contrast.Design() to a similar
>> model with several levels of a factor. Example data:
>> 
>> # need these
>> library(lattice)
>> library(Design)
>> 
>> # replicate an important experimental dataset
>> set.seed(10101010)
>> x <- rnorm(100)
>> y

Re: [R] Chromatogram deconvolution and peak matching

2009-02-17 Thread Katharine Mullen

Hoi Bart,

I think you're right that ALS should be applicable to this problem.
Unfortunately in writing I see that there is a bug when the spectra are
NOT constrained to nonnegative values (the package has been used to my
knowledge only in fitting multiway mass spectra thus far, where this
constraint is always applied); I'll have a patched version soon.

I'd be interested in hearing about where the manual could be improved,
too.

#2D chromatogram generation

time <- seq(0,20,by=0.05)
f <- function(x,rt) dnorm((x-rt),mean=0,sd=rt/35)
C1 <- matrix(c( f(time,6.1), f(time,5.6), f(time,15)), nrow=401,ncol=3)
C2 <- matrix(c( f(time,2.1), f(time,4), f(time,8)), nrow=401,ncol=3)

#spectrum generation
spectra <- function(x,a,b,c,d,e) a + b*(x-e) + c*((x-e)^2) + d*((x-e)^3)
x <- 220:300
s1 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),0)
s2 <- spectra(x,(-1.054e02),1.3,(-5.239e-03),(6.927e-06),-20)
s3 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),20)
spec <- matrix(c(s1,s2,s3), nrow=81,ncol=3)

## data matrix 1
chrom1 <- C1 %*% t(spec)

## data matrix 2
chrom2 <- C2 %*% t(spec)

require(ALS)

## you want to recover 2 (401 x 3) matrices C1 and C2 and a
## (82 x 3) matrix E such that
## chrom1 = C1 E^T, chrom2 = C2 E^T

## some starting guess for elution profiles
## these need to be pretty good
Cstart1 <- matrix(c( f(time,4), f(time,6), f(time,12)), nrow=401,ncol=3)
Cstart2 <- matrix(c( f(time,3), f(time,5), f(time,10)), nrow=401,ncol=3)

xx <- als( CList = list(Cstart1, Cstart2), PsiList = list(chrom1, chrom2),
  S=matrix(1, nrow=81, ncol=3), x=time, x2=x, uniC=TRUE,
  normS=TRUE, nonnegS=TRUE)

par(mfrow=c(3,1))

matplot(time, xx$CList[[1]], col=2, type="l",
main="elution profiles chrom1", lty=1,
ylab="amplitude")

matplot(time, C1, col=1, add=TRUE, type="l", lty=1)

legend(10,2,legend=c("real", "estimated"), col=1:2, lty=1)

matplot(time, xx$CList[[2]], col=2, type="l",
main="elution profiles chrom2", lty=1,
ylab="amplitude")

matplot(time, C2, col=1, add=TRUE, type="l", lty=1)

legend(10,6,legend=c("real", "estimated"), col=1:2, lty=1)

matplot(x, xx$S, col=2, type="l", main="spectra", lty=1,
ylab="amplitude")

matplot(x, spec, col=1, add=TRUE, type="l", lty=1)

legend(270,.95,legend=c("real", "estimated"), col=1:2, lty=1)


On Tue, 17 Feb 2009, bartjoosen wrote:

>
> Hi,
>
> I'm trying to match peaks between chromatographic runs.
> I'm able to match peaks when they are chromatographed with the same method,
> but not when there are different methods are used and spectra comes in to
> play.
>
> While searching I found the ALS package which should be usefull for my
> application, but I couldn't figure it out.
>
> I made some dummy chroms with R, which mimic my actual datasets, to play
> with, but after looking at the manuals of ALS, I'm affraid I can't get the
> job done. Can someone put me on the right way?
>
> Here is my code to generate the dummy chroms, which also plots the 2 chroms
> and the spectra of the 3 peaks:
>
> #2D chromatogram generation
> par(mfrow=c(3,1))
> time <- seq(0,20,by=0.05)
> f <- function(x,rt) dnorm((x-rt),mean=0,sd=rt/35)
> c1 <- f(time,6.1)
> c2 <- f(time,5.6)
> c3 <- f(time,15)
> plot(c1+c2+c3~time,type="l",main="chrom1")
>
> #spectrum generation
> spectra <- function(x,a,b,c,d,e) a + b*(x-e) + c*((x-e)^2) + d*((x-e)^3)
> x <- 220:300
> s1 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),0)
> s2 <- spectra(x,(-1.054e02),1.3,(-5.239e-03),(6.927e-06),-20)
> s3 <- spectra(x,(-194.2),2.386,(-0.009617),(1.275e-05),20)
>
> chrom1.tot <-
> data.frame(time,outer(c1,s1,"*")+outer(c2,s2,"*")+outer(c2,s2,"*"))
> names(chrom.tot)[-1] <- x
>
> #generation of chromatogram 2
> c1 <- f(time,2.1)
> c2 <- f(time,4)
> c3 <- f(time,8)
> plot(c1+c2+c3~time,type="l",main="chrom2")
>
> chrom2.tot <-
> data.frame(time,outer(c1,s1,"*")+outer(c2,s2,"*")+outer(c2,s2,"*"))
> names(chrom.tot)[-1] <- x
>
> plot(s1~x,type="l",main="spectra")
> lines(s2~x,col=2)
> lines(s3~x,col=3)
>
> Thanks for your time
>
> Kind Regards
>
> Bart
> --
> View this message in context: 
> http://www.nabble.com/Chromatogram-deconvolution-and-peak-matching-tp22057592p22057592.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] creating a map

2009-02-17 Thread Alina Sheyman

I'm trying to create a fairly basic map using R. What i want to get is the
map of the country with circles representing a count of students in each
state.
What I've done so far is as following -
  map("state")
 symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inches=F)

this gives me the map of the country, but one that's not populated by my
counts.
Does anyone know what I'm doing wrong?

Also, if anyone can recommend a good reference for creating maps in R, I'd
really appreciate that.

thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ARIMA models

see auto.arima in the forecast package.

On Tue, Feb 17, 2009 at 10:20 AM, emj83  wrote:
>
> is there some sort of R function which can advise me of the best ARIMA(p,q,r)
> model to use based on the Schwarz criterion e.g for e.g p=0-5, q =0, r=0-5
> or for example p+r< 5???
>
> or is this something I will have to write my own code for?
>
> Thanks Emma
> --
> View this message in context: 
> http://www.nabble.com/ARIMA-models-tp22059382p22059382.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to connect R and WinBUGS/OpenBUGS/LinBUGS in Linux in Feb. 2009

2009-02-17 Thread Uwe Ligges




Paul Heinrich Dietrich wrote:

Hi all,
I've managed to get JAGS working on my Ubuntu Hardy Linux with a 32-bit
computer and AMD processors using R 2.8.1.  JAGS is great.  I've read that
JAGS is the fastest, but that hasn't been my experience.  At any rate, I
have more experience with WinBUGS under Windows and would like a version of
that working as well.

It seems like I've read a lot on the subject and tried a lot, but haven't
managed to get BUGS to work yet.  The most success I've had is to install
WinBUGS or OpenBUGS using this method:
http://www.math.aau.dk/~slb/kurser/bayes-08/install.html

What you also need to know is that you need to open Wine and add a drive. 
Although Z is recommended, I haven't been able to specify it, but have

gotten a D drive to work, using:

wine D:/opt/OpenBUGS/winbugs.exe

Using this method, OpenBUGS opens.  Now, to be able to open it with R.  I've
read all sorts of discussions about BRugs (which is no longer on CRAN, but
old versions can still be found), rbugs, and R2WinBUGS (which I'm used to
using on Windows with WinBUGS).  Some people say R2WinBUGS cannot run
OpenBUGS on Linux, some claim they've done it (I think).  It seems the same
thing with everything else.  I've tried making the linbugs and cbugs file
recommended elsewhere online.  It's all very confusing.


For short: It is quite unlikely that BRugs / OpenBUGS (which is called 
LinBUGS under Linux) works natively under your Linux (although it might 
work under very specific settings). BRugs is available for Windows users 
from the "CRAN extras" repsository maintained by Brian Ripley. We moved 
it in order to meet GPL compliance issues.


Hence a standard recommendation is to use R2WinBUGS under native R under 
Linux with WinBUGS running under wine. R2WinBUGS can use wine to do so. 
See the help page ?bugs once you have loaded R2WinBUGS.


Best wishes,
Uwe Ligges


Can someone show a method that works currently, along with some sample code? 
I'm also new to Linux, and confused by path conventions.  For example, in

rbugs, it shows an example of a path such as
"/var/scratch/jyan/wine-20040408/wine", and I don't see how to modify this. 
I have no /var/scratch to begin with, and think Wine is installed in

/home/me/.wine...(I don't have Linux in front of me right now).

Please help.  Thanks.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Percentiles/Quantiles with Weighting

2009-02-17 Thread Brigid Mooney

Hi All,

I am looking at applications of percentiles to time sequenced data.  I had
just been using the quantile function to get percentiles over various
periods, but am more interested in if there is an accepted (and/or
R-implemented) method to apply weighting to the data so as to weigh recent
data more heavily.

I wrote the following function, but it seems quite inefficient, and not
really very flexible in its applications - so if anyone has any suggestions
on how to look at quantiles/percentiles within R while also using a
weighting schema, I would be very interested.

Note - this function supposes the data in X is time-sequenced, with the most
recent (and thus heaviest weighted) data at the end of the vector

WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
{
  Xprime <- NA

  for(i in 1:length(X))
  {
Xprime <- c(Xprime, rep(X[i], times=i))
  }

  print("Percentiles:")
  print(quantile(X, pctile))
  print("Weighted:")
  print(Xprime)
  print("Weighted Percentiles:")
  print(quantile(Xprime, pctile, na.rm=TRUE))
}

WtPercentile(1:10)
WtPercentile(rnorm(10))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] creating a map


Two places that have worked examples leap to mind:

--- Sarkar's online accompaniment to his book:
http://lmdvr.r-forge.r-project.org/figures/figures.html
Thumbing through the hard copy I see Figure 6.5 might of interest.

--- Addicted to R's graphics gallery:
http://addictedtor.free.fr/graphiques/search.php?q=map&engine=RGG

--  
David Winsemius


On Feb 17, 2009, at 11:53 AM, Alina Sheyman wrote:

I'm trying to create a fairly basic map using R. What i want to get  
is the
map of the country with circles representing a count of students in  
each

state.
What I've done so far is as following -
 map("state")
symbols 
(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inches=F)


this gives me the map of the country, but one that's not populated  
by my

counts.
Does anyone know what I'm doing wrong?

Also, if anyone can recommend a good reference for creating maps in  
R, I'd

really appreciate that.

thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] merging files with different structures

2009-02-17 Thread Grant Gillis

Hello list,

Thanks in advance for any help.

I have many (approx 20) files that I have merged.  For example

d1<-read.csv("AlleleReport.csv")
d2<-read.csv("AlleleReport.csv")

m1 <- merge(d1, d2,  by = c("IND", intersect(colnames(d1), colnames(d2))),
all = TRUE)
m2 <- merge(m1, d3,  by = c("IND", intersect(colnames(m1), colnames(d3))),
all = TRUE)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spss-file problem with foreign 0.8-32

2009-02-17 Thread Prof Brian Ripley

It is a issue specific to 0.8-32 and some files (most likely those 
with some (not all) Windows codepages declared).


We are trying to collect together some examples, and will update 
foreign accordingly later in the week.


On Tue, 17 Feb 2009, Harry Haupt wrote:



Hi,
after updating to foreign version 0.8-32, I experienced the following error
when I tried to load a SPSS file:

Fehler in inherits(x, "factor") : objekt "cp" nicht gefunden
Zusätzlich: Warning message:
In read.spss("***l.sav", use.value.labels = TRUE, to.data.frame = TRUE) :
 ***.sav: File-indicated character representation code (1252) looks like a
Windows codepage

It worked without problems with earlier versions.

Thanks for any clues,
best,
Harry
--
View this message in context: 
http://www.nabble.com/spss-file-problem-with-foreign-0.8-32-tp22059259p22059259.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging files with different structures

2009-02-17 Thread Grant Gillis

> Hello list,
>
> I am sorry for the previous half post.  I accidentily hit send.  Thanks
> again in advance for any help.
>
> I have many (approx 20) files that I have merged.  Each data set contains
> rows for individuals and data in 2 - 5 columns (depending upon which data
> set).  The individuals in each data set are not necessarily the same and are
> all duplicated (different data in columns) across sheets I am trying to
> merge.  I have used the merge function
>
For example
>
> d1<-read.csv("AlleleReport.csv")
> d2<-read.csv("AlleleReport.csv")
>
> m1 <- merge(d1, d2,  by = c("IND", intersect(colnames(d1), colnames(d2))),
> all = TRUE)
> m2 <- merge(m1, d3,  by = c("IND", intersect(colnames(m1), colnames(d3))),
> all = TRUE)



My problem is that when the data is merged it looks something like


Ind   L1 L1.1L2 L2.1L3   L3.1
a 12  13  NA NA  NA  NA
a  NA NA 22 43   34   45
b  14  1545   64   NA  NA
b   NANANA  NA 99  84

Is there a way that I can merge the rows for each individual?

Cheers


Grant







>
>
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Percentiles/Quantiles with Weighting

I do know that Harrell's Quantile function in the Hmisc package will  
allow quantile estimates from models. Whether it is general enough to  
extend to time series, I have no experience and cannot say.


--
David Winsemius


On Feb 17, 2009, at 11:57 AM, Brigid Mooney wrote:


Hi All,

I am looking at applications of percentiles to time sequenced data.   
I had

just been using the quantile function to get percentiles over various
periods, but am more interested in if there is an accepted (and/or
R-implemented) method to apply weighting to the data so as to weigh  
recent

data more heavily.

I wrote the following function, but it seems quite inefficient, and  
not
really very flexible in its applications - so if anyone has any  
suggestions

on how to look at quantiles/percentiles within R while also using a
weighting schema, I would be very interested.

Note - this function supposes the data in X is time-sequenced, with  
the most

recent (and thus heaviest weighted) data at the end of the vector

WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
{
 Xprime <- NA

 for(i in 1:length(X))
 {
   Xprime <- c(Xprime, rep(X[i], times=i))
 }

 print("Percentiles:")
 print(quantile(X, pctile))
 print("Weighted:")
 print(Xprime)
 print("Weighted Percentiles:")
 print(quantile(Xprime, pctile, na.rm=TRUE))
}

WtPercentile(1:10)
WtPercentile(rnorm(10))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] creating a map

2009-02-17 Thread Greg Snow

You need to give the symbols function the locations where you want the centers 
of the circles to be.  Some datesets with map information also have centers of 
the states that you can use, for the USA, there is the state.center dataset 
that may work for you, or the maptools package function get.Pcent will compute 
a center for polygons (there are probably other similar functions in other 
packages).

For adding circles to a map of the USA, you may want to look at the state.vbm 
data in the TeachingDemos package (works with maptools package), but you will 
need to computer the centers of the polygons, they don't match state.center or 
others.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of Alina Sheyman
> Sent: Tuesday, February 17, 2009 9:53 AM
> To: r-help@r-project.org
> Subject: [R] creating a map
> 
> I'm trying to create a fairly basic map using R. What i want to get is
> the
> map of the country with circles representing a count of students in
> each
> state.
> What I've done so far is as following -
>   map("state")
> 
> symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inch
> es=F)
> 
> this gives me the map of the country, but one that's not populated by
> my
> counts.
> Does anyone know what I'm doing wrong?
> 
> Also, if anyone can recommend a good reference for creating maps in R,
> I'd
> really appreciate that.
> 
> thank you
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overdispersion with binomial distribution

Jessica L Hite/hitejl/O/VCU  vcu.edu> writes:

> I am attempting to run a glm with a binomial model to analyze proportion
> data.
> I have been following Crawley's book closely and am wondering if there is
> an accepted standard for how much is too much overdispersion? (e.g. change
> in AIC has an accepted standard of 2).

  In principle, in the null case (i.e. data are really binomial)
the deviance is  chi-squared distributed with the df equal
to the residual df.

  For example:

example(glm)
deviance(glm.D93) ## 5.13
summary(glm.D93)$dispersion ## 1 (by definition)
dfr <- df.residual(glm.D93)
deviance(glm.D93)/dfr ## 1.28
d2 <- sum(residuals(glm.D93,"pearson")^2) ## 5.17
(disp2 <- d2/dfr)  ## 1.293

gg2 <- update(glm.D93,family=quasipoisson)
summary(gg2)$dispersion  ## 1.293, same as above

pchisq(d2,df=dfr,lower.tail=FALSE)

all.equal(coef(glm.D93),coef(gg2)) ## TRUE

se1 <- coef(summary(glm.D93))[,"Std. Error"]
se2 <- coef(summary(gg2))[,"Std. Error"]
se2/se1

# (Intercept)outcome2outcome3  treatment2  treatment3 
#   1.1372341.1372341.1372341.1372341.137234 

sqrt(disp2)
# [1] 1.137234

> My code and output are below, given the example in the book, these data are
> WAY overdispersed .do I mention this and go on or does this signal the
> need to try a different model? If so, any suggestions on the type of
> distribution (gamma or negative binomial ?)?

  Way overdispersed may indicate model lack of fit.  Have
you examined residuals/data for outliers etc.?  

  quasibinomial should be fine, or you can try beta-binomial
(see the aod package) ...


> attach(Clutch2)
>  y<-cbind(Total,Size-Total)
> glm1<-glm(y~Pred,"binomial")
> summary(glm1)
> 
> Call:
> glm(formula = y ~ Pred, family = "binomial")
> 
> Deviance Residuals:
> Min   1Q   Median   3Q  Max
> -9.1022  -2.7899  -0.4781   2.6058   8.4852
> 
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept)  1.350950.06612  20.433  < 2e-16 ***
> PredF   -0.348110.11719  -2.970  0.00297 **
> PredSN  -3.291560.10691 -30.788  < 2e-16 ***
> PredW   -1.464510.12787 -11.453  < 2e-16 ***
> PredWF  -0.564120.13178  -4.281 1.86e-05 ***
> ---
>  the output for residual deviance and df does not change even when I
> use quasibinomial, is this ok?  #

  That's as expected.
 
>  library(MASS)

  you don't really need MASS for quasibinomial.

> > glm2<-glm(y~Pred,"quasibinomial")
> > summary(glm2)
> 
> Call:
> glm(formula = y ~ Pred, family = "quasibinomial")
> 
> Deviance Residuals:
> Min   1Q   Median   3Q  Max
> -9.1022  -2.7899  -0.4781   2.6058   8.4852
> 
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept)   1.3510 0.2398   5.633 1.52e-07 ***
> PredF-0.3481 0.4251  -0.819  0.41471
> PredSN   -3.2916 0.3878  -8.488 1.56e-13 ***
> PredW-1.4645 0.4638  -3.157  0.00208 **
> PredWF   -0.5641 0.4780  -1.180  0.24063
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> 
> (Dispersion parameter for quasibinomial family taken to be 13.15786)
> 
> Null deviance: 2815.5  on 108  degrees of freedom
> Residual deviance: 1323.5  on 104  degrees of freedom
>   (3 observations deleted due to missingness)
> AIC: NA
> 
> Number of Fisher Scoring iterations: 5
> 
> > anova(glm2,test="F")
> Analysis of Deviance Table
> 
> Model: quasibinomial, link: logit
> 
> Response: y
> 
> Terms added sequentially (first to last)
> 
>   Df Deviance Resid. Df Resid. Dev  F   Pr(>F)
> NULL108 2815.5
> Pred   4   1492.0   104 1323.5 28.349 6.28e-16 ***
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> > model1<-update(glm2,~.-Pred)
> > anova(glm2,model1,test="F")
> Analysis of Deviance Table
> 
> Model 1: y ~ Pred
> Model 2: y ~ 1
>   Resid. Df Resid. Dev  Df Deviance  F   Pr(>F)
> 1   104 1323.5
> 2   108 2815.5  -4  -1492.0 28.349 6.28e-16 ***
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> > coef(glm2)
> (Intercept)   PredF  PredSN   PredW  PredWF
>   1.3509550  -0.3481096  -3.2915601  -1.4645097  -0.5641223
> 
> Thanks
> Jessica
> hitejl  vcu.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] creating a map

2009-02-17 Thread Alina Sheyman

Thanks Greg,
  do you know where i can find the sate.center dataset that you mention?

On Tue, Feb 17, 2009 at 12:28 PM, Greg Snow  wrote:

> You need to give the symbols function the locations where you want the
> centers of the circles to be.  Some datesets with map information also have
> centers of the states that you can use, for the USA, there is the
> state.center dataset that may work for you, or the maptools package function
> get.Pcent will compute a center for polygons (there are probably other
> similar functions in other packages).
>
> For adding circles to a map of the USA, you may want to look at the
> state.vbm data in the TeachingDemos package (works with maptools package),
> but you will need to computer the centers of the polygons, they don't match
> state.center or others.
>
> Hope this helps,
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
>
>
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> > project.org] On Behalf Of Alina Sheyman
> > Sent: Tuesday, February 17, 2009 9:53 AM
> > To: r-help@r-project.org
> > Subject: [R] creating a map
> >
> > I'm trying to create a fairly basic map using R. What i want to get is
> > the
> > map of the country with circles representing a count of students in
> > each
> > state.
> > What I've done so far is as following -
> >   map("state")
> >
> > symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inch
> > es=F)
> >
> > this gives me the map of the country, but one that's not populated by
> > my
> > counts.
> > Does anyone know what I'm doing wrong?
> >
> > Also, if anyone can recommend a good reference for creating maps in R,
> > I'd
> > really appreciate that.
> >
> > thank you
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error with "make" with R-devel

2009-02-17 Thread Lana Schaffer

Hi,
I am getting an error compiling the R-devel on a suse architecture
64-bit architecture.  The cp attribute is sending 'trusted.lov'
and an error.  This is a sample of the output:

>  make[3]: Entering directory
>`/lustre/people/schaffer/R-devel/src/library/base'
>building package 'base'
>make[4]: Entering directory
>`/lustre/people/schaffer/R-devel/src/library/base'
>all.R is unchanged
>cmp: EOF on ../../../library/base/R/base
>cp: getting attribute `trusted.lov' of `all.R': Operation not permitted
>make[4]: *** [mkR] Error 1
>make[4]: Leaving directory
>`/lustre/people/schaffer/R-devel/src/library/base'
>make[3]: *** [all] Error 2

Is there someone who can modify the Makefile so that I am able
to compile R-devel?

Lana Schaffer
Biostatistics/Informatics
The Scripps Research Institute
DNA Array Core Facility
La Jolla, CA 92037
(858) 784-2263
(858) 784-2994
schaf...@scripps.edu 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] printing out the summary for lm into a txt file

2009-02-17 Thread kayj


Hi All,


I am trying to run several linear regressions and print out the summay and
the anova reslts on the top of 
each other for each model. Below is a sample progarm that did not work. is
it possible to print the
anova below the summary of lm in one file?

thanks for your help



##

data<-read.table("data.txt", header=T, sep='\t')

for (i in 1:100){

y<-data[,i]

lm.model<-lm(y~data$x1+data$x2+data$x3+data)

sink("results.txt", append=T)

s<-summary(lm.Model)
print(s)
sink()

an<-anova(lm.Model)
print(an)
sink()

} 
-- 
View this message in context: 
http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22062643.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to control the overall shape of a figure?

2009-02-17 Thread Oliver

Thanks very much, exactly what I need.

Oliver

On Feb 16, 10:36 pm, Marc Schwartz  wrote:
> on 02/16/2009 07:51 PM Oliver wrote:
>
>
>
> > hi,
>
> > I am a R beginner. One thing I notice is that when do graphing is,
>
> > if I want to draw two figures in a row such as this:
>
> > par(mfrow(1, 2))
> > plot(...)
> > plot(...)
>
> > Each figure inside will be rectangle instead of the familiar square
> > shape.
> > Though you can drag the edge the window to resize it. I would have
> > prefer this can be done automatically ... also when I do the pdf
> > export for example. Is this possible?
>
> > The thing is if you do par (mfrow(2,2)), then you got all 4 figures in
> > perfect square shape ... but one row two figures are pretty common to
> > me.
>
> > thanks for help
>
> > Oliver
>
> The overall shape of the plot region is controlled by par("pty"), which
> is set to "m" by default, for 'maximal' plotting region.
>
> Set this to "s" to create a square plot region. For example:
>
> par(mfrow = c(1, 2), pty = "s")
> plot(1)
> plot(1)
>
> In order to do this with a PDF file, ensure that the overall plot
> dimensions are in the proper relative aspect ratio. For example:
>
> pdf(file = "SquarePlots.pdf", height = 4, width = 8)
> par(mfrow = c(1, 2), pty= "s")
> plot(1)
> plot(1)
> dev.off()
>
> This will create 2 square plots, side-by-side, within an overall plot
> size of 4x8. You can further adjust the plot margins and other
> characteristics as may be required.
>
> See ?par and ?Devices for more information regarding these
> customizations and options to set size arguments for specific devices.
>
> HTH,
>
> Marc Schwartz
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multiple levels of nesting with trellis plots

2009-02-17 Thread R User R User

Hi guys,
I have a tricky problem that I'd appreciate your help with.

I have two categorical variables, say varA and varB and an associated
frequency Freq for combinations of the levels of varA and varB. This was
created with a table() call.

I'd now like to make panel plots of the frequency. I can do a panel plot by
varA:
barchart(data$Freq ~ data$varB | data$varA)

How do I achieve two levels of nesting for say a third categorical variable
varC?
That is, how do I specify 2 'given's in the the panel plot, and preferably
arranged the panels vertically within the 'inner' panel/

thanks very much,
Richie

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overdispersion with binomial distribution

2009-02-17 Thread Prof Brian Ripley


On Tue, 17 Feb 2009, Ben Bolker wrote:


Jessica L Hite/hitejl/O/VCU  vcu.edu> writes:


I am attempting to run a glm with a binomial model to analyze proportion
data.
I have been following Crawley's book closely and am wondering if there is
an accepted standard for how much is too much overdispersion? (e.g. change
in AIC has an accepted standard of 2).



 In principle, in the null case (i.e. data are really binomial)
the deviance is  chi-squared distributed with the df equal
to the residual df.


*Approximately*, provided the expected counts are not near or below 
one.  See MASS §7.5 for an analysis of the size of the approximation 
errors (which can be large and in both directions).


Given that I once had a consulting job where the over-dispersion was 
causing something close ot panic and was entirely illusory, the lack 
of the 'approximately' can have quite serious consequences.



 For example:

example(glm)
deviance(glm.D93) ## 5.13
summary(glm.D93)$dispersion ## 1 (by definition)
dfr <- df.residual(glm.D93)
deviance(glm.D93)/dfr ## 1.28
d2 <- sum(residuals(glm.D93,"pearson")^2) ## 5.17
(disp2 <- d2/dfr)  ## 1.293

gg2 <- update(glm.D93,family=quasipoisson)
summary(gg2)$dispersion  ## 1.293, same as above

pchisq(d2,df=dfr,lower.tail=FALSE)

all.equal(coef(glm.D93),coef(gg2)) ## TRUE

se1 <- coef(summary(glm.D93))[,"Std. Error"]
se2 <- coef(summary(gg2))[,"Std. Error"]
se2/se1

# (Intercept)outcome2outcome3  treatment2  treatment3
#   1.1372341.1372341.1372341.1372341.137234

sqrt(disp2)
# [1] 1.137234


My code and output are below, given the example in the book, these data are
WAY overdispersed .do I mention this and go on or does this signal the
need to try a different model? If so, any suggestions on the type of
distribution (gamma or negative binomial ?)?


 Way overdispersed may indicate model lack of fit.  Have
you examined residuals/data for outliers etc.?

 quasibinomial should be fine, or you can try beta-binomial
(see the aod package) ...



attach(Clutch2)
 y<-cbind(Total,Size-Total)
glm1<-glm(y~Pred,"binomial")
summary(glm1)

Call:
glm(formula = y ~ Pred, family = "binomial")

Deviance Residuals:
Min   1Q   Median   3Q  Max
-9.1022  -2.7899  -0.4781   2.6058   8.4852

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)  1.350950.06612  20.433  < 2e-16 ***
PredF   -0.348110.11719  -2.970  0.00297 **
PredSN  -3.291560.10691 -30.788  < 2e-16 ***
PredW   -1.464510.12787 -11.453  < 2e-16 ***
PredWF  -0.564120.13178  -4.281 1.86e-05 ***
---
 the output for residual deviance and df does not change even when I
use quasibinomial, is this ok?  #


 That's as expected.


 library(MASS)


 you don't really need MASS for quasibinomial.


glm2<-glm(y~Pred,"quasibinomial")
summary(glm2)


Call:
glm(formula = y ~ Pred, family = "quasibinomial")

Deviance Residuals:
Min   1Q   Median   3Q  Max
-9.1022  -2.7899  -0.4781   2.6058   8.4852

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)   1.3510 0.2398   5.633 1.52e-07 ***
PredF-0.3481 0.4251  -0.819  0.41471
PredSN   -3.2916 0.3878  -8.488 1.56e-13 ***
PredW-1.4645 0.4638  -3.157  0.00208 **
PredWF   -0.5641 0.4780  -1.180  0.24063
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

(Dispersion parameter for quasibinomial family taken to be 13.15786)

Null deviance: 2815.5  on 108  degrees of freedom
Residual deviance: 1323.5  on 104  degrees of freedom
  (3 observations deleted due to missingness)
AIC: NA

Number of Fisher Scoring iterations: 5


anova(glm2,test="F")

Analysis of Deviance Table

Model: quasibinomial, link: logit

Response: y

Terms added sequentially (first to last)

  Df Deviance Resid. Df Resid. Dev  F   Pr(>F)
NULL108 2815.5
Pred   4   1492.0   104 1323.5 28.349 6.28e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

model1<-update(glm2,~.-Pred)
anova(glm2,model1,test="F")

Analysis of Deviance Table

Model 1: y ~ Pred
Model 2: y ~ 1
  Resid. Df Resid. Dev  Df Deviance  F   Pr(>F)
1   104 1323.5
2   108 2815.5  -4  -1492.0 28.349 6.28e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

coef(glm2)

(Intercept)   PredF  PredSN   PredW  PredWF
  1.3509550  -0.3481096  -3.2915601  -1.4645097  -0.5641223

Thanks
Jessica
hitejl  vcu.edu



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http:

Re: [R] printing out the summary for lm into a txt file


" did not work."

Might that mean errors? Care to share?

Running that through my wetware R interpreter, I think I am seeing you  
ask for creation of models with variable outcomes from a the first 100  
columns of "data". And then you are specifying a formula that includes  
a weird mixture of 3 vectors, data[,"x1"],x2 x3,... plus all of the  
data.frame, "data"? Have you tested this loop without the sink's to  
see if it produces anything meaningful?


Once you test the process, then issue the sink(filename, append=TRUE)  
once, before the loop, do your analyses in the loop, and then close  
with sink() after the loop.


--
David Winsemius
On Feb 17, 2009, at 12:52 PM, kayj wrote:



Hi All,


I am trying to run several linear regressions and print out the  
summay and

the anova reslts on the top of
each other for each model. Below is a sample progarm that did not  
work. is

it possible to print the
anova below the summary of lm in one file?

thanks for your help



##

data<-read.table("data.txt", header=T, sep='\t')

for (i in 1:100){

y<-data[,i]

lm.model<-lm(y~data$x1+data$x2+data$x3+data)

sink("results.txt", append=T)

s<-summary(lm.Model)
print(s)
sink()

an<-anova(lm.Model)
print(an)
sink()

}
--
View this message in context: 
http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22062643.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overdispersion with binomial distribution

 Thanks for the clarification.
  I actually had MASS open to that page while
I was composing my reply but forgot to mention
it (trying to do too many things at once) ...

  Ben Bolker

Prof Brian Ripley wrote:
> On Tue, 17 Feb 2009, Ben Bolker wrote:
> 
>> Jessica L Hite/hitejl/O/VCU  vcu.edu> writes:
>>
>>> I am attempting to run a glm with a binomial model to analyze proportion
>>> data.
>>> I have been following Crawley's book closely and am wondering if there is
>>> an accepted standard for how much is too much overdispersion? (e.g. change
>>> in AIC has an accepted standard of 2).
> 
>>  In principle, in the null case (i.e. data are really binomial)
>> the deviance is  chi-squared distributed with the df equal
>> to the residual df.
> 
> *Approximately*, provided the expected counts are not near or below 
> one.  See MASS §7.5 for an analysis of the size of the approximation 
> errors (which can be large and in both directions).
> 
> Given that I once had a consulting job where the over-dispersion was 
> causing something close ot panic and was entirely illusory, the lack 
> of the 'approximately' can have quite serious consequences.
> 
>>  For example:
>>
>> example(glm)
>> deviance(glm.D93) ## 5.13
>> summary(glm.D93)$dispersion ## 1 (by definition)
>> dfr <- df.residual(glm.D93)
>> deviance(glm.D93)/dfr ## 1.28
>> d2 <- sum(residuals(glm.D93,"pearson")^2) ## 5.17
>> (disp2 <- d2/dfr)  ## 1.293
>>
>> gg2 <- update(glm.D93,family=quasipoisson)
>> summary(gg2)$dispersion  ## 1.293, same as above
>>
>> pchisq(d2,df=dfr,lower.tail=FALSE)
>>
>> all.equal(coef(glm.D93),coef(gg2)) ## TRUE
>>
>> se1 <- coef(summary(glm.D93))[,"Std. Error"]
>> se2 <- coef(summary(gg2))[,"Std. Error"]
>> se2/se1
>>
>> # (Intercept)outcome2outcome3  treatment2  treatment3
>> #   1.1372341.1372341.1372341.1372341.137234
>>
>> sqrt(disp2)
>> # [1] 1.137234
>>
>>> My code and output are below, given the example in the book, these data are
>>> WAY overdispersed .do I mention this and go on or does this signal the
>>> need to try a different model? If so, any suggestions on the type of
>>> distribution (gamma or negative binomial ?)?
>>  Way overdispersed may indicate model lack of fit.  Have
>> you examined residuals/data for outliers etc.?
>>
>>  quasibinomial should be fine, or you can try beta-binomial
>> (see the aod package) ...
>>
>>
>>> attach(Clutch2)
>>>  y<-cbind(Total,Size-Total)
>>> glm1<-glm(y~Pred,"binomial")
>>> summary(glm1)
>>>
>>> Call:
>>> glm(formula = y ~ Pred, family = "binomial")
>>>
>>> Deviance Residuals:
>>> Min   1Q   Median   3Q  Max
>>> -9.1022  -2.7899  -0.4781   2.6058   8.4852
>>>
>>> Coefficients:
>>> Estimate Std. Error z value Pr(>|z|)
>>> (Intercept)  1.350950.06612  20.433  < 2e-16 ***
>>> PredF   -0.348110.11719  -2.970  0.00297 **
>>> PredSN  -3.291560.10691 -30.788  < 2e-16 ***
>>> PredW   -1.464510.12787 -11.453  < 2e-16 ***
>>> PredWF  -0.564120.13178  -4.281 1.86e-05 ***
>>> ---
>>>  the output for residual deviance and df does not change even when I
>>> use quasibinomial, is this ok?  #
>>  That's as expected.
>>
>>>  library(MASS)
>>  you don't really need MASS for quasibinomial.
>>
 glm2<-glm(y~Pred,"quasibinomial")
 summary(glm2)
>>> Call:
>>> glm(formula = y ~ Pred, family = "quasibinomial")
>>>
>>> Deviance Residuals:
>>> Min   1Q   Median   3Q  Max
>>> -9.1022  -2.7899  -0.4781   2.6058   8.4852
>>>
>>> Coefficients:
>>> Estimate Std. Error t value Pr(>|t|)
>>> (Intercept)   1.3510 0.2398   5.633 1.52e-07 ***
>>> PredF-0.3481 0.4251  -0.819  0.41471
>>> PredSN   -3.2916 0.3878  -8.488 1.56e-13 ***
>>> PredW-1.4645 0.4638  -3.157  0.00208 **
>>> PredWF   -0.5641 0.4780  -1.180  0.24063
>>> ---
>>> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>>>
>>> (Dispersion parameter for quasibinomial family taken to be 13.15786)
>>>
>>> Null deviance: 2815.5  on 108  degrees of freedom
>>> Residual deviance: 1323.5  on 104  degrees of freedom
>>>   (3 observations deleted due to missingness)
>>> AIC: NA
>>>
>>> Number of Fisher Scoring iterations: 5
>>>
 anova(glm2,test="F")
>>> Analysis of Deviance Table
>>>
>>> Model: quasibinomial, link: logit
>>>
>>> Response: y
>>>
>>> Terms added sequentially (first to last)
>>>
>>>   Df Deviance Resid. Df Resid. Dev  F   Pr(>F)
>>> NULL108 2815.5
>>> Pred   4   1492.0   104 1323.5 28.349 6.28e-16 ***
>>> ---
>>> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
 model1<-update(glm2,~.-Pred)
 anova(glm2,model1,test="F")
>>> Analysis of Deviance Table
>>>
>>> Model 1: y ~ Pred
>>> Model 2: y ~ 1
>>>   Resid. Df Resid. Dev  Df Deviance  F   Pr(>F)
>>> 1   104 1323.5
>>> 2   108 2815.5  -4  -1492.0 28.349 6.28e-16 ***
>>> ---
>>> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.0

Re: [R] creating a map

2009-02-17 Thread Greg Snow

It should be in the datasets package that is automatically loaded with R (at 
least my copy), try ?state and you should see the help for it and others.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

From: Alina Sheyman [mailto:alina...@gmail.com]
Sent: Tuesday, February 17, 2009 11:02 AM
To: Greg Snow
Cc: r-help@r-project.org
Subject: Re: [R] creating a map

Thanks Greg,
  do you know where i can find the sate.center dataset that you mention?
On Tue, Feb 17, 2009 at 12:28 PM, Greg Snow 
mailto:greg.s...@imail.org>> wrote:
You need to give the symbols function the locations where you want the centers 
of the circles to be.  Some datesets with map information also have centers of 
the states that you can use, for the USA, there is the state.center dataset 
that may work for you, or the maptools package function get.Pcent will compute 
a center for polygons (there are probably other similar functions in other 
packages).

For adding circles to a map of the USA, you may want to look at the state.vbm 
data in the TeachingDemos package (works with maptools package), but you will 
need to computer the centers of the polygons, they don't match state.center or 
others.

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-
> project.org] On Behalf Of Alina Sheyman
> Sent: Tuesday, February 17, 2009 9:53 AM
> To: r-help@r-project.org
> Subject: [R] creating a map
>
> I'm trying to create a fairly basic map using R. What i want to get is
> the
> map of the country with circles representing a count of students in
> each
> state.
> What I've done so far is as following -
>   map("state")
>
> symbols(data1$count,circles=log(data1$count)*3,fg=col,bg=col,add=T,inch
> es=F)
>
> this gives me the map of the country, but one that's not populated by
> my
> counts.
> Does anyone know what I'm doing wrong?
>
> Also, if anyone can recommend a good reference for creating maps in R,
> I'd
> really appreciate that.
>
> thank you
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] printing out the summary for lm into a txt file

2009-02-17 Thread Greg Snow

The sink() command stops the sinking, so you send the lm output to the file, 
then stop the sinking before printing out the anova result.  So the simplest 
thing to try is to put the first sink (with the filename and append=T) before 
you start the loop, remove all calls to sink within the loop, then do a single 
sink() after the loop ends.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of kayj
> Sent: Tuesday, February 17, 2009 10:52 AM
> To: r-help@r-project.org
> Subject: [R] printing out the summary for lm into a txt file
> 
> 
> Hi All,
> 
> 
> I am trying to run several linear regressions and print out the summay
> and
> the anova reslts on the top of
> each other for each model. Below is a sample progarm that did not work.
> is
> it possible to print the
> anova below the summary of lm in one file?
> 
> thanks for your help
> 
> 
> 
> ##
> 
> data<-read.table("data.txt", header=T, sep='\t')
> 
> for (i in 1:100){
> 
> y<-data[,i]
> 
> lm.model<-lm(y~data$x1+data$x2+data$x3+data)
> 
> sink("results.txt", append=T)
> 
> s<-summary(lm.Model)
> print(s)
> sink()
> 
> an<-anova(lm.Model)
> print(an)
> sink()
> 
> }
> --
> View this message in context: http://www.nabble.com/printing-out-the-
> summary-for-lm-into-a-txt-file-tp22062643p22062643.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Percentiles/Quantiles with Weighting

2009-02-17 Thread Brigid Mooney

Thanks for pointing me to the quantreg package as a resource.  I was hoping
to ask be able to address one quick follow-up question...

I get slightly different variants between using the rq funciton with formula
= mydata ~ 1 as I would if I ran the same data using the quantile function.

Example:

mydata <- (1:10)^2/2
pctile <- seq(.59, .99, .1)

quantile(mydata, pctile)
59%69%79%89%99%
20.015 26.075 32.935 40.595 49.145

rq(mydata~1, tau=pctile)
Call:
rq(formula = mydata ~ 1, tau = pctile)
Coefficients:
tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99
(Intercept)18  24.532  40.550
Degrees of freedom: 10 total; 9 residual

Is it correct to assume this is due to the different accepted methods of
calculating quantiles?  If so, do you know where I would be able to see the
algorithms used in these functions?  I'm not finding it in the documentation
for function rq, and am new enough to R that I don't know where those
references would generally be.




On Tue, Feb 17, 2009 at 12:29 PM, roger koenker  wrote:

> http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869
>
> gives one possibility...
>
> url:www.econ.uiuc.edu/~rogerRoger Koenker
> emailrkoen...@uiuc.eduDepartment of Economics
> vox: 217-333-4558University of Illinois
> fax:   217-244-6678Champaign, IL 61820
>
>
>
>
> On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote:
>
>   Hi All,
>>
>> I am looking at applications of percentiles to time sequenced data.  I had
>> just been using the quantile function to get percentiles over various
>> periods, but am more interested in if there is an accepted (and/or
>> R-implemented) method to apply weighting to the data so as to weigh recent
>> data more heavily.
>>
>> I wrote the following function, but it seems quite inefficient, and not
>> really very flexible in its applications - so if anyone has any
>> suggestions
>> on how to look at quantiles/percentiles within R while also using a
>> weighting schema, I would be very interested.
>>
>> Note - this function supposes the data in X is time-sequenced, with the
>> most
>> recent (and thus heaviest weighted) data at the end of the vector
>>
>> WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
>> {
>>  Xprime <- NA
>>
>>  for(i in 1:length(X))
>>  {
>>   Xprime <- c(Xprime, rep(X[i], times=i))
>>  }
>>
>>  print("Percentiles:")
>>  print(quantile(X, pctile))
>>  print("Weighted:")
>>  print(Xprime)
>>  print("Weighted Percentiles:")
>>  print(quantile(Xprime, pctile, na.rm=TRUE))
>> }
>>
>> WtPercentile(1:10)
>> WtPercentile(rnorm(10))
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Percentiles/Quantiles with Weighting

2009-02-17 Thread roger koenker



url:www.econ.uiuc.edu/~rogerRoger Koenker
emailrkoen...@uiuc.eduDepartment of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820



On Feb 17, 2009, at 1:58 PM, Brigid Mooney wrote:

Thanks for pointing me to the quantreg package as a resource.  I was  
hoping to ask be able to address one quick follow-up question...


I get slightly different variants between using the rq funciton with  
formula = mydata ~ 1 as I would if I ran the same data using the  
quantile function.

Example:

mydata <- (1:10)^2/2
pctile <- seq(.59, .99, .1)

quantile(mydata, pctile)
59%69%79%89%99%
20.015 26.075 32.935 40.595 49.145

rq(mydata~1, tau=pctile)
Call:
rq(formula = mydata ~ 1, tau = pctile)
Coefficients:
tau= 0.59 tau= 0.69 tau= 0.79 tau= 0.89 tau= 0.99
(Intercept)18  24.532  40.550
Degrees of freedom: 10 total; 9 residual

Is it correct to assume this is due to the different accepted  
methods of calculating quantiles?  If so, do you know where I would  
be able to see the algorithms used in these functions?  I'm not  
finding it in the documentation for function rq, and am new enough  
to R that I don't know where those references would generally be.




Yes,  quantile() in base R documents 9 varieties of quantiles, 2 more  
than

William Empson's famous 7 Types of Ambiguity.  In quantreg the function
rq() finds a solution to an underlying optimization problem and  
doesn't ask

any further into the nature of the ambiguity -- it does often produce a
warning indicating that there may be more than one solution.  The  
default
base R quantile is interpolated, while the default rq() with method =  
"br"
using the simplex algorithm finds an order statistic, typically.  If  
you prefer

something more like interpolation, you can try rq() with method = "fn"
which is using an interior point algorithm and when there are multiple
solutions it tends to produce something more like the centroid of the
solution set.  I hope that this helps.




On Tue, Feb 17, 2009 at 12:29 PM, roger koenker   
wrote:

http://www.nabble.com/weighted-quantiles-to19864562.html#a19865869

gives one possibility...

url:www.econ.uiuc.edu/~rogerRoger Koenker
emailrkoen...@uiuc.eduDepartment of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820




On Feb 17, 2009, at 10:57 AM, Brigid Mooney wrote:

Hi All,

I am looking at applications of percentiles to time sequenced data.   
I had

just been using the quantile function to get percentiles over various
periods, but am more interested in if there is an accepted (and/or
R-implemented) method to apply weighting to the data so as to weigh  
recent

data more heavily.

I wrote the following function, but it seems quite inefficient, and  
not
really very flexible in its applications - so if anyone has any  
suggestions

on how to look at quantiles/percentiles within R while also using a
weighting schema, I would be very interested.

Note - this function supposes the data in X is time-sequenced, with  
the most

recent (and thus heaviest weighted) data at the end of the vector

WtPercentile <- function(X=rnorm(100), pctile=seq(.1,1,.1))
{
 Xprime <- NA

 for(i in 1:length(X))
 {
  Xprime <- c(Xprime, rep(X[i], times=i))
 }

 print("Percentiles:")
 print(quantile(X, pctile))
 print("Weighted:")
 print(Xprime)
 print("Weighted Percentiles:")
 print(quantile(Xprime, pctile, na.rm=TRUE))
}

WtPercentile(1:10)
WtPercentile(rnorm(10))

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] printing out the summary for lm into a txt file

2009-02-17 Thread kayj


Thanks a lot for your help, it worked.




Greg Snow-2 wrote:
> 
> The sink() command stops the sinking, so you send the lm output to the
> file, then stop the sinking before printing out the anova result.  So the
> simplest thing to try is to put the first sink (with the filename and
> append=T) before you start the loop, remove all calls to sink within the
> loop, then do a single sink() after the loop ends.
> 
> Hope this helps,
> 
> -- 
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
> 
> 
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
>> project.org] On Behalf Of kayj
>> Sent: Tuesday, February 17, 2009 10:52 AM
>> To: r-help@r-project.org
>> Subject: [R] printing out the summary for lm into a txt file
>> 
>> 
>> Hi All,
>> 
>> 
>> I am trying to run several linear regressions and print out the summay
>> and
>> the anova reslts on the top of
>> each other for each model. Below is a sample progarm that did not work.
>> is
>> it possible to print the
>> anova below the summary of lm in one file?
>> 
>> thanks for your help
>> 
>> 
>> 
>> ##
>> 
>> data<-read.table("data.txt", header=T, sep='\t')
>> 
>> for (i in 1:100){
>> 
>> y<-data[,i]
>> 
>> lm.model<-lm(y~data$x1+data$x2+data$x3+data)
>> 
>> sink("results.txt", append=T)
>> 
>> s<-summary(lm.Model)
>> print(s)
>> sink()
>> 
>> an<-anova(lm.Model)
>> print(an)
>> sink()
>> 
>> }
>> --
>> View this message in context: http://www.nabble.com/printing-out-the-
>> summary-for-lm-into-a-txt-file-tp22062643p22062643.html
>> Sent from the R help mailing list archive at Nabble.com.
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22065292.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] printing out the summary for lm into a txt file

2009-02-17 Thread kayj



this is the error message that I was getting 

 "In sink() ... : no sink to remove"


i got it to work . thanks for the help


David Winsemius wrote:
> 
> " did not work."
> 
> Might that mean errors? Care to share?
> 
> Running that through my wetware R interpreter, I think I am seeing you  
> ask for creation of models with variable outcomes from a the first 100  
> columns of "data". And then you are specifying a formula that includes  
> a weird mixture of 3 vectors, data[,"x1"],x2 x3,... plus all of the  
> data.frame, "data"? Have you tested this loop without the sink's to  
> see if it produces anything meaningful?
> 
> Once you test the process, then issue the sink(filename, append=TRUE)  
> once, before the loop, do your analyses in the loop, and then close  
> with sink() after the loop.
> 
> -- 
> David Winsemius
> On Feb 17, 2009, at 12:52 PM, kayj wrote:
> 
>>
>> Hi All,
>>
>>
>> I am trying to run several linear regressions and print out the  
>> summay and
>> the anova reslts on the top of
>> each other for each model. Below is a sample progarm that did not  
>> work. is
>> it possible to print the
>> anova below the summary of lm in one file?
>>
>> thanks for your help
>>
>>
>>
>> ##
>>
>> data<-read.table("data.txt", header=T, sep='\t')
>>
>> for (i in 1:100){
>>
>> y<-data[,i]
>>
>> lm.model<-lm(y~data$x1+data$x2+data$x3+data)
>>
>> sink("results.txt", append=T)
>>
>> s<-summary(lm.Model)
>> print(s)
>> sink()
>>
>> an<-anova(lm.Model)
>> print(an)
>> sink()
>>
>> }
>> -- 
>> View this message in context:
>> http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22062643.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/printing-out-the-summary-for-lm-into-a-txt-file-tp22062643p22065365.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Subset Regression Package