date:20100608

[R] fitting t copula

2010-06-08 Thread Roslina Zakaria

Hi r-users,

I try to fit the t copula using the gamma marginals.Â  But I got error message 
which I don't really understand.

Thank you for any help given.

myCop.t <- ellipCopula(family = "t", dim = 2, dispstr = "toep", param = 0.5, df 
= 8)
myCop.t
myMvd <- mvdc(copula = myCop.t, margins = c("gamma", "gamma"), paramMargins = 
list(list(mean = 0, sd = 2), list(mean = 0,
sd = 1)))
myMvd
dat <- stn_pos ## observed data
loglikMvdc(c(0, 2, 0, 1, 0.5,8), dat, myMvd)Â Â  ## loglikelihood
mm <- apply(dat, 2, mean)
vv <- apply(dat, 2, var)
rho <- rcorr(stn_pos,type="spearman")[[1]]; round(rho,2)
rbind(mm,vv)
b1.0 <- c(mm[1]^2/vv[1], vv[1]/mm[1])
b2.0 <- c(mm[2]^2/vv[2], vv[2]/mm[2])
a.0Â  <- sin(cor(dat[, 1], dat[, 2], method = "kendall") * pi/2)
start <- c(b1.0, b2.0, a.0)
fit <- fitMvdc(dat, myMvd, start = start,optim.control = list(trace = TRUE, 
maxit = 2000))
fit

#

> head(dat);tail(dat)
Â Â Â Â Â  [,1]Â  [,2]
[1,]Â  28.4Â  43.5
[2,]Â Â  9.2Â Â  7.4
[3,]Â  48.8Â  68.9
[4,] 185.2 115.7
[5,]Â Â  4.1Â  11.7
[6,]Â  67.6Â  29.8
Â Â Â Â Â Â Â  [,1]Â  [,2]
[946,]Â Â  8.8Â  17.2
[947,] 119.6 164.0
[948,]Â  68.4 163.6
[949,]Â  45.8Â  61.6
[950,]Â  77.0 101.0
[951,]Â  56.6Â  74.8

> myCop.t <- ellipCopula(family = "t", dim = 2, dispstr = "toep", param = 0.5, 
> df = 8)
> myCop.t
t copula familyÂ  
Dimension:Â  2 
Parameters:
Â Â  rho.1Â  =Â  0.5 
Â Â  dfÂ  =Â  8 
> myMvd <- mvdc(copula = myCop.t, margins = c("gamma", "gamma"), paramMargins = 
> list(list(mean = 0, sd = 2), list(mean = 0, sd = 1)))
> myMvd
An object of class âmvdcâ
Slot "copula":
t copula familyÂ  
Dimension:Â  2 
Parameters:
Â Â  rho.1Â  =Â  0.5 
Â Â  dfÂ  =Â  8 
Slot "margins":
[1] "gamma" "gamma"
Slot "paramMargins":
[[1]]
[[1]]$mean
[1] 0
[[1]]$sd
[1] 2

[[2]]
[[2]]$mean
[1] 0
[[2]]$sd
[1] 1

Slot "marginsIdentical":
[1] FALSE

> dat <- stn_pos ## observed data

> loglikMvdc(c(0, 2, 0, 1, 0.5,8), dat, myMvd)Â Â  ## loglikelihood
Error in pgamma(x, mean = 0, sd = 2) : 
Â  unused argument(s) (mean = 0, sd = 2)

[1] NaN
> mm <- apply(dat, 2, mean)
> vv <- apply(dat, 2, var)
> rho <- rcorr(stn_pos,type="spearman")[[1]]; round(rho,2)
Â Â Â Â  [,1] [,2]
[1,]Â  1.0Â  0.9
[2,]Â  0.9Â  1.0
> rbind(mm,vv)
Â Â Â Â Â Â Â Â  [,1]Â Â Â Â Â  [,2]
mmÂ Â  58.63912Â Â  83.7224
vv 1789.51116 3315.2367
> b1.0 <- c(mm[1]^2/vv[1], vv[1]/mm[1])
> b2.0 <- c(mm[2]^2/vv[2], vv[2]/mm[2])
> a.0Â  <- sin(cor(dat[, 1], dat[, 2], method = "kendall") * pi/2)
> start <- c(b1.0, b2.0, a.0)
> fit <- fitMvdc(dat, myMvd, start = start,optim.control = list(trace = TRUE, 
> maxit = 2000))
Error in fitMvdc(dat, myMvd, start = start, optim.control = list(trace = 
TRUE,Â  : 
Â  The length of start and mvdc parameters do not match.
> 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Run Rscript.exe, how to capture error message?

2010-06-08 Thread Prof Brian Ripley

R error messages go to stderr, not stdout, so you need to redirect the 
latter.  In most shells, including cmd.exe (which as you don't say is 
likely what you are using) you can use


Rscript test.R xxx.txt > error.log 2>&1

to redirect both stdout and stderr to the same file (see the rw-FAQ 
Q2.12).


On Tue, 8 Jun 2010, A Huang wrote:


Hi there,

I use Rscript.exe for batch run (actually it is used in ASP.net code)

c:>"C:\Program Files\R\R-2.10.1\bin\Rscript.exe" test.r xxx.txt

Where test.r is the r program and xxx.txt is a file name test.r will read in, 
it comes from
a web form. This works fine, when the file is in required format. Sometime, 
xxx.txt
has wrong format, test.r read in but can't process it. If I run interactive 
mode, I
probably can see the error message on the screen. But since this is running
as a background process, I don't really have a screen to see.

I tried redirct:

c:>"C:\Program Files\R\R-2.10.1\bin\Rscript.exe" test.r xxx.txt > error.log

But it doesn't work even in local windows.

Thanks

A. Huang



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extracting Elements By Date

2010-06-08 Thread Jeff08


edit: I found out how to declare empty variables in R, but the code still
does not work. I get the index out of bounds error since my data is
irregular (some have more dates than others, and the matrix will not allow
for different sized rows)

Dear R Gurus,

Thanks for any help in advance!

Date.frame: Returns.names 

   X   id ticker  date_ adjClose   totret RankStk
258060 258060 13645T10 CP 2001-06-29   18.125 1877.758 

My data frame is in the above format. I would like to filter by period, per
id (every 125 days) each consisting of 250 days, I.e. 1-250, 126-375, etc.

One important thing to note is that not all ID's have the same number of
dates. 

x<-unique(Returns.names$date_) gives me the ordered list of all the dates,
so for each period i want it would be x[((n-1)*125+1):((n+1)*125)]

when i tried to filter for just 1 period, it worked fine, but alas, I do not
know how to work around the problem that you cannot assign dynamic variables
in loops (say I named the variable for the extracted dates Returns.period,
it would get overwritten every iteration of the loop, and I cant name
something Returns.n, where n is the index for the loop)

"Thus, I tried to adjust by dynamically assigning each period to a column.
However, I get the error: object 'Returns.period1' not found. And R is not
like Java where you can simply declare a variable by typing say
Returns.period1. 

I do not know how to create a flexible empty matrix that would allow this
code to work.

If anyone knows how to do this without a loop, that would be even better!

##Filtering by Period
n<-1
while(n<=19) {
Returns.period<-lapply(((n-1)*125+1):((n+1)*125), function (i)
which(Returns.filter$date_==x[i]))
Returns.period<-unlist(Returns.period)

Returns.period1[n,1:(length(Returns.period))]<-Returns.filter[Returns.period,]
n<-n+1
}"


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Extracting-Elements-By-Date-tp2248227p2248241.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] specifying plot symbol sizes in qplot or ggplot2

2010-06-08 Thread cfriedl


Hi.

first things first ... thanks for ggplot2.

Now my question. I'm using qplot to generate a plot as follows where X,Y,Z,
A are columns in a dataframe.

qplot(X, Y, data=XYDATA, color=Z, geom=c("point"), size=A)

This works as expected. Factor A has three levels so there are three sizes
of the point plot symbol. I understand that the factor levels are mapped to
symbol sizes. However the sizes are too small for my liking. Is there any
way I can specify the range of plot symbol sizes that factor A levels are
mapped to?

-- 
View this message in context: 
http://r.789695.n4.nabble.com/specifying-plot-symbol-sizes-in-qplot-or-ggplot2-tp2248217p2248217.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] correcting a few data in an unreshaped data frame

2010-06-08 Thread Mr. Natural


Thanks for the excellent help on my recent question on this topic in which
the data frame had been reshaped by cast.
Now, I would like to access and change erroneous data in a data frame that
has not been reshaped.

The file is lupepn1, with identifier variables bushno & bout and
dependent variables survival, and wwG
I know the bushno and bout of the erroneous dependent survival and wwG data.
I could correct these in the csv file before read.data, but I would like to
learn some more R

head(lupepn1)
  bushno bout survival wwG
2  120   0
3  130   0
4  140   0
5  150   0
6  160   2
7  170   0

> str(lupepn1)
'data.frame':   5023 obs. of  4 variables:
 $ bushno  : int  1 1 1 1 1 1 1 2 2 2 ...
 $ bout: int  2 3 4 5 6 7 8 1 2 3 ...
 $ survival: int  0 0 0 0 0 0 0 1 1 1 ...
 $ wwG : int  0 0 0 0 2 0 0 5 1 0 ...
 - attr(*, "na.action")=Class 'omit'  Named int [1:81] 1 49 65 177 201 257
337 417 449 505 ...
  .. ..- attr(*, "names")= chr [1:81] "1" "49" "65" "177" ...

Your kind advice is very much appreciated.
Mr. Natural.

> 


-- 
View this message in context: 
http://r.789695.n4.nabble.com/correcting-a-few-data-in-an-unreshaped-data-frame-tp2248219p2248219.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extracting Elements By Date

2010-06-08 Thread Jeff08


Dear R Gurus,

Thanks for any help in advance!

Date.frame: Returns.names 

   X   id ticker  date_ adjClose   totret RankStk
258060 258060 13645T10 CP 2001-06-29   18.125 1877.758 

My data frame is in the above format. I would like to filter by period, per
id (every 125 days) each consisting of 250 days, I.e. 1-250, 126-375, etc.

One important thing to note is that not all ID's have the same number of
dates. 

x<-unique(Returns.names$date_) gives me the ordered list of all the dates,
so for each period i want it would be x[((n-1)*125+1):((n+1)*125)]

when i tried to filter for just 1 period, it worked fine, but alas, I do not
know how to work around the problem that you cannot assign dynamic variables
in loops (say I named the variable for the extracted dates Returns.period,
it would get overwritten every iteration of the loop, and I cant name
something Returns.n, where n is the index for the loop)

Thus, I tried to adjust by dynamically assigning each period to a column.
However, I get the error: object 'Returns.period1' not found. And R is not
like Java where you can simply declare a variable by typing say
Returns.period1. 

I do not know how to create a flexible empty matrix that would allow this
code to work.

If anyone knows how to do this without a loop, that would be even better!

##Filtering by Period
n<-1
while(n<=19) {
Returns.period<-lapply(((n-1)*125+1):((n+1)*125), function (i)
which(Returns.filter$date_==x[i]))
Returns.period<-unlist(Returns.period)

Returns.period1[n,1:(length(Returns.period))]<-Returns.filter[Returns.period,]
n<-n+1
}

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Extracting-Elements-By-Date-tp2248227p2248227.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] combining expressions in mathplot

2010-06-08 Thread RICHARD M. HEIBERGER

Seb,
Thanks.  That doesn't solve the problem of combining two expressions.

My aa and bb are expressions constructed somewhere else and passed to the
current function which wants to use them together.  Your solution moves the
construction
of aa and bb into the function and is equivalent to my aabb.

Rich

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] combining expressions in mathplot

2010-06-08 Thread Sebastian P. Luque

On Wed, 9 Jun 2010 00:15:19 -0400,
"RICHARD M. HEIBERGER"  wrote:

> text(5,1, parse(text=paste(deparse(aa[[1]]), deparse(bb[[1]]),
> sep="~"))) text(5,2, parse(text=paste(deparse(aa[[1]]),
> deparse(bb[[1]]), sep="~', '~")))

> Is there a cleaner way of combining the expressions aa and bb to get
> the effect of the last two lines?

How about:

aa <- 0.05; bb <- 0.80
text(5, 1, bquote(alpha==.(aa) ~~ beta==.(bb)))

?plotmath

-- 
Seb

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] combining expressions in mathplot

2010-06-08 Thread RICHARD M. HEIBERGER

Is there a cleaner way of combining two expressions.

This example works and gives what I want

plot(1:10)
aa <- expression(alpha==.05)
bb <- expression(beta ==.80)
aabb <- expression(alpha==.05 ~ ", " ~ beta ==.80)

text(5, 10, aa)
text(5,  9, bb)
text(5,  8, aabb)

text(5,1, parse(text=paste(deparse(aa[[1]]), deparse(bb[[1]]), sep="~")))
text(5,2, parse(text=paste(deparse(aa[[1]]), deparse(bb[[1]]), sep="~',
'~")))

Is there a cleaner way of combining the expressions aa and bb  to get the
effect of the last two lines?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ordisurf (pkg vegan) gives implausible result

2010-06-08 Thread Matt Bakker

I'm having trouble with the ordisurf function in the vegan package.

I have created an ordination plot (cmdscale) of 60 samples based on
Bray-Curtis dissimilarities, and would like to overlay various soil edaphic
characteristics as possible clues to the clustering I observe in my plot.
However, I find that ordisurf creates a surface on the plot that is a
perfect, even gradient - and doesn't match my data. An example of this
output is attached (as .ps file), while these are the data for the
environmental variable of interest:

> edaphic$K
 [1] 28 61 48 29 28 26 45 28 34 33 55 62 44 51 60 68 51 31 54 32 58 50 37 35
35 34 52 29 53 24 37 50 62 51
[35] 28 39 47 41 49 83 33 50 55 71 59 50 57 47 46 49 43 30 56 23 49 35 31 27
38 52

The contour lines suggest that there are only two values below 32, when in
fact there are 11.
Any ideas? Thanks!

>>>
> bc <- read.table("braycurtis_noheader.dist",header=F)
> sample.mds <- cmdscale(bc)
> edaphic <- read.table("edaphic.txt", header=T)
> library(vegan)
This is vegan 1.17-2
> ordisurf(sample.mds, edaphic$K, xlab="PC 1", ylab="PC 2", main="K")
Loading required package: mgcv
This is mgcv 1.6-1. For overview type `help("mgcv-package")'.
Family: gaussian
Link function: identity
Formula:
y ~ s(x1, x2, k = knots)
Estimated degrees of freedom:
2  total = 3
GCV score: 140.9051

-- 
Matthew Bakker
Ph.D. Candidate
Department of Plant Pathology
University of Minnesota
495 Borlaug Hall
1991 Upper Buford Circle
Saint Paul, MN  55108 USA

612-624-2253
matt.g.bak...@gmail.com
http://plpa.cfans.umn.edu/Matt_Bakker.html


ordisurf_output.ps
Description: PostScript document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to draw the probability ellipse circle figure?

2010-06-08 Thread Jie TANG

hi ,R user folks .

Nowadays I read a paper which draw a probability ellipse circle  figure
shown in the appendix.
 I wonder how to draw this figure by R ?
the x-axis and y-axis both express the error but in different direction .
-- 
TANG Jie
Email: totang...@gmail.com
Tel: 0086-2154896104
Shanghai Typhoon Institute,China
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] textbox in lattice

2010-06-08 Thread Paul Murrell



Shifting this to r-devel ...

On 6/6/2010 11:20 PM, baptiste auguie wrote:

Hi,

I've just added width/heightDetails methods following Paul's
suggestion. I kept a duplicate of all on-the-fly grob size
calculations; it is necessary to ensure that the table cells adjust to
the content which can be subsequently edited (e.g changing the
colnames to plotmath expressions). Drawing the full iris dataset (150
x 5) takes 12 seconds when it was 8 seconds before. It is still
reasonable; I don't think anyone would want to use it for huge tables
anyway.

Out of curiosity, could drawDetails and height/widthDetails be altered
to share some information (thereby avoiding such duplication of
calculations at drawing time), or do they have to be completely
independent in the implementation?

Best,

baptiste

On 3 June 2010 07:58, baptiste auguie  wrote:

Hi,

On 3 June 2010 05:26, Paul Murrell  wrote:


Or the same drawing calculations have to be repeated within
width/heightDetails - those methods should get run within the same graphical
context as the drawDetails method.



Yes, the idea crossed my mind, but I did not find it very appealing
(already the function is slower than it could/should be). Something to
consider though, in a future version.

Thanks,

baptiste






Paul


Best,

baptiste

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/





--
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
p...@stat.auckland.ac.nz
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Run Rscript.exe, how to capture error message?

2010-06-08 Thread A Huang

Hi there,
I use Rscript.exe for batch run (actually it is used in ASP.net code)
c:>"C:\Program Files\R\R-2.10.1\bin\Rscript.exe" test.r xxx.txt
Where test.r is the r program and xxx.txt is a file name test.r will read in, 
it comes from
a web form. This works fine, when the file is in required format. Sometime, 
xxx.txt
has wrong format, test.r read in but can't process it. If I run interactive 
mode, I
probably can see the error message on the screen. But since this is running 
as a background process, I don't really have a screen to see.
I tried redirct:
c:>"C:\Program Files\R\R-2.10.1\bin\Rscript.exe" test.r xxx.txt > error.log
But it doesn't work even in local windows.
Thanks
A. Huang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color of histgram in Psych package (pairs.panels)

2010-06-08 Thread William Revelle


At 10:56 AM +0800 6/8/10, elaine kuo wrote:

Hello,

I searched the archives but found no answers.
How to modify the hisgram color of function pairs.panels of Psych package ?

I tried col() but it was the line color modified.

Thanks.


Elaine Elaine,

Good question.

 Right now, without going into the function and changing the color in 
the col="cyan" line in the pairs.hist.density function and then 
copying the entire set of functions back into your workspace,  it can 
not be done. (That is to say, it can be done, but it is a pain.)


I am about to release psych version  1.0-89 , probably this weekend, 
and  have added this feature to it.


In the meantime, if you want the fixed version of pairs.panels, I can 
just send it to you.


Bill


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
William Revelle http://personality-project.org/revelle.html
Professor   http://personality-project.org
Department of Psychology http://www.wcas.northwestern.edu/psych/
Northwestern University http://www.northwestern.edu/
Use R for psychology   http://personality-project.org/r
It is 6 minutes to midnight http://www.thebulletin.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Intra-Class correlation psych package missing data

2010-06-08 Thread William Revelle


Ross,
 My apologies, I just discovered your email (from April) to the 
R-help list serve asking about ICC in psych.


ICC does not remove missing data but rather lets the ANOVA handle it. 
It is probably more appropriate work on complete cases (as does the 
icc in the  irr package).

that is

my.data <- na.omit(my.data)
ICC(my.data)


I have just fixed ICC so that it will by default use na.omit and thus 
work on complete cases.This will be in the psych 1.0.89 release 
which should be ready by this weekend.


Bill





At 4:21 AM -0800 4/8/10, RCulloch wrote:

Hello R users, and perhaps William Revelle in particular,

I'm curious as to how ICC deals with missing data, so for example you are
sampling individuals over set periods in time and one individual is missing
or was not recaptured at that given time point - leading to NA in the
dataset. My thought was that it should then omit data by individual, but I'm
not convinced that that is what it is doing?


You are completely correct.



Does anyone know, I have looked at ?ICC but there is no information there,
apologies if I have missed it in any other help file, I have looked, but to
no avail!





Thanks in advance,

Ross
--
View this message in context: 
http://n4.nabble.com/Intra-Class-correlation-psych-package-missing-data-tp1773942p1773942.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] geom_ribbon removes missing values

2010-06-08 Thread Paul Murrell


Hi

grid.polygon() can do multiple polygons in a single call, but rather 
than using NA's to separate sub-polygons, it uses an 'id' argument (or 
an 'id.lengths' argument) to identify sub-polygons within the vectors of 
x- and y-values (see the examples in ?grid.polygon).  So a ggplot2 patch 
that makes use of that facility might make more sense.


Paul

On 6/7/2010 5:46 AM, Karsten Loesing wrote:

Hi Hadley,

On 5/31/10 9:51 PM, Hadley Wickham wrote:

There's no easy way to do this because behind the scenes geom_ribbon
uses grid.polygon.


A possible workaround might be to have grid.polygon draw multiple
polygons, one for each interval. We can do this by constructing vectors
with coordinates for the first polygon, then NA, then coordinates for
the second polygon, etc. Here are the vectors for my initial example:

x<- c(x[1:4], x[4:1], NA, x[7:10], x[10:7])
y<- c(ymax[1:4], ymin[4:1], NA, ymax[7:10], ymin[10:7])

I worked on a simple (but ugly) patch to GeomRibbon in ggplot2 that does
the job using an iteration:


/Library/Frameworks/R.framework/Versions/2.10/Resources/library/ggplot2/R$
diff ggplot2-orig ggplot2
5047,5048d5046
<  data<- remove_missing(data, na.rm,
  length(data$x) || is.na(data$ymin[i]) ||
   is.na(data$ymax[i])) {
 if (start>  0) {
   polyx<- c(polyx, data$x[start:(i-1)],
   data$x[(i-1):start], NA)
   polyy<- c(polyy, data$ymax[start:(i-1)],
   data$ymin[start:(i-1)], NA)
   start<- 0
 }
   } else {
 if (start == 0) {
   start<- i
 }
   }
 }
 polyx<- head(polyx, length(polyx) - 1)
 polyy<- head(polyy, length(polyy) - 1)

5052c5071

[R] Run Rscript.exe, how to capture error message?

2010-06-08 Thread A Huang

Hi there,

I use Rscript.exe for batch run (actually it is used in ASP.net code)

c:>"C:\Program Files\R\R-2.10.1\bin\Rscript.exe" test.r xxx.txt

Where test.r is the r program and xxx.txt is a file name test.r will read in, 
it comes from
a web form. This works fine, when the file is in required format. Sometime, 
xxx.txt
has wrong format, test.r read in but can't process it. If I run interactive 
mode, I
probably can see the error message on the screen. But since this is running 
as a background process, I don't really have a screen to see.

I tried redirct:

c:>"C:\Program Files\R\R-2.10.1\bin\Rscript.exe" test.r xxx.txt > error.log

But it doesn't work even in local windows.

Thanks

A. Huang

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logical vector question

2010-06-08 Thread Jorge Ivan Velez

Try

v1<.3 | v1 > .7

HTH,
Jorge


On Tue, Jun 8, 2010 at 6:38 PM, Worik R <> wrote:

> If I create a vector thusly
>
> > v1 <- runif(20, min=0, max=1)
> > v1
>  [1] 0.9754443 0.6306228 0.3238158 0.3175769 0.6791534 0.6956507 0.3840803
>  [8] 0.1421328 0.8592398 0.4388306 0.9472040 0.4727435 0.5645302 0.7391616
> [15] 0.6116199 0.2727754 0.2657867 0.5261744 0.8764804 0.2032126
>
> And I want to create a logical vector the same length that is true if v1<.3
> or v1 > .7  how do I do it without a loop?
>
> > v1>.7
>  [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE
> FALSE
> [13] FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE
> > v1<.3
>  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
> FALSE
> [13] FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE  TRUE
>
> OK so far but I try...
> > v1<.3 || v1>.7
> [1] TRUE
> >
>
> Not a vector.
>
> I have tried a lot of combinations culminating in...
>
> ifelse(((v1<.3) || (v1>.7)), TRUE, FALSE)
> [1] TRUE
> >
>
> What can I do?
>
> cheers
> Worik
>
>
>
> > v1
>  [1] 1 1 1 0 0 0 0 0 0 1
> > v2
>  [1] 1 1 1 0 1 0 0 0 1 1
>
> The
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logical vector question

2010-06-08 Thread Joshua Wiley

Hello,

This should work

v1 <- runif(20, min=0, max=1)
v2 <- v1 <.3 | v1 >.7 #you just need one | not two

HTH,

Josh

On Tue, Jun 8, 2010 at 3:38 PM, Worik R  wrote:
> If I create a vector thusly
>
>> v1 <- runif(20, min=0, max=1)
>> v1
>  [1] 0.9754443 0.6306228 0.3238158 0.3175769 0.6791534 0.6956507 0.3840803
>  [8] 0.1421328 0.8592398 0.4388306 0.9472040 0.4727435 0.5645302 0.7391616
> [15] 0.6116199 0.2727754 0.2657867 0.5261744 0.8764804 0.2032126
>
> And I want to create a logical vector the same length that is true if v1<.3
> or v1 > .7  how do I do it without a loop?
>
>> v1>.7
>  [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE
> [13] FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE
>> v1<.3
>  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
> [13] FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE  TRUE
>
> OK so far but I try...
>> v1<.3 || v1>.7
> [1] TRUE
>>
>
> Not a vector.
>
> I have tried a lot of combinations culminating in...
>
> ifelse(((v1<.3) || (v1>.7)), TRUE, FALSE)
> [1] TRUE
>>
>
> What can I do?
>
> cheers
> Worik
>
>
>
>> v1
>  [1] 1 1 1 0 0 0 0 0 0 1
>> v2
>  [1] 1 1 1 0 1 0 0 0 1 1
>
> The
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Senior in Psychology
University of California, Riverside
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] more dates and data frames

2010-06-08 Thread Bill.Venables

Here is one way

...

DF4 <- cast(formula=Date~V2,data=DF3,value="X1",fill=0)

d <- with(DF4, seq(min(Date), max(Date), by = 1))  ### full set
m <- as.Date(setdiff(d, DF4$Date)) ### missing dates
if(length(m) > 0) {
extras <- cbind(data.frame(Date = m), cat = 0, dog = 0, tree = 0)
DF4 <- rbind(DF4, extras)
rm(extras)
DF4 <- DF4[order(DF4$Date), ]
}
rm(d, m)  ### clean up
...

Bill. 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Erin Hodgess
Sent: Wednesday, 9 June 2010 5:52 AM
To: R help
Subject: [R] more dates and data frames

Dear R People:

So thanks to your help, I have the following:


> dog3.df <- 
> read.delim("c:/Users/erin/Documents/dog1.txt",header=FALSE,sep="\t")
> dog3.df
 V1   V2
1  1/1/2000  dog
2  1/1/2000  cat
3  1/1/2000 tree
4  1/1/2000  dog
5  1/2/2000  cat
6  1/2/2000  cat
7  1/2/2000  cat
8  1/2/2000 tree
9  1/3/2000  dog
10 1/3/2000 tree
11 1/6/2000  dog
12 1/6/2000  cat
> dog3.df$V1 <- as.Date(dog3.df$V1,"%m/%d/%Y")
> DF3 <- with(dog3.df,data.frame(Date=V1,V2,1))
> library(reshape)
> cast(formula=Date~V2,data=DF3,value="X1",fill=0)
Aggregation requires fun.aggregate: length used as default
Date cat dog tree
1 2000-01-01   1   21
2 2000-01-02   3   01
3 2000-01-03   0   11
4 2000-01-06   1   10
>

So far, so good.  My new question:  Can I fill in the days which are
"missing"; i.e., 2000-01-04 and 2000-01-05, with zeros for each set,
please?

thanks,
Erin

-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Logical vector question

2010-06-08 Thread Worik R

If I create a vector thusly

> v1 <- runif(20, min=0, max=1)
> v1
 [1] 0.9754443 0.6306228 0.3238158 0.3175769 0.6791534 0.6956507 0.3840803
 [8] 0.1421328 0.8592398 0.4388306 0.9472040 0.4727435 0.5645302 0.7391616
[15] 0.6116199 0.2727754 0.2657867 0.5261744 0.8764804 0.2032126

And I want to create a logical vector the same length that is true if v1<.3
or v1 > .7  how do I do it without a loop?

> v1>.7
 [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE
[13] FALSE  TRUE FALSE FALSE FALSE FALSE  TRUE FALSE
> v1<.3
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE  TRUE

OK so far but I try...
> v1<.3 || v1>.7
[1] TRUE
>

Not a vector.

I have tried a lot of combinations culminating in...

ifelse(((v1<.3) || (v1>.7)), TRUE, FALSE)
[1] TRUE
>

What can I do?

cheers
Worik



> v1
 [1] 1 1 1 0 0 0 0 0 0 1
> v2
 [1] 1 1 1 0 1 0 0 0 1 1

The

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] iterating over groups of columns

2010-06-08 Thread 09wkj

In the code fragment, I used 'by' to actually compute the min value (part of 
the statement with the eval) - and I agree that an apply would work there 
wonderfully. 

However, my hope was to use an apply for the subsetting of the data.frame's 
columns, so that I could then use an apply to compute the min across each row 
of the subsets.

Something that would give me the results of the following, but programmatically:
apply(the.data[,1], 1, min)#min of the first column
apply(the.data[,2:3], 1, min)#min of the next 2 columns
apply(the.data[,4:6], 1, min)#min of the next 3 columns
apply(the.data[,7:10], 1, min)  #min of the next 4 columns
...
apply(the.data[,46:55], 1, min)#min of the next 10 columns

Like, can I make a vector of levels with 'rep(1:10,1:10)', and then apply the 
function across all columns in each level? And then how could I cbind them 
together?

Thanks for any help,
Bill

On Jun 8, 2010, at 5:08 PM, Jannis wrote:

> you should have found a solution for that in the help page of apply.
> 
> just run
> 
> min.values = apply(the.data,1,min)
> 
> the '1' marks the direction (e.g. whether apply is applied to rows or 
> columns), it could be a 2 as well. Check that yourself in the apply 
> documentation.
> 
> Then run rbind(the.data,min.values) (could be cbind as well, I am not sure 
> again ;-) ) and you get what you want.
> 
> 09wkj schrieb:
>> I am mainly a Java/C++ programmer, so my mind is used to iterating over data 
>> with for loops. After a long break, I am trying to get back into the "R 
>> mindset", but I could not find a solution in the documentation for the 
>> applys, aggregate, or by.
>> 
>> I have a data.frame where each row is an entry with 10 groups of 
>> measurements. The first measurement spans 1 column, the second spans 2 
>> columns, third 3, and so on (55 total columns). What I want to do is add to 
>> my data.frame 10 new columns containing the minimum value of each 
>> measurement.
>> 
>> dim(the.data)
>> [1] 1679  55
>> 
>>  
>>> colnames(the.data)
>>>
>>  [1] "k.1.1"   "k.2.1"   "k.2.2"   "k.3.1"   "k.3.2"   "k.3.3"   "k.4.1"
>> [8] "k.4.2"   "k.4.3"   "k.4.4"   "k.5.1"   "k.5.2"   "k.5.3"   "k.5.4"   
>> [15] "k.5.5"   "k.6.1"   "k.6.2"   "k.6.3"   "k.6.4"   "k.6.5"   "k.6.6"   
>> [22] "k.7.1"   "k.7.2"   "k.7.3"   "k.7.4"   "k.7.5"   "k.7.6"   "k.7.7"   
>> [29] "k.8.1"   "k.8.2"   "k.8.3"   "k.8.4"   "k.8.5"   "k.8.6"   "k.8.7"   
>> [36] "k.8.8"   "k.9.1"   "k.9.2"   "k.9.3"   "k.9.4"   "k.9.5"   "k.9.6"   
>> [43] "k.9.7"   "k.9.8"   "k.9.9"   "k.10.1"  "k.10.2"  "k.10.3"  "k.10.4"  
>> [50] "k.10.5"  "k.10.6"  "k.10.7"  "k.10.8"  "k.10.9"  "k.10.10"
>> 
>> I want to add to the.data new columns: min.k.1, min.k.2, ..., min.k.10
>> 
>> This is the section of code I would like to improve, hopefully getting rid 
>> of the eval and the for loop:
>> 
>> for(k in 1:10){
>>s <- subset(the.data, select=paste("k", k, 1:k, sep="."))
>>eval(parse(text = paste("the.data$min.k.", k, "<-as.vector(by(s, 
>> 1:nrow(s), min))", sep="")))
>> }
>> 
>> Thanks for any help,
>> Bill
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>>  
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to ignore rows missing arguments of a function when creating a function?

2010-06-08 Thread edmund jones

Hi,

I am relatively new to R; when creating functions, I run into problems with
missing values. I would like my functions to ignore rows with missing values
for arguments of my function) in the analysis (as for example is the case in
STATA). Note that I don't want my function to drop rows if there are missing
arguments elsewhere in a row, ie for variables that are not arguments of my
function.

As an example: here is a clustering function I wrote:


cl <- function(dat, na.rm = TRUE, fm, cluster){

attach( dat , warn.conflicts = F)

library(sandwich)

library(lmtest)

M <- length(unique(cluster))

N <- length(cluster)

K <- fm$rank

dfc <- (M/(M-1))*((N-1)/(N-K))

uj <- data.frame(apply(estfun(fm),2, function(x) data.frame(tapply(x,
cluster, sum)) ) );

vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N)

coeftest(fm, vcovCL)

}


When I run my function, I get the message:


Error in tapply(x, cluster, sum) : arguments must have same length


If I specify instead attach(na.omit(dat), warn.conflicts = F)  and don't
have the "na.rm=TRUE" argument, then my function runs; but only for the rows
where there are no missing values AT ALL; however, I don't care if there are
missing values for variables on which I am not applying my function.


For example, I have information on children's size; if I want regress scores
on age and parents' education, clustering on class, I would like missing
values in size not to interfere (ie if I have scores, age, parents'
education, and class, but not size, I don't want to drop this observation).


I tried to look at the code of "lm" to see how the na.action part works, but
I couldn't figure it out... This is exactly how I would like to deal with
missing values.


I tried to write

cl <- function(dat, fm, cluster, na.action){

attach( dat , warn.conflicts = F)

library(sandwich)

library(lmtest)

  M <- length(unique(cluster))

  N <- length(cluster)

  K <- fm$rank

  dfc <- (M/(M-1))*((N-1)/(N-K))

uj <- data.frame(apply(estfun(fm),2, function(x) data.frame(tapply(x,
cluster, sum)) ) );

vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N)

  coeftest(fm, vcovCL)

}

 attr(cl,"na.action") <- na.exclude


but it still didn't work...


Any ideas of how to deal with this issue?

Thank you for your answers!

Edmund

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Efficiency question

2010-06-08 Thread Worik R

Given the following snippet
 m.nf.xts <- xts(rep(0, length(index(m.xts))), order.by=index(m.xts))

Does R know to cache the index(m.xts) or is it more efficient to say...

m.i <- index(m.xts)
 m.nf.xts <- xts(rep(0, length(m.i)), order.by=index(m.i))

?

cheers
Worik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] restructuring "by" output for use with write.table

2010-06-08 Thread Joris Meys

don't forget to make the "by" option a list :
vegMeans = aggregate(SoilVegHydro[3:37],list(SoilVegHydro['Physiogomy']),mean)
and
 vegSd = aggregate(SoilVegHydro[3:37],list(SoilVegHydro['Physiogomy']),sd)

see also ?aggregate

on a side note, it would be handy if that transformation to a list
would be built into the function...

Cheers
Joris

On Tue, Jun 8, 2010 at 11:39 PM, Phil Spector  wrote:
> Not much to go on, but you might find
>
>  vegMeans = aggregate(SoilVegHydro[3:37],SoilVegHydro['Physiogomy'],mean)
> and
>  vegSd = aggregate(SoilVegHydro[3:37],SoilVegHydro['Physiogomy'],sd)
>
> more suitable for your needs.  (Not run because I don't know what
> SoilVegHydro is.)
>
>                                        - Phil Spector
>                                         Statistical Computing Facility
>                                         Department of Statistics
>                                         UC Berkeley
>                                         spec...@stat.berkeley.edu
>
>
> On Tue, 8 Jun 2010, steve_fried...@nps.gov wrote:
>
>>
>> Hello,
>>
>> vegMeans <- by(SoilVegHydro[3:37] , SoilVegHydro$Physiogomy, mean)
>>
>> vegSD <- by(SoilVegHydro[3:37] , SoilVegHydro$Physiogomy, sd)
>>
>>
>> write.table(vegMeans,
>> file="A:\\Work_Area\\Steve\\Hydrology_Data\\data\\vegMeans.txt")
>> Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors =
>> stringsAsFactors) :
>>  cannot coerce class '"by"' into a data.frame
>>
>> Is there anyway to convert these output objects for use with write.table
>> or
>> an alternate method such that I can output to a word processing system?
>>
>> Windows R 2.11.1
>>
>> Thanks
>> Steve
>>
>>
>> Steve Friedman Ph. D.
>> Spatial Statistical Analyst
>> Everglades and Dry Tortugas National Park
>> 950 N Krome Ave (3rd Floor)
>> Homestead, Florida 33034
>>
>> steve_fried...@nps.gov
>> Office (305) 224 - 4282
>> Fax     (305) 224 - 4147
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] restructuring "by" output for use with write.table

2010-06-08 Thread Phil Spector


Not much to go on, but you might find

  vegMeans = aggregate(SoilVegHydro[3:37],SoilVegHydro['Physiogomy'],mean)
and
  vegSd = aggregate(SoilVegHydro[3:37],SoilVegHydro['Physiogomy'],sd)

more suitable for your needs.  (Not run because I don't know what
SoilVegHydro is.)

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Tue, 8 Jun 2010, steve_fried...@nps.gov wrote:



Hello,

vegMeans <- by(SoilVegHydro[3:37] , SoilVegHydro$Physiogomy, mean)

vegSD <- by(SoilVegHydro[3:37] , SoilVegHydro$Physiogomy, sd)


write.table(vegMeans,
file="A:\\Work_Area\\Steve\\Hydrology_Data\\data\\vegMeans.txt")
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors =
stringsAsFactors) :
 cannot coerce class '"by"' into a data.frame

Is there anyway to convert these output objects for use with write.table or
an alternate method such that I can output to a word processing system?

Windows R 2.11.1

Thanks
Steve


Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] partial matches across rows not columns

2010-06-08 Thread Jannis

I did not go too deep into your zoology problem ;-) but as far as I 
understood you, you want to omit all rows where

ID and TO_ID are A1 and A1.1, (or A2) correct?

If the data you send us is all the data and if there do not occour any 
different situations the following should be sufficient:


Transfer the vectors ID an TO_ID to values without the . and the number 
following it (e.g. A1.1 -> A1):


ID.clean<-gsub("^.*[?]| .*$", "",data$ID)
TO_ID.clean<-gsub("^.*[?]| .*$", "",data$TO_ID)


And then use logical indexing:
data.clean = data[ID.clean==TO_ID.clean,]


HTH
Jannis


RCulloch schrieb:

Hi R users,

I am trying to omit rows of data based on partial matches an example of my
data (seal_dist) is below:

A quick break down of my coding and why I need to answer this - I am dealing
with a colony of seals where for example A1 is a female with pup and A1.1 is
that female's pup, the important part of the data here is DIST which tells
the distance between one seal (ID) and another (TO_ID). What I want to do is
take a mean for these data for a nearest neighbour analysis but I want to
omit any cases where there is the distance between a female and her pup,
i.e. in the previous e.g. omit rows where A1 and A1.1 occur. 


I have looked at grep and pmatch but these appear to work across columns and
don't appear to do what I'm looking to do, 


If anyone can point me in the right direction, I'd be most greatful,

Best wishes,

Ross 



FROM TO DISTID HR DD MM YY ANIMAL DAY TO_ID TO_ANIMAL
2  1  2  4.81803A1  1 30  9  9  1   1 MALE112
3  1  3  2.53468A1  1 30  9  9  1   1A2 3
4  1  4  7.57332A1  1 30  9  9  1   1  A1.1 7
5  1  1  7.57332  A1.1  1 30  9  9  7   1A1 1
6  1  2  7.89665  A1.1  1 30  9  9  7   1 MALE112
7  1  3  6.47847  A1.1  1 30  9  9  7   1A2 3
9  1  1  2.53468A2  1 30  9  9  3   1A1 1
10 1  2  2.59051A2  1 30  9  9  3   1 MALE112
12 1  4  6.47847A2  1 30  9  9  3   1  A1.1 7
13 1  1  4.81803 MALE1  1 30  9  9 12   1A1 1
15 1  3  2.59051 MALE1  1 30  9  9 12   1A2 3
16 1  4  7.89665 MALE1  1 30  9  9 12   1  A1.1 7
17 1  1  3.85359A1  2 30  9  9  1   1 MALE112
19 1  3  4.88826A1  2 30  9  9  1   1A2 3
20 1  4  7.25773A1  2 30  9  9  1   1  A1.1 7
21 1  1  9.96431  A1.1  2 30  9  9  7   1 MALE112
22 1  2  7.25773  A1.1  2 30  9  9  7   1A1 1
23 1  3  5.71725  A1.1  2 30  9  9  7   1A2 3
25 1  1  8.73759A2  2 30  9  9  3   1 MALE112
26 1  2  4.88826A2  2 30  9  9  3   1A1 1
28 1  4  5.71725A2  2 30  9  9  3   1  A1.1 7
30 1  2  3.85359 MALE1  2 30  9  9 12   1A1 1
31 1  3  8.73759 MALE1  2 30  9  9 12   1A2 3
32 1  4  9.96431 MALE1  2 30  9  9 12   1  A1.1 7
33 1  1  7.95399A1  3 30  9  9  1   1 MALE112
35 1  3  0.60443A1  3 30  9  9  1   1  A1.1 7
36 1  4  1.91136A1  3 30  9  9  1   1A2 3
37 1  1  8.29967  A1.1  3 30  9  9  7   1 MALE112
38 1  2  0.60443  A1.1  3 30  9  9  7   1A1 1
40 1  4  1.43201  A1.1  3 30  9  9  7   1A2 3
41 1  1  9.71659A2  3 30  9  9  3   1 MALE112
42 1  2  1.91136A2  3 30  9  9  3   1A1 1
43 1  3  1.43201A2  3 30  9  9  3   1  A1.1 7
46 1  2  7.95399 MALE1  3 30  9  9 12   1A1 1
47 1  3  8.29967 MALE1  3 30  9  9 12   1  A1.1 7
48 1  4  9.71659 MALE1  3 30  9  9 12   1A2 3



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] type conversion with apply or not

2010-06-08 Thread Horace Tso

Guys, many thanks. lapply works. Did not occur to me as I thought lapply 
returns a list and the receiving entity is a data frame. 

H

  

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Tuesday, June 08, 2010 2:22 PM
To: Horace Tso; r-help@r-project.org
Subject: RE: [R] type conversion with apply or not

The short answer is, don't use apply() on data.frame's.
Use lapply to loop over the columns of a data.frame.



> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Horace Tso
> Sent: Tuesday, June 08, 2010 2:19 PM
> To: r-help@r-project.org
> Subject: [R] type conversion with apply or not
> 
> Folks, i thought it should be straightforward but after a few 
> hours poking around, I decided it's best to post my question 
> on this list.
> 
> I have a data frame consisting of a (large) number of date 
> columns, which are read in from a csv file as character 
> string. I want to convert them to Date type. Following is an 
> example, where the first column is of integer type, while the 
> rest are type character.
> 
> > head(df)
>   TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB
> 1  250897 7/1/2010  7/31/2010  PSTRUCTZ
> 2  250617 8/1/2010  8/31/2010  PSTRUCTZ
> 3  250364 4/1/2011  6/30/2011  PLRZ
> 4  250176 4/1/2011  6/30/2011  PLRZ
> 5  250176 4/1/2011  6/30/2011  PLRZ
> 6  250364 4/1/2011  6/30/2011  PLRZ
> > sapply(df, class)
>  TRANSNO TRANS.START_DATE   TRANS.END_DATE 
> DIVISION FASB
>"integer"  "character"  "character"  
> "character"  "character"
> I thought it's just a matter of applying with a as.Date,
> 
> df[,c(2,3)] = apply(df[,c(2,3)], 2, function(x)as.Date(x,"%m/%d/%Y"))
> 
> Well, the Date conversion fails and I got,
> 
>   TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB
> 1  25089714791  14821  PSTRUCTZ
> 2  25061714822  14852  PSTRUCTZ
> 3  25036415065  15155  PLRZ
> 4  25017615065  15155  PLRZ
> 5  25017615065  15155  PLRZ
> 6  25036415065  15155  PLRZ
> The character columns are indeed converted, but they became 
> integer, not Date type. OK, that's strange and so I started 
> reading the help pages.
> 
> It turns out in apply, the result is coerced to some basic 
> vector types.  And apparently Date is not one of the basic 
> vector types.
> 
> "In all cases the result is coerced by 
> as.vector 
> to one of the basic vector types before the dimensions are 
> set, so that (for example) factor results will be coerced to 
> a character array. "
> 
> The question then is how type conversion can be carried out 
> on some columns of a data frame without using a loop.
> 
> Thanks.
> 
> Horace Tso
> 
> 
> 
> 
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] type conversion with apply or not

2010-06-08 Thread Henrique Dallazuanna

You need lapply here:

df[2:3] <- lapply(df[2:3], as.Date, '%m/%d/%Y')





On Tue, Jun 8, 2010 at 6:19 PM, Horace Tso  wrote:

> Folks, i thought it should be straightforward but after a few hours poking
> around, I decided it's best to post my question on this list.
>
> I have a data frame consisting of a (large) number of date columns, which
> are read in from a csv file as character string. I want to convert them to
> Date type. Following is an example, where the first column is of integer
> type, while the rest are type character.
>
> > head(df)
>  TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB
> 1  250897 7/1/2010  7/31/2010  PSTRUCTZ
> 2  250617 8/1/2010  8/31/2010  PSTRUCTZ
> 3  250364 4/1/2011  6/30/2011  PLRZ
> 4  250176 4/1/2011  6/30/2011  PLRZ
> 5  250176 4/1/2011  6/30/2011  PLRZ
> 6  250364 4/1/2011  6/30/2011  PLRZ
> > sapply(df, class)
> TRANSNO TRANS.START_DATE   TRANS.END_DATE DIVISION
> FASB
>   "integer"  "character"  "character"  "character"
>  "character"
> I thought it's just a matter of applying with a as.Date,
>
> df[,c(2,3)] = apply(df[,c(2,3)], 2, function(x)as.Date(x,"%m/%d/%Y"))
>
> Well, the Date conversion fails and I got,
>
>  TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB
> 1  25089714791  14821  PSTRUCTZ
> 2  25061714822  14852  PSTRUCTZ
> 3  25036415065  15155  PLRZ
> 4  25017615065  15155  PLRZ
> 5  25017615065  15155  PLRZ
> 6  25036415065  15155  PLRZ
> The character columns are indeed converted, but they became integer, not
> Date type. OK, that's strange and so I started reading the help pages.
>
> It turns out in apply, the result is coerced to some basic vector types.
>  And apparently Date is not one of the basic vector types.
>
> "In all cases the result is coerced by as.vector<
> http://127.0.0.1:19182/library/base/help/as.vector> to one of the basic
> vector types before the dimensions are set, so that (for example) factor
> results will be coerced to a character array. "
>
> The question then is how type conversion can be carried out on some columns
> of a data frame without using a loop.
>
> Thanks.
>
> Horace Tso
>
>
>
>
>
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] restructuring "by" output for use with write.table

2010-06-08 Thread Steve_Friedman


Hello,

vegMeans <- by(SoilVegHydro[3:37] , SoilVegHydro$Physiogomy, mean)

vegSD <- by(SoilVegHydro[3:37] , SoilVegHydro$Physiogomy, sd)


 write.table(vegMeans,
file="A:\\Work_Area\\Steve\\Hydrology_Data\\data\\vegMeans.txt")
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors =
stringsAsFactors) :
  cannot coerce class '"by"' into a data.frame

Is there anyway to convert these output objects for use with write.table or
an alternate method such that I can output to a word processing system?

Windows R 2.11.1

Thanks
Steve


Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] type conversion with apply or not

2010-06-08 Thread William Dunlap

The short answer is, don't use apply() on data.frame's.
Use lapply to loop over the columns of a data.frame.



> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Horace Tso
> Sent: Tuesday, June 08, 2010 2:19 PM
> To: r-help@r-project.org
> Subject: [R] type conversion with apply or not
> 
> Folks, i thought it should be straightforward but after a few 
> hours poking around, I decided it's best to post my question 
> on this list.
> 
> I have a data frame consisting of a (large) number of date 
> columns, which are read in from a csv file as character 
> string. I want to convert them to Date type. Following is an 
> example, where the first column is of integer type, while the 
> rest are type character.
> 
> > head(df)
>   TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB
> 1  250897 7/1/2010  7/31/2010  PSTRUCTZ
> 2  250617 8/1/2010  8/31/2010  PSTRUCTZ
> 3  250364 4/1/2011  6/30/2011  PLRZ
> 4  250176 4/1/2011  6/30/2011  PLRZ
> 5  250176 4/1/2011  6/30/2011  PLRZ
> 6  250364 4/1/2011  6/30/2011  PLRZ
> > sapply(df, class)
>  TRANSNO TRANS.START_DATE   TRANS.END_DATE 
> DIVISION FASB
>"integer"  "character"  "character"  
> "character"  "character"
> I thought it's just a matter of applying with a as.Date,
> 
> df[,c(2,3)] = apply(df[,c(2,3)], 2, function(x)as.Date(x,"%m/%d/%Y"))
> 
> Well, the Date conversion fails and I got,
> 
>   TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB
> 1  25089714791  14821  PSTRUCTZ
> 2  25061714822  14852  PSTRUCTZ
> 3  25036415065  15155  PLRZ
> 4  25017615065  15155  PLRZ
> 5  25017615065  15155  PLRZ
> 6  25036415065  15155  PLRZ
> The character columns are indeed converted, but they became 
> integer, not Date type. OK, that's strange and so I started 
> reading the help pages.
> 
> It turns out in apply, the result is coerced to some basic 
> vector types.  And apparently Date is not one of the basic 
> vector types.
> 
> "In all cases the result is coerced by 
> as.vector 
> to one of the basic vector types before the dimensions are 
> set, so that (for example) factor results will be coerced to 
> a character array. "
> 
> The question then is how type conversion can be carried out 
> on some columns of a data frame without using a loop.
> 
> Thanks.
> 
> Horace Tso
> 
> 
> 
> 
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] type conversion with apply or not

2010-06-08 Thread Horace Tso

Folks, i thought it should be straightforward but after a few hours poking 
around, I decided it's best to post my question on this list.

I have a data frame consisting of a (large) number of date columns, which are 
read in from a csv file as character string. I want to convert them to Date 
type. Following is an example, where the first column is of integer type, while 
the rest are type character.

> head(df)
  TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB
1  250897 7/1/2010  7/31/2010  PSTRUCTZ
2  250617 8/1/2010  8/31/2010  PSTRUCTZ
3  250364 4/1/2011  6/30/2011  PLRZ
4  250176 4/1/2011  6/30/2011  PLRZ
5  250176 4/1/2011  6/30/2011  PLRZ
6  250364 4/1/2011  6/30/2011  PLRZ
> sapply(df, class)
 TRANSNO TRANS.START_DATE   TRANS.END_DATE DIVISION 
FASB
   "integer"  "character"  "character"  "character"  
"character"
I thought it's just a matter of applying with a as.Date,

df[,c(2,3)] = apply(df[,c(2,3)], 2, function(x)as.Date(x,"%m/%d/%Y"))

Well, the Date conversion fails and I got,

  TRANSNO TRANS.START_DATE TRANS.END_DATE DIVISION FASB
1  25089714791  14821  PSTRUCTZ
2  25061714822  14852  PSTRUCTZ
3  25036415065  15155  PLRZ
4  25017615065  15155  PLRZ
5  25017615065  15155  PLRZ
6  25036415065  15155  PLRZ
The character columns are indeed converted, but they became integer, not Date 
type. OK, that's strange and so I started reading the help pages.

It turns out in apply, the result is coerced to some basic vector types.  And 
apparently Date is not one of the basic vector types.

"In all cases the result is coerced by 
as.vector to one of the 
basic vector types before the dimensions are set, so that (for example) factor 
results will be coerced to a character array. "

The question then is how type conversion can be carried out on some columns of 
a data frame without using a loop.

Thanks.

Horace Tso








[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] partial matches across rows not columns

2010-06-08 Thread jim holtman

Is this what you are looking for:

> # assume females start with "A"
> # extract first part if female from ID
> x.id <- sub("(A[[:digit:]]+).*", "\\1", x$ID)
> # now see if this pattern matches first part of TO_ID
> x.match <- x.id == substring(x$TO_ID, 1, nchar(x.id))
> # here are the ones that would be eliminated
> x[x.match,]
   FROM TODIST   ID HR DD MM YY ANIMAL DAY TO_ID TO_ANIMAL
4 1  4 7.57332   A1  1 30  9  9  1   1  A1.1 7
5 1  1 7.57332 A1.1  1 30  9  9  7   1A1 1
201  4 7.25773   A1  2 30  9  9  1   1  A1.1 7
221  2 7.25773 A1.1  2 30  9  9  7   1A1 1
351  3 0.60443   A1  3 30  9  9  1   1  A1.1 7
381  2 0.60443 A1.1  3 30  9  9  7   1A1 1
>
>


On Tue, Jun 8, 2010 at 1:43 PM, RCulloch  wrote:
>
> Hi R users,
>
> I am trying to omit rows of data based on partial matches an example of my
> data (seal_dist) is below:
>
> A quick break down of my coding and why I need to answer this - I am dealing
> with a colony of seals where for example A1 is a female with pup and A1.1 is
> that female's pup, the important part of the data here is DIST which tells
> the distance between one seal (ID) and another (TO_ID). What I want to do is
> take a mean for these data for a nearest neighbour analysis but I want to
> omit any cases where there is the distance between a female and her pup,
> i.e. in the previous e.g. omit rows where A1 and A1.1 occur.
>
> I have looked at grep and pmatch but these appear to work across columns and
> don't appear to do what I'm looking to do,
>
> If anyone can point me in the right direction, I'd be most greatful,
>
> Best wishes,
>
> Ross
>
>
>    FROM TO     DIST    ID HR DD MM YY ANIMAL DAY TO_ID TO_ANIMAL
> 2      1  2  4.81803    A1  1 30  9  9      1   1 MALE1        12
> 3      1  3  2.53468    A1  1 30  9  9      1   1    A2         3
> 4      1  4  7.57332    A1  1 30  9  9      1   1  A1.1         7
> 5      1  1  7.57332  A1.1  1 30  9  9      7   1    A1         1
> 6      1  2  7.89665  A1.1  1 30  9  9      7   1 MALE1        12
> 7      1  3  6.47847  A1.1  1 30  9  9      7   1    A2         3
> 9      1  1  2.53468    A2  1 30  9  9      3   1    A1         1
> 10     1  2  2.59051    A2  1 30  9  9      3   1 MALE1        12
> 12     1  4  6.47847    A2  1 30  9  9      3   1  A1.1         7
> 13     1  1  4.81803 MALE1  1 30  9  9     12   1    A1         1
> 15     1  3  2.59051 MALE1  1 30  9  9     12   1    A2         3
> 16     1  4  7.89665 MALE1  1 30  9  9     12   1  A1.1         7
> 17     1  1  3.85359    A1  2 30  9  9      1   1 MALE1        12
> 19     1  3  4.88826    A1  2 30  9  9      1   1    A2         3
> 20     1  4  7.25773    A1  2 30  9  9      1   1  A1.1         7
> 21     1  1  9.96431  A1.1  2 30  9  9      7   1 MALE1        12
> 22     1  2  7.25773  A1.1  2 30  9  9      7   1    A1         1
> 23     1  3  5.71725  A1.1  2 30  9  9      7   1    A2         3
> 25     1  1  8.73759    A2  2 30  9  9      3   1 MALE1        12
> 26     1  2  4.88826    A2  2 30  9  9      3   1    A1         1
> 28     1  4  5.71725    A2  2 30  9  9      3   1  A1.1         7
> 30     1  2  3.85359 MALE1  2 30  9  9     12   1    A1         1
> 31     1  3  8.73759 MALE1  2 30  9  9     12   1    A2         3
> 32     1  4  9.96431 MALE1  2 30  9  9     12   1  A1.1         7
> 33     1  1  7.95399    A1  3 30  9  9      1   1 MALE1        12
> 35     1  3  0.60443    A1  3 30  9  9      1   1  A1.1         7
> 36     1  4  1.91136    A1  3 30  9  9      1   1    A2         3
> 37     1  1  8.29967  A1.1  3 30  9  9      7   1 MALE1        12
> 38     1  2  0.60443  A1.1  3 30  9  9      7   1    A1         1
> 40     1  4  1.43201  A1.1  3 30  9  9      7   1    A2         3
> 41     1  1  9.71659    A2  3 30  9  9      3   1 MALE1        12
> 42     1  2  1.91136    A2  3 30  9  9      3   1    A1         1
> 43     1  3  1.43201    A2  3 30  9  9      3   1  A1.1         7
> 46     1  2  7.95399 MALE1  3 30  9  9     12   1    A1         1
> 47     1  3  8.29967 MALE1  3 30  9  9     12   1  A1.1         7
> 48     1  4  9.71659 MALE1  3 30  9  9     12   1    A2         3
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/partial-matches-across-rows-not-columns-tp2247757p2247757.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guid

Re: [R] more dates and data frames

2010-06-08 Thread Gabor Grothendieck

Once again my message got held up for moderator approval so I
am deleting it and trying again. Hopefully this one goes through.

In general, we will get the simplest usage if we match the problem to
the appropriate OO class. In this case we are using time series so it
is advantageous to use a time series class, i.e. zoo, instead of data
frames.   We can use data frames but then each time we run into a
problem that would be trivial with time series we have to reinvent the
wheel all over again.

We read the data into a data frame, append a column of ones and then
read it into zoo, converting the index to Date class with the
indicated format, splitting it on column 2 and aggregating using sum
(since unlike the prior example we now have duplicate dates within cat
and also within dog).  See ?read.zoo for more.

To fill in the dates we just convert the zoo series to ts and back
again.  This loses the Date class (since ts has no notion of index
class) but we can put it back again.  Since this fills the newly added
entries with NAs we replace the NAs with zeros.

Lines <- "V1 V2
1  1/1/2000  dog
2  1/1/2000  cat
3  1/1/2000 tree
4  1/1/2000  dog
5  1/2/2000  cat
6  1/2/2000  cat
7  1/2/2000  cat
8  1/2/2000 tree
9  1/3/2000  dog
10 1/3/2000 tree
11 1/6/2000  dog
12 1/6/2000  cat"

library(zoo)
source("http://r-forge.r-project.org/scm/viewvc.php/*checkout*/pkg/zoo/R/read.zoo.R?revision=719&root=zoo";)
DF <- read.table(textConnection(Lines))
z <- read.zoo(cbind(DF, 1), format = "%m/%d/%Y", split = 2, aggregate = sum)
zz <- as.zoo(as.ts(z))
time(zz) <- as.Date(time(zz))
zz[is.na(zz)] <- 0
zz

plot(zz)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] iterating over groups of columns

2010-06-08 Thread Jannis


you should have found a solution for that in the help page of apply.

just run

min.values = apply(the.data,1,min)

the '1' marks the direction (e.g. whether apply is applied to rows or 
columns), it could be a 2 as well. Check that yourself in the apply 
documentation.


Then run rbind(the.data,min.values) (could be cbind as well, I am not 
sure again ;-) ) and you get what you want.


09wkj schrieb:

I am mainly a Java/C++ programmer, so my mind is used to iterating over data with for 
loops. After a long break, I am trying to get back into the "R mindset", but I 
could not find a solution in the documentation for the applys, aggregate, or by.

I have a data.frame where each row is an entry with 10 groups of measurements. 
The first measurement spans 1 column, the second spans 2 columns, third 3, and 
so on (55 total columns). What I want to do is add to my data.frame 10 new 
columns containing the minimum value of each measurement.

dim(the.data)
[1] 1679  55

  

colnames(the.data)

  [1] "k.1.1"   "k.2.1"   "k.2.2"   "k.3.1"   "k.3.2"   "k.3.3"   "k.4.1"  
  [8] "k.4.2"   "k.4.3"   "k.4.4"   "k.5.1"   "k.5.2"   "k.5.3"   "k.5.4"  
 [15] "k.5.5"   "k.6.1"   "k.6.2"   "k.6.3"   "k.6.4"   "k.6.5"   "k.6.6"  
 [22] "k.7.1"   "k.7.2"   "k.7.3"   "k.7.4"   "k.7.5"   "k.7.6"   "k.7.7"  
 [29] "k.8.1"   "k.8.2"   "k.8.3"   "k.8.4"   "k.8.5"   "k.8.6"   "k.8.7"  
 [36] "k.8.8"   "k.9.1"   "k.9.2"   "k.9.3"   "k.9.4"   "k.9.5"   "k.9.6"  
 [43] "k.9.7"   "k.9.8"   "k.9.9"   "k.10.1"  "k.10.2"  "k.10.3"  "k.10.4" 
 [50] "k.10.5"  "k.10.6"  "k.10.7"  "k.10.8"  "k.10.9"  "k.10.10"


I want to add to the.data new columns: min.k.1, min.k.2, ..., min.k.10

This is the section of code I would like to improve, hopefully getting rid of 
the eval and the for loop:

for(k in 1:10){
s <- subset(the.data, select=paste("k", k, 1:k, sep="."))
eval(parse(text = paste("the.data$min.k.", k, "<-as.vector(by(s, 1:nrow(s), min))", 
sep="")))
}

Thanks for any help,
Bill
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cor.test() -- how to get the value of a coefficient

2010-06-08 Thread Erik Iverson




Ekaterina Pek wrote:

Hi, all.

Yet another beginner to R : )

I wonder, how it's possible to get the value of a coefficient from the
object produced by cor.test() ?


cor.test(a, b, method="spearman")


You can always assign the value of a function to a variable, and then 
use ?str to see the structure of that object.


x <- cor.test(a, b, method = "spearman")

str(x)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cor.test() -- how to get the value of a coefficient

2010-06-08 Thread Joris Meys

result <- cor.test(a,b,method="spearman")
result$estimate

Cheers
Joris

On Tue, Jun 8, 2010 at 10:40 PM, Ekaterina Pek  wrote:
> Hi, all.
>
> Yet another beginner to R : )
>
> I wonder, how it's possible to get the value of a coefficient from the
> object produced by cor.test() ?
>
>> cor.test(a, b, method="spearman")
>
>        Spearman's rank correlation rho
>
> data:  a and b
> S = 21554.28, p-value = 2.496e-11
> alternative hypothesis: true rho is not equal to 0
> sample estimates:
>      rho
> 0.6807955
>
> Warning message:
> In cor.test.default(a, b, method = "spearman") :
>  Cannot compute exact p-values with ties
>
>
> So, I'd like to do something like :
>
>> result = cor.test(...)
>> rho = result.rho()
>
> -- in order to save then the value in some format into external file.
>
> Thanks in advance for helping !
>
> Regards,
> Kate.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] cor.test() -- how to get the value of a coefficient

2010-06-08 Thread Ekaterina Pek

Hi, all.

Yet another beginner to R : )

I wonder, how it's possible to get the value of a coefficient from the
object produced by cor.test() ?

> cor.test(a, b, method="spearman")

Spearman's rank correlation rho

data:  a and b
S = 21554.28, p-value = 2.496e-11
alternative hypothesis: true rho is not equal to 0
sample estimates:
  rho
0.6807955

Warning message:
In cor.test.default(a, b, method = "spearman") :
  Cannot compute exact p-values with ties


So, I'd like to do something like :

> result = cor.test(...)
> rho = result.rho()

-- in order to save then the value in some format into external file.

Thanks in advance for helping !

Regards,
Kate.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] more dates and data frames

2010-06-08 Thread Henrique Dallazuanna

Try this:

xtabs( ~ V1 + V2, transform(dog3.df, V1 = factor(V1, levels =
as.character(seq(min(dog3.df$V1), max(dog3.df$V1), by = "days")


On Tue, Jun 8, 2010 at 4:52 PM, Erin Hodgess wrote:

> Dear R People:
>
> So thanks to your help, I have the following:
>
>
> > dog3.df <-
> read.delim("c:/Users/erin/Documents/dog1.txt",header=FALSE,sep="\t")
> > dog3.df
> V1   V2
> 1  1/1/2000  dog
> 2  1/1/2000  cat
> 3  1/1/2000 tree
> 4  1/1/2000  dog
> 5  1/2/2000  cat
> 6  1/2/2000  cat
> 7  1/2/2000  cat
> 8  1/2/2000 tree
> 9  1/3/2000  dog
> 10 1/3/2000 tree
> 11 1/6/2000  dog
> 12 1/6/2000  cat
> > dog3.df$V1 <- as.Date(dog3.df$V1,"%m/%d/%Y")
> > DF3 <- with(dog3.df,data.frame(Date=V1,V2,1))
> > library(reshape)
> > cast(formula=Date~V2,data=DF3,value="X1",fill=0)
> Aggregation requires fun.aggregate: length used as default
>Date cat dog tree
> 1 2000-01-01   1   21
> 2 2000-01-02   3   01
> 3 2000-01-03   0   11
> 4 2000-01-06   1   10
> >
>
> So far, so good.  My new question:  Can I fill in the days which are
> "missing"; i.e., 2000-01-04 and 2000-01-05, with zeros for each set,
> please?
>
> thanks,
> Erin
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] partial matches across rows not columns

2010-06-08 Thread RCulloch


Hi R users,

I am trying to omit rows of data based on partial matches an example of my
data (seal_dist) is below:

A quick break down of my coding and why I need to answer this - I am dealing
with a colony of seals where for example A1 is a female with pup and A1.1 is
that female's pup, the important part of the data here is DIST which tells
the distance between one seal (ID) and another (TO_ID). What I want to do is
take a mean for these data for a nearest neighbour analysis but I want to
omit any cases where there is the distance between a female and her pup,
i.e. in the previous e.g. omit rows where A1 and A1.1 occur. 

I have looked at grep and pmatch but these appear to work across columns and
don't appear to do what I'm looking to do, 

If anyone can point me in the right direction, I'd be most greatful,

Best wishes,

Ross 


FROM TO DISTID HR DD MM YY ANIMAL DAY TO_ID TO_ANIMAL
2  1  2  4.81803A1  1 30  9  9  1   1 MALE112
3  1  3  2.53468A1  1 30  9  9  1   1A2 3
4  1  4  7.57332A1  1 30  9  9  1   1  A1.1 7
5  1  1  7.57332  A1.1  1 30  9  9  7   1A1 1
6  1  2  7.89665  A1.1  1 30  9  9  7   1 MALE112
7  1  3  6.47847  A1.1  1 30  9  9  7   1A2 3
9  1  1  2.53468A2  1 30  9  9  3   1A1 1
10 1  2  2.59051A2  1 30  9  9  3   1 MALE112
12 1  4  6.47847A2  1 30  9  9  3   1  A1.1 7
13 1  1  4.81803 MALE1  1 30  9  9 12   1A1 1
15 1  3  2.59051 MALE1  1 30  9  9 12   1A2 3
16 1  4  7.89665 MALE1  1 30  9  9 12   1  A1.1 7
17 1  1  3.85359A1  2 30  9  9  1   1 MALE112
19 1  3  4.88826A1  2 30  9  9  1   1A2 3
20 1  4  7.25773A1  2 30  9  9  1   1  A1.1 7
21 1  1  9.96431  A1.1  2 30  9  9  7   1 MALE112
22 1  2  7.25773  A1.1  2 30  9  9  7   1A1 1
23 1  3  5.71725  A1.1  2 30  9  9  7   1A2 3
25 1  1  8.73759A2  2 30  9  9  3   1 MALE112
26 1  2  4.88826A2  2 30  9  9  3   1A1 1
28 1  4  5.71725A2  2 30  9  9  3   1  A1.1 7
30 1  2  3.85359 MALE1  2 30  9  9 12   1A1 1
31 1  3  8.73759 MALE1  2 30  9  9 12   1A2 3
32 1  4  9.96431 MALE1  2 30  9  9 12   1  A1.1 7
33 1  1  7.95399A1  3 30  9  9  1   1 MALE112
35 1  3  0.60443A1  3 30  9  9  1   1  A1.1 7
36 1  4  1.91136A1  3 30  9  9  1   1A2 3
37 1  1  8.29967  A1.1  3 30  9  9  7   1 MALE112
38 1  2  0.60443  A1.1  3 30  9  9  7   1A1 1
40 1  4  1.43201  A1.1  3 30  9  9  7   1A2 3
41 1  1  9.71659A2  3 30  9  9  3   1 MALE112
42 1  2  1.91136A2  3 30  9  9  3   1A1 1
43 1  3  1.43201A2  3 30  9  9  3   1  A1.1 7
46 1  2  7.95399 MALE1  3 30  9  9 12   1A1 1
47 1  3  8.29967 MALE1  3 30  9  9 12   1  A1.1 7
48 1  4  9.71659 MALE1  3 30  9  9 12   1A2 3
-- 
View this message in context: 
http://r.789695.n4.nabble.com/partial-matches-across-rows-not-columns-tp2247757p2247757.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] iterating over groups of columns

2010-06-08 Thread 09wkj

I am mainly a Java/C++ programmer, so my mind is used to iterating over data 
with for loops. After a long break, I am trying to get back into the "R 
mindset", but I could not find a solution in the documentation for the applys, 
aggregate, or by.

I have a data.frame where each row is an entry with 10 groups of measurements. 
The first measurement spans 1 column, the second spans 2 columns, third 3, and 
so on (55 total columns). What I want to do is add to my data.frame 10 new 
columns containing the minimum value of each measurement.

dim(the.data)
[1] 1679  55

> colnames(the.data)
  [1] "k.1.1"   "k.2.1"   "k.2.2"   "k.3.1"   "k.3.2"   "k.3.3"   "k.4.1"  
  [8] "k.4.2"   "k.4.3"   "k.4.4"   "k.5.1"   "k.5.2"   "k.5.3"   "k.5.4"  
 [15] "k.5.5"   "k.6.1"   "k.6.2"   "k.6.3"   "k.6.4"   "k.6.5"   "k.6.6"  
 [22] "k.7.1"   "k.7.2"   "k.7.3"   "k.7.4"   "k.7.5"   "k.7.6"   "k.7.7"  
 [29] "k.8.1"   "k.8.2"   "k.8.3"   "k.8.4"   "k.8.5"   "k.8.6"   "k.8.7"  
 [36] "k.8.8"   "k.9.1"   "k.9.2"   "k.9.3"   "k.9.4"   "k.9.5"   "k.9.6"  
 [43] "k.9.7"   "k.9.8"   "k.9.9"   "k.10.1"  "k.10.2"  "k.10.3"  "k.10.4" 
 [50] "k.10.5"  "k.10.6"  "k.10.7"  "k.10.8"  "k.10.9"  "k.10.10"

I want to add to the.data new columns: min.k.1, min.k.2, ..., min.k.10

This is the section of code I would like to improve, hopefully getting rid of 
the eval and the for loop:

for(k in 1:10){
s <- subset(the.data, select=paste("k", k, 1:k, sep="."))
eval(parse(text = paste("the.data$min.k.", k, "<-as.vector(by(s, 1:nrow(s), 
min))", sep="")))
}

Thanks for any help,
Bill
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] GEE: estimate of predictor with high time dependency

2010-06-08 Thread Charles C. Berry


On Tue, 8 Jun 2010, Sachi Ito wrote:


Hi,

I'm analyzing my data using GEE, which looks like below:


interact <- geeglm(L ~ O + A + O:A,

+ data = data1, id = id,
+ family = binomial, corstr = "ar1")


summary(interact)


Call:
geeglm(formula = lateral ~ ontask + attachment + ontask:attachment,
   family = binomial, data = firstgroupnowalking, id = id, corstr = "ar1")

Coefficients:
  Estimate  Std.err  Wald Pr(>|W|)
(Intercept)-1.89133  0.30363 38.80  4.7e-10 ***
O0.00348  0.00100 12.03  0.00052 ***
A1  -0.21729  0.37350  0.34  0.56073
A2  -0.14151  0.43483  0.11  0.74486
O:A1   -0.37540  0.16596  5.12  0.02370 *
O:A2   -0.27626  0.16651  2.75  0.09708 .
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Estimated Scale Parameters:
   Estimate Std.err
(Intercept) 1.27   0.369

Correlation: Structure = ar1  Link = identity

Estimated Correlation Parameters:
 Estimate Std.err
alpha0.979 0.00586
Number of clusters:   49   Maximum cluster size: 533



I decided to use auto-regression as the correlation structure because of the
high auto-correlation of the dependent variable, "L".  However, because one
of the predictors, "O", also has high time dependency (high
autocorrelation), the estimate of "O" (0.00348) seems to be too small.  In
this case, how shall I interpret the parameter?


First off, do you know how to interpret main effects in the presence of an 
interaction involving them?? I suspect not, but feel free to offer 
evidence to the contrary and then tell us why discussing 'the estimate of 
"O"' is sensible.


Secondly, without much more detail on the data it is hard to know what to 
make of a question like this even if the business of main 
effects/interactions is handled. As suggested, providing a minimal, 
reproducible example of R code will go a long way.


Chuck


Should I be using another analysis, such as loglm?

Thank you in advance for your help!

Sachi

[[alternative HTML version deleted]]




Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GEE: estimate of predictor with high time dependency

2010-06-08 Thread Sachi Ito

Hi,

I'm analyzing my data using GEE, which looks like below:

> interact <- geeglm(L ~ O + A + O:A,
+ data = data1, id = id,
+ family = binomial, corstr = "ar1")

> summary(interact)

Call:
geeglm(formula = lateral ~ ontask + attachment + ontask:attachment,
family = binomial, data = firstgroupnowalking, id = id, corstr = "ar1")

 Coefficients:
   Estimate  Std.err  Wald Pr(>|W|)
(Intercept)-1.89133  0.30363 38.80  4.7e-10 ***
O0.00348  0.00100 12.03  0.00052 ***
A1  -0.21729  0.37350  0.34  0.56073
A2  -0.14151  0.43483  0.11  0.74486
O:A1   -0.37540  0.16596  5.12  0.02370 *
O:A2   -0.27626  0.16651  2.75  0.09708 .
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

Estimated Scale Parameters:
Estimate Std.err
(Intercept) 1.27   0.369

Correlation: Structure = ar1  Link = identity

Estimated Correlation Parameters:
  Estimate Std.err
alpha0.979 0.00586
Number of clusters:   49   Maximum cluster size: 533



I decided to use auto-regression as the correlation structure because of the
high auto-correlation of the dependent variable, "L".  However, because one
of the predictors, "O", also has high time dependency (high
autocorrelation), the estimate of "O" (0.00348) seems to be too small.  In
this case, how shall I interpret the parameter?  Should I be using another
analysis, such as loglm?

Thank you in advance for your help!

Sachi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] more dates and data frames

2010-06-08 Thread Erin Hodgess

Dear R People:

So thanks to your help, I have the following:


> dog3.df <- 
> read.delim("c:/Users/erin/Documents/dog1.txt",header=FALSE,sep="\t")
> dog3.df
 V1   V2
1  1/1/2000  dog
2  1/1/2000  cat
3  1/1/2000 tree
4  1/1/2000  dog
5  1/2/2000  cat
6  1/2/2000  cat
7  1/2/2000  cat
8  1/2/2000 tree
9  1/3/2000  dog
10 1/3/2000 tree
11 1/6/2000  dog
12 1/6/2000  cat
> dog3.df$V1 <- as.Date(dog3.df$V1,"%m/%d/%Y")
> DF3 <- with(dog3.df,data.frame(Date=V1,V2,1))
> library(reshape)
> cast(formula=Date~V2,data=DF3,value="X1",fill=0)
Aggregation requires fun.aggregate: length used as default
Date cat dog tree
1 2000-01-01   1   21
2 2000-01-02   3   01
3 2000-01-03   0   11
4 2000-01-06   1   10
>

So far, so good.  My new question:  Can I fill in the days which are
"missing"; i.e., 2000-01-04 and 2000-01-05, with zeros for each set,
please?

thanks,
Erin

-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hypergeometric series in R

2010-06-08 Thread Ben Bolker

Arnau Mir  uib.es> writes:

> 
> Hello.
> 
> Somebody knows how to compute generalized hypergeometric series in R?
> (see 
> http://functions.wolfram.com/HypergeometricFunctions/HypergeometricPFQ/02/ 
> to understand what I mean)

library(sos)
findFn("generalized hypergeometric function")

 -> see the hypergeo package, and the Hyperg function in the gsl package.
 You'll have to work out for yourself whether the particular version and
parameterization match what you need.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scatterplot function - double check: dashed lines

2010-06-08 Thread Greg Snow

While it is possible to set your own dash patterns as you show below, it is 
unlikely that the resulting graph will be very meaningful.  Most people cannot 
keep the detailed dash patterns separate, and if they need to refer to a legend 
then it makes it even harder (See Bert Gunter's rant on the "Symbols in R" 
thread).

If the curves are distinct enough it is best to just have them all black and 
solid and label the different lines.  If that does not work then maybe using 
trellis style graphs where each line is in its own panel would be better (and 
still solid).

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of K F Pearce
> Sent: Tuesday, June 08, 2010 4:45 AM
> To: r-help@r-project.org
> Subject: [R] scatterplot function - double check: dashed lines
> 
> Hello everyone,
> 
> This is just a quick double check.  It concerns the 'scatterplot
> function' in R.
> 
>  I have 6 curves and I wish to represent each of them by a different
> kind of line (their colour must be black).
> 
> The curves are derived from the cuminc function...the coordinates of
> which are in 'xx'.
> 
> Upon reading the documentation in R, it looks like I can use the
> 'on/off' command for lty and I can merely run:
> 
> plot(xx,color="black",lty=c("11","13","1343","73","2262","35"))
> 
> according to the documentation, "13" (for example) means 1 'on' and 3
> 'off' .
> 
> Does the above look OK ?
> 
> Say, in another study, I wish to draw my 6 lines all in different
> colours (solid lines), I suppose that I could type:
> 
> plot(xx, color=c("red","black","purple","yellow","brown","violet"),
> lty=1)
> 
> Thanks so much for your help,
> All the Best,
> Kim
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ols function in rms package

2010-06-08 Thread Mark Seeto


> On 06/08/2010 05:29 AM, Mark Seeto wrote:
>>
>>> On 06/06/2010 10:49 PM, Mark Seeto wrote:
 Hello,

 I have a couple of questions about the ols function in Frank Harrell's
 rms
 package.

 Is there any way to specify variables by their column number in the
 data
 frame rather than by the variable name?

 For example,

 library(rms)
 x1<- rnorm(100, 0, 1)
 x2<- rnorm(100, 0, 1)
 x3<- rnorm(100, 0, 1)
 y<- x2 + x3 + rnorm(100, 0, 5)
 d<- data.frame(x1, x2, x3, y)
 rm(x1, x2, x3, y)
 lm(y ~ d[,2] + d[,3], data = d)  # This works
 ols(y ~ d[,2] + d[,3], data = d) # Gives error
 Error in if (!length(fname) || !any(fname == zname)) { :
 missing value where TRUE/FALSE needed

 However, this works:
 ols(y ~ x2 + d[,3], data = d)

 The reason I want to do this is to program variable selection for
 bootstrap model validation.

 A related question: does ols allow "y ~ ." notation?

 lm(y ~ ., data = d[, 2:4])  # This works
 ols(y ~ ., data = d[, 2:4]) # Gives error
 Error in terms.formula(formula) : '.' in formula and no 'data'
 argument

 Thanks for any help you can give.

 Regards,
 Mark
>>>
>>> Hi Mark,
>>>
>>> It appears that you answered the questions yourself.  rms wants real
>>> variables or transformations of them.  It makes certain assumptions
>>> about names of terms.   The y ~ . should work though; sometime I'll
>>> have
>>> a look at that.
>>>
>>> But these are the small questions compared to what you really want.
>>> Why
>>> do you need variable selection, i.e., what is wrong with having
>>> insignificant variables in a model?  If you indeed need variable
>>> selection see if backwards stepdown works for you.  It is built-in to
>>> rms bootstrap validation and calibration functions.
>>>
>>> Frank
>>>
>>
>> Thank you for your reply, Frank. I would have reached the conclusion
>> that rms only accepts real variables had this not worked:
>> ols(y ~ x2 + d[,3], data = d)
>
> Hi Mark - that probably worked by accident.
>
>>
>> The reason I want to program variable selection is so that I can use the
>> bootstrap to check the performance of a model-selection method. My
>> co-workers and I have used a variable selection method which combines
>> forward selection, backward elimination, and best subsets (the forward
>> and
>> backward methods were run using different software).
>>
>> I want to do bootstrap validation to (1) check the over-optimism in R^2,
>> and (2) justify using a different approach, if R^2 turns out to be very
>> over-optimistic. The different approach would probably be data reduction
>> using variable clustering, as you describe in your book.
>

Again, thanks for your reply Frank.

> The validate.ols function which calls the predab.resample function may
> give you some code to start with.  Note however that the performance of
> the approach you are suggestion has already been shown to be poor in
> many cases.

I think I've worked out how to program forward selection, and backward
elimination should be no more difficult. Other reasons for doing this are
(1) while I believe your criticism of stepwise variable selection methods,
it would be good to verify their poor performance for myself, and (2) I've
done very little programming in R, so it's good practice.

>You might run the following in parallel: full model fits
> and penalized least squares using penalties selected by AIC (using
> special arguments to ols along with the pentrace function).
>
> Frank

Thanks for the suggestion; I'll try this.

Regards,
Mark
-- 
Mark Seeto
Statistician

National Acoustic Laboratories 
A Division of Australian Hearing

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rsymphony

2010-06-08 Thread Kaveh Vakili

Hi list,

I have two question relating to the Rsymphony package:

a)  is there a way to use 'warm starts' ?
i.e. if i successively call Rsymphony_solve_LP()
with only one constraints change, does it perform the optimization
from the simplex-origin or does it uses the previously found solution
[i.e. as is common usage, see for instance lpsolveAPI]

b) to what extend are sparse matrix supported ?
i.e. if a constraint matrix is set as 
cbind(1,x,diag(dim(x)[1]),-diag(dim(x)[1])) is it understood as sparse ?


Best,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] constructing a contingency table (ftable, table)

2010-06-08 Thread stefan.d...@gmail.com

Hi Joris,
thanks for your help. I just had to alter it slightly (basically just
transposing):

tmp <- array(rbind(t(test),t(test2)),
   dim=c(9,2,2),
   dimnames=list(colnames(test),rownames(test),c("Test","Test2")))
 ftable(tmp)

Thanks again!
Best,
Stefan


On Tue, Jun 8, 2010 at 5:46 PM, Joris Meys  wrote:
> I could get something close to what you asked using a little hack,
> emulating a table using an array based on your two matrices :
>
> test <- matrix(rpois(18,10),ncol=9,nrow=2)
> colnames(test) <- paste("Dis",1:9,sep="")
> rownames(test) <- c("2010","2020")
>
> test2 <- matrix(rpois(18,10),ncol=9,nrow=2)
> colnames(test2) <- paste("Dis",1:9,sep="")
> rownames(test2) <- c("2010","2020")
>
> tmp <- array(cbind(test,test2),
>        dim=c(2,9,2),
>        dimnames=list(rownames(test),colnames(test),c("Test","Test2")))
>
> ftable(as.table(tmp))
>
> Cheers
> Joris
>
> On Tue, Jun 8, 2010 at 3:42 PM, stefan.d...@gmail.com
>  wrote:
>> Dear all,
>> an hopefully quick table question.
>>
>> I have the following data:
>> Two objects that are 2*9 matrix with nine column names (Dis1, ...,
>> Dis9) and the row names (2010,2020). The content are frequencies
>> (numeric).
>>
>> In want to create a table that is along the lines of
>> ftable(UCBAdmissions) and should looks like this:
>> Dis1 | ...| Dis9
>> 2010|2020||2010|2020
>> (first row,first column is the value of Dis1 in 2010 from first
>> object)| (first row,second column is the value of Dis1 in 2020 from
>> first object)| 
>> (second row,first column is the value of Dis1 in 2010 from second
>> object)| (first row,second column is the value of Dis1 in 2020 from
>> second object)| 
>> and so on
>>
>> So basically what ftable does. But I do not understand how I can turn
>> my two matrices (which already contain the frequencies) into the
>> appropriate table object (if thats the way to go).
>>
>> Thanks and best,
>> Stefan
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joris Meys
> Statistical consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Applied mathematics, biometrics and process control
>
> tel : +32 9 264 59 87
> joris.m...@ugent.be
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Symbols in R

2010-06-08 Thread Ted Harding

On 08-Jun-10 18:00:18, Bert Gunter wrote:
> Mount soapbox; begin rant {
> 
> ... However I think it should be added that rarely does this work with
> more than about a half dozen different symbols: a viewer of a graphic
> simply cannot keep the distinctions straight -- or often even decode
> them. Using color to distinguish groups is typically more effective
> (there is actual research to justify this), but of course one then
> runs up against the subjectivity of color perception: aside from the
> obvious (and fairly common) red-green color blindness issue, there's
> also the problem that people perceive different colors differently,
> making some look more prominent than others, which may interfere with
> the decoding of the graphical features.
> 
> In general, in the age of Excel graphics, I would argue that the use of
> point/line characteristics like shape, fill, and color that are decoded
> through legends are much overused and frequently result in visual
> puzzles and illusions, which is the opposite of what a good data
> graphic should do.
> 
> An often better approach is the use of small multiples -- trellis
> graphics -- for which the lattice and ggplot packages provide excellent
> functionality.
> 
> } end rant; dismount soapbox
> 
> Cheers,
> Bert

Hear Hear! Well ranted! About colour, I would add that it is useful
(as Bert hints) only up to the point where the colours remain clearly
distinguishable. When (as one too often does) one sees blobs in
different shades of yellow scattered all over a plot, or in different
shades of purple and magenta, then you cannot see what is what.

You get a similar problem (even with relatively distinct colours)
when the bars of a bar-chart, or the sectors of a pie chart, are in
different colours with the only clue as to what is what being a
"colour key" well off to one side, with no direct visible link to
the graphic, so that the eye has to travel laboriously from one
to the other, searching out the match, and having to consciously
remember which colour is being tracked; by the time you've gone from
chart to key and back again, you have forgotten what it was you looked
at the previous time, so you have been disconnected from the comparison
exercise that is the point of the chart in the first place.

Far better to put the key in a direct visual relationship with the
graphic in the first place (e.g. as labels adjacent to the bars or
sectors, or linked to them by line segments).

Ted
(already booted into rant mode, so "end rant" is an illegal operation)

E-Mail: (Ted Harding) 
Fax-to-email: +44 (0)870 094 0861
Date: 08-Jun-10   Time: 19:27:28
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] POSIXlt objects

2010-06-08 Thread Erik Iverson




Luis Felipe Parra wrote:

Hello I am using POSIXlt date format and I am having the following problem,
I've got two dates called FechaIni and FechaFin, one in 2008 and the other
in 2009 but when I do FechaIni$year and FechaFin$year to call the year I am
getting the smae year for both.


FechaIni

[1] "2008-11-13 UTC"

FechaFin

[1] "2009-06-13 UTC"

FechaFin$year

[1] 108

FechaIni$year

[1] 108
does somebody know what is happening? how can I solve this if I need two
compare just the years of two dates? Thank you


Why don't you tell us what the posting guide asks?  We need your 
sessionInfo(), OS, and a reproducible example.


Can you run ?dput on your two objects?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] POSIXlt objects

2010-06-08 Thread Luis Felipe Parra

Hello I am using POSIXlt date format and I am having the following problem,
I've got two dates called FechaIni and FechaFin, one in 2008 and the other
in 2009 but when I do FechaIni$year and FechaFin$year to call the year I am
getting the smae year for both.

> FechaIni
[1] "2008-11-13 UTC"
> FechaFin
[1] "2009-06-13 UTC"
> FechaFin$year
[1] 108
> FechaIni$year
[1] 108
does somebody know what is happening? how can I solve this if I need two
compare just the years of two dates? Thank you

Felipe Parra

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Symbols in R

2010-06-08 Thread Bert Gunter

Mount soapbox; begin rant {

... However I think it should be added that rarely does this work with more
than about a half dozen different symbols: a viewer of a graphic simply
cannot keep the distinctions straight -- or often even decode them. Using
color to distinguish groups is typically more effective (there is actual
research to justify this), but of course one then runs up against the
subjectivity of color perception: aside from the obvious (and fairly common)
red-green color blindness issue, there's also the problem that people
perceive different colors differently, making some look more prominent than
others, which may interfere with the decoding of the graphical features.

In general, in the age of Excel graphics, I would argue that the use of
point/line characteristics like shape, fill, and color that are decoded
through legends are much overused and frequently result in visual puzzles
and illusions, which is the opposite of what a good data graphic should do.

An often better approach is the use of small multiples -- trellis graphics
-- for which the lattice and ggplot packages provide excellent
functionality.

} end rant; dismount soapbox

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
 
 
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Greg Snow
Sent: Tuesday, June 08, 2010 9:58 AM
To: wenjun zheng; R-help@r-project.org
Subject: Re: [R] Symbols in R

The my.symbols function (TeachingDemos package) allows for defining your own
symbols to use in plots using base graphics (see ms.filled.polygon for an
example), there is also panel.my.symbols which works with lattice (possibly
with general grid, but I have not tested it that way).

Those may give you a starting point.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of wenjun zheng
> Sent: Tuesday, June 08, 2010 7:45 AM
> To: R-help@r-project.org
> Subject: [R] Symbols in R
> 
> Hi R Users,
> 
> I want to distinguish different condition by different symbols by
> pch in
> function grid.points, but the symbols needed should be with solid or
> hollow,
> in this way only 21 to 25 in pch worked, is there any other symbols
> could be
> used like this? or does it exist any other way to draw symbols?
> 
> Any suggestions will be appreciated.
> 
> Best regards.
> 
> --
> Wenjun
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] hypergeometric series in R

2010-06-08 Thread Arnau Mir


Hello.

Somebody knows how to compute generalized hypergeometric series in R?
(see 
http://functions.wolfram.com/HypergeometricFunctions/HypergeometricPFQ/02/ 
to understand what I mean)


Thanks in advance,

Arnau.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert a data frame to a 2D array?

2010-06-08 Thread A Huang

Thank you. It works like a charm.

A. Huang





From: Joshua Wiley 

Cc: r-help@r-project.org
Sent: Tue, June 8, 2010 9:19:19 AM
Subject: Re: [R] Convert a data frame to a 2D array?

Hi,

Does this work?

array(data=unlist(yourdataframe), dim=c(n,m))


Josh


> Hi there,
>
> I've read a file into a data frame. The data is n rows by m columns, all 
> values are numbers.
> Is there a way to convert the data frame to a 2D array? I tried as.array(), 
> but got some
> error messages.
>
> Thanks
>
> A. Huang
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Senior in Psychology
University of California, Riverside
http://www.joshuawiley.com/



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need help in multi-dimensional clustering

2010-06-08 Thread kapil mahant

Hi All , 

For an academic project I am trying to do the following 

Step 1 ) Draw and cluster a N ( lets say 3 ) column dataset by dbscan 
algorithm using  R-projectâs fpc package   ( let say they are "training 
clusters" ) , 
   Using dbscan as number of clusters are not know before hand  
Step 2 ) Once that is done i want to spread some new 
data points in the above plot space ( lets say these are "test points" ) 
Step 3 ) Find out which "test points" are lying within a boundary  ( kinda 
convex hull ) of any above discovered training clusters 

I am done with Step1 but pretty much stuck after that , 
Anyone knows how to proceed on the above 

Below is the code i am using for Step 1 ) 
---
library('fpc')
x <- read.table("file.dat" , sep=",")
d <- dbscan(x,eps = 1.9, MinPts=3,showplot = 2);
Plot(d);
---
  
I am fairly  new to R , started a week back , but covered quite a few online 
resources without success 
If anyone knows of a better approach of what i am trying to achieve then  
please let me know 

Thanks & Regards
Kapil 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Symbols in R

2010-06-08 Thread Greg Snow

The my.symbols function (TeachingDemos package) allows for defining your own 
symbols to use in plots using base graphics (see ms.filled.polygon for an 
example), there is also panel.my.symbols which works with lattice (possibly 
with general grid, but I have not tested it that way).

Those may give you a starting point.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of wenjun zheng
> Sent: Tuesday, June 08, 2010 7:45 AM
> To: R-help@r-project.org
> Subject: [R] Symbols in R
> 
> Hi R Users,
> 
> I want to distinguish different condition by different symbols by
> pch in
> function grid.points, but the symbols needed should be with solid or
> hollow,
> in this way only 21 to 25 in pch worked, is there any other symbols
> could be
> used like this? or does it exist any other way to draw symbols?
> 
> Any suggestions will be appreciated.
> 
> Best regards.
> 
> --
> Wenjun
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with seting up comparison

2010-06-08 Thread Greg Snow

You really need to do some studying on mixed effects models.  Some resources 
are at: http://lme4.r-forge.r-project.org/

Your formula below is wrong, you fit only the intercept as a fixed effect and 
you are fitting a random slope on animal by day for the random effect, which 
does not make much sense, you want animal as the random piece, so it goes to 
the right of the '|'.  You also need to decide whether to fit only random 
intercepts (same slope/differences in each animal, just at a different 
intercept) or include the possibility of random slopes (each animal has 
different slopes, fixed effect is an average).

Your formula should probably be something more like: Count ~ Day + (1|Animal)
Or: Count ~ Day + (Day|Animal)

Further questions would probably be best on mixed effects SIG.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111

From: Bart Joosen [mailto:bartjoo...@hotmail.com]
Sent: Monday, June 07, 2010 11:37 PM
To: Greg Snow; r-help
Subject: RE: [R] Help with seting up comparison

Greg,


the animals are a sample of a larger population, as you guessed.

I used lmer to estimate the effects:


> lmer(Count~Animal | Day, dat)
Linear mixed model fit by REML
Formula: Count ~ Animal | Day
Data: dat
AIC BIC logLik deviance REMLdev
1554 1574 -772 1542 1544
Random effects:
Groups Name Variance Std.Dev. Corr
Day (Intercept) 1.7707e-02 0.1330678
Animal 4.0287e-05 0.0063472 1.000
Residual 2.0917e+00 1.4462790
Number of obs: 430, groups: Day, 4

Fixed effects:
Estimate Std. Error t value
(Intercept) 4.2423 0.1257 33.76


But how does this help me to state that there is no effect within an animal?
anova doesn't seems to work (gives an empty table)
I'm sorry, but I have no experience with lme models, only lm.


Thanks for your time


Bart



> From: greg.s...@imail.org
> To: bartjoo...@hotmail.com; r-help@r-project.org
> Date: Mon, 7 Jun 2010 14:54:47 -0600
> Subject: RE: [R] Help with seting up comparison
>
> Are you interested in only those 35 animals (not every going to look at any 
> other animals other than those 35, but you want to predict what will happen 
> for those 35)? Or are the 35 animals a sample of a larger population of 
> animals?
>
> If the later (seems the most likely case) then you probably want to use a 
> mixed effects model (nlme or lme4 packages) with animal as a random effect, 
> then just look at the fixed effect of day.
>
> Hope this helps,
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
>
>
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> > project.org] On Behalf Of Bart Joosen
> > Sent: Monday, June 07, 2010 9:14 AM
> > To: r-help@r-project.org
> > Subject: [R] Help with seting up comparison
> >
> >
> > Hi,
> >
> > I tried on this, but couldn't figure it out:
> > Suppose I have a dataframe as follows:
> > dat <- data.frame(Day=rep(rep(c(1,2), each=4),2), Animal = rep(c(1,2),
> > each=8), Count=c(rnorm(8, 100), rnorm(8,90)))
> >
> > 2 animals are being examined on 2 different days. Count is the result.
> >
> > Now I need to point out whether or not there is a difference between
> > the
> > days.
> > I did this by an ANOVA test, while first converting the animal and day
> > to a
> > factor variable:
> > dat$Animal <- as.factor(dat$Animal)
> > dat$Day <- as.factor(dat$Day)
> > mod <- lm(Count ~Animal * Day,dat)
> > anova(mod)
> >
> > Now I have to check for difference within the animal, to see if there
> > is a
> > difference in count for each day. (In my real data, I have 35 animals,
> > with
> > 4 days, and 4 results).
> > I thought about a Tukey HSD test, but this compares every day of every
> > animal with every other day of every other animal. (TukeyHSD(aov(mod)))
> >
> > Any idea about which function (or model for lm) to use to only compare
> > days
> > within every animal?
> >
> > Best regards
> >
> > Bart
> >
> >
> > --
> > View this message in context: http://r.789695.n4.nabble.com/Help-with-
> > seting-up-comparison-tp2246106p2246106.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.

SMS "HOTMAIL" naar 3010 en ontvang gratis de juiste instellingen voor Hotmail 
op je mobiele 
telefoon.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide comm

Re: [R] add one point to contourplot()

2010-06-08 Thread Thomas Stewart

Because contourplot comes from the lattice package, I think you'll want to
look at these help pages:

+ help(trellis.focus)
+ help(lpoints)

Below, I've used the example from help(contourplot) to demonstrate how one
might add points and text to a lattice plot.

-tgs

#
# FROM help(contourplot)

require(stats)
attach(environmental)
ozo.m <- loess((ozone^(1/3)) ~ wind * temperature * radiation,
   parametric = c("radiation", "wind"), span = 1, degree = 2)
w.marginal <- seq(min(wind), max(wind), length.out = 50)
t.marginal <- seq(min(temperature), max(temperature), length.out = 50)
r.marginal <- seq(min(radiation), max(radiation), length.out = 4)
wtr.marginal <- list(wind = w.marginal, temperature = t.marginal,
radiation = r.marginal)
grid <- expand.grid(wtr.marginal)
grid[, "fit"] <- c(predict(ozo.m, grid))
contourplot(fit ~ wind * temperature | radiation, data = grid,
cuts = 10, region = TRUE,
xlab = "Wind Speed (mph)",
ylab = "Temperature (F)",
main = "Cube Root Ozone (cube root ppb)")
detach()

###
# Add text and points

trellis.focus("panel", 1, 1, highlight=T)

lpoints(c(10,15),c(70,70))
ltext(10,70,"New Point",pos=3)
ltext(15,70,"New Point",pos=1)

trellis.unfocus()

On Tue, Jun 8, 2010 at 7:57 AM, biostat book  wrote:

> Hi All,
> I want to add one point to contourplot().  I used contourplot() in my code
> like
> contourplot(z ~ a + b |c, data)
> I understand there is plot.axes argument for filled.contour(), but it  did
> not work for my code. I also tried plot() and text() for contourplot(), but
> got this error: "plot.new has not been called yet"
> Any suggestion for adding a point in contourplot()?
> Thank you very much!
> Lily D
>
>
>
>
>[[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert a data frame to a 2D array?

2010-06-08 Thread Joris Meys

test <- data.frame(x=1:10,y=1:10)
as.array(as.matrix(test))

does the job too
Cheers

On Tue, Jun 8, 2010 at 6:07 PM, A Huang  wrote:
> Hi there,
>
> I've read a file into a data frame. The data is n rows by m columns, all 
> values are numbers.
> Is there a way to convert the data frame to a 2D array? I tried as.array(), 
> but got some
> error messages.
>
> Thanks
>
> A. Huang
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert a data frame to a 2D array?

2010-06-08 Thread Joshua Wiley

Hi,

Does this work?

array(data=unlist(yourdataframe), dim=c(n,m))


Josh

On Tue, Jun 8, 2010 at 9:07 AM, A Huang  wrote:
> Hi there,
>
> I've read a file into a data frame. The data is n rows by m columns, all 
> values are numbers.
> Is there a way to convert the data frame to a 2D array? I tried as.array(), 
> but got some
> error messages.
>
> Thanks
>
> A. Huang
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Senior in Psychology
University of California, Riverside
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Convert a data frame to a 2D array?

2010-06-08 Thread A Huang

Hi there,

I've read a file into a data frame. The data is n rows by m columns, all values 
are numbers.
Is there a way to convert the data frame to a 2D array? I tried as.array(), but 
got some 
error messages. 

Thanks

A. Huang


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] constructing a contingency table (ftable, table)

2010-06-08 Thread Joris Meys

I could get something close to what you asked using a little hack,
emulating a table using an array based on your two matrices :

test <- matrix(rpois(18,10),ncol=9,nrow=2)
colnames(test) <- paste("Dis",1:9,sep="")
rownames(test) <- c("2010","2020")

test2 <- matrix(rpois(18,10),ncol=9,nrow=2)
colnames(test2) <- paste("Dis",1:9,sep="")
rownames(test2) <- c("2010","2020")

tmp <- array(cbind(test,test2),
dim=c(2,9,2),
dimnames=list(rownames(test),colnames(test),c("Test","Test2")))

ftable(as.table(tmp))

Cheers
Joris

On Tue, Jun 8, 2010 at 3:42 PM, stefan.d...@gmail.com
 wrote:
> Dear all,
> an hopefully quick table question.
>
> I have the following data:
> Two objects that are 2*9 matrix with nine column names (Dis1, ...,
> Dis9) and the row names (2010,2020). The content are frequencies
> (numeric).
>
> In want to create a table that is along the lines of
> ftable(UCBAdmissions) and should looks like this:
> Dis1 | ...| Dis9
> 2010|2020||2010|2020
> (first row,first column is the value of Dis1 in 2010 from first
> object)| (first row,second column is the value of Dis1 in 2020 from
> first object)| 
> (second row,first column is the value of Dis1 in 2010 from second
> object)| (first row,second column is the value of Dis1 in 2020 from
> second object)| 
> and so on
>
> So basically what ftable does. But I do not understand how I can turn
> my two matrices (which already contain the frequencies) into the
> appropriate table object (if thats the way to go).
>
> Thanks and best,
> Stefan
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GMM: "The covariance matrix of the coefficients is singular"

2010-06-08 Thread Fernando Martinez


Hi All,

I'm trying to estimate some parameters in my model via GMM using the 
function gmm(), but I keep getting the message "The covariance matrix of 
the coefficients is singular". I've changed the moment conditions and 
the initial value of the parameters, and I still get this message. Are 
the results valid after receiving this message? Any ideas on how to get 
rid of it?


My code:

@@@

Function g, where E[g]=0:


g <- function(theta,x){
 N <- length(x)

 #Transform parameters in order to make them x>0, x<0 or 0 m4 <- nt1^4 - 
(-2*prob*mu_s^4-12*mu_s^2*sigma_s^2*prob+3*sigma^4-6*prob*sigma_s^4)/(1-2*prob)


 m5 <- nt1^5

 m6 <- nt1^6 - 
(-2*prob*mu_s^6-30*prob*mu_s^4*sigma_s^2-90*prob*mu_s^2*sigma_s^4+15*sigma^6-30*prob*sigma_s^6)/(1-2*prob)




 g <- cbind(m1, m2, m3, m4, m5, m6)

 return(g)

}



Initial value for the parameters:

tet0
[1] 12.1824940 -0.7408182 20.0855369  0.5000250  2.0137527  2.2255409


Calling the function:

res <- gmm(g,x,tet0)
Warning message:
In FinRes.baseGmm.res(z, Model_info) :
 The covariance matrix of the coefficients is singular





Cheers,
Fernando

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glm output for binomial family

2010-06-08 Thread Douglas Bates

On Tue, Jun 8, 2010 at 7:10 AM, Enrico Colosimo  wrote:
> Hello,
>  I am having some trouble running a very simple
> example. I am running a logistic regression entering the SAME data set
> in two different forms and getting different values for the deviance residual.
>
> Just look with this naive data set:
>
> 
> # 1- Entering as a Bernoulli data set
>  y<-c(1,0,1,1,0)
>  x<-c(2,2,5,5,8)
>  ajust1<-glm(y~x,family=binomial(link="logit"))
>  ajust1
> #
> Coefficients:
> (Intercept)            x
>     1.3107      -0.2017
>
> Degrees of Freedom: 4 Total (i.e. Null);  3 Residual
> Null Deviance:      6.73
> Residual Deviance: 6.491        AIC: 10.49
> #
> # 2- Entering as Binomial data set
> #
>  ysim<-c(1,2,0)
>  ynao<-c(1,0,1)
>  x<-c(2,5,8)
>  dados<-cbind(ysim,ynao,x)
>  dados<-as.data.frame(dados)
>  attach(dados)
>  ajust2<-glm(as.matrix(dados[,c(1,2)])~x,family=binomial, data=dados)
>  summary(ajust2)
> #
> Coefficients:
> (Intercept)            x
>     1.3107      -0.2017
>
> Degrees of Freedom: 2 Total (i.e. Null);  1 Residual
> Null Deviance:      3.958
> Residual Deviance: 3.718        AIC: 9.104
> =
>
> It seems that there is problem with the first fitting!!!

In what way?  Notice that the estimates of the coefficients are the
same in the two fits and the difference between the null deviance and
the residual deviance is approximately the same.  If you are worried
about the deviance in the first fit being greater than the deviance in
the second fit it is because of the definition of the deviance used.
The deviance for the binomial fit is a shifted version of the deviance
from the Bernoulli fit, which is why the null deviance is also
reported.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] setting up zoo objects from a data frame

2010-06-08 Thread Gabor Grothendieck

The first time I posted this it got held up for approval so I am
trying it again.  Hopefully this time it gets through right away.

As with your prior post we can use read.zoo(..., split=...).
Alternatives are reshape and reshape::cast.


# read data into DF

Lines <- "V1   V2
1 1/1/2000  dog
2 1/1/2000  cat
3 1/1/2000 tree
4 1/2/2000  cat
5 1/2/2000 tree
6 1/3/2000  dog
7 1/3/2000 tree"

DF <- read.table(textConnection(Lines), header = TRUE)
DF$V1 <- as.Date(DF$V1, "%m/%d/%Y")

# 1. using zo

library(zoo)
d <- read.zoo(cbind(DF, 1), split = 2)
do.call(merge, c(d, fill = 0))

# 2. using reshape

DF2 <- with(DF, data.frame(Date = V1, V2, 1))
r <- reshape(DF2, dir = "wide", idvar = "Date", timevar = "V2")
colnames(r) <- c("Date", sub("X1.", "", colnames(r))[-1])
r[is.na(r)] <- 0
r

#  3. using the reshape package (where DF2 is as in #2)

library(reshape)
cast(formula = Date ~ V2, data = DF2, value = "X1", fill = 0)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R- Package - fdth

2010-06-08 Thread Enio Jelihovschi

We wish to announce the new package:
fdth - Frequency Distribution Table and Associated Histogram.

The package contains a a set of high level function which easily allows the
user
to make a frequency distribution table (fdt) and its associated plots.
The fdt can be formatted in many ways which may be suited to publication.
The plot (S3)
can be histogram, polygons or density, which can be dealt with the easiness
and
flexibility of a high level function.

José Claudio Faria Enio Jelihovschi
DCET - Estatística DCET - Estatística
UESC , BrasilUESC, Brasil

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] TS model

2010-06-08 Thread Ted Byers

I am looking at a new project involving time series analysis.  I know I can
complete the tasks involving VARMA using either dse or mAr (and I think
there are a couple others that might serve).

However, there is one task that I am not sure of the best way to proceed.

A simple example illustrates what I am after.  If you think of a simple
ballistic problem, with a vector describing current position in 3
dimensions, the components of that vector are simple functions of initial
position, initial velocity (constants, for our purposes) and time.  It is
trivial calculus to compute these values at arbitrary time using only
initial conditions and time.  Of course, for such a simple problem, we know
the equations of motion that we can use for this purpose.

I want to use time series values to estimate a suitable vector valued
function of time in a case where we know neither the equations of change nor
the initial conditions (but where we have daily values going back many
years).  Actually, I don't really care much about the details of the
function nearly as much as the first and second derivatives of the function
with respect to time; and these derivatives have to be inferred from the
model of the measurements as 'simle' functions of time.  And as I do not
want to assume the system is autonomous, I want to be able to repeat the
analysis on a moving window wherein always the current day is designated as
having s = 0 (I.E. the time variable used in the model estimated slides
along that representing real time).  I figure that if that window is short
enough, a quadratic or cubic function of time will suffice.  Finally, if the
combination of first and second derivatives indicates that the first
derivative will take a value of 0 at some point in the future, I want to
estimate the number of days until that happens.  (yes, I know I will need
some sort of orthogonalization of the time variable in order to reduce
problems of multicollinearity, but that I'd expect in any multivariate
nonlinear regression).

I don't know if this could be recast as a VARMA problem, or if so, how and
how I'd get the answers to the questions of importance to me.  I would
welcome  being enlightened on this, if there is an answer.

The question is, "Is there a package that already provides support for this
'out of the box', as it were, and if so which one, or do I have to construct
code supporting it de novo?"

Thanks

Ted

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] setting up zoo objects from a data frame: solved.

2010-06-08 Thread Stefan Grosse

Am 08.06.2010 17:04, schrieb Erin Hodgess:
> Here is a particular way to solve the problem:
>   

If you solve your own problem then please reply to your own message
otherwise things get confused. How should one know what your problem was
without knowing your first e-mail - if you reply your own e-mail it gets
sorted as such otherwise not. However
>
>> cat.zoo <- zoo(table(dog.df$V1,dog.df$V2)[,1],order=test3)
>> dog.zoo <- zoo(table(dog.df$V1,dog.df$V2)[,1],order=test3)
>> 
that must be an error, a line to much  bur you "heal" it:
>> dog.zoo <- zoo(table(dog.df$V1,dog.df$V2)[,2],order=test3)
>> tree.zoo <- zoo(table(dog.df$V1,dog.df$V2)[,3],order=test3)
>

This is way to complicating, you should put the things into one object,
then you can e.g. plot them or analyse them.

You can do so after this more complicating code with:

all<-merge(cat.zoo,dog.zoo,tree.zoo)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Symbols in R

2010-06-08 Thread wenjun zheng

Hi R Users,

I want to distinguish different condition by different symbols by pch in
function grid.points, but the symbols needed should be with solid or hollow,
in this way only 21 to 25 in pch worked, is there any other symbols could be
used like this? or does it exist any other way to draw symbols?

Any suggestions will be appreciated.

Best regards.

-- 
Wenjun

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] setting up zoo objects from a data frame

2010-06-08 Thread Stefan Grosse

Am 08.06.2010 16:52, schrieb Erin Hodgess:
>
> I would like to set up 3 time series; one for dog, one for cat, one
> for tree, such that each runs from 1/1/2000 to 1/3/2000, with 0 if
> there is no entry for the day.
>
>
>   

Before using zoo or zooreg you should transform your data.frame as such
as that you have one column for each time series you want to have
afterwards. have a look at
?reshape
to do so.
little example:
test<-data.frame(date=c("2001/01/01","2001/01/01","2001/01/02","2001/01/02"),animal=c("dog","cat","dog","cat"),data=1:4)
test.w<-reshape(test,idvar="date",timevar="animal",direction="wide")
test.w

then you can do the zoo stuff, e.g. with
?zooreg
if you have a "regular" series.
It will transform automatically into a multiple time series.

hth
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] setting up zoo objects from a data frame: solved.

2010-06-08 Thread Erin Hodgess

Here is a particular way to solve the problem:

> test3 <- seq(from=as.Date("1/1/2000","%m/%d/%Y"),to=as.Date("1/3/2000",
+ "%m/%d/%Y"),length=3)
> test3
[1] "2000-01-01" "2000-01-02" "2000-01-03"
> zoo(table(dog.df$V1,dog.df$V2)[,1],order=test3)
2000-01-01 2000-01-02 2000-01-03
 1  1  0
> cat.zoo <- zoo(table(dog.df$V1,dog.df$V2)[,1],order=test3)
> dog.zoo <- zoo(table(dog.df$V1,dog.df$V2)[,1],order=test3)
> dog.zoo <- zoo(table(dog.df$V1,dog.df$V2)[,2],order=test3)
> tree.zoo <- zoo(table(dog.df$V1,dog.df$V2)[,3],order=test3)
> dog.zoo
2000-01-01 2000-01-02 2000-01-03
 1  0  1
> tree.zoo
2000-01-01 2000-01-02 2000-01-03


thanks though!
sincerely,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] setting up zoo objects from a data frame

2010-06-08 Thread Erin Hodgess

Dear R People:

I have the following data frame:

> str(dog.df)
'data.frame':   7 obs. of  2 variables:
 $ V1: Factor w/ 3 levels "1/1/2000","1/2/2000",..: 1 1 1 2 2 3 3
 $ V2: Factor w/ 3 levels "cat","dog","tree": 2 1 3 1 3 2 3
> dog.df
V1   V2
1 1/1/2000  dog
2 1/1/2000  cat
3 1/1/2000 tree
4 1/2/2000  cat
5 1/2/2000 tree
6 1/3/2000  dog
7 1/3/2000 tree
>

I would like to set up 3 time series; one for dog, one for cat, one
for tree, such that each runs from 1/1/2000 to 1/3/2000, with 0 if
there is no entry for the day.

So for instance, I would have:

dog.zoo
1/1/2000 1
1/2/2000 0
1/3/2000 1

Any help would be much appreciated.

Thanks in advance,
Sincerely,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem installing Rmpi

2010-06-08 Thread Uwe Ligges




On 08.06.2010 15:17, Paco Pastor wrote:

Hi everyone

I want to install Rmpi to use R in parallel mode in a Linux cluster
(Ubuntu, Hardy Heron). It seems to be properly installed but a problem
appears when loading Rmpi library.

R version 2.11.1 (2010-05-31)

 > library("Rmpi")
Error: package 'Rmpi' was built before R 2.10.0: please re-install it



Run update.packages(checkBuilt=TRUE)

and try again (but watch for error messages during the update process).

Uwe Ligges




Should I remove R-2.11 and install R-2.10? I have tried to reinstall
Rmpi but it gives the same error message.

Thanks in advance

Paco



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Deleting duplicate values in a correlation matrix

2010-06-08 Thread Henrique Dallazuanna

Try this:

m <- matrix(Corr, ncol = 3, dimnames = list(unique(NodesRow),
unique(NodesCol)))
m[col(m) == row(m) | upper.tri(m)] <- NA
subset(as.data.frame.table(m), !is.na(Freq))



On Tue, Jun 8, 2010 at 10:29 AM, Matthew DiLeo  wrote:

> I have a large correlation matrix that I'm trying to convert to a list of
> every connection (edge) between every two nodes with its accompanying
> correlation value (for Cytoscape). I figured out how to do this and to
> remove the connections that nodes have to themselves but I can't figure out
> how to get rid of the duplicate pairs (e.g. the edge b-a is a duplicate of
> a-b).
>
>
>
> NodesRow<-rep(c("a","b","c"),each=3)
>
> NodesCol<-rep(c("a","b","c"),times=3)
>
> Corr<-c(1,.8,.4,.8,1,.3,.4,.3,1)
>
> data<-data.frame(NodesRow,NodesCol,Corr)
>
> # above is what my node and edge dataframe looks like
>
> data2<-subset(data,data[,1]!=data[,2])
>
> # above is how I clear out connections nodes have to themselves
>
>
> Thanks for any clues of a better way to do this!
>
>
> Matt
>
>
>
> Matthew DiLeo, Ph.D.
> Postdoctoral Associate
> Boyce Thompson Institute for Plant Research
> Robert W. Holley Center, Tower Road, Ithaca
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting started

2010-06-08 Thread Gabor Grothendieck

The author's site is here:
http://www.bio.ic.ac.uk/research/mjcraw/therbook/

and you can read the file right off his site like this:

URL <- "http://www.bio.ic.ac.uk/research/mjcraw/therbook/data/Gain.txt";
gg <- read.table(URL, header = TRUE)

On Tue, Jun 8, 2010 at 9:37 AM, Andrew Kelly  wrote:
> Hi,)
> I am just getting started with R but have hit an early snag. I am working 
> through Crawley (2008) The R Book and on page 6, 'Significance Stars', I am 
> trying to enter the commands given. However, 'Gain.txt' does not seem to have 
> been downloaded when I downloaded the R programme.
>
> I have searched for it in case it put it somewhere other than 'temp' but I 
> cannot find it.
>
> Can anybody tell me how to do this?
>
> Many thanks
> Andrew Kelly
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] constructing a contingency table (ftable, table)

2010-06-08 Thread stefan.d...@gmail.com

Dear all,
an hopefully quick table question.

I have the following data:
Two objects that are 2*9 matrix with nine column names (Dis1, ...,
Dis9) and the row names (2010,2020). The content are frequencies
(numeric).

In want to create a table that is along the lines of
ftable(UCBAdmissions) and should looks like this:
Dis1 | ...| Dis9
2010|2020||2010|2020
(first row,first column is the value of Dis1 in 2010 from first
object)| (first row,second column is the value of Dis1 in 2020 from
first object)| 
(second row,first column is the value of Dis1 in 2010 from second
object)| (first row,second column is the value of Dis1 in 2020 from
second object)| 
and so on

So basically what ftable does. But I do not understand how I can turn
my two matrices (which already contain the frequencies) into the
appropriate table object (if thats the way to go).

Thanks and best,
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Getting started

2010-06-08 Thread Andrew Kelly

Hi,)
I am just getting started with R but have hit an early snag. I am working 
through Crawley (2008) The R Book and on page 6, 'Significance Stars', I am 
trying to enter the commands given. However, 'Gain.txt' does not seem to have 
been downloaded when I downloaded the R programme.

I have searched for it in case it put it somewhere other than 'temp' but I 
cannot find it.

Can anybody tell me how to do this?

Many thanks
Andrew Kelly

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Deleting duplicate values in a correlation matrix

2010-06-08 Thread Matthew DiLeo

I have a large correlation matrix that I'm trying to convert to a list of
every connection (edge) between every two nodes with its accompanying
correlation value (for Cytoscape). I figured out how to do this and to
remove the connections that nodes have to themselves but I can't figure out
how to get rid of the duplicate pairs (e.g. the edge b-a is a duplicate of
a-b).



NodesRow<-rep(c("a","b","c"),each=3)

NodesCol<-rep(c("a","b","c"),times=3)

Corr<-c(1,.8,.4,.8,1,.3,.4,.3,1)

data<-data.frame(NodesRow,NodesCol,Corr)

# above is what my node and edge dataframe looks like

data2<-subset(data,data[,1]!=data[,2])

# above is how I clear out connections nodes have to themselves


Thanks for any clues of a better way to do this!


Matt



Matthew DiLeo, Ph.D.
Postdoctoral Associate
Boyce Thompson Institute for Plant Research
Robert W. Holley Center, Tower Road, Ithaca

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem installing Rmpi

2010-06-08 Thread Paco Pastor


Hi everyone

I want to install Rmpi to use R in parallel mode in a Linux cluster 
(Ubuntu, Hardy Heron). It seems to be properly installed but a problem 
appears when loading Rmpi library.


R version 2.11.1 (2010-05-31)

> library("Rmpi")
Error: package 'Rmpi' was built before R 2.10.0: please re-install it


Should I remove R-2.11 and install R-2.10? I have tried to reinstall 
Rmpi but it gives the same error message.


Thanks in advance

Paco

--
---
Francisco Pastor
Meteorology department
Fundación CEAM
p...@ceam.es
http://www.ceam.es/ceamet - http://www.ceam.es
Parque Tecnologico, C/ Charles R. Darwin, 14
46980 PATERNA (Valencia), Spain
Tlf. 96 131 82 27 - Fax. 96 131 81 90
---
Usuario Linux registrado: 363952

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sweave glm.fit

2010-06-08 Thread Jimmy Söderly

Thanks for the advice. I found the function : summary(mcmc(x)), from the
coda package.


2010/6/4 Gavin Simpson 

> On Thu, 2010-06-03 at 23:44 +0200, Jimmy Söderly wrote:
> > Thanks for your help.
> >
> > Does it have something to do with the mcmc package, the coda package, or
> the
> > lattice package ?
>
> Is there something stopping you checking this for yourself? You have the
> code *you* are running, after all, and we don't.
>
> The coda package contains a great many functions and as you have yet to
> show us any code or your Sweave file, that would involve me looking
> through the entirety of those packages to find out.
>
> coda mentions glm() in the spectrum0() function. Are you using that?
>
> I doubt it will be lattice - it is primarily a graphics package.
>
> mcmc doesn't appear to have any obvious calls to glm().
>
> Why don't you Tangle your Sweave file to extract the code, then step
> through your code until it issues the warnings. Then you'll know exactly
> where the code is failing to converge. I mentioned this in my earlier
> reply. I'm afraid I'll be unable to provide further enlightenment unless
> you can work out what code is causing the error.
>
> HTH
>
> G
>
> >
> > Jimmy
> >
> > 2010/6/3 Gavin Simpson 
> >
>  > > On Wed, 2010-06-02 at 20:29 +0200, Jimmy Sderly wrote:
> > > > Dear R users,
> > > >
> > > > After running Sweave, this is what I get :
> > > >
> > > > Warning messages:
> > > > 1: glm.fit: algorithm did not converge
> > > > 2: glm.fit: algorithm did not converge
> > > > There is no glm.fit function in my code.
> > >
> > > glm calls glm.fit did you call glm() or any code that might use glm()
> > > (or glm.fit()). glm.fit() is the function that does the actual model
> > > fitting etc, with glm() being a convenience wrapper providing the
> > > formula interface etc.
> > >
> > > > Where does it come from ? From Sweave ? From system.time ?
> > >
> > > Your call to glm() or any function that calls glm(). This is nothing to
> > > sweave or system.time.
> > >
> > > Your model fitting has failed to converge. There may be many reasons
> why
> > > glm.fit is failing to converge for the model your are fitting.
> > >
> > > Either extract the R code or tangle the sweave file to do the
> > > extraction, and then step through the code to see where the warning is
> > > being generated. Then figure out why glm.fit is not converging (you
> > > might need to up the number of iterations or provide better starting
> > > values or rethink the type of model you are trying to fit).
> > >
> > > HTH
> > >
> > > G
> > >
> > > --
> > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> > >  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
> > >  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
> > >  Pearson Building, [e] 
> > > gavin.simpsonATNOSPAMucl.ac.uk
> 
> > >  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
> > >  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> > >
> > >
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> --
>  %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
>  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
>  Pearson Building, [e] 
> gavin.simpsonATNOSPAMucl.ac.uk
>  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
>  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] glm output for binomial family

2010-06-08 Thread Enrico Colosimo

Hello,
 I am having some trouble running a very simple
example. I am running a logistic regression entering the SAME data set
in two different forms and getting different values for the deviance residual.

Just look with this naive data set:


# 1- Entering as a Bernoulli data set
 y<-c(1,0,1,1,0)
 x<-c(2,2,5,5,8)
 ajust1<-glm(y~x,family=binomial(link="logit"))
 ajust1
#
Coefficients:
(Intercept)            x
    1.3107      -0.2017

Degrees of Freedom: 4 Total (i.e. Null);  3 Residual
Null Deviance:      6.73
Residual Deviance: 6.491        AIC: 10.49
#
# 2- Entering as Binomial data set
#
 ysim<-c(1,2,0)
 ynao<-c(1,0,1)
 x<-c(2,5,8)
 dados<-cbind(ysim,ynao,x)
 dados<-as.data.frame(dados)
 attach(dados)
 ajust2<-glm(as.matrix(dados[,c(1,2)])~x,family=binomial, data=dados)
 summary(ajust2)
#
Coefficients:
(Intercept)            x
    1.3107      -0.2017

Degrees of Freedom: 2 Total (i.e. Null);  1 Residual
Null Deviance:      3.958
Residual Deviance: 3.718        AIC: 9.104
=

It seems that there is problem with the first fitting!!!

Best
Enrico Colosimo
Dept Statistics, UFMG
Brazil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Very OT: World Cup Statistics

2010-06-08 Thread R Heberto Ghezzo, Dr

The only goals I remember are that the "Hand of God" was at 6 min second period 
and the
"Goal of the Century" at 11 min second period
The others dont count.
HG

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Paul [p...@paulhurley.co.uk]
Sent: Monday, June 07, 2010 4:50 PM
To: R-help@r-project.org
Subject: [R] Very OT: World Cup Statistics

Hello,

Sorry for the very Off TOpic post, but I need some data on past football
(soccer) world cups.  I'd like to find (or calculate) the time to the
first goal of each match (where a goal was scored).

I''ve looked at the uefa website and can't find what I want, maybe
someone here can help ?

Thanks

Paul.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: partial solutions to: if else statement problem

2010-06-08 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 08.06.2010 14:21:10:

> Thank you to all for your help!
> 
> I received
> two equal alternative solutions that bypassed elegantly the problem
> from peter.l.e.koni...@gmail.com
> and
> rafael.bj...@gmail.com
> 
> rr.dia2.corr <- rr.dia.2m
> rr.dia2.corr[which(med.hyper == 1)] <- rr.dia.2m[which(med.hyper == 1)] 
- 5

Try to compare it with my suggestion

my.result <- rr.dia.2m - (med.hyper == 1)*5
all.eqal(rr.dia2.corr, my.result)

which by my opinion will be same.

Regards
Petr


> 
>  From pbu...@pburns.seanet.com
> I received a hint to understand the problem why my code does not work.
> p 13 on
> http://lib.stat.cmu.edu/S/Spoetry/Tutor/R_inferno.pdf
> 
> I have to digest this yet
> 
> Peter
> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ols function in rms package

2010-06-08 Thread Frank E Harrell Jr


On 06/08/2010 05:29 AM, Mark Seeto wrote:



On 06/06/2010 10:49 PM, Mark Seeto wrote:

Hello,

I have a couple of questions about the ols function in Frank Harrell's
rms
package.

Is there any way to specify variables by their column number in the data
frame rather than by the variable name?

For example,

library(rms)
x1<- rnorm(100, 0, 1)
x2<- rnorm(100, 0, 1)
x3<- rnorm(100, 0, 1)
y<- x2 + x3 + rnorm(100, 0, 5)
d<- data.frame(x1, x2, x3, y)
rm(x1, x2, x3, y)
lm(y ~ d[,2] + d[,3], data = d)  # This works
ols(y ~ d[,2] + d[,3], data = d) # Gives error
Error in if (!length(fname) || !any(fname == zname)) { :
missing value where TRUE/FALSE needed

However, this works:
ols(y ~ x2 + d[,3], data = d)

The reason I want to do this is to program variable selection for
bootstrap model validation.

A related question: does ols allow "y ~ ." notation?

lm(y ~ ., data = d[, 2:4])  # This works
ols(y ~ ., data = d[, 2:4]) # Gives error
Error in terms.formula(formula) : '.' in formula and no 'data' argument

Thanks for any help you can give.

Regards,
Mark


Hi Mark,

It appears that you answered the questions yourself.  rms wants real
variables or transformations of them.  It makes certain assumptions
about names of terms.   The y ~ . should work though; sometime I'll have
a look at that.

But these are the small questions compared to what you really want.  Why
do you need variable selection, i.e., what is wrong with having
insignificant variables in a model?  If you indeed need variable
selection see if backwards stepdown works for you.  It is built-in to
rms bootstrap validation and calibration functions.

Frank



Thank you for your reply, Frank. I would have reached the conclusion
that rms only accepts real variables had this not worked:
ols(y ~ x2 + d[,3], data = d)


Hi Mark - that probably worked by accident.



The reason I want to program variable selection is so that I can use the
bootstrap to check the performance of a model-selection method. My
co-workers and I have used a variable selection method which combines
forward selection, backward elimination, and best subsets (the forward and
backward methods were run using different software).

I want to do bootstrap validation to (1) check the over-optimism in R^2,
and (2) justify using a different approach, if R^2 turns out to be very
over-optimistic. The different approach would probably be data reduction
using variable clustering, as you describe in your book.


The validate.ols function which calls the predab.resample function may 
give you some code to start with.  Note however that the performance of 
the approach you are suggestion has already been shown to be poor in 
many cases.  You might run the following in parallel: full model fits 
and penalized least squares using penalties selected by AIC (using 
special arguments to ols along with the pentrace function).


Frank



Regards,
Mark



--
Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] partial solutions to: if else statement problem

2010-06-08 Thread Peter Lercher


Thank you to all for your help!

I received
two equal alternative solutions that bypassed elegantly the problem
from peter.l.e.koni...@gmail.com
and
rafael.bj...@gmail.com

rr.dia2.corr <- rr.dia.2m
rr.dia2.corr[which(med.hyper == 1)] <- rr.dia.2m[which(med.hyper == 1)] - 5

From pbu...@pburns.seanet.com
I received a hint to understand the problem why my code does not work.
p 13 on
http://lib.stat.cmu.edu/S/Spoetry/Tutor/R_inferno.pdf

I have to digest this yet

Peter


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Desolve package: How to pass thousand of parameters to C compiled code?

2010-06-08 Thread Ben Bolker

  yahoo.com> writes:

> 
> Hi,
> 
> I have used DeSolve package for my ODE problem regarding
> infectious disease transmission and currently am
> trying to pass lots (roughly a thousand) of model parameters 
> to the C compiled model (I have to use C
> compiled code instead of R code purely because of the speed).
> 
> I can't go define it one by one as it gonna take ages to finish 
> and also quite difficult to revise. I have read
> the instructions in "Writing Code in Compiled Languages", which
> is very well written, but still have no
> clue on how to proceed.
> 
> Please can anyone suggest the best way to go about this, and also the places
where I can find examples of
> DeSolve C compiled code.
> 

  There's another example of compiled code in 
http://www.jstatsoft.org/v33/i09/paper , but it probably
won't say anything the 'Writing Code' guide doesn't already
cover.  I found the guide pretty complete ...

  I would suggest that you pack your parameters into a single
numeric vector (or several, if that makes more sense) and pass
them that way.

  What kind of infectious disease data do you have that will
allow you to fit a model with thousands of parameters ... ??
(Just curious.)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] duplicated() and unique() problems

2010-06-08 Thread Felix Andrews

On Tuesday, June 8, 2010, christiaan pauw  wrote:
> Hi everybody
>
> I have found something (for me at least) strange with duplicated(). I will
> first provide a replicable example of a certain kind of behaviour that I
> find odd and then give a sample of unexpected results from my own data. I
> hope someone can help me understand this.
>
> Consider the following
>
> # this works as expected
>
> ex=sample(1:20, replace=TRUE)
>
> ex
>
> duplicated(ex)
>
> ex=sort(ex)
>
> ex
>
> duplicated(ex)
>
>
> # but why does duplicate not work after order() ?
>
> ex=sample(1:20, replace=TRUE)
>
> ex
>
> duplicated(ex)
>
> ex=order(ex)
>
> duplicated(ex)
>
> Why does duplicated() not work after order() has been applied but it works
> fine after sort()  ? Is this an error or is there something I don't
> understand.

The latter: order() returns the indexes into your vector, i.e. a
permutation, which select the values in a sorted order. Each element
is unique by definition.

>
> I have been getting very strage results from duplicated() and unique() in a
> dataset I am analysing. Her is a little sample of my real life problem

presumably this is a data.frame...

>
>> str(Masechaba$PROPDESC)
>  Factor w/ 24545 levels "     06","   71Hemilton str",..: 14527 8043 16113
> 16054 13875 15780 12522 7771 14824 12314 ...
>> # Create a indicator if the PROPDESC is unique. Default false
>> Masechaba$unique=FALSE
>> Masechaba$unique[which(is.na(unique(Masechaba$PROPDESC))==FALSE)]=TRUE

The statement above is in error. You are referring to elements of
unique(Masechaba$PROPDESC) which do not correspond to the rows of
Masechaba. They are different lengths. Use duplicated() instead.

>> # Check is something happended
>> length(which(Masechaba$unique==TRUE))
> [1] 2174
>> length(which(Masechaba$unique==FALSE))
> [1] 476
>> Masechaba$duplicate=FALSE
>> Masechaba$duplicate[which(duplicated(Masechaba$PROPDESC)==TRUE)]=TRUE

equivalent to
Masechaba$duplicate <-  duplicated(Masechaba$PROPDESC)

>> length(which(Masechaba$duplicate==TRUE))
> [1] 476
>> length(which(Masechaba$duplicate==FALSE))
> [1] 2174
>> # Looks OK so far
>> # Test on a known duplicate. I expect one to be true and one to be false
>> Masechaba[which(Masechaba$PROPDESC==2363),10:12]
>       PROPDESC unique duplicate
> 24874     2363   TRUE     FALSE
> 31280     2363   TRUE      TRUE
>
> # This is strange.  I expected that unique() and duplicate() would give the
> same results. The variable PROPDESC is clearly not unique in both cases.
> # The totals are the same but not the individual results
>> table(Masechaba$unique,Masechaba$duplicate)
>
>         FALSE TRUE
>   FALSE   342  134
>   TRUE   1832  342
>
> I don't understand this. Is there something I am missing?
>
> Best regards
> Christaan
>
>
> P.S
>> sessionInfo()
> R version 2.11.1 (2010-05-31)
> x86_64-apple-darwin9.8.0
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] splines   stats     graphics  grDevices utils     datasets  methods
> base
>
> other attached packages:
> [1] plyr_0.1.9      maptools_0.7-34 lattice_0.18-8  foreign_0.8-40
>  Hmisc_3.8-0     survival_2.35-8 rgdal_0.6-26
> [8] sp_0.9-64
>
> loaded via a namespace (and not attached):
> [1] cluster_1.12.3 grid_2.11.1    tools_2.11.1
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Felix Andrews / 安福立
Integrated Catchment Assessment and Management (iCAM) Centre
Fenner School of Environment and Society [Bldg 48a]
The Australian National University
Canberra ACT 0200 Australia
M: +61 410 400 963
T: + 61 2 6125 4670
E: felix.andr...@anu.edu.au
CRICOS Provider No. 00120C
-- 
http://www.neurofractal.org/felix/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] add one point to contourplot()

2010-06-08 Thread biostat book

Hi All,
I want to add one point to contourplot().  I used contourplot() in my code like
contourplot(z ~ a + b |c, data)
I understand there is plot.axes argument for filled.contour(), but it  did not 
work for my code. I also tried plot() and text() for contourplot(), but got 
this error: "plot.new has not been called yet"
Any suggestion for adding a point in contourplot()?
Thank you very much!
Lily D



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] duplicated() and unique() problems

2010-06-08 Thread christiaan pauw

Thanks for your help Petr

I think I understand better now.


> > > Masechaba$unique[which(is.na(unique(Masechaba$PROPDESC))==FALSE)]=TRUE
>^^^
> This seems to be strange. At first sight I am puzzlet what result I shall
> expect from such construction.
>
> About your comment above: I looked at the construction again. This was
absolutely unsuited to what I actually had in mind and based on a wrong
understanding of how unique() functions. When one uses is.na(unique(x)) the
answer is FALSE as long a there is something unique - which there will
always be as long as there are any values. It therefore produces a list of
FALSE the same length as the result of unique. which() gives the indices
namely 1:length(unique(x))

The desired result (a true or false value indicating whether a variable is a
duplicate or not is obtained by

Masechaba$unique=duplicated(match(Masechaba$PROPDESC,unique(Masechaba$PROPDESC)))

Thanks again for your help

regards
Christiaan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] scatterplot function - double check: dashed lines

2010-06-08 Thread K F Pearce

Hello everyone,

This is just a quick double check.  It concerns the 'scatterplot function' in 
R. 

 I have 6 curves and I wish to represent each of them by a different kind of 
line (their colour must be black).

The curves are derived from the cuminc function...the coordinates of which are 
in 'xx'.

Upon reading the documentation in R, it looks like I can use the 'on/off' 
command for lty and I can merely run:

plot(xx,color="black",lty=c("11","13","1343","73","2262","35"))

according to the documentation, "13" (for example) means 1 'on' and 3 'off' .

Does the above look OK ?

Say, in another study, I wish to draw my 6 lines all in different colours 
(solid lines), I suppose that I could type:

plot(xx, color=c("red","black","purple","yellow","brown","violet"), lty=1)

Thanks so much for your help,
All the Best,
Kim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding in Missing Data

2010-06-08 Thread Petr PIKAL

Hi

Hm, maybe you can first make a sequence of all required dates and ids, 
construct empty data frame with all possible dates, merge your existing 
data frame with empty one just to fill in all dates, get rid of duplicated 
dates and ids if necessary and finally use na.locf from zoo library to 
fill totrets from previous date with split/lapply.

Seems to be quite complicated and maybe if the data frame is big you could 
have problems with memory

ex=sample(letters[1:5],10, replace=TRUE)
DF<-data.frame(ex=ex, x=rnorm(10), date=1:10)
DF[c(4,8),]<-NA
EDF<-expand.grid(ex=letters[1:5], date=1:10)
MDF<-merge(DF, EDF, all=T)
MDF$y<-unlist(lapply(split(MDF$x, MDF$ex), na.locf, na.rm=F))

But without some working example it is not easy to find out what you want.

Regards
Petr


r-help-boun...@r-project.org napsal dne 08.06.2010 09:03:13:

> 
> Hey All,
> 
> I have just recently thought of a completely different way to accomplish 
my
> analysis (requiring different type of coding)
> 
> Instead of going in and filling in data, I could remove any dates not 
shared
> by ALL the id's.
> 
> I was thinking about accomplishing this using merge(~~), do you think 
this
> is feasible?
> 
> It might take up a bunch of memory at first, going through and 
subsetting
> the data.frame by id.
> 
> 
> 
> 
> 
> "Sample Data.Frame format
> 
> Name is Returns.names
> 
> X   id ticker  date_ adjClose totret RankStk
> 427225 427225 00174410AHS 2001-11-1321.661001235
> 
> 
> "id" uniquely defines a row
> 
> 
> What I am trying to do is add missing data for each ID.
> 
> Important Information: Date is not continuous, the data points are for
> trading days, so weekends/certain holidays are off
> 
> x<-unique(Returns.names$date_) gives me the list of all the possible 
trading
> days.
> 
> For days that are missing, I would like to add a row for that date & the
> same totret as the previous day.
> 
> I cant think of an easy way to do this"
> -- 
> View this message in context: 
http://r.789695.n4.nabble.com/Adding-in-Missing-
> Data-tp2246825p2246946.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Please help me

2010-06-08 Thread Joris Meys

First, read the posting guides.
Then, supply us with a bit more information, like the package you
used, example code that reproduces the error, information about the
data, the complete error message, the traceback (use the function
traceback() right after you got the error).

Otherwise we ain't going to be able to help you.

Cheers
Joris

On Tue, Jun 8, 2010 at 9:42 AM, Xiongqing Zhang  wrote:
>
> Dear Mr. or Ms.,
>
> I used the R-software to run the zero-inflatoin negative binomial model 
> (zeroinfl()) .
>
> Firstly, I introduced one dummy variable to the model as an independent 
> variable, and I got the estimators of parameters. But the results are not 
> satisfied to me. So I introduced three dummy variables to the model. but I 
> could not get the results. And the error message is 
> "solve.default(as.matrix(fit$gaussian)) ". I do not know the reasons fully. 
> And I want to know what method was used to estimate the parameters when I 
> called the function zeroinfl() .
>
> I will be very appreciate if you can help me. Thank you.
>
> Best regards,
>
> Sincerely,
> Xiongqing Zhang
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cross-validation

2010-06-08 Thread Max Kuhn

Install the caret package and see ?train. There is also:

   http://cran.r-project.org/web/packages/caret/vignettes/caretTrain.pdf
   http://www.jstatsoft.org/v28/i05/paper

Max



On Tue, Jun 8, 2010 at 5:34 AM, azam jaafari  wrote:
> Hi
>
> I want to do leave-one-out cross-validation for multinomial logistic 
> regression in R. I did multinomial logistic reg. by package nnet in R. How I 
> do validation? by which function?
> response variable has 7 levels
>
> please help me
>
> Thanks alot
> Azam
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>



-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to read CSV file in R?

2010-06-08 Thread Joris Meys

That will be R 2.10.1 if I'm correct.

For reading in csv files, there's a function read.csv who does just that:
los <- read.csv("file.csv",header=T)

But that is just a detail. You have problems with your memory, but
that's not caused by the size of your dataframe. On my system, a
matrix with 100,000 rows and 75 columns takes only 28 Mb. So I guess
your workspace is cluttered with other stuff.

Check following help pages :
?Memory
?memory.size
?Memory.limits

it generally doesn't make a difference, but sometimes using gc() can
set some memory free again.

If none of this information helps, please provide us with a bit more
info regarding your system and the content of your current workspace.

Cheers
Joris

On Tue, Jun 8, 2010 at 8:46 AM, dhanush  wrote:
>
> I tried to read a CSV file in R. The file has about 100,000 records and 75
> columns. When used read.delim, I got this error. I am using R ver 10.1.
>
>> los<-read.delim("file.csv",header=T,sep=",")
> Warning message:
> In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>  Reached total allocation of 1535Mb: see help(memory.size)
>
> Thanks
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/how-to-read-CSV-file-in-R-tp2246930p2246930.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cross-validation

2010-06-08 Thread Joris Meys

As far as my knowledge goes, nnet doesn't have a built-in function for
crossvalidation. Coding it yourself is not hard though. Nnet is used
in this book : http://www.stats.ox.ac.uk/pub/MASS4/ , which contains
enough examples on how to do so.

See also the crossval function in the bootstrap package.
http://sekhon.berkeley.edu/library/bootstrap/html/crossval.html

Cheers
Joris

On Tue, Jun 8, 2010 at 11:34 AM, azam jaafari  wrote:
> Hi
>
> I want to do leave-one-out cross-validation for multinomial logistic 
> regression in R. I did multinomial logistic reg. by package nnet in R. How I 
> do validation? by which function?
> response variable has 7 levels
>
> please help me
>
> Thanks alot
> Azam
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with if else statement

2010-06-08 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 08.06.2010 11:46:17:

> Hi,
> 
> You are using if/then/else which is a logical control statement and so 
doesn't
> return a value, see 
> ?if
> for details.
> 
> You are probably looking for the ifelse function.

Or use possibility of easy conversion logical (T/F) to numeric (1/0)

rr.dia.2m - (med.hyper==1)*5

Regards
Petr

> 
> Martyn
> 
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] 
On 
> Behalf Of Peter Lercher
> Sent: 08 June 2010 10:18
> To: r-help@r-project.org
> Subject: [R] problem with if else statement
> 
> Dear colleagues,
> 
> What did I not understand ?
> 
> ->my intention
> I want to create a new variable:
> In plain language:
> If someone is taking anithypertensive treatment (med.hyper==1)
> table(med.hyper)
> med.hyper
>   0   1
> 472  97
> I want to subtract 5 mmHg (rr.dia.2m-5) from the measured diastolic 
blood 
> pressure (rr.dia.2m) if not treated - the value of the measured 
diastolic 
> blood pressure should remain the same
> 
> ->my code (data frame is attached !)
> rr.dia2.corr<-if(med.hyper==1)
> {
> rr.dia.2m-5
> } else
> {
> rr.dia2.corr==rr.dia.2m
> }
> 
> R warning
> Warnmeldung:
> In if (med.hyper == 1) { :
>   Bedingung hat Länge > 1 und nur das erste Element wird benutzt "Only 
first 
> condition is use"
> which you can see from the result (simply 5 mm was subtracted and the 
second 
> condition was taken into account)
> 
>  > describe(rr.dia.2m)
> rr.dia.2m
>   n missing  uniqueMean .05 .10 .25 .50 
> .75 .90 .95
> 546  26 110   85.32   67.00   71.50   77.62   84.50 
> 92.00   99.25  105.00
> 
> lowest :  45.0  57.0  60.0  61.0  62.0, highest: 127.5 128.5 129.5 142.5
> 146.5
>  > describe(rr.dia2.corr)
> rr.dia2.corr
>   n missing  uniqueMean .05 .10 .25 .50 
> .75 .90 .95
> 546  26 110   80.32   62.00   66.50   72.62   79.50 
> 87.00   94.25  100.00
> 
> lowest :  40.0  52.0  55.0  56.0  57.0, highest: 122.5 123.5 124.5 137.5
> 141.5
> 
> OS windows
> R version 2.10.1 (2009-12-14)
> attached R modules: design+hmisc
> --
> Thank you so much
> 
> Peter
> 
> 
> 
> 
> This e-mail has been scanned for all viruses by Star.\ 
_...{{dropped:12}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: duplicated() and unique() problems

2010-06-08 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 08.06.2010 08:44:39:

> Hi everybody
> 
> I have found something (for me at least) strange with duplicated(). I 
will
> first provide a replicable example of a certain kind of behaviour that I
> find odd and then give a sample of unexpected results from my own data. 
I
> hope someone can help me understand this.
> 
> Consider the following
> 
> # this works as expected
> 
> ex=sample(1:20, replace=TRUE)
> 
> ex
> 
> duplicated(ex)
> 
> ex=sort(ex)

This is OK as sort sorts your data


> 
> ex
> 
> duplicated(ex)
> 
> 
> # but why does duplicate not work after order() ?
> 
> ex=sample(1:20, replace=TRUE)
> 
> ex
> 
> duplicated(ex)
> 
> ex=order(ex)

This is not as order gives you positions not your data

> ex=sample(letters[1:5],20, replace=TRUE)
> ex
 [1] "b" "b" "b" "e" "d" "c" "e" "a" "a" "d" "d" "d" "a" "e" "b" "c" "e" 
"d" "a"
[20] "a"
> ex<-order(ex)
> ex
 [1]  8  9 13 19 20  1  2  3 15  6 16  5 10 11 12 18  4  7 14 17
>

ex=ex[order(ex)]

shall give you the same result as sort. Maybe with exception of ties.

> 
> duplicated(ex)
> 
> Why does duplicated() not work after order() has been applied but it 
works
> fine after sort()  ? Is this an error or is there something I don't
> understand.
> 
> I have been getting very strage results from duplicated() and unique() 
in a
> dataset I am analysing. Her is a little sample of my real life problem
> 
> > str(Masechaba$PROPDESC)
>  Factor w/ 24545 levels " 06","   71Hemilton str",..: 14527 8043 
16113
> 16054 13875 15780 12522 7771 14824 12314 ...
> > # Create a indicator if the PROPDESC is unique. Default false
> > Masechaba$unique=FALSE
> > Masechaba$unique[which(is.na(unique(Masechaba$PROPDESC))==FALSE)]=TRUE
> > # Check is something happended
> > length(which(Masechaba$unique==TRUE))
> [1] 2174
> > length(which(Masechaba$unique==FALSE))
> [1] 476
> > Masechaba$duplicate=FALSE
> > Masechaba$duplicate[which(duplicated(Masechaba$PROPDESC)==TRUE)]=TRUE
> > length(which(Masechaba$duplicate==TRUE))
> [1] 476
> > length(which(Masechaba$duplicate==FALSE))
> [1] 2174
> > # Looks OK so far
> > # Test on a known duplicate. I expect one to be true and one to be 
false
> > Masechaba[which(Masechaba$PROPDESC==2363),10:12]
>   PROPDESC unique duplicate
> 24874 2363   TRUE FALSE
> 31280 2363   TRUE  TRUE
> 
> # This is strange.  I expected that unique() and duplicate() would give 
the
> same results. The variable PROPDESC is clearly not unique in both cases.

No.

ex=sample(letters[1:5],10, replace=TRUE)
ex
 [1] "b" "d" "d" "b" "a" "c" "b" "c" "d" "d"
unique(ex)
[1] "b" "d" "a" "c"
duplicated(ex)
 [1] FALSE FALSE  TRUE  TRUE FALSE FALSE  TRUE  TRUE  TRUE  TRUE

Functions give you different answers about your data as you ask different 
questions.

> > Masechaba$unique[which(is.na(unique(Masechaba$PROPDESC))==FALSE)]=TRUE
^^^ 
This seems to be strange. At first sight I am puzzlet what result I shall 
expect from such construction.

Regards
Petr

> # The totals are the same but not the individual results
> > table(Masechaba$unique,Masechaba$duplicate)
> 
> FALSE TRUE
>   FALSE   342  134
>   TRUE   1832  342
> 
> I don't understand this. Is there something I am missing?
> 
> Best regards
> Christaan
> 
> 
> P.S
> > sessionInfo()
> R version 2.11.1 (2010-05-31)
> x86_64-apple-darwin9.8.0
> 
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
> 
> attached base packages:
> [1] splines   stats graphics  grDevices utils datasets  methods
> base
> 
> other attached packages:
> [1] plyr_0.1.9  maptools_0.7-34 lattice_0.18-8  foreign_0.8-40
>  Hmisc_3.8-0 survival_2.35-8 rgdal_0.6-26
> [8] sp_0.9-64
> 
> loaded via a namespace (and not attached):
> [1] cluster_1.12.3 grid_2.11.1tools_2.11.1
> 
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Polar coordinate

2010-06-08 Thread Jim Lemon


On 06/07/2010 10:05 PM, ogbos okike wrote:

Greetings to you all.
  I have two datasets - Time and magnitude. For a particular location, the
magnitude of the parameter varies with time. I wish to obtain a polar
coordinate distribution of time (0-24h) and magnitudes so as to visualize
how magnitude varies with different times of the day (e.g., morning,
midnight hours).
I have searched for "polar coordinates in R" but could not get helpful
hints. I would be most grateful if anyone could advise me on where to start.


Hi Ogbos,
Have a look at clock24.plot in the plotrix package.
Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ols function in rms package

2010-06-08 Thread Mark Seeto

> On 06/06/2010 10:49 PM, Mark Seeto wrote:
>> Hello,
>>
>> I have a couple of questions about the ols function in Frank Harrell's
>> rms
>> package.
>>
>> Is there any way to specify variables by their column number in the data
>> frame rather than by the variable name?
>>
>> For example,
>>
>> library(rms)
>> x1<- rnorm(100, 0, 1)
>> x2<- rnorm(100, 0, 1)
>> x3<- rnorm(100, 0, 1)
>> y<- x2 + x3 + rnorm(100, 0, 5)
>> d<- data.frame(x1, x2, x3, y)
>> rm(x1, x2, x3, y)
>> lm(y ~ d[,2] + d[,3], data = d)  # This works
>> ols(y ~ d[,2] + d[,3], data = d) # Gives error
>> Error in if (!length(fname) || !any(fname == zname)) { :
>>missing value where TRUE/FALSE needed
>>
>> However, this works:
>> ols(y ~ x2 + d[,3], data = d)
>>
>> The reason I want to do this is to program variable selection for
>> bootstrap model validation.
>>
>> A related question: does ols allow "y ~ ." notation?
>>
>> lm(y ~ ., data = d[, 2:4])  # This works
>> ols(y ~ ., data = d[, 2:4]) # Gives error
>> Error in terms.formula(formula) : '.' in formula and no 'data' argument
>>
>> Thanks for any help you can give.
>>
>> Regards,
>> Mark
>
> Hi Mark,
>
> It appears that you answered the questions yourself.  rms wants real
> variables or transformations of them.  It makes certain assumptions
> about names of terms.   The y ~ . should work though; sometime I'll have
> a look at that.
>
> But these are the small questions compared to what you really want.  Why
> do you need variable selection, i.e., what is wrong with having
> insignificant variables in a model?  If you indeed need variable
> selection see if backwards stepdown works for you.  It is built-in to
> rms bootstrap validation and calibration functions.
>
> Frank
>

Thank you for your reply, Frank. I would have reached the conclusion
that rms only accepts real variables had this not worked:
ols(y ~ x2 + d[,3], data = d)

The reason I want to program variable selection is so that I can use the
bootstrap to check the performance of a model-selection method. My
co-workers and I have used a variable selection method which combines
forward selection, backward elimination, and best subsets (the forward and
backward methods were run using different software).

I want to do bootstrap validation to (1) check the over-optimism in R^2,
and (2) justify using a different approach, if R^2 turns out to be very
over-optimistic. The different approach would probably be data reduction
using variable clustering, as you describe in your book.

Regards,
Mark
-- 
Mark Seeto
Statistician

National Acoustic Laboratories 
A Division of Australian Hearing

126 Greville Street
Chatswood NSW 2067 Australia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 114 matches

Mail list logo