Re: [R] using subset() in data frame

2008-02-23 Thread Chuck Cleland
On 2/23/2008 6:09 AM, Chuck Cleland wrote:
> On 2/22/2008 8:01 PM, Robert Walters wrote:
>> R folks,
>> As an R novice, I struggle with the mystery of subsetting. Textbook 
>> and online examples of this seem quite straightforward yet I cannot 
>> get my mind around it. For practice, I'm using the code in MASS Ch. 6, 
>> "whiteside data" to analyze a different data set with similar 
>> variables and structure.
>> Here is my data frame:
>>
>> ###subset one of three cases for the variable 'position'
>>  >data.b<-data.a[data.a$position=="inrow",]
>>  > print(data.b)
>>   position  porosityx   y
>> 1 inrow macro 1.40   16.5
>> 2 inrow macro  .   .
>> .  ..   .
>> .  ..   .
>> 7 inrow micro
>> 8 inrow micro
>>
>> Now I want to do separate lm's for each case of porosity, macro and 
>> micro. The code as given in MASS, p.141, slightly modified would be:
>>
>> fit1 <- lm(y ~ x, data=data.b, subset = porosity == "macro")
>> fit2 <- update(fit1, subset = porosity == "micro")
>>
>> ###simplest code with subscripting
>> fit1 <- lm(y ~ x, data.b[porosity=="macro"])
> 
>   Assuming data.b has two dimensions, you need a comma after 
> porosity=="macro" to indicate that you are selecting a subset of rows of 
> the data frame:
> 
> fit1 <- lm(y ~ x, data.b[porosity=="macro",])

   Actually, that should be:

fit1 <- lm(y ~ x, data.b[data.b$porosity=="macro",])

   because [.data.frame needs to know where to find porosity, and it 
won't know to look inside of data.b unless you direct it to look there.

>> ###following example in ?subset
>> fit1 <- lm(y ~ x, data.b, subset(data.b, porosity, select=macro))
> 
>   The select argument to subset is meant to select variables (i.e., it 
> indicates "columns to select from a data frame") and you are misusing it 
> by specifying the level of a factor.  If you make your call to subset by 
> itself (a good idea when you are learning how a function works), you 
> should get an error like this:
> 
>  > subset(whiteside, Insul, select=Before)
> Error in subset.data.frame(whiteside, Insul, select = Before) :
>   'subset' must evaluate to logical
> 
>  What I think you intended was this:
> 
> subset(data.b, porosity == "macro")
> 
>   Even with the correct call to subset, you also don't want both data.b 
> and the subset piece, because subset returns a data frame.  In other 
> words, you would be passing lm() two different data frames.  So try this 
> instead:
> 
> fit1 <- lm(y ~ x, subset(data.b, porosity == "macro"))
> 
>> None of th above, plus many permutations thereof, works.
>> Can anyone educate me?
>>
>> Thanks,
>>
>> Robert Walters
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.  

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using subset() in data frame

2008-02-23 Thread Chuck Cleland
On 2/22/2008 8:01 PM, Robert Walters wrote:
> R folks,
> As an R novice, I struggle with the mystery of subsetting. Textbook and 
> online examples of this seem quite straightforward yet I cannot get my 
> mind around it. For practice, I'm using the code in MASS Ch. 6, 
> "whiteside data" to analyze a different data set with similar variables 
> and structure.
> Here is my data frame:
> 
> ###subset one of three cases for the variable 'position'
>  >data.b<-data.a[data.a$position=="inrow",]
>  > print(data.b)
>   position  porosityx   y
> 1 inrow macro 1.40   16.5
> 2 inrow macro  .   .
> .  ..   .
> .  . .   .
> 7 inrow micro
> 8 inrow micro
> 
> Now I want to do separate lm's for each case of porosity, macro and 
> micro. The code as given in MASS, p.141, slightly modified would be:
> 
> fit1 <- lm(y ~ x, data=data.b, subset = porosity == "macro")
> fit2 <- update(fit1, subset = porosity == "micro")
> 
> ###simplest code with subscripting
> fit1 <- lm(y ~ x, data.b[porosity=="macro"])

   Assuming data.b has two dimensions, you need a comma after 
porosity=="macro" to indicate that you are selecting a subset of rows of 
the data frame:

fit1 <- lm(y ~ x, data.b[porosity=="macro",])

> ###following example in ?subset
> fit1 <- lm(y ~ x, data.b, subset(data.b, porosity, select=macro))

   The select argument to subset is meant to select variables (i.e., it 
indicates "columns to select from a data frame") and you are misusing it 
by specifying the level of a factor.  If you make your call to subset by 
itself (a good idea when you are learning how a function works), you 
should get an error like this:

 > subset(whiteside, Insul, select=Before)
Error in subset.data.frame(whiteside, Insul, select = Before) :
   'subset' must evaluate to logical

  What I think you intended was this:

subset(data.b, porosity == "macro")

   Even with the correct call to subset, you also don't want both data.b 
and the subset piece, because subset returns a data frame.  In other 
words, you would be passing lm() two different data frames.  So try this 
instead:

fit1 <- lm(y ~ x, subset(data.b, porosity == "macro"))

> None of th above, plus many permutations thereof, works.
> Can anyone educate me?
> 
> Thanks,
> 
> Robert Walters
> 
> ______
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshaping data frame

2008-02-20 Thread Chuck Cleland
On 2/20/2008 1:14 PM, ahimsa campos-arceiz wrote:
> Dear all,
> 
> I'm having a few problems trying to reshape a data frame. I tried with
> reshape{stats} and melt{reshape} but I was missing something. Any help is
> very welcome. Please find details below:
> 
> #
> # data in its original shape:
> 
> indiv <- rep(c("A","B"),c(10,10))
> level.1 <- rpois(20, lambda=3)
> covar.1 <- rlnorm(20, 3, 1)
> level.2 <- rpois(20, lambda=3)
> covar.2 <- rlnorm(20, 3, 1)
> my.dat <- data.frame(indiv,level.1,covar.1,level.2,covar.2)
> 
> # the values of level.1 and level.2 represent the number of cases for the
> particular
> # combination of indiv*level*covar value
> 
> # I would like to do two things:
> # 1. reshape to long reducing my.dat[,2:5] into two colums "factor" (levels=
> level.1 & level.2)
> # and the covariate
> # 2. create one new row for each case in level.1 and level.2
> 
> # the new reshaped data.frame would should look like this:
> 
> # indiv  factorcovar   case.id
> #   A   level.1   4.6141051
> #   A   level.1   4.6141052
> #   A   level.2  31.0644051
> #   A   level.2  31.0644052
> #   A   level.2  31.0644053
> #   A   level.2  31.0644054
> #   A   level.1  19.1857841
> #   A   level.2  48.4559291
> #   A   level.2  48.4559292
> #   A   level.2  48.4559293
> # etc...
> 
> #

   Maybe there is a better way, but this seems to do what you want:

#
# data in its original shape:

indiv <- rep(c("A","B"),c(10,10))
level.1 <- rpois(20, lambda=3)
covar.1 <- rlnorm(20, 3, 1)
level.2 <- rpois(20, lambda=3)
covar.2 <- rlnorm(20, 3, 1)
my.dat <- data.frame(indiv,level.1,covar.1,level.2,covar.2)

long <- reshape(my.dat, varying = list(c("level.1","level.2"),
c("covar.1","covar.2")),
 timevar="level", idvar="case.id",
 v.names=c("ncases","covar"),
 direction="long")

newdf <- with(long, data.frame(indiv = rep(  indiv, ncases),
level = rep(  level, ncases),
covar = rep(  covar, ncases),
  case.id = rep(case.id, ncases)))

   The idea is to first reshape() and then rep() each variable ncases 
times.  You can then convert newdf$level into a factor if you like.

> Thank you very much!!
> 
> Ahimsa 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmList, tapply() and lm()

2008-02-15 Thread Chuck Cleland
On 2/15/2008 11:00 AM, Marc Belisle wrote:
> Howdee,
> 
> *** I know that the lmList() function exists, yet I don't want to use it.
> ***
> 
> Would anyone be kind enough to tell how I can apply the function lm() to
> each level of a given factor so to obtain the intercept and slope for each
> factor level within a matrix?
> 
> For instance, suppose a dataframe containing 3 variables: id, x and y.
> 
> I want to compute the function lm() for each level contained in id, as
> lmList would do...

   Something like this?

t(sapply(split(df, list(df$id)),
function(subd){coef(lm(y ~ x, data = subd))}))

> Thanks for your time,
> 
> Marc
> 
> ===
> Marc Bélisle
> Professeur adjoint
> Chaire de recherche du Canada en écologie spatiale et en écologie du paysage
> Département de biologie
> Université de Sherbrooke
> 2500 Boul. de l'Université
> Sherbrooke, Québec
> J1K 2R1 Canada
> 
> Tél: +1-819-821-8000 poste 61313
> Fax: +1-819-821-8049
> Courriél: [EMAIL PROTECTED]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Levene's test for homogeneity of variances (befor using ANOVA)

2008-02-14 Thread Chuck Cleland
On 2/14/2008 9:47 AM, Kes Knave wrote:
> Dear all
> 
> I have tried to find this function in R, but don't find it by searching in
> the help function.
> 
> Anybody who knows if R has the function "Levene's test for homogeneity of
> variances"?
> 
> Note: Im a "R-begginer"

   I wonder how you searched for a function.  For example, when I do the 
following:

RSiteSearch("levene", restrict="function")

   levene.test() functions in the car and lawstat packages are the first 
two hits.

> Regards Kes
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing columns that are all NA from a matrix

2008-02-14 Thread Chuck Cleland
On 2/14/2008 7:59 AM, Martin Waller wrote:
> Hi,
> 
> I guess this might be a FAQ or something, and there's probably a nice 
> simple way to do it, but I can't think of it:
> 
> Given a matrix, I want to remove columns that are _entirely_ filled with 
> NAs (partial NAs are fine).
> 
> How please?

mymat[,which(colMeans(is.na(mymat)) < 1)]

> Thanks,
> 
> Martin
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a data.frame

2008-02-13 Thread Chuck Cleland
On 2/13/2008 5:17 PM, Joe Trubisz wrote:
> OK...newbie question here.
> Either I'm reading the docs wrong, or I'm totally confused.
> 
> Given the following:
> 
> x<-c("aaa","bbb","ccc")
> y<-rep(0,3)
> z<-rep(0,3)
> 
> is.character(x)
> [1] TRUE
> 
> is.numeric(y)
> [1] TRUE
> 
> Now...I want to create a data frame, but keep the data types.
> In reading the docs, I assume you do it this way:
> 
> d<-data.frame(cbind(x=I(x),y=y,z=z)
> 
> But, when I do str(d), I get the following:
> 
> 'data.frame': 3 obs. of  3 variables:
>   $ x: Factor w/ 3 levels "aaa","bbb","ccc": 1 2 3
>   $ y: Factor w/ 1 level "0": 1 1 1
>   $ z: Factor w/ 1 level "0": 1 1 1
> 
> I thought the I() prevents character from becoming factors, right?
> Secondly, how do I force y and z in the data frame to become numeric?

   Don't use cbind() inside of data.frame().  Using cbind() coerces the 
variables into a matrix where all variables share a common type.  I 
think you want this:

d <- data.frame(x=I(x), y=y, z=z)

> Thanks in advance
> Joe
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to run one-way anova R?

2008-02-12 Thread Chuck Cleland
On 2/12/2008 8:54 AM, Kes Knave wrote:
> Dear all,
> 
> How do I run a basic one-way anova in R?

   Have you read the relevant sections of An Introduction to R, as the 
posting guide requests?  Have you tried searching for documentation 
using, for example:

RSiteSearch("oneway ANOVA")

or

help.search("oneway")

??

> Regards Kes
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reverse vector elements

2008-02-12 Thread Chuck Cleland
On 2/12/2008 7:47 AM, mohamed nur anisah wrote:
> Dear lists,
>
>   I want to write a function of a vector and reverse the order of its 
> elements. Here is my code:
>
>   revector<-function(n){
>  y=vector(length=n)
>   for(i in n:1){
> y[i]=i
> }
> return(y)
> }
>
>   i want my output to be like this:
>   y
>[1] 10  9  8  7  6  5  4  3  2  1
>
>   Any suggestion?? Thanks!!
>   Cheers,
>   Anisah

?rev

rev(1:10)
  [1] 10  9  8  7  6  5  4  3  2  1

> -
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshape question

2008-02-08 Thread Chuck Cleland
On 2/8/2008 9:15 AM, Ista Zahn wrote:
> I know there are a lot of reshape questions on the mailing list, but I  
> haven't been able to find an answer to this particular issue.
> 
> I am trying to get a datafame structured like this:
> 
>  > sub <- rep(1:5)
>  > ta1 <- rep(1,5)
>  > ta2 <- rep(2,5)
>  > tb1<- rep(3,5)
>  > tb2 <- rep(4,5)
>  > DF <- data.frame(sub,ta1,ta2,tb1,tb2)
>  > DF
>sub ta1 ta2 tb1 tb2
> 1   1   1   2   3   4
> 2   2   1   2   3   4
> 3   3   1   2   3   4
> 4   4   1   2   3   4
> 5   5   1   2   3   4
> 
> into a form like this:
> 
>  sub time x1 x2
> 1.1   11  1  3
> 1.2   12  2  4
> 2.1   21  1  3
> 2.2   22  2  4
> 3.1   31  1  3
> 3.2   32  2  4
> 4.1   41  1  3
> 4.2   42  2  4
> 5.1   51  1  3
> 5.2   52  2  4
> 
> using the "reshape" command. But when I try reshaping I don't get the  
> desired structure:
> 
>  > DF.L <- reshape(DF, varying = 2:5, idvar="sub", v.names = c("x1",  
> "x2"), times=c(1,2), direction="long")
>  > library(doBy)
>  > orderBy(~sub, data=DF.L)
>  sub time x1 x2
> 1.1   11  1  2
> 1.2   12  3  4
> 2.1   21  1  2
> 2.2   22  3  4
> 3.1   31  1  2
> 3.2   32  3  4
> 4.1   41  1  2
> 4.2   42  3  4
> 5.1   51  1  2
> 5.2   52  3  4

   The varying argument to reshape() can be a list.  For example:

DF.long <- reshape(DF, varying = list(c("ta1","ta2"),
   c("tb1","tb2")),
idvar="sub",
v.names = c("x1","x2"),
times=c(1,2),
direction="long")

DF.long[order(DF.long$sub),]

 sub time x1 x2
1.1   11  1  3
1.2   12  2  4
2.1   21  1  3
2.2   22  2  4
3.1   31  1  3
3.2   32  2  4
4.1   41  1  3
4.2   42  2  4
5.1   51  1  3
5.2   52  2  4

> I can get the desired result by rearranging the original dataframe, like
>  > DF2 <- data.frame(sub,ta1,tb1,ta2,tb2)
> 
> before running the reshape command, but I'm hoping someone knows a way  
> to do the desired reshaping without this step, as it becomes very time  
> consuming with large numbers of repeated measurements.
> 
> Thanks,
> Ista
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Row percentages for a table object

2008-02-07 Thread Chuck Cleland
On 2/7/2008 2:03 PM, Tom Backer Johnsen wrote:
> I an stumbling on something that is probably very simple, but I cannot 
> see the solution.  I have an object generated by the table () function 
> and want to recompute this table so each cell represents the 
> percentage of the corresponding row sum.
> 
> Of course a dedicated function can be written (which I have done), 
> containing the necessary loops etc., but there should be a simpler 
> way.  I'd prefer something simple and as transparent as possible since 
> it is for use in a text I am writing for my students.  I have fiddled 
> around with the apply () function but have so far been unable to find 
> something that works.
> 
> Any suggestions?

   See ?prop.table and also possibly ?CrossTable in the gmodels package.

> Tom
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with package reshape, wide to long

2008-02-07 Thread Chuck Cleland
On 2/7/2008 11:40 AM, Don MacQueen wrote:
> Hello,
> 
> I am having difficulty figuring out how to use functions in the 
> reshape package to perform a wide to long transformation
> 
> I have a "wide" dataframe whose columns are like this example:
> 
>id1 id2 subject treat height weight age
> 
> id1 and id2 are unique for each row
> subject and treat are not unique for each row
> height, weight, and age are different types of measurements made on 
> each unique combination of subject and treatment
> 
> I want to reshape to a long format which will look like this:
> 
>   id1 id2 subject treat measurement.type value
> 
> where
> 
> measurement.type identifies the type of measurement, i.e. 'height', 
> 'weight', 'age'
> value contains the values of those measurements
> and the other variables are replicated as necessary

   Does something like this work for you?

df <- data.frame(id1 = c(1,2,3),
  id2 = c(4,5,6),
  subject = runif(3),
  treat = runif(3),
  height = runif(3),
  weight = runif(3),
  age = runif(3))

library(reshape)

melt(df, measure.var = c("height","weight","age"))

   id1 id2   subject  treat variable  value
1   1   4 0.4736760 0.02221288   height 0.09290354
2   2   5 0.1708840 0.84465476   height 0.70660286
3   3   6 0.6955085 0.56084004   height 0.52761841
4   1   4 0.4736760 0.02221288   weight 0.08853412
5   2   5 0.1708840 0.84465476   weight 0.85745906
6   3   6 0.6955085 0.56084004   weight 0.29902097
7   1   4 0.4736760 0.02221288  age 0.91842595
8   2   5 0.1708840 0.84465476  age 0.93592277
9   3   6 0.6955085 0.56084004  age 0.01587108

> To put it another way, can I use reshape() to transform my original 
> dataframe, which has 45 rows, into a "long" form that has 3*45  = 135 
> rows:  45 rows for height, 45 for weight, 45 for age, with the other 
> variables carried along as is within each set of 45, and a new 
> variable that identifies the type of measurement in each row of the 
> long form, i.e., 'height' in 45 rows, 'weight' in 45 rows, and 'age' 
> in 45 rows.
> 
> I know it's not difficult to do this with explicit looping, using 
> rbind(), but it seems like reshape() is a natural tool. But I'm not 
> getting it, so I'd appreciate some help.
> 
> (in case anyone is wondering whether it makes sense to do this, 
> height, weight, and age are just examples; it makes more sense with 
> my actual measurements)
> 
> Thanks
> -Don

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling

2008-02-05 Thread Chuck Cleland
On 2/5/2008 1:52 PM, Judith Flores wrote:
> Hi there,
> 
>I want to generate different samples using the
> followindg code:
> 
> 
> g<-sample(LETTERS[1:2], 24, replace=T)
> 
>How can I specify that I need 12 "A"s and 12 "B"s?
> 
> Thank you,
> 
> Judith

x <- rep(c("A","B"), each=12)

x
  [1] "A" "A" "A" "A" "A" "A" "A" "A" "A" "A" "A"
[12] "A" "B" "B" "B" "B" "B" "B" "B" "B" "B" "B"
[23] "B" "B"

# "sample(x) generates a random permutation of the elements of x"

g <- sample(x)

g
  [1] "A" "A" "B" "A" "B" "B" "B" "B" "A" "B" "A"
[12] "A" "B" "A" "A" "A" "B" "B" "B" "A" "A" "B"
[23] "A" "B"

>   
> 
> Be a better friend, newshound, and
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Effect size of comparison of two levels of a factor in multiple linear regression

2008-02-04 Thread Chuck Cleland
(here
> # its mean value after standardization: zero)
> library(MCMCpack)
> no.of.sims <- 1
> sims.model <- MCMCregress(model, mcmc=no.of.sims)
> sims.model[1:2,]
>  (Intercept) treatment1 treatment2 scale(covariate)  sigma2
> [1,]  -0.1780735  0.2024111  0.33952330.8682119 0.002617449
> [2,]  -0.1521623  0.1773623  0.29560530.8764573 0.003529013
> sims.treat0 <- rnorm(no.of.sims, sims.model[,"(Intercept)"], 
> sqrt(sims.model[,"sigma2"]))
> sims.treat1 <- rnorm(no.of.sims, sims.model[,"(Intercept)"] + 
> sims.model[,"treatment1"], sqrt(sims.model[,"sigma2"]))
> sims.treat2 <- rnorm(no.of.sims, sims.model[,"(Intercept)"] + 
> sims.model[,"treatment2"], sqrt(sims.model[,"sigma2"]))
> 
> # Calculate Cohen's d for simulated values
> cohens.d(sims.treat1, sims.treat0)
> [1] 3.683093
> cohens.d(sims.treat2, sims.treat0)
> [1] 5.782622
> 
> These values are reasonably close to the ones (4 and 6) I plugged in at
> the beginning. It would be even nicer to have a confidence interval for
> them, but if I bootstrap one out of the simulated outcomes its width
> depends on the number of simulations and is therefore arbitrary. If
> anyone knew a better way to get at the effect sizes I'm looking for or
> how I could also get confidence intervals for them, that would be
> greatly appreciated.
> 
> Thanks,
> 
> Christoph
> 
> --
> Christoph Mathys, M.S.
> Music and Neuroimaging Laboratory
> Beth Israel Deaconess Medical Center
> and Harvard Medical School
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] boxplot and number of observations per box

2008-01-30 Thread Chuck Cleland
On 1/30/2008 6:42 AM, Marc Bernard wrote:
> Dear all,
>
>   How can I plot the number of observations per box in a boxplot. Many thanks,
>
>   Bernard

   Something like this?

X <- boxplot(count ~ spray, data = InsectSprays, col = "lightgray")

mtext(side=1, line = 2, at = 1:nlevels(InsectSprays$spray), paste("n=", 
X$n, sep=""))

> -
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] accessing the indices of outliers in a data frame boxplot

2008-01-25 Thread Chuck Cleland
On 1/25/2008 11:39 AM, Karin Lagesen wrote:
> I have a data frame containing columns which are factors. I use this
> to make boxplots for the data, with one box per factor. I would now
> like to get at the data in the data frame which corresponds to the
> outliers. I have so far found the $out, which gives "the values of any
> data points which lie beyond the extremes of the whiskers", but I
> haven't found anything which will let me get at the indices in the
> original data frame for these outliers. 
> 
> I think there might be a chance that I could simply compare the values
> I am plotting from my data frame with the values for the whiskers and
> use that as a criteria, but I am unsertain of how to do this withhout
> doing it manually. The factor I am plotting against contains 17
> levels, and I'd thus like to see if there is a somewhat more general
> solution available.
> 
> Thanks for your help!
> 
> Karin

   You can use the %in% operator (is.element) to see which data values 
in your data frame match an outlier value.  Then use which() to return 
the TRUE indices.  For example:

set.seed(245)

df <- data.frame(GRP = rep(LETTERS[1:4], each=25), Y = rchisq(100, 2))

mybp <- boxplot(Y ~ GRP, data=df)

which(df$Y %in% mybp$out)
[1]  8 12 47 66 88 93

mybp$out
[1] 5.919915 9.135578 5.723714 8.758584 8.502147 4.920513

df$Y[which(df$Y %in% mybp$out)]
[1] 5.919915 9.135578 5.723714 8.758584 8.502147 4.920513

   See ?is.element and ?which.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting gridlines

2008-01-25 Thread Chuck Cleland
On 1/25/2008 8:14 AM, John Lande wrote:
> dear all,
> 
> I have a very simple question but I could not figure out.
> 
> I need to make plots with grid in the background.
> 
> something like I old retrive like this
> 
> a=runif(100)*10
> b=runif(100)*10
> plot(a,b, pch=20, xlim=c(0, round(max(a))), ylim=c(0, round(max(b
> vs=seq(0, max(a), 0.5)
> for(i in 1:length(vs)){
>abline(v=vs[i], col="lightgrey")
> }
> hs=seq(0, max(b), 0.5)
> for(i in 1:length(hs)){
>abline(h=hs[i], col="lightgrey")
> }
> points(a, b, pch=20)
> 
> as you can see it is not very convenient

   How about using xyplot() in the lattice package with a panel function 
to draw the gridlines?  For example:

library(lattice)

a <- runif(100)*10
b <- runif(100)*10

xyplot(a ~ b, panel = function(x, y, ...){
   panel.xyplot(x, y, ...)
   panel.grid(h=20,v=20)})

> --
> john
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] predict from a multiple regression model

2008-01-22 Thread Chuck Cleland
On 1/22/2008 9:50 AM, Fränzi Korner wrote:
> Hello
> 
>  
> 
> how can I predict from a lm-object over a range of values of one explanatory
> variable without having to specify values for all the other explanatory
> variables? 
> 
>  
> 
> e.g. 
> 
>  
> 
> mod<-lm(y~x1+x2+x3+x4)
> 
>  
> 
> x1.new<-seq(0, 100)
> 
> predict(mod, new=list(x1=x1.new))
> 
>  
> 
>  
> 
> Here, predict() does not work, since values for x2, x3 and x4 are missing.
> Is there a function or argument that, in such a case, averages or weights
> over the other explanatory variables, how it is done in Genstat? 

   You might consider effect() in the effects package by John Fox.  For 
example:

library(effects)

mod.pres <- lm(prestige ~ log(income, 10) + poly(education, 3) +
poly(women, 2), data=Prestige)

effect("poly(women,2)", mod.pres, xlevels=list(women = seq(0,100,20)))

  poly(women,2) effect
women
0   20   40   60   80  100
47.71028 45.12385 44.53749 45.95121 49.36499 54.77883

> Thanks
> 
> Fränzi
> 
>  
> 
>  
> 
> 
> 
> Dr. Fränzi Korner-Nievergelt
> 
> oikostat - Statistische Analysen und Beratung
> 
> Ausserdorf 43
> 
> CH - 6218 Ettiswil
> 
> Tel.: +41 (0) 41 980 49 22
> 
> www.oikostat.ch
> 
> 
> 
> *
> 
> 
> 
> Schweizerische Vogelwarte
> 
> CH - 6204 Sempach
> 
> 
> 
> www.vogelwarte.ch
> 
> *
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> 
> 
> 
> 
> ______
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] summary of categorical variables

2008-01-21 Thread Chuck Cleland
On 1/21/2008 8:08 AM, [EMAIL PROTECTED] wrote:
> Dear list,
> 
> I have a data.frame with nine categorical variables (0,1,2 and NAs) 
> that I would like to get the number of events for each of them. I can 
> extract this using summary() for each variable at a time with the 
> as.factor()argument (otherwise it will get me the mean value):
> 
>> summary(as.factor(mydf[,3]))
>  012 NA's 
>  194   6742
> 
> Trying to use apply() to get this for all the columns of the 
> data.frame I cannot use as.factor because it works with vectors and 
> not data.frames.

   How did you try to use apply()?  Try something like this:

df <- as.data.frame(matrix(sample(c(0,1,2,NA), 9*267, replace=TRUE), 
ncol=9))

apply(df, 2, function(x){summary(as.factor(x))})
  V1 V2 V3 V4 V5 V6 V7 V8 V9
070 69 75 72 71 75 66 58 66
170 67 64 64 64 55 62 65 72
263 60 56 79 57 73 75 69 67
NA's 64 71 72 52 75 64 64 75 62

> Can anyone help me? Maybe a for loop would be good in this case 
> instead, although I learnt that for loops are not recommended? 
> 
> Thanks in advance,
> 
> David
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subsetting a data frame using string matching

2008-01-21 Thread Chuck Cleland
On 1/21/2008 5:18 AM, Karin Lagesen wrote:
> Example data frame: 
> 
> 
> a = c("Alpha", "Beta", "Gamma", "Beeta", "Alpha", "beta")
> b = c(1:6)
> example = data.frame("Title" = a, "Vals" = b)
> 
> 
>> example
>   Title Vals
> 1 Alpha1
> 2  Beta2
> 3 Gamma3
> 4 Beeta4
> 5 Alpha5
> 6  beta6
> 
> I would like to be able to get a new data frame from this data frame
> containing only rows that match a certain string. In this case it
> could for instance be the string "eta". I have tried various ways of
> using agrep and grep, but so far I have not found anything that
> worked.
> 
> Thankyou in advance for your help!

a <- c("Alpha", "Beta", "Gamma", "Beeta", "Alpha", "beta")
b <- c(1:6)
df <- data.frame(Title = a, Vals = b)
df[grep("eta", df$Title),]
   Title Vals
2  Beta2
4 Beeta4
6  beta6

> Karin

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Assigning into each of a list of dataframes

2008-01-18 Thread Chuck Cleland
   What is the right way to assign a new variable into each a of list of 
data frames?  Here is my failed attempt:

mylist <- list(df1 = data.frame(A = runif(5), B = runif(5)),
df2 = data.frame(A = runif(5), B= runif(5)))

lapply(mylist, function(x){x$Y <- x$A * x$B})

$df1
[1] 0.25589928 0.03446026 0.94992362 0.21388326 0.08668821

$df2
[1] 0.08771839 0.05643553 0.09036894 0.09179378 0.37394748

mylist
$df1
   A  B
1 0.3293760 0.77692140
2 0.1283307 0.26852710
3 0.9865388 0.96288517
4 0.5087024 0.42044870
5 0.9175345 0.09447951

$df2
   A B
1 0.3887178 0.2256608
2 0.2642189 0.2135938
3 0.3881635 0.2328115
4 0.9060760 0.1013091
5 0.4578424 0.8167602

   I want the variable Y to be added to each data frame in mylist.

thanks,

Chuck

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] product of vector elements

2008-01-12 Thread Chuck Cleland
On 1/12/2008 7:38 PM, Gerard Smits wrote:
> All,
> 
> I would like to find a simper method that I now have to find the 
> product of all elements  in a vector:
> 
> #get product of vector elements: (1,2,3,4,5)
> vec.product <- exp(sum(log(c(1,2,3,4,5
> 
> I have not found a vector product function, if one has been written.

 > cumprod(1:5)
[1]   1   2   6  24 120

?cumprod

> Thanks,
> 
> Gerard 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] choose the intercept from glm and lm

2008-01-08 Thread Chuck Cleland
On 1/8/2008 2:48 PM, Knut Krueger wrote:
> Hi to all,
> Is there a possibility to choose which value is used for the intercept 
> or does in not make any sense?
> f.e I would like to use outcome2 for the intercept
> 
> ## Dobson (1990) Page 93: Randomized Controlled Trial :
> counts <- c(18,17,15,20,10,20,25,13,12)
> outcome <- gl(3,1,9)
> treatment <- gl(3,3)
> print(d.AD <- data.frame(treatment, outcome, counts))
> glm.D93 <- glm(counts ~ outcome + treatment, family=poisson())
> summary(glm.D93
> 
> 
> Coefficients:
>   Estimate Std. Error  z value Pr(>|z|)
> (Intercept)  3.045e+00  1.709e-01   17.815   <2e-16 ***
> outcome2-4.543e-01  2.022e-01   -2.247   0.0246 *  
> outcome3-2.930e-01  1.927e-01   -1.520   0.1285
> treatment2   8.717e-16  2.000e-01 4.36e-15   1.
> treatment3   4.557e-16  2.000e-01 2.28e-15   1.
> 
> 
> Regards Knut

   Here is one way:

outcome <- relevel(outcome, ref="2")

?relevel

> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode() function results in logical output, not factor output

2008-01-08 Thread Chuck Cleland
On 1/7/2008 3:32 PM, Thomas MacFarland wrote:
> Dear R Users:
>  
> I have race-ethnicity groups identified in the factor variable Ethnic_G.
>  
> I need to collapse Ethnic_G into a new variable with only two factors, 1 
> (White, non-Hispanic) and 2 (Minority).
>  
> As seen in the code and output below, the recoded race-ethnicity variable is 
> put into logical format, not factor format.
>  
> I've used library(car) and the package was updated.
>  
> Any ideas on how to fix this problem would be apprecidated.

   In your call to recode(), use 'W' rather than 'c(W)' to specify the 
White-non-Hispanic category:

 > recode(Ethnic_G.Character, "'W'=1; else=2", as.factor.result=TRUE)
  [1] 2 2 2 2 1 1 2 1 2 1 2 1 1 2 2 2 2 1 2 2 1 2 2
[24] 2 2
Levels: 1 2

OR

 > recode(Ethnic_G.Character, "'W'='White-non-Hispanic'; 
else='Minority'", as.factor.result=TRUE)
  [1] Minority   Minority
  [3] Minority   Minority
  [5] White-non-Hispanic White-non-Hispanic
  [7] Minority   White-non-Hispanic
  [9] Minority   White-non-Hispanic
[11] Minority   White-non-Hispanic
[13] White-non-Hispanic Minority
[15] Minority   Minority
[17] Minority   White-non-Hispanic
[19] Minority   Minority
[21] White-non-Hispanic Minority
[23] Minority   Minority
[25] Minority
Levels: Minority White-non-Hispanic

   Also, you can recode the factor directly without converting it to 
character:

 > recode(Ethnic_G, "'W'='White-non-Hispanic'; else='Minority'", 
as.factor.result=TRUE)
  [1] Minority   Minority
  [3] Minority   Minority
  [5] White-non-Hispanic White-non-Hispanic
  [7] Minority   White-non-Hispanic
  [9] Minority   White-non-Hispanic
[11] Minority   White-non-Hispanic
[13] White-non-Hispanic Minority
[15] Minority   Minority
[17] Minority   White-non-Hispanic
[19] Minority   Minority
[21] White-non-Hispanic Minority
[23] Minority   Minority
[25] Minority
Levels: Minority White-non-Hispanic

> Thanks in advance!
>  
> Best wishes.
>  
> Tom 
>> #> # 
>> Task - recode Ethnic_G into two breakout groups, White-non-Hispanic> #   
>>  and Other> class(Ethnic_G)[1] "factor"> print(Ethnic_G) [1] B A I U W W B W 
>> H W N W W B A B H W A H W B I N ILevels: A B H I N U W> Ethnic_G.Character 
>> <- as.character(Ethnic_G)> class(Ethnic_G.Character)[1] "character"> 
>> print(Ethnic_G.Character) [1] "B" "A" "I" "U" "W" "W" "B" "W" "H" "W" "N" 
>> "W" "W" "B" "A" "B" "H" "W" "A" "H" "W" "B" "I" "N" "I"> Ethnic_G.Recode <- 
>> recode(Ethnic_G.Character, "c(W)='1'; else='2'",+   as.factor.result=TRUE)> 
>> class(Ethnic_G.Recode)[1] "logical"> print(Ethnic_G.Recode) [1] NA NA NA NA 
>> NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA> 
>> #########
> 
> _
> Watch “Cause Effect,” a show about real people making a real difference.
> 
>   [[alternative HTML version deleted]] 
> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R procedure similar to STATA heckprob?

2008-01-03 Thread Chuck Cleland
On 1/3/2008 2:20 PM, Richard Saba wrote:
> Is anyone aware of an R procedure similar to STATA's "heckprob" procedure?
> "Heckprob" fits maximum likelihood probit models correcting for sample
> selection bias.
> 
> Thanks,

RSiteSearch("Heckman selection", restrict="functions") points to the
micEcon package:

http://cran.at.r-project.org/src/contrib/Descriptions/micEcon.html

> Richard Saba
> 
> Department of Economics
> 
> Auburn University
> 
> Email:  [EMAIL PROTECTED] 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bootstrap Confidence Intervals

2008-01-02 Thread Chuck Cleland
_Fede_ wrote:
> Sorry for the previous message. I have been reading the help page about boot
> library and I have already understood what the arguments made reference
> (original data and vector of indices). Now everything is ok.Thanks for your
> help. 
> 
> But I have another doubt. How I can make a histogram of the bootstrap
> samples? How can I access them when I use boot function?
> 
> Thank you in advance.

  To see the structure of x.boot use str(x.boot).  Also, the help page
for boot() describes the following:

t A matrix with R rows each of which is a bootstrap replicate of statistic.

  So you want something like this:

hist(x.boot$t)

> _Fede_
> 
> Chuck Cleland wrote:
>> _Fede_ wrote:
>>> Hi again.
>>>
>>> Watching this example that appears in the help page
>>>
>>> ratio <- function(d, w) sum(d$x * w)/sum(d$u * w)
>>> city.boot <- boot(city, ratio, R = 999, stype = "w",sim = "ordinary")
>>> boot.ci(city.boot, conf = c(0.90,0.95),type =
>>> c("norm","basic","perc","bca"))
>>>
>>> I have tried to do the following (calling boot() to create an object to
>>> pass
>>> to boot.ci):
>>>
>>> x <- rnorm(20)
>>> kurtosis <- function(x) (mean((x-mean(x))^4))/(sd(x)^4)
>>> x.boot <- boot(x, kurtosis, R = 999, sim = "ordinary")
>>> boot.ci(x.boot, conf = 0.95,type = c("norm","basic","perc","bca"))
>>>
>>> But I don't know why this don't work. The editor window shows the
>>> following
>>> error message:
>>>
>>> Error in statistic(data, original, ...) : unused argument(s) (1:20)
>>>
>>> I suppose that something is wrong with my data but I don't know what is.
>>>
>>> Thanks in advance and wishing everybody a happy new year.
>>   Check the statistic argument to boot() very carefully.  When sim =
>> "ordinary" the statistic function must have at least two arguments.  Try
>> something like this:
>>
>> library(boot)
>>
>> kurtosis <- function(x) (mean((x-mean(x))^4))/(sd(x)^4)
>>
>> x <- rnorm(20)
>>
>> x.boot <- boot(x,
>>statistic = function(d, ind){kurtosis(d[ind])},
>>R = 999,
>>sim = "ordinary")
>>
>> boot.ci(x.boot, conf = 0.95, type = c("norm","basic","perc","bca"))
>>
>> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
>> Based on 999 bootstrap replicates
>>
>> CALL :
>> boot.ci(boot.out = x.boot, conf = 0.95, type = c("norm", "basic",
>> "perc", "bca"))
>>
>> Intervals :
>> Level  Normal  Basic
>> 95%   ( 1.060,  2.430 )   ( 0.899,  2.233 )
>>
>> Level PercentileBCa
>> 95%   ( 1.394,  2.728 )   ( 1.373,  2.690 )
>> Calculations and Intervals on Original Scale
>>
>>   Also, note that the e1071 package contains a kurtosis function.
>>
>>> Regards 
>>>
>>> _Fede_
>>>
>>> Prof Brian Ripley wrote:
>>>> You need to call boot() to create an object to pass to boot.ci().
>>>>
>>>> There are lots of examples in the help pages and in the book that
>>>> package 
>>>> 'boot' supports. From the help:
>>>>
>>>> Usage:
>>>>
>>>>   boot.ci(boot.out, conf = 0.95, type = "all",
>>>>   index = 1:min(2,length(boot.out$t0)), var.t0 = NULL,
>>>>   var.t = NULL, t0 = NULL, t = NULL, L = NULL, h =
>>>> function(t)
>>>> t,
>>>>   hdot = function(t) rep(1,length(t)), hinv = function(t) t,
>>>> ...)
>>>>
>>>> Arguments:
>>>>
>>>> boot.out: An object of class '"boot"' containing the output of a
>>>>bootstrap calculation.
>>>>
>>>> and try class(z) .
>>>>
>>>>
>>>> On Sun, 30 Dec 2007, _Fede_ wrote:
>>>>
>>>>> Hi all.
>>>>>
>>>>> This is my first post in this forum. Finally I find a forum in the web
>>>>> about
>>>>> R, although is not in my language.
>>>>>
>>>>> Now I'm working with Bootstrap CI. I'd like to know how I can calculate
>>>>> a
>>>>>

Re: [R] Bootstrap Confidence Intervals

2008-01-01 Thread Chuck Cleland
_Fede_ wrote:
> Hi again.
> 
> Watching this example that appears in the help page
> 
> ratio <- function(d, w) sum(d$x * w)/sum(d$u * w)
> city.boot <- boot(city, ratio, R = 999, stype = "w",sim = "ordinary")
> boot.ci(city.boot, conf = c(0.90,0.95),type =
> c("norm","basic","perc","bca"))
> 
> I have tried to do the following (calling boot() to create an object to pass
> to boot.ci):
> 
> x <- rnorm(20)
> kurtosis <- function(x) (mean((x-mean(x))^4))/(sd(x)^4)
> x.boot <- boot(x, kurtosis, R = 999, sim = "ordinary")
> boot.ci(x.boot, conf = 0.95,type = c("norm","basic","perc","bca"))
> 
> But I don't know why this don't work. The editor window shows the following
> error message:
> 
> Error in statistic(data, original, ...) : unused argument(s) (1:20)
> 
> I suppose that something is wrong with my data but I don't know what is.
> 
> Thanks in advance and wishing everybody a happy new year.

  Check the statistic argument to boot() very carefully.  When sim =
"ordinary" the statistic function must have at least two arguments.  Try
something like this:

library(boot)

kurtosis <- function(x) (mean((x-mean(x))^4))/(sd(x)^4)

x <- rnorm(20)

x.boot <- boot(x,
   statistic = function(d, ind){kurtosis(d[ind])},
   R = 999,
   sim = "ordinary")

boot.ci(x.boot, conf = 0.95, type = c("norm","basic","perc","bca"))

BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 999 bootstrap replicates

CALL :
boot.ci(boot.out = x.boot, conf = 0.95, type = c("norm", "basic",
"perc", "bca"))

Intervals :
Level  Normal  Basic
95%   ( 1.060,  2.430 )   ( 0.899,  2.233 )

Level PercentileBCa
95%   ( 1.394,  2.728 )   ( 1.373,  2.690 )
Calculations and Intervals on Original Scale

  Also, note that the e1071 package contains a kurtosis function.

> Regards 
> 
> _Fede_
> 
> Prof Brian Ripley wrote:
>> You need to call boot() to create an object to pass to boot.ci().
>>
>> There are lots of examples in the help pages and in the book that package 
>> 'boot' supports. From the help:
>>
>> Usage:
>>
>>   boot.ci(boot.out, conf = 0.95, type = "all",
>>   index = 1:min(2,length(boot.out$t0)), var.t0 = NULL,
>>   var.t = NULL, t0 = NULL, t = NULL, L = NULL, h = function(t)
>> t,
>>   hdot = function(t) rep(1,length(t)), hinv = function(t) t,
>> ...)
>>
>> Arguments:
>>
>> boot.out: An object of class '"boot"' containing the output of a
>>bootstrap calculation.
>>
>> and try class(z) .
>>
>>
>> On Sun, 30 Dec 2007, _Fede_ wrote:
>>
>>>
>>> Hi all.
>>>
>>> This is my first post in this forum. Finally I find a forum in the web
>>> about
>>> R, although is not in my language.
>>>
>>> Now I'm working with Bootstrap CI. I'd like to know how I can calculate a
>>> Bootstrap CI for any statistic, in particular, for Kurtosis Coeficient. I
>>> have done the following code lines:
>>>
>>>> library(boot)
>>>> x=rnorm(20)
>>>> kurtosis=function(x) (mean((x-mean(x))^4))/(sd(x)^4)
>>>> z <- numeric(1)
>>>> for(i in 1:1)
>>>> z[i]=kurtosis(sample(x, replace=TRUE))
>>>> boot.ci(z, conf = 0.95,type = c("norm","basic","perc","bca"))
>>> But the output shows the next error:
>>>
>>> Error en if (ncol(boot.out$t) < max(index)) { :
>>>argumento tiene longitud cero
>>>
>>> I don't know what is wrong.
>>>
>>> I hope that somebody can help me. Sorry for my english.
>>>
>>> All have a nice new year.
>>>
>>> _Fede_
>>>
>> -- 
>> Brian D. Ripley,  [EMAIL PROTECTED]
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford, Tel:  +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UKFax:  +44 1865 272595
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] median of binned values

2007-12-19 Thread Chuck Cleland
Martin Tomko wrote:
> Thank you, Chuck,
> would you mind commenting a bit on the code, it is not all clear... HOw 
> would you go to retrieve only the numeric value (not the category name)?
> I am just starting with R, and the functionality of replicate and levels 
> is not quite clear. I tried the documentation, but am not any wiser. 
> What if I had a vector v <- vector(c(1,10,100,1000,1)) and wanted to 
> perform it on that?
> 
> Thanks a lot
> Martin

  Retrieve the numeric value rather than the category name as follows:

with(df, freq[median(rep(as.numeric(binname), freq))])
[1] 1

  To do essentially the same thing with a vector:

myvec <- c(1,10,100,1000,1)

myvec[median(rep(1:length(myvec), myvec))]
[1] 1

  I'm sure I cannot explain levels() and rep() any better than the help
pages for those functions.

> Chuck Cleland wrote:
>> Martin Tomko wrote:
>>> Dear list,
>>> I have a vector (array, table row, whatever is best) of frequency values 
>>> for categories (or bins), and I need to find the median category. 
>>> Trivial to do by hand, but I was wondering if there is a means to  do it 
>>> in R in an elegant way.
>>>
>>> The obvious medioan(vector) returns the median frequency for the binns, 
>>> and that is not what I want. i.e,:
>>>  freq
>>> cat11
>>> cat2   10  
>>> cat3   100  
>>> cat4   1000
>>> cat5   1
>>>
>>> I want it to return cat5, instead of cat3.
>> df <- data.frame(binname = as.factor(paste("cat", 1:5, sep="")),
>>  freq = c(1,10,100,1000,1))
>>
>> df
>>   binname  freq
>> 1cat1 1
>> 2cat210
>> 3cat3   100
>> 4cat4  1000
>> 5cat5 1
>>
>> with(df, levels(binname)[median(rep(as.numeric(binname), freq))])
>> [1] "cat5"
>>
>>> Thanks a lot
>>> Martin
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.  

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculate remainer

2007-12-19 Thread Chuck Cleland
livia wrote:
> Hello everyone,
> 
> I have got a question about a simple calculation. If I would like to
> calculate 50/12 and return the result as 4 and the remainer 2. Is there a
> function of doing this?
> 
> Many thanks.

?"%%" to see how to get the remainder.  You might put the "result" and
remainder together in a function like this:

myfunc <- function(x,y){list(result = floor(x / y), remainder = x %% y)}

myfunc(x=50, y=12)
$result
[1] 4

$remainder
[1] 2

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] median of binned values

2007-12-19 Thread Chuck Cleland
Martin Tomko wrote:
> Dear list,
> I have a vector (array, table row, whatever is best) of frequency values 
> for categories (or bins), and I need to find the median category. 
> Trivial to do by hand, but I was wondering if there is a means to  do it 
> in R in an elegant way.
> 
> The obvious medioan(vector) returns the median frequency for the binns, 
> and that is not what I want. i.e,:
>  freq
> cat11
> cat2   10  
> cat3   100  
> cat4   1000
> cat5   1
> 
> I want it to return cat5, instead of cat3.

df <- data.frame(binname = as.factor(paste("cat", 1:5, sep="")),
 freq = c(1,10,100,1000,1))

df
  binname  freq
1cat1 1
2cat210
3cat3   100
4cat4  1000
5cat5 1

with(df, levels(binname)[median(rep(as.numeric(binname), freq))])
[1] "cat5"

> Thanks a lot
> Martin
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cor.test formula

2007-12-17 Thread Chuck Cleland
Monica Pisica wrote:
> Hi everybody,
>  
> I am interested in seeing how the p value is calculated for a t test for a 
> correlation coefficient. I know that cor.test delivers the correlation 
> coefficient and the t-test, p-value and the 95 confidence interval. I am 
> interested  in how the p-value is calculated.
>  
> Usually if i type the name of the function i get explicitly the coding of 
> that function, but if i type
>  
>> cor.testfunction (x, ...) UseMethod("cor.test")
>  
> So  How can i get the coding to find out how the p-value is calculated 
> for this function?

  The following Google search finds coding for cor.test:

cor.test site:https://svn.r-project.org/R/trunk/src/

  From those 6 hits, it is a short trip to the following link:

https://svn.r-project.org/R/trunk/src/library/stats/R/cor.test.R

> Thanks,
>  
> Monica
> _
> [[replacing trailing spam]]
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ragged array with append

2007-11-24 Thread Chuck Cleland
Alexy Khrabrov wrote:
> On Nov 24, 2007, at 2:07 PM, Chuck Cleland wrote:
> 
>> with(d, split(word, kind))
>>
>> # OR
>>
>> with(d, split(as.character(word), kind))
> 
> Awesome!  Scoping: how do I get the result back to the top level?
> 
>  > with(d, wk <- split(word,kind))
>  > wk
> Error: object "wk" not found
> 
> -- trying to create it at top level first:
>  > wk <- c()
>  > with(d, wk <- split(word,kind))
>  > wk
> NULL

  Like this:

wk <- with(d, split(word, kind))

> wk
$`1`
[1] a b c
Levels: a b c d e f g h i j

$`2`
[1] d
Levels: a b c d e f g h i j

$`3`
[1] e
Levels: a b c d e f g h i j

$`4`
[1] f
Levels: a b c d e f g h i j

$`5`
[1] g h
Levels: a b c d e f g h i j

$`7`
[1] i j
Levels: a b c d e f g h i j

> Cheers,
> Alexy
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ragged array with append

2007-11-24 Thread Chuck Cleland
Alexy Khrabrov wrote:
> I wonder what's the right way in R to do the following -- placing  
> objects of the same kind together in subarrays of varying length.   
> Here's what I mean:
> 
>  > word <- c("a","b","c","d","e","f","g","h","i","j")
>  > kind <- c(1,1,1,2,3,4,5,5,7,7)
>  > d <- data.frame(word,kind)
>  > d
> word kind
> 1 a1
> 2 b1
> 3 c1
> 4 d2
> 5 e3
> 6 f4
> 7 g5
> 8 h5
> 9 i7
> 10j7
> 
> Now from this data frame, I want to assemble words of the same kind  
> into lists.  The result should look like (not R syntax):
> 
> 1 => [a,b,c]
> 2 => [d]
> 3 => [e]
> 4 => [f]
> 5 => [g,h]
> 7 => [i,j]
> 
> What is the most appropriate data structure in R for this result and  
> growing these sublists most effectively with append?

  It would make sense to use a list as the data structure, and here is
how you might do it:

with(d, split(word, kind))

# OR

with(d, split(as.character(word), kind))

?split

> Cheers,
> Alexy
> 
> __________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] looking for function like rollmean()

2007-11-24 Thread Chuck Cleland
Jonas Stein wrote:
> Hi,
> 
> i have some data, that has 1-5 % noise. 
> I want to smooth this data without loosing rows.
> 
> rollmean() would be great, but it returns a vector of different size as the
> initial vector.

  You might consider one of the smoothers in smooth() from the stats
package.

?smooth

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [:]

2007-11-24 Thread Chuck Cleland
Alexy Khrabrov wrote:
> What are idioms for taking a head or a tail of a vector, either up to  
> an index, or from an index to the end?  Also -- is it necessary to  
> use length(v) to refer to the last element? E.g., Python has
> 
> v[:3] # indices 0,1,2
> v[3:] # indices 3,4,...
> v[-1] # the last element of v
> v[:-1] # all but last

?head

  For example:

> x <- runif(10)

> x
 [1] 0.60253459 0.51895186 0.01080359 0.68675829
 [5] 0.58588120 0.41019752 0.25345482 0.84928989
 [9] 0.78826981 0.01696331

> head(x, 3)
[1] 0.60253459 0.51895186 0.01080359

> tail(x, -3)
[1] 0.68675829 0.58588120 0.41019752 0.25345482
[5] 0.84928989 0.78826981 0.01696331

> tail(x, 1)
[1] 0.01696331

> head(x, -1)
[1] 0.60253459 0.51895186 0.01080359 0.68675829
[5] 0.58588120 0.41019752 0.25345482 0.84928989
[9] 0.78826981

> Cheers,
> Alexy
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple comparison (glht) problem

2007-11-21 Thread Chuck Cleland
Anqi Qiu wrote:
> I am not sure whether there is a bug. When I tested the example given for 
> "glht"
> in the help, I entered the following error:
> 
> Running commands:
> 
> amod <- aov(minutes ~ blanket, data = recovery)
> rht <- glht(amod, linfct = mcp(blanket = "Dunnett"),
>   alternative = "less")
> 
> Errors are:
> 
> Error in try(coef.(model)) : could not find function "coef."
> 
> Error in modelparm.default(model, ...) : no 'coef' method for 'model'
> found! 
> 
> Thanks!

  It works for me.

> library(multcomp)
Loading required package: mvtnorm

> amod <- aov(minutes ~ blanket, data = recovery)
> rht <- glht(amod, linfct = mcp(blanket = "Dunnett"),
+   alternative = "less")

> summary(rht)

 Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Dunnett Contrasts

Fit: aov(formula = minutes ~ blanket, data = recovery)

Linear Hypotheses:
 Estimate Std. Error t value p value
b1 - b0 >= 0  -2.1333 1.6038  -1.330  0.2412
b2 - b0 >= 0  -7.4667 1.6038  -4.656  <0.001 ***
b3 - b0 >= 0  -1.6667 0.8848  -1.884  0.0924 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported)

> sessionInfo()
R version 2.6.1 RC (2007-11-20 r43507)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] multcomp_0.992-6 mvtnorm_0.8-1

loaded via a namespace (and not attached):
[1] tools_2.6.1

> Regards,
> 
> Anqi Qiu
> 
> Anqi Qiu
> Center for Imaging Science
> Johns Hopkins University
> Email: [EMAIL PROTECTED]
> tel: 410-516-8103
> 
> Anqi Qiu
> Center for Imaging Science
> Johns Hopkins University
> Email: [EMAIL PROTECTED]
> tel: 410-516-8103
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] factors levels ?

2007-11-13 Thread Chuck Cleland
W Eryk Wolski wrote:
> Hi,
> 
> It's just some example code.. The application is uninteresting. I am
> searching for some functionality.
> 
> X <- rnorm(100) //my data
> 
> Y <- seq(-3,3,by=0.1) // bin boundaries.
> 
> Now I would like to generate a - list of factors,  length as X...
> i.e.: all values in the range [-3,-2.9) have the same factor... [-3,-2.9) etc.
> 
> I would assume R has such a function but I cant recall which one it is.

?cut

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] producing output as *.spo (spss output format)

2007-11-10 Thread Chuck Cleland
Ivan Uemlianin wrote:
> Dear All
> 
> I am considering moving from SPSS to R as my stats environment of 
> choice.  I have read around and everything looks favourable.  There is 
> just one issue on which I have been unable to find information.
> 
> Many clients ask me to send them output (tables, graphs, etc) as an spss 
> output file (ie .spo).  I haven't asked them why, I've just said yes.  I 
> know R can produce graphics as nice as SPSS, and presumably they can be 
> output in some portable format for pasting into a word-processor 
> document.  I need to find out why the client wants spo.  In the meantime 
> let's assume they have a good reason.
> 
> Can R write .spo files?  Failing that does any one know of any spo 
> writers that I could wire up to R (eg with some python gluecode)? 
> Failing that any suggestions for overcoming the output hurdle would be 
> welcome, as R looks very attractive (platform independent, easy to use 
> and to automate, fast).

  RSiteSearch("SPSS", restrict="function") shows nothing relevant to
*.spo files.  I don't think you will ever see an R *.spo writer.  You
might look into one or more of the following to produce accessible and
attractive output for clients:

Sweave
OdfWeave
R2HTML

> Best wishes
> 
> Ivan
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] running sum of a vector

2007-11-07 Thread Chuck Cleland
Alexy Khrabrov wrote:
> I need a vector with sums of vectors up to each position in the  
> original.  The imperative version is simple:
> 
> # running sum: the traditional imperative way
> sumr.1 <- function(x) {
>s <- c()
>ss <- 0
>for (i in 1:length(x)) {
>   ss <- ss + x[i]
>   s[i] <- ss
>}
>s
> }
> 
> Yet I want a functional way, which is shorter:
> 
> # running sum: functional way, but inefficient one!
> sumr.2 <- function(x) {
>   sapply(1:length(x), function(i) sum(x[1:i]))
> }
> 
> -- the problem with the latter is, we need to create indices to run  
> over them, and the sum is recomputed anew for each position, while  
> the imperative version iterates without recomputing.  Is there a  
> better functional solution?

?cumsum

X <- runif(20)

sumr.1(X)
 [1] 0.6359909 0.9435293 1.2167988 1.6229179
 [5] 2.2816672 3.2687057 4.1973724 4.4421475
 [9] 4.5601287 4.7500524 5.0639924 5.5831643
[13] 6.5071247 6.9861566 7.0352500 7.6723079
[17] 7.8560394 7.9281423 8.4757938 8.9985340

sumr.2(X)
 [1] 0.6359909 0.9435293 1.2167988 1.6229179
 [5] 2.2816672 3.2687057 4.1973724 4.4421475
 [9] 4.5601287 4.7500524 5.0639924 5.5831643
[13] 6.5071247 6.9861566 7.0352500 7.6723079
[17] 7.8560394 7.9281423 8.4757938 8.9985340

cumsum(X)
 [1] 0.6359909 0.9435293 1.2167988 1.6229179
 [5] 2.2816672 3.2687057 4.1973724 4.4421475
 [9] 4.5601287 4.7500524 5.0639924 5.5831643
[13] 6.5071247 6.9861566 7.0352500 7.6723079
[17] 7.8560394 7.9281423 8.4757938 8.9985340

all(cumsum(X) == sumr.1(X))
[1] TRUE

> Cheers,
> Alexy
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unique Pairs based on a Minimum Value

2007-11-05 Thread Chuck Cleland
Bret Collier wrote:
> RUsers,
> 
> I am trying (with little success) to determine how to combine unique and 
> min to do some data frame manipulations and I could use a little advice. 
>   I have tried various combinations of 'unique', 'min' and 'apply' with 
> no luck.
> 
> Given this simple example data (x, y)
> 
> x<-c(10, 10, 12, 13, 15, 16 ,17, 17, 17)
> y<-c(112, 117, 111, 117, 114, 113, 119, 121, 130)
> 
> as.data.frame(cbind(x, y))
> 
> x   y
> 1 10 112
> 2 10 117
> 3 12 111
> 4 13 117
> 5 15 114
> 6 16 113
> 7 17 119
> 8 17 121
> 9 17 130
> 
> I have been attempting to get all those unique 'x-y' combinations for 
> which the y column is the minimum of the values y takes for each unique 
> x (ID variable), such as below.
> 
> x   y
> 1 10 112
> 3 12 111
> 4 13 117
> 5 15 114
> 6 16 113
> 7 17 119
> 
> Any advice/directions I could look would be appreciated.

x<-c(10, 10, 12, 13, 15, 16 ,17, 17, 17)
y<-c(112, 117, 111, 117, 114, 113, 119, 121, 130)
df <- data.frame(x,y)

aggregate(df, list(df$x), min)[,c("x","y")]
   x   y
1 10 112
2 12 111
3 13 117
4 15 114
5 16 113
6 17 119

> TIA,
> Bret
> R 2.6.0; platform i386-pc-mingw32
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple averaging question?

2007-11-01 Thread Chuck Cleland
Jeff Miller wrote:
> Hi all,

> Suppose I have a column vector of 600 measurements taken in 1s intervals.
> 
> What I want is a new vector with the averages for each min (so there would
> be 10 entries).
> 
> Is there an efficient way to do this? I’ve been doing it with a ‘for’ loop
> but something tells me there is a simple command that is more efficient.

  How about something like this?

y <- runif(600)

df <- data.frame(minute = rep(1:10, each=60), y = y)

with(df, tapply(y, minute, mean))
1 2 3 4 5
0.4664301 0.4622071 0.5159511 0.4744836 0.5282750
6 7 8 910
0.4670941 0.5091410 0.4648349 0.5227221 0.5251926

> Jeff
> 
>  
> 
> 
> Internal Virus Database is out-of-date.
> Checked by AVG Free Edition. 
> 
> 3:09 PM
> 
>   [[alternative HTML version deleted]]
> 
> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset data.frame with constraint of many values?

2007-10-24 Thread Chuck Cleland
Dong-hyun Oh wrote:
> Dear UseRs,
> 
> Let us assume that I have data.frame named as dt.
> 
> dt is as follows:
> 
> abcd
> 1345
> 2332
> 1342
> 3245
> 4536
> 3257
> 2578
> .
> .
> 
> I want to subset dt with fileds a having 2 or 3 or 4, and I wrote  
> following code.
> 
> dt[dt$a == 2 | dt$a == 3 | dt$a == 4,]
> 
> Is there more efficient way for subset?

subset(dt, a %in% 2:4)

> Thanks in advance.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sum variable as long condition is true

2007-10-23 Thread Chuck Cleland
marcg wrote:
> Hello R
> 
> For expierienced user, the following problem will be easy to solve:
> 
> a<-c(0,1,0,1,0,2,3,4,3,2)
> b<-c(3,3,3,4,4,4,7,7,7,10)
> c<-data.frame(a,b)
> 
> Data Frame c contains tow colums. I would like to sum up all values in a as 
> long as b stays the same:
> 
> sum(a[which(b==1)])
> 
> does this, but i have to manually put in b
> 
> then i tryied st like this, but i canno't save it properly
> 
> for (i in 0:max(b)){
>   i<-sum(a[which(b==i)])
> }
> 
> i tried to figure out, how tapply works, but neither

> with(c, tapply(a, list(b), sum))
 3  4  7 10
 1  3 10  2

> thanks alot
> 
> marc
> --
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] save lm output into vectors

2007-10-10 Thread Chuck Cleland
Henrique Dallazuanna wrote:
> Hi,
> 
> mods <- lapply(lapply(df[,which(sapply(df, is.factor))],
> function(reg)lm(df$value~reg)), summary)
> res <- lapply(mods, "[", c(4,10))
> 
> And for each model adjusted:
> 1-pf(res[[2]][[2]][1], res[[2]][[2]][2], res[[2]][[2]][3])
> 1-pf(res[[1]][[2]][1], res[[1]][[2]][2], res[[1]][[2]][3]) 

Jiong:
  Here is another approach:

mymodels <- lapply(split(iris, list(iris$Species)),
  function(x){lm(Sepal.Length ~ Sepal.Width +
Petal.Length +
Petal.Width, data = x)})

lapply(mymodels, function(x){coefficients(summary(x))})
$setosa
  Estimate Std. Error   t value Pr(>|t|)
(Intercept)  2.3518898 0.39286751 5.9864707 3.034183e-07
Sepal.Width  0.6548350 0.09244742 7.0833236 6.834434e-09
Petal.Length 0.2375602 0.20801921 1.1420107 2.593594e-01
Petal.Width  0.2521257 0.34686362 0.7268727 4.709870e-01

$versicolor
   Estimate Std. Error   t value Pr(>|t|)
(Intercept)   1.8955395  0.5070552  3.738329 5.112246e-04
Sepal.Width   0.3868576  0.2045449  1.891309 6.488965e-02
Petal.Length  0.9083370  0.1654325  5.490681 1.95e-06
Petal.Width  -0.6792238  0.4353821 -1.560064 1.255990e-01

$virginica
   Estimate Std. Errort value Pr(>|t|)
(Intercept)   0.6998830 0.53360089  1.3116227 1.961563e-01
Sepal.Width   0.3303370 0.17432873  1.8949086 6.439972e-02
Petal.Length  0.9455356 0.09072204 10.4223360 1.074269e-13
Petal.Width  -0.1697527 0.19807243 -0.8570233 3.958750e-01

lapply(mymodels, function(x){coefficients(summary(x))[,4]})
$setosa
 (Intercept)  Sepal.Width Petal.Length  Petal.Width
3.034183e-07 6.834434e-09 2.593594e-01 4.709870e-01

$versicolor
 (Intercept)  Sepal.Width Petal.Length  Petal.Width
5.112246e-04 6.488965e-02 1.95e-06 1.255990e-01

$virginica
 (Intercept)  Sepal.Width Petal.Length  Petal.Width
1.961563e-01 6.439972e-02 1.074269e-13 3.958750e-01

  Once you have a list of fitted models and can see how to extract parts
of the summaries, you can reorganize what you extract to meet your needs.

hope this helps,

Chuck

> On 09/10/2007, Jiong Zhang, PhD <[EMAIL PROTECTED]> wrote:
>> Hi,
>>
>> May I ask how I can save the coefficients and the p values into a table?
>>
>> thanks.
>>
>> jiong
>> The email message (and any attachments) is for the sol...{{dropped:20}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression by groups

2007-10-09 Thread Chuck Cleland
Jiong Zhang, PhD wrote:
> Hi All,
> 
> I want to run regression (lm) on my dependant variable by gender and race. 
> How do I integrate the "by" function in lm?
> 
> thanks.

  Here is an example using the iris data:

by(iris, iris$Species, function(x){
summary(lm(Sepal.Length ~ Sepal.Width +
  Petal.Length +
  Petal.Width, data=x))})

> jiong
>  The email message (and any attachments) is for the sole...{{dropped:11}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SEM - singularity error

2007-09-20 Thread Chuck Cleland
[EMAIL PROTECTED] wrote:
> Good morning,
> 
> I am trying to develop a structural equation model of snake abundance using
> habitat variables. In attempting to estimate the model using the "sem" package
> in R version 2.4.0, I receive the following error message:
> 
> "Error in solve.default(C) : system is computationally singular: reciprocal
> condition number = 1.75349e-16"
> 
> MAIN PROBLEM: I am hoping to discover why I am receiving the aforementioned
> error message and how to successfully estimate the model.
> 
> OTHER INFORMATION:
> 1. I believe the model is over-identified rather than under-identified (based 
> on
> my understanding of the t-rule). I have observed data for 10 variables (9
> exogenous, 1 endogenous).
> 
> 2. I am not certain that I have used the proper tool to estimate the 
> covariance
> matrix. In this case, I used the "VAR" function.
> 
> 3. I am most concerned that I have improperly coded the RAM file. For example,
> in a case where I have three exogenous indicators of one exogenous latent
> variable, I specify a start value of 1 for one of the exogenous indicators. I
> am not sure if this is proper or necessary.
> 
> 4. I am new to SEM; this is the first model I have ever tried to estimate.
> 
> R CODE: Below is the r-code I have used to estimate the structural equation
> model --
> 
> # LOADING R PACKAGES
> library(sem)
> 
> # READING IN THE CSV FILES
> thsi.2006<-read.csv("thsi_ab_env_space_sem.csv")
> thsi<-thsi.2006
> 
> # MAKING "RAM" FILE 2
> model2.nlc <-specify.model()
> Moist->slope, NA, 1
> Moist->sand, lamda21, NA
> Moist->clay, lamda31, NA
> Hab->isol, NA, 1
> Hab->edgedist_a, lamda52, NA
> Hab->ag10, lamda62, NA
> Hab->urb10, lamda72, NA
> Hab->rd10, lamda82, NA
> Hab->y, lamda92, NA
> Moist->this, gamma11, NA
> Hab->this, gamma12, NA
> slope<->slope, theta11, NA
> sand<->sand, theta22, NA
> clay<->clay, theta33, NA
> isol<->isol, theta44, NA
> edgedist_a<->edgedist_a, theta55, NA
> ag10<->ag10, theta66, NA
> urb10<->urb10, theta77, NA
> rd10<->rd10, theta88, NA
> y<->y, the99, NA
> Moist<->Moist, phi11, NA
> Hab<->Hab, phi22, NA
> this<->this, theps11, NA
> 
> model2.nlc
> end
> 
> # MAKING S (COVARIANCE MATRIX)
> thsi.var <- var(thsi)
> 
> # MAKING UNSCALED SEM MODEL
> sem2<-sem(ram=model2.nlc, S=thsi.var, N=22)
> 
> I am also attaching a jpeg diagram of the model I am trying to estimate. 
> Please
> let me know if there is any additional information that I should add to this
> posting.
> 
> Thank you so much for your time.
> Nicolette Cagle

  Your specification of the model seems OK and it is over-identified (21
free parameters and 34 df).  I suspect the problem is that one or more
of your 10 variables is a linear function of the remaining variables.
If that is the case, then the following should give the same singularity
error:

factanal(thsi, factors=1)

  You may be able to drop one or more of the 10 variables from
consideration and successfully estimate a conceptually similar model.

hope this helps,

Chuck Cleland

> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Row-by-row regression on matrix

2007-09-19 Thread Chuck Cleland
[EMAIL PROTECTED] wrote:
> Folks,
> 
> I have a 3000 x 4 matrix (y), which I need to regress row-by-row against a 
> 4-vector (x) to create a
> matrix lm.y of intercepts and slopes. To illustrate:
> 
> y <- matrix(rnorm(12000), ncol = 4)
> x <- c(1/12, 3/12, 6/12, 1)
> 
> system.time(lm.y <- t(apply(y, 1, function(z) lm(z ~ x)$coefficient)))
> [1] 44.72 18.00 69.52NANA
> 
> Takes more than a minute to do (and I need to do many similar regressions 
> a day). 
> 
> Is there a more efficient way of handling this?
> 
> I'm running R 2.4.1 on Windows XP Service Pack 2 on a Intel Xeon dual-core 
> 2.66GHz with 3GB RAM.
> 
> Thanks very much,
> 
> Murali

 y <- matrix(rnorm(12000), ncol = 4)
 x <- c(1/12, 3/12, 6/12, 1)

 system.time(lm.y1 <- t(coef(lm(t(y) ~ x
   user  system elapsed
   0.030.000.04

 system.time(lm.y2 <- t(apply(y, 1, function(z) lm(z ~ x)$coefficient)))
   user  system elapsed
  19.700.05   20.45

 all.equal(lm.y1, lm.y2)
[1] TRUE

> 
> ---
> This message (including any attachments) is confidential and...{{dropped}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] By() with method = spearman

2007-09-19 Thread Chuck Cleland
Doran, Harold wrote:
> I have a data set where I want the correlations between 2 variables
> conditional on a students grade level.
> 
> This code works just fine.
> 
> by(tmp[,c('mtsc07', 'DCBASmathscoreSPRING')], tmp$Grade, cor,
> use='complete', method='pearson')
> 
> However, this generates an error
> 
> by(tmp[,c('mtsc07', 'DCBASmathscoreSPRING')], tmp$Grade, cor,
> use='complete', method='spearman')
> Error in FUN(data[x, ], ...) : 'x' is empty
> 
> I can subset the data by grade and compute spearman rho as
> 
> tmp5 <- subset(tmp, Grade == 5)
> cor(tmp5[,c('mtsc07', 'DCBASmathcountSPRING')], use='complete',
> method='spearman')
> 
> But doing this iteratively is inefficient.
> 
> I don't see anything in the help man for by() or cor() that tells me
> what the problem is. I might be missing it though. Any thoughts?

  It works as expected using the iris data:

by(iris[,c('Sepal.Length', 'Sepal.Width')], iris$Species, cor,
   use='complete', method='spearman')

iris$Species: setosa
 Sepal.Length Sepal.Width
Sepal.Length1.000   0.7553375
Sepal.Width 0.7553375   1.000
-

iris$Species: versicolor
 Sepal.Length Sepal.Width
Sepal.Length 1.000.517606
Sepal.Width  0.5176061.00
-

iris$Species: virginica
 Sepal.Length Sepal.Width
Sepal.Length1.000   0.4265165
Sepal.Width 0.4265165   1.000

> sessionInfo()
R version 2.5.1 Patched (2007-09-16 r42884)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] "stats" "graphics"  "grDevices" "utils" "datasets"
"methods"   "base"

other attached packages:
 lattice
"0.16-5"

> Thanks,
> Harold
> 
> 
>> sessionInfo()
> R version 2.5.0 (2007-04-23) 
> i386-pc-mingw32 
> 
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> States.1252;LC_MONETARY=English_United
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] "stats" "graphics"  "grDevices" "utils" "datasets"
> "methods"   "base" 
> 
> other attached packages:
>  lattice 
> "0.15-4" 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] locate word in vector

2007-09-14 Thread Chuck Cleland
kevinchang wrote:
> Hey All,
> 
> 
> I am wondering if there is a built-in function allowing us to locate a
> particular word in a character vector.
> 
> ex: vector a
> 
> a
> [1] "superman"  "xamn"  "spiderman" "superman"  "superman"  "xman" 
> [7] "spiderman"
> 
> Is there any built-in function that can show "superman" are the first,
> fourth and fifith element in "a"? Please help me out. Thanks. 

a <- c("superman", "xamn", "spiderman", "superman",
   "superman", "xman", "spiderman")

grep("^superman$", a)
[1] 1 4 5

?grep

OR

which(a %in% "superman")
[1] 1 4 5

?which
?is.element

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshape help

2007-09-12 Thread Chuck Cleland
juli pausas wrote:
> Hi,
> I'm trying to use reshape but I cannot quite understand how it works.
> Could somebody help me on this? Example, my data is something like:
> 
> mydat <- data.frame(tree= 1:10, serra=rep(1:2, c(5,5)), bt01= 101:110,
> bt02= 201:210, bt03= 301:310,  mm01= 9101:9110, mm02= 9201:9210, mm03=
> 9301:9310)
> 
>> mydat
>tree serra bt01 bt02 bt03 mm01 mm02 mm03
> 1 1 1  101  201  301  9101  9201  9301
> 2 2 1  102  202  302  9102  9202  9302
> 3 3 1  103  203  303  9103  9203  9303
> 4 4 1  104  204  304  9104  9204  9304
> 5 5 1  105  205  305  9105  9205  9305
> 6 6 2  106  206  306  9106  9206  9306
> 7 7 2  107  207  307  9107  9207  9307
> 8 8 2  108  208  308  9108  9208  9308
> 9 9 2  109  209  309  9109  9209  9309
> 10   10 2  110  210  310  9110  9210  9310
> 
> that is, in the "wide form" with 2 constant variables (tree, serra)
> and 6 variables that correspond to two variables (bt mm) measured in 3
> years 01, 02, 03
> 
> I would like to reshaped the data to the long format as follows:
> 
>   tree serra   YEAR   bt   mm
>  1 12001   101   9101
>  2 12001   102   9102
>  3 12001   103   9103
>  4 12001   104   9104
>  5 12001   105   9105
>  6 22001   106   9106
>  7 22001   107   9107
>  8 22001   108   9108
>  9 22001   109   9109
> 10 22001   110   9110
>  1 12002   201   9201
>  2 12002   202   9202
>  3 12002   203   9203
>  4 12002   204   9204
>  5 12002   205   9205
>  6 22002   206   9206
>  7 22002   207   9207
>  8 22002   208   9208
>  9 22002   209   9209
> 10 22002   210   9210
>  1 12003   301   9301
>  2 12003   302   9302
>  3 12003   303   9303
>  4 12003   304   9304
>  5 12003   305   9305
>  6 22003   306   9306
>  7 22003   307   9307
>  8 22003   308   9308
>  9 22003   309   9309
> 10 22003   310   9310
> 
> I would appreciate if somebody let me know how could I do this with reshape

reshape(mydat, varying = list(c("bt01","bt02","bt03"),
  c("mm01","mm02","mm03")),
   v.names=c("bt","mm"),
   timevar = "YEAR",
   times = c(2001, 2002, 2003),
   idvar = "tree",
   direction = "long")

tree serra YEAR  bt   mm
1.2001 1 1 2001 101 9101
2.2001 2 1 2001 102 9102
3.2001 3 1 2001 103 9103
4.2001 4 1 2001 104 9104
5.2001 5 1 2001 105 9105
6.2001 6 2 2001 106 9106
7.2001 7 2 2001 107 9107
8.2001 8 2 2001 108 9108
9.2001 9 2 2001 109 9109
10.2001   10 2 2001 110 9110
1.2002 1 1 2002 201 9201
2.2002 2 1 2002 202 9202
3.2002 3 1 2002 203 9203
4.2002 4 1 2002 204 9204
5.2002 5 1 2002 205 9205
6.2002 6 2 2002 206 9206
7.2002 7 2 2002 207 9207
8.2002 8 2 2002 208 9208
9.2002 9 2 2002 209 9209
10.2002   10 2 2002 210 9210
1.2003 1 1 2003 301 9301
2.2003 2 1 2003 302 9302
3.2003 3 1 2003 303 9303
4.2003 4 1 2003 304 9304
5.2003 5 1 2003 305 9305
6.2003 6 2 2003 306 9306
7.2003 7 2 2003 307 9307
8.2003 8 2 2003 308 9308
9.2003 9 2 2003 309 9309
10.2003   10 2 2003 310 9310

> Thanks in advance
> 
> Juli 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


<    1   2   3