Re: [R] Extrapolating values from a glm fit

2011-01-26 Thread Ahnate Lim
Even when I try to predict y values from x, let's say I want to predict y at
x=0. Looking at the graph from the provided syntax, I would expect y to be
about 0.85. Is this right:

predict(mylogit,newdata=as.data.frame(0),type="response")

# I get:

Warning message:
'newdata' had 1 rows but variable(s) found have 34 rows

# Why do I need 34 rows? So I try:

v=vector()
v[1:34]=0
predict(mylogit,newdata=as.data.frame(v),type="response")

# And I get a matrix of 34 values that appear to fit a logistic function,
instead of 0.85..




On Wed, Jan 26, 2011 at 6:59 PM, David Winsemius wrote:

>
> On Jan 26, 2011, at 10:52 PM, Ahnate Lim wrote:
>
>  Dear R-help,
>>
>> I have fitted a glm logistic function to dichotomous forced choices
>> responses varying according to time interval between two stimulus. x
>> values
>> are time separation in miliseconds, and the y values are proportion
>> responses for one of the stimulus. Now I am trying to extrapolate x values
>> for the y value (proportion) at .25, .5, and .75. I have tried several
>> predict parameters, and they don't appear to be working. Is this correct
>> use
>> and understanding of the predict function? It would be nice to know the
>> parameters for the glm best fit, but all I really need are the
>> extrapolated
>> x values for those proportions. Thank you for your help. Here is the code:
>>
>> x <-
>> c(-283.9, -267.2, -250.5, -233.8, -217.1, -200.4, -183.7, -167,
>> -150.3, -133.6, -116.9, -100.2, -83.5, -66.8, -50.1, -33.4, -16.7,
>> 16.7, 33.4, 50.1, 66.8, 83.5, 100.2, 116.9, 133.6, 150.3, 167,
>> 183.7, 200.4, 217.1, 233.8, 250.5, 267.2, 283.9)
>>
>> y <-
>> c(0, 0.333, 0, 0, 0, 0, 0, 0, 0, 0.333,
>> 0, 0.133, 0.238095238095238, 0.528,
>> 0.567,
>> 0.845238095238095, 0.55, 1, 0.889, 1, 1, 1, 1, 1,
>> 1, 1, 1, 1, 1, 1, 1, 1, 1, 0.5)
>>
>> weight <-
>> c(1, 3, 2, 5, 4, 4, 3, 5, 5, 4, 5, 11, 22, 11, 15, 16, 11, 7,
>> 14, 10, 16, 19, 11, 5, 4, 5, 6, 9, 4, 2, 5, 5, 2, 2)
>>
>> mylogit <- glm(y~x,weights=weight, family = binomial)
>>
>> # now I try plotting the predicted value, and it looks like a good fit,
>> hopefully I can access what the glm is doing
>>
>> ypred <- predict(mylogit,newdata=as.data.frame(x),type="response")
>> plot(x, ypred,type="l")
>> points(x,y)
>>
>> # so I try to predict the x value when y (proportion) is at .5, but
>> something is wrong..
>>
>
> Right. Predict goes in the other direction ... x predicts y.
>
> Perhaps if you created a function of y w.r.t. x and then inverted it.
>
> ?approxfun  # and see the posting earlier this week "Inverse Prediction
> with splines" where it was demonstrated how to reverse the roles of x and y.
>
>>
>> predict(mylogit,newdata=as.data.frame(0.5))
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> David Winsemius, MD
> West Hartford, CT
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Factor rotation (e.g., oblimin, varimax) and PCA

2011-01-26 Thread Mark Difford

Finn,

>> But when I use 'principal' I do not seem to be able to get the same
>> results 
>> from prcomp and princomp  and a 'raw' use of eigen:
< ...snip... >
>> So what is wrong with the rotations and what is wrong with 'principal'?

I would say that nothing is wrong. Right at the top of the help file for
principal() [3rd line down] Professor Revelle notes that

"The eigen vectors are rescaled by the sqrt of the eigen values to produce
the component loadings more typical in factor analysis."

You talk about differences and results not "quite" corresponding. But what
actually are these differences? and what are their magnitudes? In cases like
this, where you are questioning well-tested functions you would be well
advised to give concrete examples, accompanied by code that users can run
without having to grub around your questions.

Regards, Mark.

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Factor-rotation-e-g-oblimin-varimax-and-PCA-tp3239099p3241553.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identifying when one element of a row has a positive number

2011-01-26 Thread Peter Langfelder
Here's a solution., maybe not the most elegant but works.

df.r = df1[, c(3:5)]; # restricted data

nNonZero = apply(df.r!=0, 1, sum);
one = nNonZero==1;
oneZero = nNonZero==2;

whichOne = apply(df.r[one, ]!=0, 1, which);
whichZero = apply(df.r[oneZero, ]==0, 1, which);

colNames = colnames(df.r);

one_presence = one_absence = rep(NA, nrow(df1))

one_presence[one] = colNames[whichOne];
one_absence[oneZero] = colNames[whichZero];

Peter

On Wed, Jan 26, 2011 at 9:36 PM, Daisy Englert Duursma
 wrote:
> Hello,
>
> I am not sure where to begin with this problem or what to search for
> in r-help. I just don't know what to call this.
>
> If I have 5 columns, the first 2 are the x,y, locations and the last
> three are variables about those locations.
>
> x<-seq(1860,1950,by=10)
> y<-seq(-290,-200,by=10)
> ANN<-c(3,0,0,0,1,0,1,1,0,0)
> CTA<-c(0,1,0,0,0,0,1,0,0,2)
> GLM<-c(0,0,2,0,0,0,0,1,0,0)
> df1<-as.data.frame(cbind(x,y,ANN,CTA,GLM))
>
> What I would like to produce is an additional column that tells when
> only 1 of the three variables has a value greater than 0. I would like
> this new column to give the name of the variable. Likewise, I would
> like a column that tells one only one of the three variables for a
> given row has a value of 0. For my example the new columns would be:
>
> one_presence<-c("ANN","CTA","GLM","NA","ANN","NA","NA","NA","NA","CTA")
> one_absence<-c("NA","NA","NA","NA","NA","NA","GLM","CTA","NA","NA")
>
> The end result should look like
>
> df2<-(cbind(df1,one_presence,one_absence))
>
> I am sure I can do this with a loop or maybe grep but I am out of ideas.
>
> Any help would be appreciated.
>
> Cheers,
> Daisy
>
> --
> Daisy Englert Duursma
>
> Room E8C156
> Dept. Biological Sciences
> Macquarie University  NSW  2109
> Australia
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] identifying when one element of a row has a positive number

2011-01-26 Thread Daisy Englert Duursma
Hello,

I am not sure where to begin with this problem or what to search for
in r-help. I just don't know what to call this.

If I have 5 columns, the first 2 are the x,y, locations and the last
three are variables about those locations.

x<-seq(1860,1950,by=10)
y<-seq(-290,-200,by=10)
ANN<-c(3,0,0,0,1,0,1,1,0,0)
CTA<-c(0,1,0,0,0,0,1,0,0,2)
GLM<-c(0,0,2,0,0,0,0,1,0,0)
df1<-as.data.frame(cbind(x,y,ANN,CTA,GLM))

What I would like to produce is an additional column that tells when
only 1 of the three variables has a value greater than 0. I would like
this new column to give the name of the variable. Likewise, I would
like a column that tells one only one of the three variables for a
given row has a value of 0. For my example the new columns would be:

one_presence<-c("ANN","CTA","GLM","NA","ANN","NA","NA","NA","NA","CTA")
one_absence<-c("NA","NA","NA","NA","NA","NA","GLM","CTA","NA","NA")

The end result should look like

df2<-(cbind(df1,one_presence,one_absence))

I am sure I can do this with a loop or maybe grep but I am out of ideas.

Any help would be appreciated.

Cheers,
Daisy

-- 
Daisy Englert Duursma

Room E8C156
Dept. Biological Sciences
Macquarie University  NSW  2109
Australia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extrapolating values from a glm fit

2011-01-26 Thread David Winsemius


On Jan 26, 2011, at 10:52 PM, Ahnate Lim wrote:


Dear R-help,

I have fitted a glm logistic function to dichotomous forced choices
responses varying according to time interval between two stimulus. x  
values

are time separation in miliseconds, and the y values are proportion
responses for one of the stimulus. Now I am trying to extrapolate x  
values

for the y value (proportion) at .25, .5, and .75. I have tried several
predict parameters, and they don't appear to be working. Is this  
correct use
and understanding of the predict function? It would be nice to know  
the
parameters for the glm best fit, but all I really need are the  
extrapolated
x values for those proportions. Thank you for your help. Here is the  
code:


x <-
c(-283.9, -267.2, -250.5, -233.8, -217.1, -200.4, -183.7, -167,
-150.3, -133.6, -116.9, -100.2, -83.5, -66.8, -50.1, -33.4, -16.7,
16.7, 33.4, 50.1, 66.8, 83.5, 100.2, 116.9, 133.6, 150.3, 167,
183.7, 200.4, 217.1, 233.8, 250.5, 267.2, 283.9)

y <-
c(0, 0.333, 0, 0, 0, 0, 0, 0, 0, 0.333,
0, 0.133, 0.238095238095238, 0.528,
0.567,
0.845238095238095, 0.55, 1, 0.889, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 0.5)

weight <-
c(1, 3, 2, 5, 4, 4, 3, 5, 5, 4, 5, 11, 22, 11, 15, 16, 11, 7,
14, 10, 16, 19, 11, 5, 4, 5, 6, 9, 4, 2, 5, 5, 2, 2)

mylogit <- glm(y~x,weights=weight, family = binomial)

# now I try plotting the predicted value, and it looks like a good  
fit,

hopefully I can access what the glm is doing

ypred <- predict(mylogit,newdata=as.data.frame(x),type="response")
plot(x, ypred,type="l")
points(x,y)

# so I try to predict the x value when y (proportion) is at .5, but
something is wrong..


Right. Predict goes in the other direction ... x predicts y.

Perhaps if you created a function of y w.r.t. x and then inverted it.

?approxfun  # and see the posting earlier this week "Inverse  
Prediction with splines" where it was demonstrated how to reverse the  
roles of x and y.


predict(mylogit,newdata=as.data.frame(0.5))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Need help with my homework

2011-01-26 Thread gaja

 Hello to all again :)

Again, I have an issue with my homework. I was troubling myself for two
days, and I still can't get it :(

So, if someone can help me, I would be so grateful... 


HOMEWORK 1.

I need to draw function,

random walk take the length of walk, and its in 2D. As a result, I have to
take matrix in which are i-lines of matrix, written coordinates on i-steps 

radnom.walk <- function (length, dim = 2){

}

HOMEWORK 2

I need to draw function, which draws self-avoiding walk of given length in
plane level


selfavoiding <- function (length)
{

  }



That would be all. I'm sure, there is an expert which can help me, hehe. I
wish to learn with those two examples for other 5, I have to deal with, for
that homework.

Cheers, Gaja*
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Need-help-with-my-homework-tp3238794p3238794.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How do I fix this ?

2011-01-26 Thread eric

Just when I think I'm starting to learn 

Statement z1 works, statement z doesn't. Why doesn't z work and what do I do
to fix it ? Clearly the problem is with the first NA, but I would think it's
handled through the loop vectorization.


y1 <- rnorm(20, 0, .013)

y1
 [1] -0.0068630836 -0.0101106230 -0.0169663344 -0.0066314769  0.0075063818
 [6] -0.0033548024  0.0015647863  0.0119815982 -0.0021430336  0.0044617167
[11]  0.0053447708 -0.0005590323  0.0063195781  0.0073059640 -0.0181872678
[16] -0.0098094568  0.0013679040 -0.0028490887 -0.0131129191  0.0126610358

z1 <- ifelse(is.na(y1), 1, 1*cumprod(1+y1))

z1
 [1] 9931.369 9830.957 9664.162 9600.074 9672.136 9639.688 9654.772 9770.451
 [9] 9749.513 9793.012 9845.354 9839.850 9902.034 9974.378 9792.971 9696.907
[17] 9710.172 9682.506 9555.541 9676.524



y <-c(NA, rnorm(19,0, .013))

y
 [1]NA  0.0056258152 -0.0117690116  0.0163961630  0.0007818773
 [6]  0.0007761957  0.0139769376  0.0041086982 -0.0049545337  0.0059587216
[11] -0.0079022056  0.0083076357 -0.0075823658  0.0173806814 -0.0034915869
[16] -0.0045480358  0.0168642491  0.0038681635 -0.0123010077  0.0087494624

z <-ifelse(is.na(y), 1, 1*cumprod(1+y))

z
 [1] 1NANANANANANANANANANANA
[13]NANANANANANANANA

-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-do-I-fix-this-tp3239239p3239239.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extrapolating values from a glm fit

2011-01-26 Thread Ahnate Lim
Dear R-help,

I have fitted a glm logistic function to dichotomous forced choices
responses varying according to time interval between two stimulus. x values
are time separation in miliseconds, and the y values are proportion
responses for one of the stimulus. Now I am trying to extrapolate x values
for the y value (proportion) at .25, .5, and .75. I have tried several
predict parameters, and they don't appear to be working. Is this correct use
and understanding of the predict function? It would be nice to know the
parameters for the glm best fit, but all I really need are the extrapolated
x values for those proportions. Thank you for your help. Here is the code:

x <-
c(-283.9, -267.2, -250.5, -233.8, -217.1, -200.4, -183.7, -167,
-150.3, -133.6, -116.9, -100.2, -83.5, -66.8, -50.1, -33.4, -16.7,
16.7, 33.4, 50.1, 66.8, 83.5, 100.2, 116.9, 133.6, 150.3, 167,
183.7, 200.4, 217.1, 233.8, 250.5, 267.2, 283.9)

y <-
c(0, 0.333, 0, 0, 0, 0, 0, 0, 0, 0.333,
0, 0.133, 0.238095238095238, 0.528,
0.567,
0.845238095238095, 0.55, 1, 0.889, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 0.5)

weight <-
c(1, 3, 2, 5, 4, 4, 3, 5, 5, 4, 5, 11, 22, 11, 15, 16, 11, 7,
14, 10, 16, 19, 11, 5, 4, 5, 6, 9, 4, 2, 5, 5, 2, 2)

mylogit <- glm(y~x,weights=weight, family = binomial)

# now I try plotting the predicted value, and it looks like a good fit,
hopefully I can access what the glm is doing

ypred <- predict(mylogit,newdata=as.data.frame(x),type="response")
plot(x, ypred,type="l")
points(x,y)

# so I try to predict the x value when y (proportion) is at .5, but
something is wrong..

predict(mylogit,newdata=as.data.frame(0.5))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] agnes clustering and NAs

2011-01-26 Thread Dario Strbenac
Hello,

In the documentation for agnes in the package 'cluster', it says that NAs are 
allowed, and sure enough it works for a small example like :

> m <- matrix(c(
1, 1, 1, 2,
1, NA, 1, 1,
1, 2, 2, 2), nrow = 3, byrow = TRUE)
> agnes(m)
Call:agnes(x = m) 
Agglomerative coefficient:  0.1614168 
Order of objects:
[1] 1 2 3
Height (summary):
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
  1.155   1.247   1.339   1.339   1.431   1.524 

Available components:
[1] "order"  "height" "ac" "merge"  "diss"   "call"   "method" "data"

But I have a large matrix (23371 rows, 50 columns) with some NAs in it and it 
runs for about a minute, then gives an error :

> agnes(iMatrix)
Error in agnes(iMatrix) : 
  No clustering performed, NA-values in the dissimilarity matrix.

I've also tried getting rid of rows with all NAs in them, and it still gave me 
the same error. Is this a bug in agnes() ? It doesn't seem to fulfil the claim 
made by its documentation.

The matrix I'm using can be obtained here :
http://129.94.136.7/file_dump/dario/iMatrix.obj

--
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I fix this ?

2011-01-26 Thread Pete Brecknock

Eric

Your problem lies in the way cumprod deals with NAs

If you look at ?cumprod you will see 
"An NA value in x causes the corresponding and following elements of the
return value to be NA"

Not sure what behaviour you want to see on encountering an NA (ignore it,
restart the cumprod process, .). Making things easy for myself (it's
late), if you wish to simply ignore an NA the following would work

# sample data
y2=c(NA,1,2,3,4,5)

# ignore NA
ave(y2,is.na(y2),FUN=cumprod)

HTH

Pete

 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-do-I-fix-this-tp3239239p3241432.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pdf greek letter typos

2011-01-26 Thread Eduardo de Oliveira Horta
Hi there,

yet on the topic of greek letters and pdf plotting: when I run the
following code

pdf(file="temp.pdf")
mu=seq(from=-pi, to=pi, length=100)
plot(mu, sin(mu^2),
 type="l",
 xlab=expression(mu%in%(list(-pi,pi))),
 ylab=expression(sin(mu^2)),
 main=expression((list(mu,sin(mu^2)
dev.off()

I get a "proportional to" symbol in place of a "mu" and a "not equal
to" in place of a "pi" (see attached file). If I run only

mu=seq(from=-pi, to=pi, length=100)
plot(mu, sin(mu^2),
 type="l",
 xlab=expression(mu%in%(list(-pi,pi))),
 ylab=expression(sin(mu^2)),
 main=expression((list(mu,sin(mu^2)

then the characters are displayed correctly.

I would like to know if there is any sort of fixes to this problem,
such as specifying the symbols font or embedding fonts into the pdf
file. Any help would be welcome.

Thanks in advance and best regards,

Eduardo Horta


temp.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hilbert Huang Transformation

2011-01-26 Thread Jumlong Vongprasert
Dear All
  I try to use Hilbert Huang Transformation in R.
  How I can do.
Many Thanks.

-- 
Jumlong Vongprasert Assist, Prof.
Institute of Research and Development
Ubon Ratchathani Rajabhat University
Ubon Ratchathani
THAILAND
34000

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] binomial dist: obtaining probability of success on each trial

2011-01-26 Thread David Winsemius


On Jan 26, 2011, at 8:07 PM, Folkes, Michael wrote:

I'm trying to fathom how to answer two example problems (3.3.2 &  
3.3.3) in:
Krishnamoorthy. 2006. "handbook of statistical distributions with  
applications"


The first requires calculating single trial probability of success  
for a binomial distribution when we know:

trial size=20, successes k=4, P(x<=k)=0.7
Appreciably all the binomial functions are requiring "prob", which  
I'm trying to estimate.


Yi might try via successive approximation. Here's a start.

> dbinom(0:20, 20, prob=0.7)
 [1] 3.486784e-11 1.627166e-09 3.606885e-08 5.049639e-07 5.007558e-06  
3.738977e-05 2.181070e-04 1.017833e-03
 [9] 3.859282e-03 1.200665e-02 3.081708e-02 6.536957e-02 1.143967e-01  
1.642620e-01 1.916390e-01 1.788631e-01

[17] 1.304210e-01 7.160367e-02 2.784587e-02 6.839337e-03 7.979227e-04
> sum(dbinom(0:20, 20, prob=0.7))
[1] 1
> sum(dbinom(0:4, 20, prob=0.7))
[1] 5.550253e-06
> sum(dbinom(0:4, 20, prob=0.2))
[1] 0.6296483
> sum(dbinom(0:4, 20, prob=0.2))
[1] 0.6296483
> sum(dbinom(0:4, 20, prob=0.18))
[1] 0.7151181

So you have an interval in which the answer lies and  should be able  
to set up a call to optimize or other solving algorithm.


> fnbinom <- function(x) abs( sum(dbinom(0:4, 20, prob=x)) - 0.7)
> optimize(fnbinom, interval=c(0.15, 0.2))
$minimum
[1] 0.1836066

$objective
[1] 6.063185e-05



The second problem is similar:
prob=0.2, successes k=6, P(x<=k)=0.4, need to calc # trials ('size'  
in R).


Same sort of approach ought to work.


I'm sure it'll be obvious once somebody explains, but google and R- 
help archive searches have failed me.  :(

thanks in advance!
Michael
___
Michael Folkes
Salmon Stock Assessment
Canadian Dept. of Fisheries & Oceans
Pacific Biological Station

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] binomial dist: obtaining probability of success on each trial

2011-01-26 Thread Folkes, Michael
I'm trying to fathom how to answer two example problems (3.3.2 & 3.3.3) in:
Krishnamoorthy. 2006. "handbook of statistical distributions with applications"

The first requires calculating single trial probability of success for a 
binomial distribution when we know:
trial size=20, successes k=4, P(x<=k)=0.7
Appreciably all the binomial functions are requiring "prob", which I'm trying 
to estimate.

The second problem is similar:
prob=0.2, successes k=6, P(x<=k)=0.4, need to calc # trials ('size' in R).

I'm sure it'll be obvious once somebody explains, but google and R-help archive 
searches have failed me.  :(
thanks in advance!
Michael
___
Michael Folkes
Salmon Stock Assessment
Canadian Dept. of Fisheries & Oceans 
Pacific Biological Station
3190 Hammond Bay Rd.
Nanaimo, B.C., Canada
V9T-6N7


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Colour area under density curve

2011-01-26 Thread Peter Alspach
Tena koe David

?polygon

should help you.

HTH 

Peter Alspach

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of David Hervas Marin
> Sent: Thursday, 27 January 2011 12:02 p.m.
> To: R-Help
> Subject: [R] Colour area under density curve
> 
> Hello,
> 
> I have this code to plot a certain normal distribution and represent
> the pnorm value for a certain "x":
> 
> 
> x<-300
> xx <- seq(2.5,7.5, by=0.1)
> yy <- dnorm(xx,5.01,0.77)
> d<-signif(pnorm(log(x), 5.01,0.77),4)
> xpts <- round(exp(0:8))
> par(bg = "antiquewhite")
> plot(xx,yy, type="l", col="blue", lwd=2, xaxt="n",
> xlab=expression(ufc/m^3),
> ylab="Densidad")
> axis(1, at=log(xpts, base=exp(1)), lab=xpts)
> n<-0
> for (i in 1:(dnorm(log(x), 5.01,0.77)/0.005)){
> n<-n+0.005
> points(log(x), n, pch=16, cex=0.3)}
> points(log(x),-0.01,pch=24,cex=2,bg='RED')
> text(log(x)+0.4, 0.004, paste(d*100, "%"))
> 
> I'd like to colour the area under the curve left to my "x". Is there
> any way?
> 
> Thanks in advance
> 
> 
> _
> David Hervás Marín
> 
> 
> 
>   [[alternative HTML version deleted]]


The contents of this e-mail are confidential and may be subject to legal 
privilege.
 If you are not the intended recipient you must not use, disseminate, 
distribute or
 reproduce all or any part of this e-mail or attachments.  If you have received 
this
 e-mail in error, please notify the sender and delete all material pertaining 
to this
 e-mail.  Any opinion or views expressed in this e-mail are those of the 
individual
 sender and may not represent those of The New Zealand Institute for Plant and
 Food Research Limited.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find the empty element in a factor array

2011-01-26 Thread Peter Alspach
Tena koe Wendy

which(test=='')

Should do it.

HTH 

Peter Alspach

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Wendy
> Sent: Thursday, 27 January 2011 10:59 a.m.
> To: r-help@r-project.org
> Subject: [R] Find the empty element in a factor array
> 
> 
> Hi all,
> 
> I have a factor array, and some of the elements are empty. How would I
> return the index number of the empty elements. For example,
> 
> test<-factor(c('A','','B','C','E'))
> > test
> [1] A   B C E
> Levels:  A B C E
> 
> I would like the result equal to 2.
> 
> Thank you,
> Wendy
> --
> View this message in context: http://r.789695.n4.nabble.com/Find-the-
> empty-element-in-a-factor-array-tp3238894p3238894.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

The contents of this e-mail are confidential and may be subject to legal 
privilege.
 If you are not the intended recipient you must not use, disseminate, 
distribute or
 reproduce all or any part of this e-mail or attachments.  If you have received 
this
 e-mail in error, please notify the sender and delete all material pertaining 
to this
 e-mail.  Any opinion or views expressed in this e-mail are those of the 
individual
 sender and may not represent those of The New Zealand Institute for Plant and
 Food Research Limited.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] removing outlier function / dataset update

2011-01-26 Thread kirtau

First off, thank you for the help with the global environment.

I have however attempted to run the code and am now presented with a new
error which is 
"Error in formula.default(eval(parse(text = x)[[1L]])) : invalid formula"
and am not sure what to make of it. I have tried a few different work around
with no luck.

Any help will continue to be appreciated! 

-
- AK
-- 
View this message in context: 
http://r.789695.n4.nabble.com/removing-outlier-function-dataset-update-tp3238394p3239080.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Factor rotation (e.g., oblimin, varimax) and PCA

2011-01-26 Thread Finn Aarup Nielsen



A bit of a newbee to R and factor rotation I am trying to understand 
factor rotations and their implementation in R, particularly the 
GPArotation library.


I have tried to reproduce some of the examples that I have found, e.g., I 
have taken the values from Jacksons example in "Oblimin Rotation", 
Encyclopedia of Biostatistics

http://onlinelibrary.wiley.com/doi/10.1002/0470011815.b2a13060/abstract
and run it through R:

library(GPArotation)
data <- matrix(c(0.6, 0.39, 0.77, 0.70, 0.64, 0.35, 0.52, 0.72, 0.34, 
0.58, 0.15, 0.07, -0.13, -0.23, -0.23, 0.67, -0.27, -0.23, 0.72, 
0.20, 0.41, 0.55, -0.10, -0.06, -0.21, -0.33, -0.27, -0.20, -0.22, 0.47), 10, 3)

oblimin(data)

The values I get out do not quite correspond to the values given in the 
table. What could this difference be due to? Rounding in the initial data? 
Or implementation details of the R oblimin function in GPArotation? 
Jackson writes about 'raw oblimin', 'normal oblimin' and 'direct oblimin' 
and I do not know how that relates to the R oblimin implementation.


I have also tried varimax on data and results given by Mardia in his 
'Multivariate analysis' book Table 9.4.1. Mardia uses the communalities 
from the factor analysis in the expression for the varimax rotation. I 
dont see how the R varimax function can handle the communalities. I dont 
have the book right at hand, but I believe this R code represents

Mardia examples in R:

lambda <- matrix(c(0.628, 0.696, 0.899, 0.779, 0.728,
   0.372, 0.313, -0.050, -0.201, -0.200), 5, 2)
varimax(lambda)

I do not get the result that Mardia presents.


I was about to use the factor rotation on the loadings from a principal 
component analysis and I saw that the 'principal' from the 'psych' library 
has a (some kind of) PCA with rotation. But when I use 'principal' I do 
not seem to be able to get the same results from prcomp and princomp and a 
'raw' use of eigen:


library(GPArotation)
library(psych)

# These 3 lines gives the same result
prcomp(answers)$r[1:2,1:3]
princomp(answers, cor=FALSE)$l[1:2,1:3]
eigen(cov(answers))$ve[1:2,1:3]

# These 3 lines gives the same result
prcomp(answers, center=TRUE, scale=TRUE)$r[1:2,1:3]
princomp(answers, cor=TRUE)$l[1:2,1:3]
eigen(cor(answers))$ve[1:2,1:3]

# This gives another result
principal(answers, nfactors=3, rotate="none")$l[1:2,1:3]


Furthermore, I tried to use oblimin on the PCA loadings via prcomp and 
'principal', but they give different results:


# These 2 lines give different results
oblimin(prcomp(answers, center=TRUE, scale=TRUE)$r[,1:3])$l[1:2,]
principal(answers, nfactors=3, rotate="oblimin")$l[1:2,1:3]


So what is wrong with the rotations and what is wrong with 'principal'? 
How do the different oblimin methods relate to the implementation in R 
GPArotation?



/Finn
___

 Finn Aarup Nielsen, DTU Informatics, Denmark
 Lundbeck Foundation Center for Integrated Molecular Brain Imaging
   http://www.imm.dtu.dk/~fn/  http://nru.dk/staff/fnielsen/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Quantile regression (rq) and complex samples

2011-01-26 Thread James Shaw
I am new to R and am interested in using the program to fit quantile
regression models to data collected from a multi-stage probability
sample of the US population.  The quantile regression package, rq, can
accommodate person weights.  However, it is not clear to me that
boot.rq is appropriate for use with multi-stage samples (i.e., is
capable of sampling primary sampling units instead of survey
respondents).  I would like to apply Rao's rescaling bootstrap
procedure and poststratify the weights to population control totals in
each bootstrap replicate.  I know how to do all of this in Stata but
have not yet seen any means of doing so in R.  I  presume I could do
what is needed using batch processing but was hoping that there might
be a way to pass the rq parameter estimates to a package that performs
resampling variance estimation in order to simplify the task.  Any
programming suggestions or directions to informational resources would
be greatly appreciated.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Find the empty element in a factor array

2011-01-26 Thread Wendy

Hi all,

I have a factor array, and some of the elements are empty. How would I
return the index number of the empty elements. For example,

test<-factor(c('A','','B','C','E'))
> test
[1] A   B C E
Levels:  A B C E

I would like the result equal to 2.

Thank you,
Wendy
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Find-the-empty-element-in-a-factor-array-tp3238894p3238894.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Colour area under density curve

2011-01-26 Thread David Hervas Marin
Hello, 

I have this code to plot a certain normal distribution and represent the pnorm 
value for a certain "x":


x<-300
xx <- seq(2.5,7.5, by=0.1)
yy <- dnorm(xx,5.01,0.77)
d<-signif(pnorm(log(x), 5.01,0.77),4)
xpts <- round(exp(0:8))
par(bg = "antiquewhite")
plot(xx,yy, type="l", col="blue", lwd=2, xaxt="n", xlab=expression(ufc/m^3), 
ylab="Densidad")
axis(1, at=log(xpts, base=exp(1)), lab=xpts)
n<-0
for (i in 1:(dnorm(log(x), 5.01,0.77)/0.005)){
n<-n+0.005
points(log(x), n, pch=16, cex=0.3)}
points(log(x),-0.01,pch=24,cex=2,bg='RED')
text(log(x)+0.4, 0.004, paste(d*100, "%"))

I'd like to colour the area under the curve left to my "x". Is there any way?

Thanks in advance


_
David Hervás Marín


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merge tables in a loop

2011-01-26 Thread MacQueen, Don
Not sure exactly what you mean by, “ writing a table with several rows per 
file”.

If what you want to do is write output to an external file, adding to it as 
your loop progresses, then look at the functions
  sink()
  cat()
And their ‘file’ and ‘append’ arguments.

If what you want to do is append rows to a table within R, as your loop 
progresses, then see the
   rbind()
function.

-Don

On 1/26/11 2:51 AM, "clemens karwautz"  wrote:

I’m running a loop opening one file after another. >setwd("D:/Documents and 
Settings/trflp") >a<-list.files() 
>results.diversity<-data.frame(matrix(0,ncol=7,nrow=length(a))) 
>names(results.diversity)<-c("file","simpson","shannon","eveness") 
>x<-length(a) >for (i in 1:x){ > trflp<-read.table(a[i],header=T,sep="\t") … I 
was able to make a table with the results of my calculations for each file. 
>results.diversity$simpson[i]<-simpson >results.diversity$shannon[i]<-shannon 
>results.diversity$eveness[i]<-eveness 
>write.table(results.diversity,"diversity.txt",row.names=F,sep="\t") Now, I 
would be interested in writing a table with several rows per file. e.g.: file1: 
size abundance 37 0.0117 43 0.1566 218 0.0682 253 0.0508 412 0.0874 ... file2: 
size abundance 37 0.0117 45 0.1876 218 0.0682 255 0.0808 417 0.0374 ... Final 
table: size abundance filename 37 0.0117  file1 43 0.1566  file1 218 0.0682  
file1 253 0.0508  file1 412 0.0874  file1 37 0.0117  file2 45 0.1876  file2 218 
0.0682  file2 255 0.0808  file2 417 0.0374  file2 Could you give me some advise 
how to manage this problem? Thank you very much Clemens -- GMX DSL Doppel-Flat 
ab 19,99 Euro/mtl.! Jetzt mit gratis Handy-Flat! 
http://portal.gmx.net/de/go/dsl __ 
R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
925 423-1062

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] boxplot - code for labeling outliers - any suggestions for improvements?

2011-01-26 Thread Greg Snow
For the last point (cluttered text), look at spread.labels in the plotrix 
package and spread.labs in the TeachingDemos package (I favor the later, but 
could be slightly biased as well).  Doing more than what those 2 functions do 
becomes really complicated really fast.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Tal Galili
> Sent: Wednesday, January 26, 2011 4:05 PM
> To: r-help@r-project.org
> Subject: [R] boxplot - code for labeling outliers - any suggestions for
> improvements?
> 
> Hello all,
> I wrote a small function to add labels for outliers in a boxplot.
> This function will only work on a simple boxplot/formula command (e.g:
> something like boxplot(y~x)).
> 
> Code + example follows in this e-mail.
> 
> I'd be happy for any suggestions on how to improve this code, for
> example:
> 
>- Handle boxplot.matrix (which shouldn't be too hard to do)
>- Handle cases of complex functions (e.g: boxplot(y~a*b))
>- Handle cases where there are many outliers leading to a clutter of
> text
>(to this I have no idea how to systematically solve)
> 
> 
> Best,
> Tal
> --
> 
> 
> # the function
> boxplot.add.outlier.text <- function(DATA, x_name, y_name, label_name)
> {
> 
> 
> boxplot.outlier.data <- function(xx, y_name)
> {
>  y <- xx[,y_name]
> boxplot_range <- range(boxplot.stats(y)$stats)
> ss <- (y < boxplot_range[1]) | (y > boxplot_range[2])
>  return(xx[ss,])
> }
> 
> require(plyr)
> txt_to_run <- paste("ddply(DATA, .(",x_name,"), boxplot.outlier.data,
> y_name
> = y_name)", sep = "")
>  ourlier_df <- eval(parse(text = txt_to_run))
> # head(ourlier_df)
>  txt_to_run <- paste("formula(",y_name,"~", x_name,")")
>  formu <- eval(parse(text = txt_to_run))
> boxdata <- boxplot(formu , data = DATA, plot = F)
>  boxdata_group_name <- boxdata$names[boxdata$group]
> boxdata_outlier_df <- data.frame(group = boxdata_group_name, y =
> boxdata$out, x = boxdata$group)
>  for(i in seq_len(dim(boxdata_outlier_df)[1]))
> {
>  ss <- (ourlier_df[,x_name]  %in% boxdata_outlier_df[i,]$group) &
> (ourlier_df[,y_name] %in% boxdata_outlier_df[i,]$y)
> current_label <- ourlier_df[ss,label_name]
>  temp_x <- boxdata_outlier_df[i,"x"]
> temp_y <- boxdata_outlier_df[i,"y"]
>  text(temp_x, temp_y, current_label,pos=4)
> }
> 
> list(boxdata_outlier_df = boxdata_outlier_df, ourlier_df=ourlier_df)
> }
> 
> # example:
> boxplot(decrease ~ treatment, data = OrchardSprays, log = "y", col =
> "bisque")
> boxplot.add.outlier.text(OrchardSprays, "treatment", "decrease",
> "colpos")
> 
> 
> 
> 
> Contact
> Details:---
> Contact me: tal.gal...@gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew)
> |
> www.r-statistics.com (English)
> ---
> ---
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] boxplot - code for labeling outliers - any suggestions for improvements?

2011-01-26 Thread Tal Galili
Hello all,
I wrote a small function to add labels for outliers in a boxplot.
This function will only work on a simple boxplot/formula command (e.g:
something like boxplot(y~x)).

Code + example follows in this e-mail.

I'd be happy for any suggestions on how to improve this code, for example:

   - Handle boxplot.matrix (which shouldn't be too hard to do)
   - Handle cases of complex functions (e.g: boxplot(y~a*b))
   - Handle cases where there are many outliers leading to a clutter of text
   (to this I have no idea how to systematically solve)


Best,
Tal
--


# the function
boxplot.add.outlier.text <- function(DATA, x_name, y_name, label_name)
{


boxplot.outlier.data <- function(xx, y_name)
{
 y <- xx[,y_name]
boxplot_range <- range(boxplot.stats(y)$stats)
ss <- (y < boxplot_range[1]) | (y > boxplot_range[2])
 return(xx[ss,])
}

require(plyr)
txt_to_run <- paste("ddply(DATA, .(",x_name,"), boxplot.outlier.data, y_name
= y_name)", sep = "")
 ourlier_df <- eval(parse(text = txt_to_run))
# head(ourlier_df)
 txt_to_run <- paste("formula(",y_name,"~", x_name,")")
 formu <- eval(parse(text = txt_to_run))
boxdata <- boxplot(formu , data = DATA, plot = F)
 boxdata_group_name <- boxdata$names[boxdata$group]
boxdata_outlier_df <- data.frame(group = boxdata_group_name, y =
boxdata$out, x = boxdata$group)
 for(i in seq_len(dim(boxdata_outlier_df)[1]))
{
 ss <- (ourlier_df[,x_name]  %in% boxdata_outlier_df[i,]$group) &
(ourlier_df[,y_name] %in% boxdata_outlier_df[i,]$y)
current_label <- ourlier_df[ss,label_name]
 temp_x <- boxdata_outlier_df[i,"x"]
temp_y <- boxdata_outlier_df[i,"y"]
 text(temp_x, temp_y, current_label,pos=4)
}

list(boxdata_outlier_df = boxdata_outlier_df, ourlier_df=ourlier_df)
}

# example:
boxplot(decrease ~ treatment, data = OrchardSprays, log = "y", col =
"bisque")
boxplot.add.outlier.text(OrchardSprays, "treatment", "decrease", "colpos")




Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trouble variable scoping a function writing with get()

2011-01-26 Thread Bert Gunter
It's looking for an object named "b" in the frame of the function.
There is none. You need to specify

get(response, pos = parent.frame())

## or maybe even

get(response, pos= globalenv())

HOWEVER, this is **exactly** why you shouldn't do this! Instead of
passing in the name of the object, pass in the object itself (response
= b, not response = "b") and forget about using get(). Then it
**would** work as writted without getting tangled up in scoping. The
whole point local (to the function) scope and call by value is to
avoid exactly the sort of mess that you've gotten yourself into. Stick
with R's programming paradigms in R rather than trying to impose
others and life will be simpler and sweeter.

-- Bert



On Wed, Jan 26, 2011 at 2:21 PM, Pat Schmitz  wrote:
> I am having trouble with variable scoping inside/outside functions.  I
> want to use get() to grab a named and quoted variable as an input to a
> function, however the function can't find the variable when it is entered
> into the function call, only when it is named in the main environment.
>
> I obviously am not so clear on variable scoping, and ?get really doesn't
> clear up my confusion.
>
> I am sure that there are better, more appropriate ways to write a
> function...
>
> Please enlighten me,
> Pat
>
> # Example code
>
> dat <- expand.grid(a = factor(c("a", "b")), b=1:10)
>
> # Function that requires get()
> ex <- function(data, response){
>  library(plyr)
>  output <- ddply(data,
> .(a),
> summarize,
>  res = sum(get(response))
> )
> return(output)
> }
>
>
> out <- ex(data = dat, response = "b")
> # Error in get(response) : object 'response' not found
>
> # However if I name reponse outside of the function, it is found by the
> function
> response = "b"
> out <- ex(data = dat, response = "b")
> out
> #> out
> #  a res
> #1 a  55
> #2 b  55
>
>
> --
> Patrick Schmitz
> Graduate Student
> Plant Biology
> 1206 West Gregory Drive
> RM 1500
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trouble variable scoping a function writing with get()

2011-01-26 Thread Pat Schmitz
I am having trouble with variable scoping inside/outside functions.  I
want to use get() to grab a named and quoted variable as an input to a
function, however the function can't find the variable when it is entered
into the function call, only when it is named in the main environment.

I obviously am not so clear on variable scoping, and ?get really doesn't
clear up my confusion.

I am sure that there are better, more appropriate ways to write a
function...

Please enlighten me,
Pat

# Example code

dat <- expand.grid(a = factor(c("a", "b")), b=1:10)

# Function that requires get()
ex <- function(data, response){
 library(plyr)
 output <- ddply(data,
.(a),
summarize,
 res = sum(get(response))
)
return(output)
}


out <- ex(data = dat, response = "b")
# Error in get(response) : object 'response' not found

# However if I name reponse outside of the function, it is found by the
function
response = "b"
out <- ex(data = dat, response = "b")
out
#> out
#  a res
#1 a  55
#2 b  55


-- 
Patrick Schmitz
Graduate Student
Plant Biology
1206 West Gregory Drive
RM 1500

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Making up a graph and its equation which better fits two groups of data

2011-01-26 Thread Greg Snow
Your faith in our ability to read your mind is apparently much higher than our 
actual ability to do so.  What is the nature of your data? what question are 
you trying to answer? What type of equation do you want? What do you mean by 
better?  Better than what?

Maybe the esp package (pre-alpha) can give some additional insights:

> source('g:/R/esp/esp.R')
> esp()
[1] "dogie perm syllable brunt wore majestic Messiah luxuriously meretricious"
>

OK, that doesn't help me much, maybe someone else can get something out of 
that, but in the meantime I would suggest reading the posting guide and adding 
the info it suggests to make a question that we have a chance of giving a 
meaningful answer to.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Jimmy Martina
> Sent: Wednesday, January 26, 2011 9:05 AM
> To: R
> Subject: [R] Making up a graph and its equation which better fits two
> groups of data
> 
> 
> Dear R-folks:
> 
> I have a group of data ('x' and 'y' axis), but I'd like to know how to
> draw a graph which would fits my data in a better way, I also need its
> equation. Could you give me any R-rutine ideas?.
> 
> I thank you in advance for your kind support.
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave: \Sexpr{} inside <<>>?

2011-01-26 Thread Duncan Murdoch

On 11-01-26 3:43 PM, zerfetzen wrote:


Thanks Duncan, that helps.  It successfully displays what I'm looking for,
but it is not executing it.  In a previous code chunk, it notes the time it
took to run something, and in the successive code chunk, it runs something
else where the previous time is now a parameter, but I'd like it to
numerically display that previous time in the new chunk, rather than a
variable name I create for it behind the scenes.  Is this possible?  Many
thanks.


Of course it's possible, but it'll need ugly trickery.  Just do every 
calculation in a hidden chunk, then use code like I did to fake a 
display of it.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] RGtk2Extras package for dataframe editing and easy dialog creation

2011-01-26 Thread Taverner, Thomas
Dear useRs,

This is to announce the RGtk2Extras R package is available on CRAN in version 
0.5.1.

This package provides useful extras for R programmers who wish to create 
graphic user interfaces. It is based on GTK, using Michael Lawrence's RGtk2 
package and John Verzani's gWidgets, and some ideas from John Verzani's traitr 
package.

The first major feature of RGtk2Extras, the run.dialog function, is an 
interface for creating simple front-ends to R functions using a terse markup 
scheme. No GUI knowledge is required; if you can write an R function, you can 
create a dialog for it.

An example:

# A function with one argument, N

  Histogram = function(N) hist(rnorm(N))

# Dialog markup list for the function Histogram
# Create the main dialog label with the first "label=" (optional)
# Then specify N.integerItem=50, where 
#   (1) N is the function argument
#   (2) integerItem is the type of widget to use, see ?run.dialog
#   (3) the value 50 is the default
# Then with the second "label=" add a label for the "N" item (optional)

  Histogram.dialog = list(label = "Histogram of N points",
N.integerItem = 50, label = "Value of N")
  
# Run the dialog.
# The returned list has elements "args" for dialog arguments
# and "retval" for the return value.
# With auto.assign=TRUE, the return value is stored in "Histogram_output".
# run.dialog also does error handling, interrupts and an optional 
# progress bar dialog for long running tasks, see ?run.dialog

  a = run.dialog(Histogram)

A more complex demo example, thanks to Graham Williams:

  demo(MakeAngle)


The second feature of RGtk2Extras is gtkDfEdit, an editable spreadsheet widget 
designed for editing data matrices and data frames. It's also based on RGtk2. 
This was released earlier as RGtk2DfEdit which was then folded into RGtk2Extras.

The gtkDfEdit spreadsheet is quite full featured and has been designed to be 
familiar to Excel users, while allowing most of the data frames and factor 
operations that are possible from the R command line.

Data frame columns are type-aware and there is a factor editor. Most functions 
can be accessed from right clicking on data columns or cells. There is undo, 
cross platform copy/paste, sorting, cell filling, and data loading 
capabilities. See ?gtkDfEdit.

The widget provided by gtkDfEdit() can be integrated into larger RGtk2 based 
user interfaces, or its standalone wrapper dfedit() can be used as a straight 
replacement for edit.data.frame().

There is also a basic API for manipulating and binding event handlers to the 
spreadsheet. See ?gtkDfEditDoTask and ?gtkDfEditSetActionHandler.

# Edit the iris data frame
# Note the blank row at the bottom to allow pasting into rows below.
x = dfedit(iris)

# Example user interactions
1. Right click the "Species" column or column header to see or change the data 
type. Click "Edit Factors" to change factor levels or ordering.

2. Right click a column header and then click "Sort" to sort columns

3. Right click column header and change data type by selecting 
"Character"/"Integer" etc. Factors can be changed to integer levels ("To Factor 
Levels") or integers ("To Factor Ordering")

4. Right click column header to insert or delete columns or change column name.

5. Select a submatrix of numbers within "iris" and press the "=" key to bring 
up the Command Editor, then type "hist" in the command field and "OK" to create 
a histogram of the submatrix; "function(x) hist(x)" works as well.

6. Select a submatrix within "iris" and press Ctrl-V or right-click and select 
Copy to copy into an external spreadsheet; Ctrl-C works to paste into the data 
frame. Ctrl-Shift-C copies submatrix and row and column names. (The equivalent 
Mac keys should also work.)

7. Select a range of cells within "Species" column and right-click to select 
"Fill in Cycles" to fill cyclic factors.

8. Right click left hand corner cell to open CSV file into editor or save as 
file.

9. Right click left hand corner cell and "Default Columns" to set columns to 
default Excel-style column headers.

10. Ctrl-Z to undo previous edits.
# End of examples

RGtk2Extras is still in a beta stage of development. It is hosted on 
r-forge.r-project.org/R/?group_id=924. Comments and suggestions appreciated; 
email thomas.taverner _AT_ pnl.gov

Thanks to the entire R community, particularly the R Development Core Team, 
Michael Lawrence and John Verzani, and also Graham Williams and Iago Conde for 
comments.

Tom

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] baseline hazard function

2011-01-26 Thread Simon Kiss
Dear colleagues, I have the following dataset.  It is modelled on the data 
included in Box-Seteffenheiser and Jones "Event History Modelling"
Using the following code, I try to find the baseline hazard function 

haz_1<-muhaz(bpa$time, bpa$censored, subset=(bpa$year=="2010" | bpa$ban=="1"), 
min.time=1, max.time=3)

I think I'm doing everything right, but what I don't understand is how to 
derive a duration dependency coefficient rom the values contained in the muhaz 
object as per Box-Steffenheiser and Jones' recommendations in Ch. 5 of Event 
History Modelling.

I get the following summary(haz_1)
Number of Observations .. 50
Censored Observations ... 43
Method used . Local
Boundary Correction Type  Left and Right
Kernel type . Epanechnikov
Minimum Time  1
Maximum Time  3
Number of minimization points ... 51
Number of estimation points . 101
Pilot Bandwidth . 0.25
Smoothing Bandwidth . 1.27
Minimum IMSE  6716.9


Can anyone provide any advice?
Yours, Simon Kiss

'data.frame':   147 obs. of  7 variables:
 $ state   : Factor w/ 50 levels "Alabama","Alaska",..: 1 1 1 2 2 2 3 3 3 4 ...
 $ partisan: Factor w/ 3 levels "democrat","mixed",..: 1 1 1 2 2 2 3 3 3 1 ...
 $ ban : num  0 0 0 0 0 0 0 0 0 0 ...
 $ year: num  2008 2009 2010 2008 2009 ...
 $ news: num  1.67 1.67 0 2 0 ...
 $ time: num  1 2 3 1 2 3 1 2 3 1 ...
 $ censored: num  0 0 0 0 0 0 0 0 0 0 ...


*
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 519 761 7606

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave: \Sexpr{} inside <<>>?

2011-01-26 Thread zerfetzen

Thanks Duncan, that helps.  It successfully displays what I'm looking for,
but it is not executing it.  In a previous code chunk, it notes the time it
took to run something, and in the successive code chunk, it runs something
else where the previous time is now a parameter, but I'd like it to
numerically display that previous time in the new chunk, rather than a
variable name I create for it behind the scenes.  Is this possible?  Many
thanks.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Sweave-Sexpr-inside-tp3237313p3238764.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] applying a set of rules to each row

2011-01-26 Thread Bert Gunter
... or perhaps just break things up with assignments and do it in stages.

-- Bert

On Wed, Jan 26, 2011 at 12:52 PM, David Winsemius
 wrote:
> I remember something about the degree of nesting of ifelse calls being
> limited to 7 deep (???)  that makes me worry about this approach. You may
> want to look at the arules package or the data.table package or the sqldf
> package for approaches that are specifically constructed with this sort of
> processing in mind.
>
> --
> David.
>
> On Jan 26, 2011, at 3:42 PM, KATSCHKE, ADRIAN CIV DFAS wrote:
>
>> Yes. That is exactly what I would like to have running. Here is the first
>> attempt I made at using a nested ?ifelse statement for one of the retirement
>> plans. The variables are all there but with different names. ageYOSstart is
>> ageFedStart, SCDCivLeave is srvCompDT. I haven't gotten this working. I am
>> not sure that it is the correct way to do what I would like.
>>
>> ## Regular retirement eligibility date for FERS employees
>> retData.All$regRetireDT2[retData.All$retireSystem == "FERS"] <-
>> with(retData.All[retData.All$retireSystem == "FERS",],
>>                                ifelse(DOB < "01/01/53", ## Born before
>> 1953 minimum retirement age of 55
>>                                       ifelse(ageYOSstart < 26,
>> dates(DOB+(365.25*55)),
>>                                              ifelse((ageYOSstart >= 26 &
>> ageYOSstart < 31), dates(SCDCivLeave*(365.25*30)),
>>                                                     ifelse((ageYOSstart >=
>> 31 & ageYOSstart < 41), dates(DOB+(365.25*60)),
>>
>>  ifelse((ageYOSstart >= 41 & ageYOSstart < 43),
>>
>> dates(SCDCivLeave+(365.25*20)),
>>
>> ifelse((ageYOSstart >= 43 & ageYOSstart < 58),
>>
>>  dates(DOB+(365.25*62)),
>>
>>  ifelse(ageYOSstart >= 58,
>>
>>       dates(SCDCivLeave+(365.25*5)), NA)),
>>                                        ifelse((DOB < "12/31/69" & DOB >
>> "01/01/53"), ## Born between 1953 and 1969 MRA of 56
>>                                                ifelse(ageYOSstart < 27,
>> dates(DOB+(365.25*56)),
>>                                                       ifelse((ageYOSstart
>> >= 27 & ageYOSstart < 31),
>>
>> dates(SCDCivLeave+(365.25*30)),
>>
>> ifelse((ageYOSstart >= 31 & ageYOSstart < 41),
>>
>>  dates(DOB+(365.25*60)),
>>
>>  ifelse((ageYOSstart >= 41 & ageYOSstart < 43),
>>
>>   dates(SCDCivLeave+(365.25*20)),
>>
>>   ifelse((ageYOSstart >= 43 & ageYOSstart < 58),
>>
>>          dates(DOB+(365.25*62)),
>>
>>          ifelse(ageYOSstart >= 58,
>>
>>                 dates(SCDCivLeave+(365.25*5)),
>>
>>                 NA))),
>>                                        ifelse(DOB >= "01/01/69", ## Born
>> after 1969 Min Retire Age of 57
>>                                               ifelse(ageYOSstart < 28,
>> dates(DOB+(365.25*57)),
>>                                                      ifelse((ageYOSstart
>> >= 28 & ageYOSstart < 31),
>>
>> dates(SCDCivLeave+(365.25*30)),
>>
>> ifelse((ageYOSstart >= 31 & ageYOSstart < 41),
>>
>>  dates(DOB+(365.25*20)),
>>
>>  ifelse((ageYOSstart >= 41 & ageYOSstart < 43),
>>
>> dates(SCDCivLeave+(365.25*20)),
>>
>> ifelse((ageYOSstart >= 43 & ageYOSstart < 57),
>>
>>        dates(DOB+(365.25*62)),
>>
>>        ifelse(ageYOSstart >= 58,
>>
>>               dates(SCDCivLeave+(365.25*5)),
>>
>>               NA))), NA))
>>
>> Adrian
>>
>>
>>> If I understand you correctly, you want ?ifelse, which works on the
>>> full logical vectors of rules applied to the variables, not
>>> ifelse, which works on only a single logical.
>>
>>> -- Bert Gunter
>>
>> On Wed, Jan 26, 2011 at 12:18 PM, KATSCHKE, ADRIAN CIV DFAS
>>  wrote:
>>>
>>> All,
>>>
>>> I would like to apply a set of rules to each row of the sample data set
>>> below. The rule sets are the guidelines for determining an individual's
>>> date for retirement eligibility. The rules are found in this document,
>>> http://www.opm.gov/feddata/RetirementPaperFinal_v4.pdf. I am only
>>> interested in the top two categories for retirement eligibility, the
>>> CSRS and FERS plans.
>>>
>>> The data set has four variables Date of Birth (DOB), service computation
>>> date (srvCompDT), retirement plan (retirePlan), and the age at which the
>>> employee entered federal service (ageFedStart). The service computation
>>> date is used to compute the date eligible for retirement. The retirement
>>> plan indicates what system the employee is enrolled under.
>>>
>>> The data does contain a few other retirement plans, for now I want to
>>> just ignore those plans. I have labeled plans as 1-CSRS and 2-FERS, and
>>> 3-Other. My first attempt at applying the rules was through a complex
>>> nesting of ifelse statements, this was not very successful and quite
>>> difficult to follow. I then wrote a function and tried using "apply"
>>> unsuccessfully. The function is shown below.
>>>
>>> I would like to put a short script or function together that would allow
>>> for an efficient application o

[R] Can not invoke maxent() of library(dismo) in Mac OSX

2011-01-26 Thread Mao Jianfeng
Dear R-helpers,

I can not invoke maxent() in Mac OSX. Could you give me any directions
on that? Thank you in advance.

Here is my info:

# (1) the error
>  me <- maxent(predictors, occtrain, factors='biome')
me <- maxent(predictors, occtrain, factors='biome')
Error in .jcall(mxe, "S", "fit", c("autorun", "-e", afn, "-o", dirout,  :
  java.lang.NoClassDefFoundError: Could not initialize class
density.DirectorySelect

# (2) the variables for maxent: predictors, and occtrain
> summary(predictors)
summary(predictors)
Cells:  35712
NAs  :  0 0 0 0 0 0 0 0 0

1  2  3  4 56 7 8  9
Min.-23.00.00.00.0  97.0 -207  64.0 -66.0  1.000
1st Qu. 171.0  715.5  317.0   30.0 306.0   29 132.0 213.0  1.000
Median  230.0 1271.0  513.0   87.0 322.0  131 183.0 249.0  4.000
Mean209.3 1364.0  570.3  136.3 314.4  103 210.4 225.9  5.135
3rd Qu. 257.0 1879.0  834.0  216.0 336.0  188 275.0 262.0  8.000
Max.289.0 7091.0 2458.0 1496.0 422.0  242 449.0 322.0 14.000
summary based on a sample of 5000 cells, which is 14.0008960573477 %
of all cells
> summary(occtrain)
summary(occtrain)
  lon  lat
 Min.   :-85.93   Min.   :-23.450
 1st Qu.:-79.47   1st Qu.: -2.750
 Median :-75.58   Median :  5.300
 Mean   :-72.22   Mean   :  2.823
 3rd Qu.:-65.40   3rd Qu.:  9.150
 Max.   :-46.73   Max.   : 13.950

# (3) my OS and R version
> version
version
   _
platform   x86_64-apple-darwin9.8.0
arch   x86_64
os darwin9.8.0
system x86_64, darwin9.8.0
status
major  2
minor  12.1
year   2010
month  12
day16
svn rev53855
language   R
version.string R version 2.12.1 (2010-12-16)
> sessionInfo()
sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US/en_US/en_US/C/en_US/en_US

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] evaluate_0.3  dismo_0.5-14  rJava_0.8-8   raster_1.7-29 sp_0.9-76

loaded via a namespace (and not attached):
[1] grid_2.12.1 lattice_0.19-17 plyr_1.4stringr_0.4
[5] tools_2.12.1

-- 
Jian-Feng, Mao

the Institute of Botany,
Chinese Academy of Botany,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA replacing

2011-01-26 Thread andrija djurovic
Thanks Dennis this is what I was looking for.



On Wed, Jan 26, 2011 at 4:14 AM, Dennis Murphy  wrote:

> Hi:
>
> Here's one approach:
>
> f <- function(df) {
>  rs <- with(na.exclude(df), tapply(y, strata, sum)/tapply(x, strata,
> sum))
>  u <- transform(subset(df, is.na(y)), y = x * rs[strata])
>  transform(df, y = replace(y, u$id, u$y))
>   }
>
> f(df)
>
> The function works as follows:
>
> (1) With the rows of the data frame where y is not missing,
>  find the sum(y)/sum(x) ratio in each stratum. rs is a vector whose
> length is
>  the number of strata. (Hopefully, all of your x-sums are nonzero...)
> If you have
>  missing x values in your real data, you need to think about how you
> want to
>  handle them.
> (2) In a sub-data frame u containing the missing y's, replace them with the
> value of
>  x times the value of rs corresponding to its stratum.
> (3) Replace the missing y's in df with the y's from u, matching on id
> numbers. (This
>  is a by-product of subset(), BTW.)
>
> HTH,
> Dennis
>
> On Tue, Jan 25, 2011 at 9:40 AM, andrija djurovic wrote:
>
>> Hello R user,
>>
>> I have following data frame:
>>
>> df=data.frame(id=c(1:10),strata=rep(c(1,2),c(5,5)),y=c(
>> 10,12,10,NA,15,70,NA,NA,55,100),x=c(3,4,5,7,4,10,12,8,3,15))
>>
>> and I would like to replace NA's with:
>>
>> instead of first NA
>>  tapply(na.exclude(df)$y,na.exclude(df)$strata,sum)[1]*
>> *7 */tapply(na.exclude(df)$x,na.exclude(df)$strata,sum)[1]
>> where 7 is the value of x (id=4) in strata 1 where y=NA
>>
>> instead of second NA
>> tapply(na.exclude(df)$y,na.exclude(df)$strata,sum)[2]*
>> *12 */tapply(na.exclude(df)$x,na.exclude(df)$strata,sum)[2]
>> where 12 is the value of x (id=7) in strata 2 where y=NA
>>
>> instead of third NA tapply(na.exclude(df)$y,na.exclude(df)$strata,sum)[2]*
>> *
>> 8 */tapply(na.exclude(df)$x,na.exclude(df)$strata,sum)[2]
>> where 8 is the value of x(id=8) in strata 2 where y=NA.
>>
>> So, I would like to replace NA inside the stratas on above explained way.
>>
>> Does anyone know how to do this?
>>
>> thanks in advance
>>
>> Andrija
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] applying a set of rules to each row

2011-01-26 Thread David Winsemius
I remember something about the degree of nesting of ifelse calls being  
limited to 7 deep (???)  that makes me worry about this approach. You  
may want to look at the arules package or the data.table package or  
the sqldf package for approaches that are specifically constructed  
with this sort of processing in mind.


--
David.

On Jan 26, 2011, at 3:42 PM, KATSCHKE, ADRIAN CIV DFAS wrote:

Yes. That is exactly what I would like to have running. Here is the  
first attempt I made at using a nested ?ifelse statement for one of  
the retirement plans. The variables are all there but with different  
names. ageYOSstart is ageFedStart, SCDCivLeave is srvCompDT. I  
haven't gotten this working. I am not sure that it is the correct  
way to do what I would like.


## Regular retirement eligibility date for FERS employees
retData.All$regRetireDT2[retData.All$retireSystem == "FERS"] <-  
with(retData.All[retData.All$retireSystem == "FERS",],
ifelse(DOB < "01/01/53", ## Born  
before 1953 minimum retirement age of 55
   ifelse(ageYOSstart < 26,  
dates(DOB+(365.25*55)),
  ifelse((ageYOSstart >=  
26 & ageYOSstart < 31), dates(SCDCivLeave*(365.25*30)),
  
ifelse((ageYOSstart >= 31 & ageYOSstart < 41), dates(DOB+(365.25*60)),
 
ifelse((ageYOSstart >= 41 & ageYOSstart < 43),

dates(SCDCivLeave+(365.25*20)),

ifelse((ageYOSstart >= 43 & ageYOSstart < 58),
  dates 
(DOB+(365.25*62)),
  ifelse 
(ageYOSstart >= 58,
 dates 
(SCDCivLeave+(365.25*5)), NA)),
ifelse((DOB < "12/31/69" &  
DOB > "01/01/53"), ## Born between 1953 and 1969 MRA of 56
ifelse(ageYOSstart <  
27, dates(DOB+(365.25*56)),

ifelse((ageYOSstart >= 27 & ageYOSstart < 31),

dates(SCDCivLeave+(365.25*30)),

ifelse((ageYOSstart >= 31 & ageYOSstart < 41),
  dates 
(DOB+(365.25*60)),
  ifelse 
((ageYOSstart >= 41 & ageYOSstart < 43),
 dates 
(SCDCivLeave+(365.25*20)),
 ifelse 
((ageYOSstart >= 43 & ageYOSstart < 58),
dates 
(DOB+(365.25*62)),
ifelse 
(ageYOSstart >= 58,
   dates 
(SCDCivLeave+(365.25*5)),
   NA 
))),
ifelse(DOB >= "01/01/69", ##  
Born after 1969 Min Retire Age of 57
   ifelse(ageYOSstart <  
28, dates(DOB+(365.25*57)),
   
ifelse((ageYOSstart >= 28 & ageYOSstart < 31),
  
dates(SCDCivLeave+(365.25*30)),
  
ifelse((ageYOSstart >= 31 & ageYOSstart < 41),
 
dates(DOB+(365.25*20)),
 
ifelse((ageYOSstart >= 41 & ageYOSstart < 43),
   dates 
(SCDCivLeave+(365.25*20)),
   ifelse 
((ageYOSstart >= 43 & ageYOSstart < 57),
  dates 
(DOB+(365.25*62)),
  ifelse 
(ageYOSstart >= 58,
 dates 
(SCDCivLeave+(365.25*5)),
 NA 
))), NA))


Adrian



If I understand you correctly, you want ?ifelse, which works on the
full logical vect

Re: [R] applying a set of rules to each row

2011-01-26 Thread KATSCHKE, ADRIAN CIV DFAS
Yes. That is exactly what I would like to have running. Here is the first 
attempt I made at using a nested ?ifelse statement for one of the retirement 
plans. The variables are all there but with different names. ageYOSstart is 
ageFedStart, SCDCivLeave is srvCompDT. I haven't gotten this working. I am not 
sure that it is the correct way to do what I would like.

## Regular retirement eligibility date for FERS employees
retData.All$regRetireDT2[retData.All$retireSystem == "FERS"] <- 
with(retData.All[retData.All$retireSystem == "FERS",],
 ifelse(DOB < "01/01/53", ## Born before 1953 
minimum retirement age of 55
ifelse(ageYOSstart < 26, 
dates(DOB+(365.25*55)),
   ifelse((ageYOSstart >= 26 & 
ageYOSstart < 31), dates(SCDCivLeave*(365.25*30)),
  ifelse((ageYOSstart >= 31 
& ageYOSstart < 41), dates(DOB+(365.25*60)),
 
ifelse((ageYOSstart >= 41 & ageYOSstart < 43),

dates(SCDCivLeave+(365.25*20)),

ifelse((ageYOSstart >= 43 & ageYOSstart < 58),
   
dates(DOB+(365.25*62)),
   
ifelse(ageYOSstart >= 58,

  dates(SCDCivLeave+(365.25*5)), NA)),
 ifelse((DOB < "12/31/69" & DOB > 
"01/01/53"), ## Born between 1953 and 1969 MRA of 56
 ifelse(ageYOSstart < 27, 
dates(DOB+(365.25*56)),
ifelse((ageYOSstart >= 
27 & ageYOSstart < 31),

dates(SCDCivLeave+(365.25*30)),

ifelse((ageYOSstart >= 31 & ageYOSstart < 41),
   
dates(DOB+(365.25*60)),
   
ifelse((ageYOSstart >= 41 & ageYOSstart < 43),
  
dates(SCDCivLeave+(365.25*20)),
  
ifelse((ageYOSstart >= 43 & ageYOSstart < 58),

 dates(DOB+(365.25*62)),

 ifelse(ageYOSstart >= 58,

dates(SCDCivLeave+(365.25*5)),

NA))),
 ifelse(DOB >= "01/01/69", ## Born 
after 1969 Min Retire Age of 57
ifelse(ageYOSstart < 28, 
dates(DOB+(365.25*57)),
   ifelse((ageYOSstart >= 
28 & ageYOSstart < 31),
  
dates(SCDCivLeave+(365.25*30)),
  
ifelse((ageYOSstart >= 31 & ageYOSstart < 41),
 
dates(DOB+(365.25*20)),
 
ifelse((ageYOSstart >= 41 & ageYOSstart < 43),

dates(SCDCivLeave+(365.25*20)),

ifelse((ageYOSstart >= 43 & ageYOSstart < 57),

   dates(DOB+(365.25*62)),

   ifelse(ageYOSstart >= 58,

  dates(SCDCivLeave+(365.25*5)),

  NA))), NA))

Adrian 


>If I understand you correctly, you want ?ifelse, which works on the
>full logical vectors of rules applied to the variables, not
>ifelse, which works on only a single logical.

>-- Bert Gunter

On Wed, Jan 26, 2011 at 12:18 PM, KATSCHKE, ADRIAN CIV DFAS
 wrote:
> All,
>
> I would like to apply a set of rules to each row of the sample data set
> below. The rule sets are the guidelines for determining an individual's
> date for retirement eligibility. The rules are found in this docum

Re: [R] print() required sometimes in interactive use of console

2011-01-26 Thread Duncan Murdoch

On 26/01/2011 3:27 PM, Giles Crane wrote:

Surprising behavior:
Most R users are aware that print()
must be used inside functions to gain
output on the console.

Apparently, print() is sometimes required
when interactively using the console.
For example, the followingmay be
entered into the R console with different results.

sample(1:8,8)  #prints a permutation of 1 to 8

for(i in 1:5)   #does not
  sample(1:8,8)


What you say is right, but I think your mental model of what is going on 
is wrong.  The rule for printing is simple:  the result of an expression 
entered at the console is automatically printed unless it is marked 
invisible.


The result of for() {} is marked invisible, so it doesn't print.

The result of x <- 1 is marked invisible, so it doesn't print.

Expressions in functions are not entered at the console, so they don't 
print.


You can explicitly mark something as invisible, and it won't print.  For 
example,


invisible(sample(1:8, 8))

won't print.  Though it doesn't use the R function invisible(), that's 
essentially what happens in a for loop:  it returns NULL, but marked 
invisible, so it doesn't print.  (I seem to recall that in older 
versions it would return the last value of the loop, but that was 
changed some time ago, or I'm thinking of some other language.)


There's a function called withVisible() that returns a value and its 
visibility; you can experiment with that to see what's going on.  For 
example,


> withVisible(x <- 1)
$value
[1] 1

$visible
[1] FALSE

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] applying a set of rules to each row

2011-01-26 Thread Bert Gunter
If I understand you correctly, you want ?ifelse, which works on the
full logical vectors of rules applied to the variables, not
ifelse, which works on only a single logical.

-- Bert Gunter

On Wed, Jan 26, 2011 at 12:18 PM, KATSCHKE, ADRIAN CIV DFAS
 wrote:
> All,
>
> I would like to apply a set of rules to each row of the sample data set
> below. The rule sets are the guidelines for determining an individual's
> date for retirement eligibility. The rules are found in this document,
> http://www.opm.gov/feddata/RetirementPaperFinal_v4.pdf. I am only
> interested in the top two categories for retirement eligibility, the
> CSRS and FERS plans.
>
> The data set has four variables Date of Birth (DOB), service computation
> date (srvCompDT), retirement plan (retirePlan), and the age at which the
> employee entered federal service (ageFedStart). The service computation
> date is used to compute the date eligible for retirement. The retirement
> plan indicates what system the employee is enrolled under.
>
> The data does contain a few other retirement plans, for now I want to
> just ignore those plans. I have labeled plans as 1-CSRS and 2-FERS, and
> 3-Other. My first attempt at applying the rules was through a complex
> nesting of ifelse statements, this was not very successful and quite
> difficult to follow. I then wrote a function and tried using "apply"
> unsuccessfully. The function is shown below.
>
> I would like to put a short script or function together that would allow
> for an efficient application of the rules to each of the employees. I am
> trying to avoid a loop, because my data set is quite large, and I may
> need to update my data set regularly and re-run the analysis and reports
> that will come from this work.
>
> Any advice or guidance on building the function or code to apply the
> rules would be quite helpful.
>
> retireHelp <-
> structure(list(DOB = structure(c(-6642, -5134, -3444, -5598,
> -4356, 5737, -4894, -1951, -2950, 2467, 6945, 4908, -7930, -7236,
> -7727, -77, 4158, -7892, -6028, -7132, -5959, 2309, -2494, -3513,
> -383, -216, -3369, -5861, 3674, -10265, -8986, -5023, -4862,
> 1526, -1022, 2175, -11790, -278, -7275, -5084, -1842, 430, -2220,
> -7444, 440, 4285, -7812, 3335, -7271, -6825, -1098, -1670, -10219,
> -7131, 5963, 704, -7662, 4219, -2813, 5147, -7334, -8223, -5922,
> -7497, -9276, -1291, -11640, -5631, 518, -7268, -2105, -5901,
> -690, -8146, -7059, 133, 1176, -6091, -2895, -6020, -4724, -3616,
> -5059, -8253, -2604, -12400, -4776, -3671, -9326, -7000, -5574,
> -3248, 4255, -1358, -6255, 8, -7115, -1701, -5227, 9, -517, -8674,
> -2554, -4069, -2077, -9872, -6534, 2970, -8307, -3020, -1343,
> -8897, -2304, -7424, 2078, -8274, -5559, -, -9262, -8473,
> -4088, -2429, -8006, -1091, 5015, 2765, 4036, 3101, -3743, 5103,
> -10018, -12095, -7646, -5966, -6208, -5784, -1325, -4288, -1665,
> -1409, 4685, -7881, -3413, 2738, -2201, 1217, -5113, 206, -1292,
> -1725, 10, -2978, -1895, -830, -105, -2395, -3496, -8244, -9956,
> -6494, -4678, -4077, 575, 2013, -3411, 3824, -4356, 4523, -5836,
> -6350, -5337, -41, -2001, -6632, -970, -6790, -2828, -4061, 476,
> 5854, -9648, -4227, 850, 2619, -7747, -2672, 4069, -12618, -6898,
> -4178, -1772, -1643, -2064, -157, 4551, -8688, -6087, -2040,
> -7239, -783), format = "m/d/y", origin = structure(c(1, 1, 1970
> ), .Names = c("month", "day", "year")), class = c("dates", "times"
> )), srvCompDT = structure(c(743, 12429, 3585, 4364, 13227, 13578,
> 13591, 8585, 9587, 13913, 14753, 13247, 2246, 1439, 8845, 7018,
> 12625, -552, 5688, 7080, 13255, 13549, 12709, 13969, 13997, 9532,
> 13689, 1226, 13549, 4093, 13423, 13801, 3181, 14809, 13353, 9457,
> 7745, 8986, 4759, 4486, 6449, 11172, 8669, 3344, 13745, 12275,
> 5081, 13605, 8006, 3048, 6330, 13521, 5254, 1733, 14095, 8516,
> 4848, 13521, 5970, 14697, 8291, 139, 11435, 3567, 8961, 5775,
> 3602, 1409, 11577, 12163, 12258, 13156, 9472, 7963, 1362, 10332,
> 9557, 3997, 7509, 4691, 3133, 5877, 6782, 11449, 13283, 8040,
> 11565, 3425, 7860, 1790, 10778, 13199, 12625, 5889, 3317, 9831,
> 1068, 8040, 7123, 9104, 12836, 7928, 12764, 8922, 5324, -1004,
> 1806, 10263, 5635, 10310, 5625, 8861, 14613, 3896, 10316, 5725,
> 12751, 6113, 2997, 112, 5707, 4987, -1018, 8055, 13885, 13073,
> 14585, 14865, 14935, 14390, 9735, 7654, 4557, 661, 1638, 1112,
> 14011, 3086, 7032, 13942, 13325, 6735, 13900, 12673, 10148, 14193,
> 14767, 8447, 6114, 10688, 13544, 7106, 8587, 14753, 7886, 12280,
> 11946, 13662, 3332, 2108, 13977, 6203, 8369, 13857, 8369, 11486,
> 8306, 12466, 12639, 7270, 4325, 13843, 14026, 14039, 6147, 7676,
> 5781, 7038, 9187, 14640, 6174, 11491, 13913, 13787, 13465, 8854,
> 13152, 1826, 1412, 4317, 5794, 5548, 8951, 12947, 12639, 5345,
> 5961, 4637, 6465, 13717), format = "m/d/y", origin = structure(c(1,
> 1, 1970), .Names = c("month", "day", "year")), class = c("dates",
> "times")), retirePlan = c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 1, 3, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 1, 2, 2, 1

[R] print() required sometimes in interactive use of console

2011-01-26 Thread Giles Crane

Surprising behavior:
Most R users are aware that print()
must be used inside functions to gain
output on the console.

Apparently, print() is sometimes required
when interactively using the console.
For example, the followingmay be
entered into the R console with different results.

sample(1:8,8)  #prints a permutation of 1 to 8

for(i in 1:5)   #does not
sample(1:8,8)



Cordially,
Giles Crane

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] applying a set of rules to each row

2011-01-26 Thread KATSCHKE, ADRIAN CIV DFAS
All,

I would like to apply a set of rules to each row of the sample data set
below. The rule sets are the guidelines for determining an individual's
date for retirement eligibility. The rules are found in this document,
http://www.opm.gov/feddata/RetirementPaperFinal_v4.pdf. I am only
interested in the top two categories for retirement eligibility, the
CSRS and FERS plans. 

The data set has four variables Date of Birth (DOB), service computation
date (srvCompDT), retirement plan (retirePlan), and the age at which the
employee entered federal service (ageFedStart). The service computation
date is used to compute the date eligible for retirement. The retirement
plan indicates what system the employee is enrolled under.

The data does contain a few other retirement plans, for now I want to
just ignore those plans. I have labeled plans as 1-CSRS and 2-FERS, and
3-Other. My first attempt at applying the rules was through a complex
nesting of ifelse statements, this was not very successful and quite
difficult to follow. I then wrote a function and tried using "apply"
unsuccessfully. The function is shown below.

I would like to put a short script or function together that would allow
for an efficient application of the rules to each of the employees. I am
trying to avoid a loop, because my data set is quite large, and I may
need to update my data set regularly and re-run the analysis and reports
that will come from this work.

Any advice or guidance on building the function or code to apply the
rules would be quite helpful.

retireHelp <-
structure(list(DOB = structure(c(-6642, -5134, -3444, -5598, 
-4356, 5737, -4894, -1951, -2950, 2467, 6945, 4908, -7930, -7236, 
-7727, -77, 4158, -7892, -6028, -7132, -5959, 2309, -2494, -3513, 
-383, -216, -3369, -5861, 3674, -10265, -8986, -5023, -4862, 
1526, -1022, 2175, -11790, -278, -7275, -5084, -1842, 430, -2220, 
-7444, 440, 4285, -7812, 3335, -7271, -6825, -1098, -1670, -10219, 
-7131, 5963, 704, -7662, 4219, -2813, 5147, -7334, -8223, -5922, 
-7497, -9276, -1291, -11640, -5631, 518, -7268, -2105, -5901, 
-690, -8146, -7059, 133, 1176, -6091, -2895, -6020, -4724, -3616, 
-5059, -8253, -2604, -12400, -4776, -3671, -9326, -7000, -5574, 
-3248, 4255, -1358, -6255, 8, -7115, -1701, -5227, 9, -517, -8674, 
-2554, -4069, -2077, -9872, -6534, 2970, -8307, -3020, -1343, 
-8897, -2304, -7424, 2078, -8274, -5559, -, -9262, -8473, 
-4088, -2429, -8006, -1091, 5015, 2765, 4036, 3101, -3743, 5103, 
-10018, -12095, -7646, -5966, -6208, -5784, -1325, -4288, -1665, 
-1409, 4685, -7881, -3413, 2738, -2201, 1217, -5113, 206, -1292, 
-1725, 10, -2978, -1895, -830, -105, -2395, -3496, -8244, -9956, 
-6494, -4678, -4077, 575, 2013, -3411, 3824, -4356, 4523, -5836, 
-6350, -5337, -41, -2001, -6632, -970, -6790, -2828, -4061, 476, 
5854, -9648, -4227, 850, 2619, -7747, -2672, 4069, -12618, -6898, 
-4178, -1772, -1643, -2064, -157, 4551, -8688, -6087, -2040, 
-7239, -783), format = "m/d/y", origin = structure(c(1, 1, 1970
), .Names = c("month", "day", "year")), class = c("dates", "times"
)), srvCompDT = structure(c(743, 12429, 3585, 4364, 13227, 13578, 
13591, 8585, 9587, 13913, 14753, 13247, 2246, 1439, 8845, 7018, 
12625, -552, 5688, 7080, 13255, 13549, 12709, 13969, 13997, 9532, 
13689, 1226, 13549, 4093, 13423, 13801, 3181, 14809, 13353, 9457, 
7745, 8986, 4759, 4486, 6449, 11172, 8669, 3344, 13745, 12275, 
5081, 13605, 8006, 3048, 6330, 13521, 5254, 1733, 14095, 8516, 
4848, 13521, 5970, 14697, 8291, 139, 11435, 3567, 8961, 5775, 
3602, 1409, 11577, 12163, 12258, 13156, 9472, 7963, 1362, 10332, 
9557, 3997, 7509, 4691, 3133, 5877, 6782, 11449, 13283, 8040, 
11565, 3425, 7860, 1790, 10778, 13199, 12625, 5889, 3317, 9831, 
1068, 8040, 7123, 9104, 12836, 7928, 12764, 8922, 5324, -1004, 
1806, 10263, 5635, 10310, 5625, 8861, 14613, 3896, 10316, 5725, 
12751, 6113, 2997, 112, 5707, 4987, -1018, 8055, 13885, 13073, 
14585, 14865, 14935, 14390, 9735, 7654, 4557, 661, 1638, 1112, 
14011, 3086, 7032, 13942, 13325, 6735, 13900, 12673, 10148, 14193, 
14767, 8447, 6114, 10688, 13544, 7106, 8587, 14753, 7886, 12280, 
11946, 13662, 3332, 2108, 13977, 6203, 8369, 13857, 8369, 11486, 
8306, 12466, 12639, 7270, 4325, 13843, 14026, 14039, 6147, 7676, 
5781, 7038, 9187, 14640, 6174, 11491, 13913, 13787, 13465, 8854, 
13152, 1826, 1412, 4317, 5794, 5548, 8951, 12947, 12639, 5345, 
5961, 4637, 6465, 13717), format = "m/d/y", origin = structure(c(1, 
1, 1970), .Names = c("month", "day", "year")), class = c("dates", 
"times")), retirePlan = c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
1, 3, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 1, 2, 2, 1, 
2, 2, 2, 2, 3, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 1, 2, 2, 2, 1, 
2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 3, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 
3, 2, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 1, 2, 
1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 
2, 1, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,

Re: [R] how could I choose subvector of a vector?

2011-01-26 Thread David Winsemius


On Jan 26, 2011, at 2:48 PM, David Winsemius wrote:



On Jan 26, 2011, at 2:42 PM, Akram Khaleghei Ghosheh balagh wrote:


Hello ;
How could I choose subvector of a vctor. for example if  
v=c(1,2,5,0,1), how

could I chose the (1,2) or (1,2,5).


?"["

v[ c(1,2,5) ]


I didn't notice at first that you wanted the values 1,2,5 ...(thought  
you wanted the 1st 2nd and 5th (obviously not correct with only a 4  
element vector)... so your values would be accessed with


v[1:3]





thanks;

[[alternative HTML version deleted]]


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a new variable from existing ones

2011-01-26 Thread David Winsemius


On Jan 26, 2011, at 2:06 PM, Melanie Zoelck wrote:


Hi,

I am relatively new to R and have a question regarding code. I have  
a data set which has data organised by location (site names, which  
are factors). I now want to add a new variable Region (this will be  
non numerical, as it will be names). Each region will contain  
locations. So for example:


Region: WestCoast
Locations in Region WestCoast: Tralee, Carradale and so on...


Presumably you have some sort of table with "location" and "region"  
and another table with perhaps "persons" and "location". You would  
merge these two tables with ... wait for it  "merge":


?merge


In the future it is considered more helpful to offer a tiny data  
example constructed with R code to allow more complete illustrations  
to be offered. Please read the Posting Guide.




So each Region will contain different Locations, so that location  
can be nested within Region if needed.


I have found info on how to create different Age categories for  
example but my input is not numeric, but it is names or rather  
factors with a number of levels (which are place names).


I would be grateful for any assistance you may be able to provide.

Thank you!

Melanie Zoelck
86 Gleann Na Ri
Renmore
Galway
Ireland



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how could I choose subvector of a vector?

2011-01-26 Thread David Winsemius


On Jan 26, 2011, at 2:42 PM, Akram Khaleghei Ghosheh balagh wrote:


Hello ;
How could I choose subvector of a vctor. for example if  
v=c(1,2,5,0,1), how

could I chose the (1,2) or (1,2,5).


?"["

v[ c(1,2,5) ]


thanks;

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating a new variable from existing ones

2011-01-26 Thread Melanie Zoelck
Hi,

I am relatively new to R and have a question regarding code. I have a data set 
which has data organised by location (site names, which are factors). I now 
want to add a new variable Region (this will be non numerical, as it will be 
names). Each region will contain locations. So for example:

Region: WestCoast
Locations in Region WestCoast: Tralee, Carradale and so on...

So each Region will contain different Locations, so that location can be nested 
within Region if needed.

I have found info on how to create different Age categories for example but my 
input is not numeric, but it is names or rather factors with a number of levels 
(which are place names).

I would be grateful for any assistance you may be able to provide.

Thank you!

Melanie Zoelck
86 Gleann Na Ri 
Renmore
Galway
Ireland

Tel.: +353 91 746086
Movil/Mobile: +353 857246196
E-Mail: mzoe...@mail.infocanarias.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how could I choose subvector of a vector?

2011-01-26 Thread Akram Khaleghei Ghosheh balagh
Hello ;
How could I choose subvector of a vctor. for example if v=c(1,2,5,0,1), how
could I chose the (1,2) or (1,2,5).
thanks;

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting the terms from an rpart object

2011-01-26 Thread William Dunlap
I would leave off the as.character().  With it you get
things like:
  > f <- rpart(Kyphosis ~ ., kyphosis[-3])
  > as.list(f$call)$data
  kyphosis[-3]
  > as.character(as.list(f$call)$data)
  [1] "[""kyphosis" "-3"
Expressions like quote(kyphosis[-3]) are much
easier to analyze as expressions that as vectors
of character strings.  E.g., you can use all.vars
on an expression to see what variables are mentioned
in it.
 
When it is time to print the expression you may
need to use deparse() to turn it into a readable
vector of strings.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 




From: Tal Galili [mailto:tal.gal...@gmail.com] 
Sent: Wednesday, January 26, 2011 11:35 AM
To: Henrique Dallazuanna
Cc: r-help@r-project.org; William Dunlap
Subject: Re: [R] Extracting the terms from an rpart object


Exactly what I needed Henrique, 
Thank you.


Contact 
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) 
| www.r-statistics.com (English)

--





On Wed, Jan 26, 2011 at 8:26 PM, Henrique Dallazuanna 
 wrote:


Try this:

as.character(as.list(fit1$call)$data) 


On Wed, Jan 26, 2011 at 4:12 PM, Tal Galili 
 wrote:


Another (similar) question, 

If I now want to know the name of the "data" argument 
used, is there an easy way for me to access it?

I'm trying to use something like:
eval(parse(text = all.vars(terms(fit1))[1]))

Which (of course) wouldn't work, since the response 
variable is only available in the data used by rpart (specifically the 
"kyphosis" dataset)

Thanks upfront.


Tal




Contact 
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | 
www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English)

--





On Wed, Jan 26, 2011 at 7:40 PM, Henrique Dallazuanna 
 wrote:


Try this:

all.vars(terms(fit1))
all.vars(terms(fit2))



On Wed, Jan 26, 2011 at 3:33 PM, Tal Galili 
 wrote:


Hello all,

I wish to extract the terms from an 
rpart object.
Specifically, I would like to be able 
to know what is the response variable
(so I could do some manipulation on it).
But in general, such a method for rpart 
will also need to handle a "." case
(see fit2)

Here are two simple examples:

fit1 <- rpart(Kyphosis ~ Age + Number + 
Start, data=kyphosis)
fit1$call
fit2 <- rpart(Kyphosis ~ ., 
data=kyphosis)
fit2$call


Is there anything "prettier" then using 
string manipulation?


Thanks.





Contact

Details:---
 

Re: [R] Extracting the terms from an rpart object

2011-01-26 Thread Tal Galili
Exactly what I needed Henrique,
Thank you.


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Wed, Jan 26, 2011 at 8:26 PM, Henrique Dallazuanna wrote:

> Try this:
>
> as.character(as.list(fit1$call)$data)
>
>
> On Wed, Jan 26, 2011 at 4:12 PM, Tal Galili  wrote:
>
>> Another (similar) question,
>>
>> If I now want to know the name of the "data" argument used, is there an
>> easy way for me to access it?
>>
>> I'm trying to use something like:
>> eval(parse(text = all.vars(terms(fit1))[1]))
>>
>> Which (of course) wouldn't work, since the response variable is only
>> available in the data used by rpart (specifically the "kyphosis" dataset)
>>
>> Thanks upfront.
>>
>> Tal
>>
>>
>>
>>
>> Contact
>> Details:---
>> Contact me: tal.gal...@gmail.com |  972-52-7275845
>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>> www.r-statistics.com (English)
>>
>> --
>>
>>
>>
>>
>> On Wed, Jan 26, 2011 at 7:40 PM, Henrique Dallazuanna 
>> wrote:
>>
>>> Try this:
>>>
>>> all.vars(terms(fit1))
>>> all.vars(terms(fit2))
>>>
>>>
>>> On Wed, Jan 26, 2011 at 3:33 PM, Tal Galili wrote:
>>>
 Hello all,

 I wish to extract the terms from an rpart object.
 Specifically, I would like to be able to know what is the response
 variable
 (so I could do some manipulation on it).
 But in general, such a method for rpart will also need to handle a "."
 case
 (see fit2)

 Here are two simple examples:

 fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
 fit1$call
 fit2 <- rpart(Kyphosis ~ ., data=kyphosis)
 fit2$call


 Is there anything "prettier" then using string manipulation?


 Thanks.





 Contact
 Details:---
 Contact me: tal.gal...@gmail.com |  972-52-7275845
 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew)
 |
 www.r-statistics.com (English)

 --

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>
>>>
>>>
>>> --
>>> Henrique Dallazuanna
>>> Curitiba-Paraná-Brasil
>>> 25° 25' 40" S 49° 16' 22" O
>>>
>>
>>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hmm.discnp hidden markov model

2011-01-26 Thread Qasim Javed
Hi all,

I am using a discrete Hidden Markov Model with discrete observations in
order to detect a sequence of integers. I am using the "hmm.discnp" package.

I am using the following code:

signature <- c(-89, -98, -90, -84, -77, -75, -64, -60, -58, -55, -56, -57,
-57, -63, -77, -81, -82, -91, -85, -89, -93)

quant <- length(-110:-6)

# Initialize and train the hmm with the observed sequence mentioned above.
# "yval" lists the possible values for observations and K is the number of
hidden states.
my_hmm <- hmm(y=signature, yval=c(-110:-6), K=5)

print(my_hmm)


The above shows that the HMM was trained using "signature" and the values
seem to be intuitive.

My question is more a fundamental  one in regards to understanding HMMs. I
know I should use more examples of the above sequences to train the HMM in
order to make it more robust. Assuming, that the HMM is trained good enough,
I can use the viterbi algorithm to find the most probable sequence of hidden
states. However, what I really want to find out is whether a particular
observed sequence is modeled by my HMM (created above). There seems to be a
viterbi() function in hmm.discnp and also mps() but both of them give them
most probable hidden state sequence, whereas, I want the probability of a
particular observed sequence, that is, the likelihood for an arbitrary
observed sequence. This is typically solved using the solution to
"Evaluation Problem" in HMMs, but I do not see a function in hmm.discnp for
calculating this.

Am I missing something?

Thanks for the help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using popbio, reduce number of digits in image2 plot

2011-01-26 Thread Tali

Hello All.

I am using the image2 plot function in the popbio package to create 6
elasticity analyses. I am trying to reduce the number of significant digits
that displays -- from 3 digits to 2. I tried rounding my original matrices,
but one is comprised primarily of zeros, and the cut / breaks options
returns an error message. (see code and error message below)

Here are the elasticity values that I'm working with. Matrix 2 "2005-2006"
is the one that is creating the problem. I tried fiddling with cut, breaks,
digit, rounding...to no avail.

Further, is there an easy way to turn the top side labels 90 degrees? 

Thank you Chris Stubbens for this wonderful code!

TV, PhD Candidate
Center for Marine Biodiversity and Conservation
Scripps Institution of Oceanography

elast:

, , 2004-2005

1   2   3   4
1 0.017558594 0.003972943 0.002701352 0.006443492
2 0.011770698 0.106806350 0.020465566 0.022990744
3 0.001347089 0.047278966 0.258967762 0.061338978
4 0.0 0.003975099 0.086798115 0.347584256

, , 2005-2006

  1234
1 0 0.00e+00 0.00e+00 0.00e+00
2 0 6.872613e-17 2.641284e-17 3.763301e-17
3 0 0.00e+00 0.00e+00 0.00e+00
4 0 0.00e+00 0.00e+00 1.00e+00

, , 2006.1-2007

1   23   4
1 0.006634579 0.002471288 0.0001750284 0.001891018
2 0.004537334 0.040648910 0.0076802263 0.014082075
3 0.0 0.023828347 0.1431152824 0.015278433
4 0.0 0.0 0.0312515256 0.708405954

, , 2007-2008

1   2   3  4
1 0.022147389 0.007198642 0.001271820 0.0264
2 0.015296663 0.065937859 0.00541 0.01685541
3 0.004286438 0.030397868 0.157415674 0.04280021
4 0.0 0.0 0.070768252 0.54906670

, , 2008-2009

   1  2   3   4
1 0.02359622 0.00377114 0.001521911 0.007169308
2 0.01246236 0.05323351 0.009987896 0.014909244
3 0. 0.03358836 0.060984960 0.016624312
4 0. 0. 0.038702865 0.723447911

, , 2009-2010

1   23   4
1 0.005384555 0.002185377 0.0006226765 0.004539465
2 0.007347518 0.032320771 0.0065127002 0.010417766
3 0.0 0.022092608 0.0947136415 0.059574250
4 0.0 0.0 0.0745314819 0.679757189


layout(matrix(1:6,2,3,byrow=TRUE)); 
for (i in 1:6) {
image2(round(elast[,,i],2),
col = gray(seq(1,.4,-.1)))
}

Error in cut.default(log10(x), breaks) : 'breaks' are not unique


-- 
View this message in context: 
http://r.789695.n4.nabble.com/using-popbio-reduce-number-of-digits-in-image2-plot-tp3238622p3238622.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting the terms from an rpart object

2011-01-26 Thread Henrique Dallazuanna
Try this:

as.character(as.list(fit1$call)$data)

On Wed, Jan 26, 2011 at 4:12 PM, Tal Galili  wrote:

> Another (similar) question,
>
> If I now want to know the name of the "data" argument used, is there an
> easy way for me to access it?
>
> I'm trying to use something like:
> eval(parse(text = all.vars(terms(fit1))[1]))
>
> Which (of course) wouldn't work, since the response variable is only
> available in the data used by rpart (specifically the "kyphosis" dataset)
>
> Thanks upfront.
>
> Tal
>
>
>
>
> Contact
> Details:---
> Contact me: tal.gal...@gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
>
> --
>
>
>
>
> On Wed, Jan 26, 2011 at 7:40 PM, Henrique Dallazuanna wrote:
>
>> Try this:
>>
>> all.vars(terms(fit1))
>> all.vars(terms(fit2))
>>
>>
>> On Wed, Jan 26, 2011 at 3:33 PM, Tal Galili  wrote:
>>
>>> Hello all,
>>>
>>> I wish to extract the terms from an rpart object.
>>> Specifically, I would like to be able to know what is the response
>>> variable
>>> (so I could do some manipulation on it).
>>> But in general, such a method for rpart will also need to handle a "."
>>> case
>>> (see fit2)
>>>
>>> Here are two simple examples:
>>>
>>> fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
>>> fit1$call
>>> fit2 <- rpart(Kyphosis ~ ., data=kyphosis)
>>> fit2$call
>>>
>>>
>>> Is there anything "prettier" then using string manipulation?
>>>
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>>> Contact
>>> Details:---
>>> Contact me: tal.gal...@gmail.com |  972-52-7275845
>>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>>> www.r-statistics.com (English)
>>>
>>> --
>>>
>>>[[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Henrique Dallazuanna
>> Curitiba-Paraná-Brasil
>> 25° 25' 40" S 49° 16' 22" O
>>
>
>


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding error bars

2011-01-26 Thread David Winsemius

Harrell's wiki/website has material on so-called "dynamite plots"

http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/DynamitePlots


Ben Bolker has a page on them as well:

http://emdbolker.wikidot.com/blog:dynamite

--  
David.



On Jan 26, 2011, at 1:26 PM, Joshua Wiley wrote:


Hi,

You will find package "sos" has some handy functions for searching for
functions/packages:

###
install.packages("sos")
require(sos)
findFn("error.bars")
###

Perhaps the "psych" package has the error.bars() function you are  
thinking of?


As a side note, I am not sure it makes much sense to add error bars to
a boxplot---it already includes 25th/75th percentiles (at least
approximately, some use hinges, etc. but same general idea), and
typically minimum and maximum scores.  What would the error bars add
to that picture that would aid a viewer in understanding the
distribution of your data or the difference between different sets of
data?  In other words, is the information conveyed over and above the
boxplots large enough to warrant the additional clutter, possible
confusion, and necessary key/description in your figure caption so
viewers know what they are looking at?  Just something to think about.

Cheers,

Josh

On Wed, Jan 26, 2011 at 10:04 AM, ogbos okike  
 wrote:


Dear all,
I am trying to add error bars on a boxplot but have encountered an  
error as
indicated below. Is there a package I need to install or a library  
I have to

load before this goes please.
Thanks for any idea.
Ogbos

x<-replicate(20,rnorm(50))
 boxplot(x,notch=TRUE,main="Notched boxplot with error bars")
 error.bars(x,add=TRUE)
Error: could not find function "error.bars"

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to get loglik parameter from splm package?

2011-01-26 Thread Millo Giovanni
;^)

you're right: we'll add it soon. Keep an eye on R-forge.

All the best,
Giovanni

-- original message 

Message: 89
Date: Tue, 25 Jan 2011 17:14:00 -0800 (PST)
From: zhaowei 
To: r-help@r-project.org
Subject: Re: [R] how to get loglik parameter from splm package?
Message-ID: <1296004440124-3237303.p...@n4.nabble.com>
Content-Type: text/plain; charset=us-ascii


thank  Millo  for your very valuable reply

The political situation in Italy may be better than in  china.
And wish logik function could be supported in FE models too.
If then ,Splm will be more welcome.

thanks for your all works for useR.



Millo Giovanni wrote:
> 
> Dear useR,
> 
> although I admit that getting the log likelihood is important, you
must
> concede that obtaining the parameter estimates is not bad either.
> Regarding "craze", well there are crazier things in the world than
this,
> just look at the political situation in Italy.
> 
> Anyway, the loglik has always been there, although it wasn't exported
> (hence the NULL value). In the most recent versions of 'splm' we have
> made it available, at least for RE models, through the standard way: a
> logLik() method. Usage:
> 
>> logLik(yourmodel)
> 
> You can download it from R-forge, as usual.
> 
> Best,
> Giovanni
> 
>  original message --
> 
> Message: 42
> Date: Mon, 24 Jan 2011 06:59:39 -0800 (PST)
> From: zhaowei 
> To: r-help@r-project.org
> Subject: [R] how to get loglik parameter from splm package?
> Message-ID: <1295881179014-3234185.p...@n4.nabble.com>
> Content-Type: text/plain; charset=us-ascii
> 
> 
> splm package is a r  implemention of spatial panel data models.
> and the loglik paremeter is most important infomation for splm
methods.
> but  i found the loglik always been null ,it's craze to get right
> estimation 
> in splm with null  loglik.
> Any one knows the splm package and can get the right loglik ? please
> help
> me.
> 
> thanks
> -- 
> View this message in context:
>
http://r.789695.n4.nabble.com/how-to-get-loglik-parameter-from-splm-pack
> age-tp3234185p3234185.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> --- end original message -
> 
> Giovanni Millo
> Research Dept.,
> Assicurazioni Generali SpA
> Via Machiavelli 4, 
> 34132 Trieste (Italy)
> tel. +39 040 671184 
> fax  +39 040 671160 
> 
>  
> Ai sensi del D.Lgs. 196/2003 si precisa che le
informazi...{{dropped:13}}
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context:
http://r.789695.n4.nabble.com/how-to-get-loglik-parameter-from-splm-pack
age-tp3234192p3237303.html
Sent from the R help mailing list archive at Nabble.com.
-- end original message -

Giovanni Millo
Research Dept.,
Assicurazioni Generali SpA
Via Machiavelli 4, 
34132 Trieste (Italy)
tel. +39 040 671184 
fax  +39 040 671160 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding error bars

2011-01-26 Thread Joshua Wiley
Hi,

You will find package "sos" has some handy functions for searching for
functions/packages:

###
install.packages("sos")
require(sos)
findFn("error.bars")
###

Perhaps the "psych" package has the error.bars() function you are thinking of?

As a side note, I am not sure it makes much sense to add error bars to
a boxplot---it already includes 25th/75th percentiles (at least
approximately, some use hinges, etc. but same general idea), and
typically minimum and maximum scores.  What would the error bars add
to that picture that would aid a viewer in understanding the
distribution of your data or the difference between different sets of
data?  In other words, is the information conveyed over and above the
boxplots large enough to warrant the additional clutter, possible
confusion, and necessary key/description in your figure caption so
viewers know what they are looking at?  Just something to think about.

Cheers,

Josh

On Wed, Jan 26, 2011 at 10:04 AM, ogbos okike  wrote:
>
> Dear all,
> I am trying to add error bars on a boxplot but have encountered an error as
> indicated below. Is there a package I need to install or a library I have to
> load before this goes please.
> Thanks for any idea.
> Ogbos
>
> x<-replicate(20,rnorm(50))
>  boxplot(x,notch=TRUE,main="Notched boxplot with error bars")
>  error.bars(x,add=TRUE)
> Error: could not find function "error.bars"
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting the terms from an rpart object

2011-01-26 Thread William Dunlap
Note that all.vars(terms(fit)) only looks at the
formula in the terms object and throws away all
the analysis done by rpart's call to terms(formula,data).
Here is a contrived example of that approach failing:

  > ageThreshold <- 50
  > fit3 <- rpart(Kyphosis=="present" ~ (Age>ageThreshold) + log(Number, 
base=2) + Start, data=kyphosis)
  > all.vars(terms(fit3))
  [1] "Kyphosis" "Age"  "ageThreshold" "Number"   "Start"

Looking at the attributes of the terms object
tells you what I think you want:

  > attr(terms(fit3), "response") # 1=>there is a response variable, 0=>no 
response
  [1] 1
  > as.list(attr(terms(fit3), "variables"))[-1]
  [[1]]
  Kyphosis == "present"

  [[2]]
  Age > ageThreshold

  [[3]]
  log(Number, base = 2)

  [[4]]
  Start

rpart doesn't allow interaction terms (x:y), but if it did
you would want to look at the "factors" attribute to see
which items in the "variables" lists are in each term of
the expanded formula.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Tal Galili
> Sent: Wednesday, January 26, 2011 10:07 AM
> To: Henrique Dallazuanna
> Cc: r-help@r-project.org
> Subject: Re: [R] Extracting the terms from an rpart object
> 
> Thanks Henrique, exactly what I was looking for.
> 
> 
> Contact
> Details:---
> Contact me: tal.gal...@gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il 
> (Hebrew) |
> www.r-statistics.com (English)
> --
> 
> 
> 
> 
> 
> On Wed, Jan 26, 2011 at 7:40 PM, Henrique Dallazuanna 
> wrote:
> 
> > Try this:
> >
> > all.vars(terms(fit1))
> > all.vars(terms(fit2))
> >
> >
> > On Wed, Jan 26, 2011 at 3:33 PM, Tal Galili 
>  wrote:
> >
> >> Hello all,
> >>
> >> I wish to extract the terms from an rpart object.
> >> Specifically, I would like to be able to know what is the response
> >> variable
> >> (so I could do some manipulation on it).
> >> But in general, such a method for rpart will also need to 
> handle a "."
> >> case
> >> (see fit2)
> >>
> >> Here are two simple examples:
> >>
> >> fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
> >> fit1$call
> >> fit2 <- rpart(Kyphosis ~ ., data=kyphosis)
> >> fit2$call
> >>
> >>
> >> Is there anything "prettier" then using string manipulation?
> >>
> >>
> >> Thanks.
> >>
> >>
> >>
> >>
> >>
> >> Contact
> >> Details:---
> >> Contact me: tal.gal...@gmail.com |  972-52-7275845
> >> Read me: www.talgalili.com (Hebrew) | 
> www.biostatistics.co.il (Hebrew) |
> >> www.r-statistics.com (English)
> >>
> >> 
> --
> 
> >>
> >>[[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > --
> > Henrique Dallazuanna
> > Curitiba-Paraná-Brasil
> > 25° 25' 40" S 49° 16' 22" O
> >
> 
>   [[alternative HTML version deleted]]
> 
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] adding error bars

2011-01-26 Thread ogbos okike
Dear all,
I am trying to add error bars on a boxplot but have encountered an error as
indicated below. Is there a package I need to install or a library I have to
load before this goes please.
Thanks for any idea.
Ogbos

x<-replicate(20,rnorm(50))
 boxplot(x,notch=TRUE,main="Notched boxplot with error bars")
 error.bars(x,add=TRUE)
Error: could not find function "error.bars"

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting the terms from an rpart object

2011-01-26 Thread Tal Galili
Another (similar) question,

If I now want to know the name of the "data" argument used, is there an easy
way for me to access it?

I'm trying to use something like:
eval(parse(text = all.vars(terms(fit1))[1]))

Which (of course) wouldn't work, since the response variable is only
available in the data used by rpart (specifically the "kyphosis" dataset)

Thanks upfront.

Tal




Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Wed, Jan 26, 2011 at 7:40 PM, Henrique Dallazuanna wrote:

> Try this:
>
> all.vars(terms(fit1))
> all.vars(terms(fit2))
>
>
> On Wed, Jan 26, 2011 at 3:33 PM, Tal Galili  wrote:
>
>> Hello all,
>>
>> I wish to extract the terms from an rpart object.
>> Specifically, I would like to be able to know what is the response
>> variable
>> (so I could do some manipulation on it).
>> But in general, such a method for rpart will also need to handle a "."
>> case
>> (see fit2)
>>
>> Here are two simple examples:
>>
>> fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
>> fit1$call
>> fit2 <- rpart(Kyphosis ~ ., data=kyphosis)
>> fit2$call
>>
>>
>> Is there anything "prettier" then using string manipulation?
>>
>>
>> Thanks.
>>
>>
>>
>>
>>
>> Contact
>> Details:---
>> Contact me: tal.gal...@gmail.com |  972-52-7275845
>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>> www.r-statistics.com (English)
>>
>> --
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Train error:: subscript out of bonds

2011-01-26 Thread Max Kuhn
No. Any valid seed should work. In this case, train() should on;y be
using it to determine which training set samples are in the CV or
bootstrap data sets.

Max

On Wed, Jan 26, 2011 at 9:56 AM, Neeti  wrote:
>
> Thank you so much for your reply. In my case it is giving error in some seed
> value for example if I set seed value to 357 this gives an error. Does train
> have some specific seed range?
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Train-error-subscript-out-of-bonds-tp3234510p3238197.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting the terms from an rpart object

2011-01-26 Thread Tal Galili
Thanks Henrique, exactly what I was looking for.


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Wed, Jan 26, 2011 at 7:40 PM, Henrique Dallazuanna wrote:

> Try this:
>
> all.vars(terms(fit1))
> all.vars(terms(fit2))
>
>
> On Wed, Jan 26, 2011 at 3:33 PM, Tal Galili  wrote:
>
>> Hello all,
>>
>> I wish to extract the terms from an rpart object.
>> Specifically, I would like to be able to know what is the response
>> variable
>> (so I could do some manipulation on it).
>> But in general, such a method for rpart will also need to handle a "."
>> case
>> (see fit2)
>>
>> Here are two simple examples:
>>
>> fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
>> fit1$call
>> fit2 <- rpart(Kyphosis ~ ., data=kyphosis)
>> fit2$call
>>
>>
>> Is there anything "prettier" then using string manipulation?
>>
>>
>> Thanks.
>>
>>
>>
>>
>>
>> Contact
>> Details:---
>> Contact me: tal.gal...@gmail.com |  972-52-7275845
>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>> www.r-statistics.com (English)
>>
>> --
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] removing outlier function / dataset update

2011-01-26 Thread Ista Zahn
Hi,
x and y are being picked up from your global environment, not from the
x and y in dataset. Here is a version that seems to work:

rm.outliers = function(dataset,var1, var2) {

dataset$varpredicted = predict(lm(as.formula(paste(var1, var2,
sep=" ~ ")), data=dataset))
dataset$varstdres = rstudent(lm(as.formula(paste(var1, var2, sep="
~ ")), data=dataset))
i = length(which(dataset$varstdres > 3 | dataset$varstdres < -3))
while(i >= 1){
removed = which(dataset$varstdres > 3 | dataset$varstdres < -3)
print(dataset[removed,])
dataset = dataset[-removed,]
dataset$varpredicted = predict(lm(as.formula(paste(var1, var2,
sep=" ~ ")), data=dataset))
dataset$varstdres = rstudent(lm(as.formula(paste(var1, var2,
sep=" ~ ")), data=dataset))
i = with(dataset,length(varstdres > 3 | varstdres < -3))
}
}


Best,
Ista

On Wed, Jan 26, 2011 at 11:36 AM, kirtau  wrote:
>
> Hi,
>
> I have a few lines of code that will remove outliers for a regression test
> based on the studentized residuals being above or below 3, -3. I have to do
> this multiple times and have attempted to create a function to lessen the
> amount of copying, pasting and replacing.
>
> I run into trouble with the function and receiving the error "Error in
> `$<-.data.frame`(`*tmp*`, "varpredicted", value = c(0.114285714285714,  :
>  replacement has 20 rows, data has 19
> "
>
> any help would be appreciated. a list of code is listed below.
>
> Thank you for your time!
>
> x = c(1:20)
> y = c(1,3,4,2,5,6,18,8,10,8,11,13,14,14,15,85,17,19,19,20)
> data1 = data.frame(x,y)
>
> # remove outliers for regression by studentized residuals being greater than
> 3
> data1$predicted = predict(lm(data1$y~data1$x))
> data1$stdres = rstudent(lm(data1$y~data1$x));
> i=length(which(data1$stdres>3|data1$stdres< -3))
> while(i >= 1){
>        remove<-which(data1$stdres>3|data1$stdres< -3)
>        print(data1[remove,])
>        data1 = data1[-remove,]
>        data1$predicted = predict(lm(data1$y~data1$x))
>        data1$stdres = rstudent(lm(data1$y~data1$x))
>        i = with(data1,length(which(stdres>3|stdres< -3)))
>  }
>
> # attemp to create a function to perfom same idea as above
> rm.outliers = function(dataset,var1, var2) {
>
>  dataset$varpredicted = predict(lm(var1~var2))
>  dataset$varstdres = rstudent(lm(var1~var2))
>  i = length(which(dataset$varstdres > 3 | dataset$varstdres < -3))
>  while(i >= 1){
>         removed = which(dataset$varstdres > 3 | dataset$varstdres < -3)
>         print(dataset[removed,])
>         dataset = dataset[-removed,]
>         dataset$varpredicted = predict(lm(var1~var2))
>   dataset$varstdres = rstudent(lm(var1~var2))
>         i = with(dataset,length(varstdres > 3 | varstdres < -3))
>   }
> }
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/removing-outlier-function-dataset-update-tp3238394p3238394.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting the terms from an rpart object

2011-01-26 Thread Henrique Dallazuanna
Try this:

all.vars(terms(fit1))
all.vars(terms(fit2))


On Wed, Jan 26, 2011 at 3:33 PM, Tal Galili  wrote:

> Hello all,
>
> I wish to extract the terms from an rpart object.
> Specifically, I would like to be able to know what is the response variable
> (so I could do some manipulation on it).
> But in general, such a method for rpart will also need to handle a "." case
> (see fit2)
>
> Here are two simple examples:
>
> fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
> fit1$call
> fit2 <- rpart(Kyphosis ~ ., data=kyphosis)
> fit2$call
>
>
> Is there anything "prettier" then using string manipulation?
>
>
> Thanks.
>
>
>
>
>
> Contact
> Details:---
> Contact me: tal.gal...@gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
> www.r-statistics.com (English)
>
> --
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting the terms from an rpart object

2011-01-26 Thread William Dunlap
Take a look at the output of
   terms(fit2)
In particular
   tm <- terms(fit2)
   attr(tm, "response")
is 1 if there is a response and
   variables <- as.list(attr(tm, "variables"))[-1]
   variables[[1]]
gives the response expression if there is one.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Tal Galili
> Sent: Wednesday, January 26, 2011 9:33 AM
> To: r-help@r-project.org
> Subject: [R] Extracting the terms from an rpart object
> 
> Hello all,
> 
> I wish to extract the terms from an rpart object.
> Specifically, I would like to be able to know what is the 
> response variable
> (so I could do some manipulation on it).
> But in general, such a method for rpart will also need to 
> handle a "." case
> (see fit2)
> 
> Here are two simple examples:
> 
> fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
> fit1$call
> fit2 <- rpart(Kyphosis ~ ., data=kyphosis)
> fit2$call
> 
> 
> Is there anything "prettier" then using string manipulation?
> 
> 
> Thanks.
> 
> 
> 
> 
> 
> Contact
> Details:---
> Contact me: tal.gal...@gmail.com |  972-52-7275845
> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il 
> (Hebrew) |
> www.r-statistics.com (English)
> --
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to calculate p-value for Kolmogorov Smirnov test statistics?

2011-01-26 Thread David Winsemius
The answers to these questions can be found either by looking at the  
code or by reviewing what has been said about this in prior postings  
to R-help.


--
David.

On Jan 26, 2011, at 10:40 AM, saray wrote:



Although I saw this issue being discussed many times before, I still
did not find the answer to:
why does R can not calculate p-values for data with ties (i.e. -
sample with two or more values the same)?

Can anyone elaborate some details about how does R calculate the p-
values for the Kolmogorov Smirnov test statistics?

I can understand the theoretical problem that continuous distributions
do not generate ties, but again - why isn't it possible to calculate
accurate p-values with ties? how does it work? what is the procedure
to calculate p-values for the Kolmogorov Smirnov test statistics?

Thank you advance,
Saray
--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-calculate-p-value-for-Kolmogorov-Smirnov-test-statistics-tp3238293p3238293.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Inconsistencies in the rpart.object help file?

2011-01-26 Thread Tal Galili
Hello all,

I'm was going through the help for
?rpart.object
And noticed some inconsistencies, Some might be a mistake in the help file
and some might be my misunderstanding.

The help in the section:
value -> frame (first paragraph), states that:

>  yval, the fitted value of the response at each node, *and splits, a two
> column matrix of left and right split labels for each node. *


But from looking at the object, for example

fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
fit1$frame

  var  n wt dev yval complexity ncompete nsurrogateyval2.1
 yval2.2yval2.3yval2.4yval2.5

1   Start 81 81  171 0.176470592  1  1.000
64.000 17.000  0.7901235  0.2098765

2   Start 62 62   61 0.019607842  2  1.000
56.000  6.000  0.9032258  0.0967742

4   29 29   01 0.01000  0  1.000
29.000  0.000  1.000  0.000

I can't see any "splits" column.

I'm also not sure I understand what the yval2 columns signify (even that I
read what it says in the help).

p.s: I hope I sent it to the correct e-mail.

Best,
Tal





Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extracting the terms from an rpart object

2011-01-26 Thread Tal Galili
Hello all,

I wish to extract the terms from an rpart object.
Specifically, I would like to be able to know what is the response variable
(so I could do some manipulation on it).
But in general, such a method for rpart will also need to handle a "." case
(see fit2)

Here are two simple examples:

fit1 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
fit1$call
fit2 <- rpart(Kyphosis ~ ., data=kyphosis)
fit2$call


Is there anything "prettier" then using string manipulation?


Thanks.





Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combing forest plots

2011-01-26 Thread Ross, Stephanie
Hi All,

I am trying to combine two forest plots on the same page using the "forestplot" 
function in the rmeta package. Once I use the par() function to combine my 
plots on the same page, I find that my two plots are overlaying each other. 
Does anyone have any suggestions on how to  fix this?

Thanks!



PHRI DISCLAIMER
This information is directed in confidence solely to the person named above and 
may not otherwise be distributed, copied or disclosed. Therefore, this 
information should be considered strictly confidential. If you have received 
this email in error, please notify the sender immediately via a return email 
for further direction. Thank you for your assistance.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to calculate p-value for Kolmogorov Smirnov test statistics?

2011-01-26 Thread saray

Although I saw this issue being discussed many times before, I still
did not find the answer to:
why does R can not calculate p-values for data with ties (i.e. -
sample with two or more values the same)?

Can anyone elaborate some details about how does R calculate the p-
values for the Kolmogorov Smirnov test statistics?

I can understand the theoretical problem that continuous distributions
do not generate ties, but again - why isn't it possible to calculate
accurate p-values with ties? how does it work? what is the procedure
to calculate p-values for the Kolmogorov Smirnov test statistics?

Thank you advance,
Saray
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-calculate-p-value-for-Kolmogorov-Smirnov-test-statistics-tp3238293p3238293.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] removing outlier function / dataset update

2011-01-26 Thread kirtau

Hi,

I have a few lines of code that will remove outliers for a regression test
based on the studentized residuals being above or below 3, -3. I have to do
this multiple times and have attempted to create a function to lessen the
amount of copying, pasting and replacing. 

I run into trouble with the function and receiving the error "Error in
`$<-.data.frame`(`*tmp*`, "varpredicted", value = c(0.114285714285714,  : 
  replacement has 20 rows, data has 19
"

any help would be appreciated. a list of code is listed below. 

Thank you for your time!

x = c(1:20)
y = c(1,3,4,2,5,6,18,8,10,8,11,13,14,14,15,85,17,19,19,20)
data1 = data.frame(x,y)

# remove outliers for regression by studentized residuals being greater than
3
data1$predicted = predict(lm(data1$y~data1$x))
data1$stdres = rstudent(lm(data1$y~data1$x));
i=length(which(data1$stdres>3|data1$stdres< -3))
while(i >= 1){
remove<-which(data1$stdres>3|data1$stdres< -3)
print(data1[remove,])
data1 = data1[-remove,]
data1$predicted = predict(lm(data1$y~data1$x))
data1$stdres = rstudent(lm(data1$y~data1$x))
i = with(data1,length(which(stdres>3|stdres< -3)))
 }

# attemp to create a function to perfom same idea as above
rm.outliers = function(dataset,var1, var2) {

  dataset$varpredicted = predict(lm(var1~var2))
  dataset$varstdres = rstudent(lm(var1~var2))
  i = length(which(dataset$varstdres > 3 | dataset$varstdres < -3))
  while(i >= 1){
 removed = which(dataset$varstdres > 3 | dataset$varstdres < -3)
 print(dataset[removed,])
 dataset = dataset[-removed,]
 dataset$varpredicted = predict(lm(var1~var2))
   dataset$varstdres = rstudent(lm(var1~var2))
 i = with(dataset,length(varstdres > 3 | varstdres < -3))
   }
}
-- 
View this message in context: 
http://r.789695.n4.nabble.com/removing-outlier-function-dataset-update-tp3238394p3238394.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] build interval

2011-01-26 Thread Rustamali Manesiya
Hello,

I have some question on chron

I currently doing this

t1 <- chron(,"11:30:00")
t2 <- chron(,"11:45:00")
tt <- seq(t1,t2,by=times("00:00:01"))

tt has 901 values (15 minutes * 60 secs) and then

x1 <- rnorm(1:901)
x2 <- rnorm(1:901)
x3 <- rnorm(1:901)


df <-  data.frame(tt, x1, x2, x3)

I would like to write a function such that I can divide df vector into any
interval,
For e.g 10 secs, 5 secs, 15 secs etc..

How can I achieve this. Is there a way to subtract seconds on a chron object
For e.g.

#This should subtract 20 seconds, but doesn't work, looks like it is
subtracting days
newtime <- df$tt - 20

Please help.

Rusty

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot with 2 y axes

2011-01-26 Thread kirtau

I hope this is what you are looking for. you will have to add your own colors
and such.

year = c(1966:2008)
tempur =
c(2.9,4.5,1.9,1,2.9,4.3,3.9,4.3,4.9,4.4,4.5,2,2.8,-.4,2.3,3,.3,1.7,3.3,.8,-1.1,.8,4.9,5.2,4.9,1.5,3.7,3.6,3.2,4.8,2.3,2.5,5.2,5.3,4.9,3.2,3.6,3.9,4.8,4.3,3.7,5.8,4.9)
indx =
c(1,1.24,1.46,1.37,1.87,2.66,3.07,3.91,4.16,4.32,2.52,2.44,2.18,1.18,1.93,2.13,1.92,2.24,2.01,1.89,.66,1.01,1.5,2.11,2.02,.7,.75,1.28,1.37,2.01,1.54,2,2.07,2.11,2.42,2.29,2.15,2.21,2.14,2.33,1.89,2.03,2.58)
data1 = data.frame(year,tempur,indx)

# create new graphing window
windows()
# create margins for variable names
par(mar = c(5,5,5,5))
# create bar plot
barplot(data1$tempur, names = data1$year, xlab = "year", ylab = "Temp")
# Allows for plotting on same charts (kinda like overlay)
par(new=T)
# plot line points
plot(data1$year,data1$indx, xaxt = "n", yaxt = "n", xlab = "", ylab = "")
# add lines
lines(data1$year,data1$indx)
# adds axis for second plot to right hand side
axis(side = 4)
# adds second y axis variable name to right hand side
mtext("Index",side = 4, line = 3)
# quits plotting on the current plotting window
par(new = F)
-- 
View this message in context: 
http://r.789695.n4.nabble.com/plot-with-2-y-axes-tp3237418p3238368.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Making up a graph and its equation which better fits two groups of data

2011-01-26 Thread Jimmy Martina

Dear R-folks:
 
I have a group of data ('x' and 'y' axis), but I'd like to know how to draw a 
graph which would fits my data in a better way, I also need its equation. Could 
you give me any R-rutine ideas?.
 
I thank you in advance for your kind support.   
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting number of rows with two criteria in dataframe

2011-01-26 Thread Ryan Utz
Hadley and Dennis:

THANK YOU THANK YOU! This is exactly what I was looking for.

Ryan


On Wed, Jan 26, 2011 at 5:27 AM, Dennis Murphy  wrote:
> > Hi:
> >
> > Here are two more candidates, using the plyr and data.table packages:
> >
> > library(plyr)
> > ddply(X, .(x, y), function(d) length(unique(d$z)))
> >  x y V1
> > 1 1 1  2
> > 2 1 2  2
> > 3 2 3  2
> > 4 2 4  2
> > 5 3 5  2
> > 6 3 6  2
> >
> > The function counts the number of unique z values in each sub-data frame
> > with the same x and y values. The argument d in the anonymous function is
> a
> > data frame object.
>
> Another approach is to use the much faster count function:
>
> count(unique(X))
>
> Hadley
>
> --
> Assistant Professor / Dobelman Family Junior Chair
> Department of Statistics / Rice University
> http://had.co.nz/
>



-- 
Ryan Utz
Postdoctoral research scholar
University of California, Santa Barbara
(724) 272 7769

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write.table -- maintain decimal places

2011-01-26 Thread Jim Moon
Great.  Thank you, Peter!

-Original Message-
From: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
Sent: Tuesday, January 25, 2011 7:26 PM
To: Jim Moon
Cc: r-help@r-project.org
Subject: Re: [R] write.table -- maintain decimal places

On 2011-01-25 17:22, Jim Moon wrote:
> Thank you for the response, Peter.
>
> The approach:
> write.table(format(df, 
> drop0trailing=FALSE),file='df.txt',quote=F,sep='\t',row.names=F)
> surprisingly still results in some loss of trailing 0's.

Here are a couple more (essentially identical) ways:

# 1.
  dfm <- within(df, {
EFFECT2 <- sprintf("%6.3f", EFFECT2)
PVALUE  <- sprintf("%7.5f", PVALUE)
})

# 2.
  dfm <- within(df, {
   EFFECT2 <- formatC(EFFECT2, format="f", digits=3)
   PVALUE  <- formatC(PVALUE,  format="f", digits=5)
   })

write.table(dfm, file='dfm.txt', quote=FALSE, sep='\t', row.names=FALSE)

Peter Ehlers

>
> df:
> EFFECT2  PVALUE
> 1 0.0230.88080
> 2 -0.260  0.08641
> 3 -0.114  0.45200
>
> df.txt:
> EFFECT2PVALUE
> 0.023  8.808e-01
> -0.26  8.641e-02
> -0.114 4.520e-01
>
>
> -Original Message-
> From: Peter Ehlers [mailto:ehl...@ucalgary.ca]
> Sent: Tuesday, January 25, 2011 5:09 PM
> To: Jim Moon
> Cc: r-help@r-project.org
> Subject: Re: [R] write.table -- maintain decimal places
>
> On 2011-01-25 16:16, Jim Moon wrote:
>> Hello, All,
>>
>> How can I maintain the decimal places when using write.table()?
>>
>> Jim
>>
>> e.g.
>>
>> df:
>> EFFECT2  PVALUE
>> 1 0.0230.88080
>> 2 -0.260  0.08641
>> 3 -0.114  0.45200
>>
>> write.table(df,file='df.txt',quote=F,sep='\t',row.names=F)
>
>write.table(format(df, drop0trailing=FALSE), )
>
> Peter Ehlers
>
>>
>>
>> df.txt:
>> EFFECT2PVALUE
>> 0.023  0.8808
>> -0.26  0.08641
>> -0.114 0.452
>>
>>  [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write.table -- maintain decimal places

2011-01-26 Thread Jim Moon
I am using:
"R version 2.11.1 (2010-05-31)"
It is good to know that it works in 2.12.1

Jim

-Original Message-
From: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
Sent: Tuesday, January 25, 2011 5:57 PM
To: Jim Moon
Cc: r-help@r-project.org
Subject: Re: [R] write.table -- maintain decimal places

On 2011-01-25 17:22, Jim Moon wrote:
> Thank you for the response, Peter.
>
> The approach:
> write.table(format(df, 
> drop0trailing=FALSE),file='df.txt',quote=F,sep='\t',row.names=F)
> surprisingly still results in some loss of trailing 0's.
>

What version of R?
I'm using R version 2.12.1 Patched (2010-12-27 r53883)
and it works for me.

Peter Ehlers

> df:
> EFFECT2  PVALUE
> 1 0.0230.88080
> 2 -0.260  0.08641
> 3 -0.114  0.45200
>
> df.txt:
> EFFECT2PVALUE
> 0.023  8.808e-01
> -0.26  8.641e-02
> -0.114 4.520e-01
>
>
> -Original Message-
> From: Peter Ehlers [mailto:ehl...@ucalgary.ca]
> Sent: Tuesday, January 25, 2011 5:09 PM
> To: Jim Moon
> Cc: r-help@r-project.org
> Subject: Re: [R] write.table -- maintain decimal places
>
> On 2011-01-25 16:16, Jim Moon wrote:
>> Hello, All,
>>
>> How can I maintain the decimal places when using write.table()?
>>
>> Jim
>>
>> e.g.
>>
>> df:
>> EFFECT2  PVALUE
>> 1 0.0230.88080
>> 2 -0.260  0.08641
>> 3 -0.114  0.45200
>>
>> write.table(df,file='df.txt',quote=F,sep='\t',row.names=F)
>
>write.table(format(df, drop0trailing=FALSE), )
>
> Peter Ehlers
>
>>
>>
>> df.txt:
>> EFFECT2PVALUE
>> 0.023  0.8808
>> -0.26  0.08641
>> -0.114 0.452
>>
>>  [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Greek letters in CairoPDF

2011-01-26 Thread Eduardo de Oliveira Horta
Hello there,

Straight to the point: it seems that CairoPDF from package "Cairo"
cannot handle greek letters from expression(). For example,

> eta = seq(from=-pi, to=pi, length=100)
> f = sin(eta)^2
> pdf(file = "temp_pdf.pdf")
> plot(eta, f, type="l", main=expression(f(eta)==sin(eta)^2), 
> xlab=expression(eta), ylab=expression(f(eta)))
> dev.off()

gives the expected result, but

> require("Cairo")
> CairoPDF(file = "temp_CairoPDF.pdf")
> plot(eta, f, type="l", main=expression(f(eta)==sin(eta)^2), 
> xlab=expression(eta), ylab=expression(f(eta)))
> dev.off()

leaves a blank where it should display the etas. Any ideas here?
(session info below)

Thanks in advance and best regards,

Eduardo



> sessionInfo()
R version 2.11.1 (2010-05-31)
i386-pc-mingw32

locale:
[1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252
[3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
[5] LC_TIME=Portuguese_Brazil.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] Cairo_1.4-5

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot with 2 y axes

2011-01-26 Thread Greg Snow
Why do you need the line to overlay the bars?  Which bars are touched by the 
line is just a quirk of scaling and could easily change with the scales.  All 
the overlay does is to make it harder to read, why not jut have 2 panels 
aligned on the x-axis but with the line plot above the bar plot?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of Mario Beolco
> Sent: Tuesday, January 25, 2011 5:31 PM
> To: r-help@r-project.org
> Subject: [R] plot with 2 y axes
> 
> Dear R users,
> 
> apologies for the total beginner's question. I would like to create a
> barchart for some temperature values with the y axis on the right hand
> side of the plot. On this plot would like to overlay some time series
> data  (in the form of a line) for some other variable called Index.
> The y axis for this latter variable should be on the left hand side of
> the plot.
> 
> An example of what I would like to obtain:
> 
> https://sites.google.com/site/graphtests1/
> 
> I have tried to do this using ggplot2 and this where I have got (for
> data see at the bottom of the e-mail):
> 
> none<-theme_blank()
> p<-ggplot(tmp3,aes(x=year,y=Temperature))
> p1<-p+geom_bar(stat="identity",fill="#9ACD32",colour="#00")
> p1 + geom_line(data=tmp3, aes(x=year, y=Index),
> colour="black",size=1)+opts(legend.position="none",panel.grid.major=non
> e,panel.grid.minor=none)+opts(panel.border=none)+theme_bw(base_size=20)
> 
> This code does not do what I want because the Temperature y axis
> should be on the left hand side and the the y axis for the other
> variable called Index is not even there (should in theory be on the
> left hand side). I also get the following warning message when I run
> that code "I get Warning message:Stacking not well defined when ymin
> != 0". (Should I worry about this?).
> 
> I do not know whether ggplot2 can is the best package for creating the
> type of plot that I want. I would, however, be very grateful for any
> suggestions on to improve the above code or on how I could use other
> packages to create the plot I want.
> 
> thanks!
> 
> Mario
> 
> 
> 
> "year","Temperature","Index"
> 1966,2.9,1
> 1967,4.5,1.24
> 1968,1.9,1.46
> 1969,1,1.37
> 1970,2.9,1.87
> 1971,4.3,2.66
> 1972,3.9,3.07
> 1973,4.3,3.91
> 1974,4.9,4.16
> 1975,4.4,4.32
> 1976,4.5,2.52
> 1977,2,2.44
> 1978,2.8,2.18
> 1979,-0.4,1.18
> 1980,2.3,1.93
> 1981,3,2.13
> 1982,0.3,1.92
> 1983,1.7,2.24
> 1984,3.3,2.01
> 1985,0.8,1.89
> 1986,-1.1,0.66
> 1987,0.8,1.01
> 1988,4.9,1.5
> 1989,5.2,2.11
> 1990,4.9,2.02
> 1991,1.5,0.7
> 1992,3.7,0.75
> 1993,3.6,1.28
> 1994,3.2,1.37
> 1995,4.8,2.01
> 1996,2.3,1.54
> 1997,2.5,2
> 1998,5.2,2.07
> 1999,5.3,2.11
> 2000,4.9,2.42
> 2001,3.2,2.29
> 2002,3.6,2.15
> 2003,3.9,2.21
> 2004,4.8,2.14
> 2005,4.3,2.33
> 2006,3.7,1.89
> 2007,5.8,2.03
> 2008,4.9,2.58
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] barplot with varaible-width bars

2011-01-26 Thread Bill Pikounis
Hi Larry,
If I understand correctly, your barplot() call dispatches to the
method function barplot.default()  to do the work. Looking at the
definition of that function and your specific call, it seems that
around line 51 of barplot.default(),  the value of the width argument
is truncated:

width <- rep(width, length.out = NR)

where NR <- nrow(height)  is defined a bit earlier around line 44. So
in the execution width takes on the value

[1] 417 153

which seems to explain the same width pieces across pairs.

Just quick and dirty I copied the function barplot.default in the
workspace to an editor, renamed it as mod.barplot.default() and then
commented out line 51 and added the line there: width <- width which
seems at though it could actually be left out as long as beside=TRUE
is kept in the call. Then I created mod.barplot.default() as a working
function, and this call

mod.barplot.default(yy[,2*1:5], las=1, width=yy[,(2*1:5)-1],
space=c(.1,.5) ,beside=TRUE)

looks like it might provide what you wanted.

Hope that helps.

Bill


-
Bill Pikounis
http://billpikounis.net/



On Tue, Jan 25, 2011 at 10:47, Gould, A. Lawrence  wrote:
> I would like to produce a bar plot with varying-width bars.  Here is an 
> example to illustrate:
>
> ww <- c(417,153,0.0216,0.0065,556,256,0.0162,0.0117,
> +  726,379,0.0358,0.0501,786,502,0.0496,0.0837,
> +  892,591,0.0785,0.0795)
> yy<-t(t(array(ww,c(2,10
>
> barplot(yy[,2*1:5],las=1,space=c(.1,.5),beside=T)
>
> produces a barplot of 5 pairs of bars that are of equal width
>
> barplot(yy[,2*1:5],las=1,width=c(yy[,(2*1:5)-1]),space=c(.1,.5),beside=T)
>
> makes the bars in each pair of unequal width, but the two widths do not vary 
> from pair to pair
>
> I would like the width of each bar to be proportional to its corresponding 
> value in the width statement of this last call of barplot, like what I think 
> could be done with the mulbar function of SPlus.  Can I do this with barplot 
> itself, or is this something for which lattice or ggplot 2 is needed?  And, 
> if so, what would typical code look like?
>
> Thanks for your help.
>
> Larry Gould
>
>
> Notice:  This e-mail message, together with any attachme...{{dropped:14}}
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Train error:: subscript out of bonds

2011-01-26 Thread Neeti

Thank you so much for your reply. In my case it is giving error in some seed
value for example if I set seed value to 357 this gives an error. Does train
have some specific seed range?
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Train-error-subscript-out-of-bonds-tp3234510p3238197.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Failing to install {rggobi} on win-7 R 2.12.0

2011-01-26 Thread Tal Galili
I checked it using:

Sys.getenv("PATH")

And the output includes the PATH to the GTK2 installation (it's the last
item in the following list):

"C:\\Windows\\system32;C:\\Windows;C:\\Windows\\System32\\Wbem;C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\;C:\\Program
Files (x86)\\Common Files\\Ulead Systems\\MPEG;C:\\Program
Files\\TortoiseGit\\bin;C:\\Program Files
(x86)\\QuickTime\\QTSystem\\;C:\\Program Files (x86)\\ggobi;C:\\Program
Files (x86)\\GTK2-Runtime\\bin"


What else might I try?


(Thanks)


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Wed, Jan 26, 2011 at 4:23 PM, Prof Brian Ripley wrote:

> Your GTK+ installation is not being found: check your PATH.
>
>
> On Wed, 26 Jan 2011, Tal Galili wrote:
>
>  Hello Prof Brian Ripley, Yihui and Tom,
>>
>> Thank you for your suggestions.  It seemed to have made some differences
>> in
>> the error massages - but rggobi still fails to load.
>>
>> Steps taken:
>> 1) I removed the old GTK (through the uninstall interface)
>> 2) I ran  library(RGtk2) which downloaded the new GTK-runtime
>> version 2.22.0-2010-10-21 (instead of the one I got from ggobi, which
>> was 2.12.9-2).
>> 3) I downloaded both
>> ftp://ftp.zlatkovic.com/libxml/libxml2-2.7.7.win32.zip and
>> ftp://ftp.zlatkovic.com/libxml/iconv-1.9.2.win32.zip
>> Unzipped them, and moved their dll's (from their bin directory), into -
>> C:\Program Files (x86)\GTK2-Runtime\bin
>> 4) I then tried starting rggobi:  library(rggobi)  and got the following
>> error massages:
>>
>> Error 1:
>>  the program can't start because
>> libgdk-win32-2.0-0.dll
>> is missing from your computer.
>> Try reinstalling the program to fix this problem.
>>
>> It then tried to reinstall GTK, and after I refused to, it sent the second
>> Error massage:
>>  the program can't start because
>> libfreetype-6.dll
>> is missing from your computer.
>> Try reinstalling the program to fix this problem.
>>
>>
>>
>> Any suggestions what else I should try?
>>
>> Many thanks for helping,
>> Tal
>>
>>
>>
>>
>> Contact
>> Details:---
>> Contact me: tal.gal...@gmail.com |  972-52-7275845
>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
>> www.r-statistics.com (English)
>>
>> ---
>> ---
>>
>>
>>
>>
>> On Wed, Jan 26, 2011 at 9:17 AM, Prof Brian Ripley > >
>> wrote:
>>  On Tue, 25 Jan 2011, Tom La Bone wrote:
>>
>>I recall that my problem on Windows was related to
>>having a number of stray
>>versions of GTK+ installed. I went back and deleted
>>all versions and
>>reinstalled the latest GTK+ and that seemed to fix
>>things. However, when I
>>went to do any work of substance ggobi locked up and
>>became unresponsive.
>>Never did get it working right on Windows. Had much
>>more luck with R/ggobi
>>on Ubuntu 10.10.
>>
>>
>> I've just been setting rggobi up for our classroom.  It seems that on
>> Windows we now need to use Rgui in SDI mode to run rggobi without
>> lookups.  (That was not the case last year, so it might be due to the
>> change in GTK+ version or it might be due to the change from XP to x64
>> Windows 7 on those machines.)
>>
>> The rggobi binary on CRAN extras is statically linked against
>> everything except GTK+, but the www.ggobi.org ggobi DLL needs both
>> GTK+ DLLs and libxml2.dll (which needs iconv.dll and zlib1.dll). Late
>> last year there was a problem in that GTK+ and libxml2.dll needed
>> different zlib1.dll's, but AFAICS this is now resolved by using
>> ftp://ftp.zlatkovic.com/libxml/libxml2-2.7.7.win32.zip and
>> ftp://ftp.zlatkovic.com/libxml/iconv-1.9.2.win32.zip.  (Unpack those
>> and drop the DLLs into somewhere on your path, e.g. the GTK+ bin
>> directory.)
>>
>> We've had a lot of trouble over zlib1.dll: those prepared from zLib
>> 1.2.3 and 1.2.5 are incompatible.  The whole point of the '1' in the
>> name is to change the name in that case!  I suspect very few of those
>> benefitting from Windows binary packages have any idea how much work
>> goes into circumventing such issues.
>>
>>
>> --
>> Brian D. Ripley,  rip...@stats.ox.ac.uk
>> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
>> University of Oxford, Tel:  +44 1865 272861 (self)
>> 1 South Parks Road, +44 1865 272866 (PA)
>> Oxford OX1 3TG, UKFax:  +44 1865 272595
>>
>>
>> __
>> R-help@r-project.org mailing list
>> 

Re: [R] lattice draw.key(): position of key in panels

2011-01-26 Thread Boris.Vasiliev
Good morning,

This problem was already addressed in a previous post:

https://stat.ethz.ch/pipermail/r-help/2009-February/187244.html

In the call to draw.key() use
'vp=viewport(x=unit(0.1,"npc"),y=unit(0.1,"npc"))'.
Prior to calling viewport() make sure grid package is loaded.

Apologies for cluttering the mail list.
Regards,
Boris Vasiliev.


> -Original Message-
> From: Vasiliev B@CEFCOM HQ@Ottawa-Hull 
> Sent: Tuesday, 25, January, 2011 16:22 PM
> To: 'r-help@r-project.org'
> Subject: lattice draw.key(): position of key in panels
> 
> Good afternoon,
> 
> I am working on a plot that requires custom legends to be 
> placed in some panels of the plot; other panels do not 
> contain legends.  The problem that I run into is positioning 
> of the legend in individual panels. In particular, the 'x' 
> and 'y' elements of the key-list are ignored by draw.key() 
> when it is called from inside a panel function.  As a result, 
> the legend is placed in the middle of the panel.  The example 
> below illustrates this problem.
> 
> df <- data.frame(x=c(1,1),y=c(1,2),type=c("A","B"))
> 
> panel.xyplot.x <- function(...) {
>   # draw data
>   panel.xyplot(...)
> 
>   # create key-list
>   pnlid <- panel.number()
>   lbl <- ifelse(pnlid==1,"AA","BB")
>   pts <- Rows(trellis.par.get("superpose.symbol"),pnlid)
>   key <- list(points=pts,text=list(lbl),x=0.1,y=0.9,corner=c(0,1))
> 
>   # draw key
>   draw.key(key,draw=TRUE)
> }
> 
> oltc <- xyplot(y~x|type,data=df,panel=panel.xyplot.x)
> print(oltc)
> 
> I tried using 'vp=current.viewport()' in the call to 
> draw.key() but it did not help.  Can anybody suggest the 
> proper way to specify position of the key-list so that it is 
> respected by draw.key() when called within a panel function?
> 
> Sincerely,
> Boris Vasiliev.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] coxme and random factors

2011-01-26 Thread Terry Therneau
 --- begin inclusion ---
Response variable:  survival (death)
Factor 1:   treatment (4 levels)
Factor 2:   sex (male / female)
Random effects 1:   person nested within day (2 people did the  
experiment over 2   days)
Random effects 2:   box nested within treatment (animals were kept in  
boxes in groups of 6, and there were multiple boxes per treatment)

I've read the introductions to coxme by Terry Therneau, and something  
like the following is what I think I should use:

model1<-coxme(Surv(death,censor)~treatment*sex+(1|day/person)+(1| 
treatment/box))

--- End inclusion ---

That looks right to me.  Your questions:
1: How to test: As you guessed, fit the model without one of the random
effects and compare the integrated likelihood for the two fits.  The
usual "is it a chisq or sum of chisquares" question from random effects
models applies -- the simple chisq test will be conservative.

2: (treatment |box) term.  For a factor variable such as treatment a
term (1 | treatment/box) specifies a (random) coefficient for each
treatment by box combination.  The term (treatment|box) is asking for
exactly the same thing, but coxme currently does not support asking for
it in that way.

3. I do not have an extension of cox.zph to the mixed effects model, in
either theory or code.  Residuals methods for coxme would be an
important addition and is on my to-do list. (But as my wife would point
out, so is a bathroom remodel and she isn't holding her breath.)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to call subset in a for loop?

2011-01-26 Thread David Reiner
you were caught by the '=' versus '==' error ;-)
and Henrique's elegant one-liner avoids the problem altogether.

-- David

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Henrique Dallazuanna
Sent: Wednesday, January 26, 2011 6:00 AM
To: Aditya Bhagwat
Cc: r-help@r-project.org
Subject: Re: [R] How to call subset in a for loop?

Try this:

split(myDF, myDF$myField)

On Wed, Jan 26, 2011 at 8:18 AM, Aditya Bhagwat wrote:

> Dear all,
>
> I have a data frame 'myDf', in which one of the fields 'myField' can have
> several possible values. To extract the observations for which it has value
> "A", I can do:
>
> subset(myDf, myField="A")
>
> However, when I try to do this within a loop, it doesn't work, it returns
> everything, and not a subset
>
> for (currField in c("A", "B", "C")){
>   subset(myDf, myField=currField)
> }
>
> How should I modify the call of subset in the loop to make it work?
>
> Thanks for your help!
>
> Adi
>
> --
> Aditya Bhagwat
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]



This e-mail and any materials attached hereto, including, without limitation, 
all content hereof and thereof (collectively, "XR Content") are confidential 
and proprietary to XR Trading, LLC ("XR") and/or its affiliates, and are 
protected by intellectual property laws.  Without the prior written consent of 
XR, the XR Content may not (i) be disclosed to any third party or (ii) be 
reproduced or otherwise used by anyone other than current employees of XR or 
its affiliates, on behalf of XR or its affiliates.

THE XR CONTENT IS PROVIDED AS IS, WITHOUT REPRESENTATIONS OR WARRANTIES OF ANY 
KIND.  TO THE MAXIMUM EXTENT PERMISSIBLE UNDER APPLICABLE LAW, XR HEREBY 
DISCLAIMS ANY AND ALL WARRANTIES, EXPRESS AND IMPLIED, RELATING TO THE XR 
CONTENT, AND NEITHER XR NOR ANY OF ITS AFFILIATES SHALL IN ANY EVENT BE LIABLE 
FOR ANY DAMAGES OF ANY NATURE WHATSOEVER, INCLUDING, BUT NOT LIMITED TO, 
DIRECT, INDIRECT, CONSEQUENTIAL, SPECIAL AND PUNITIVE DAMAGES, LOSS OF PROFITS 
AND TRADING LOSSES, RESULTING FROM ANY PERSON'S USE OR RELIANCE UPON, OR 
INABILITY TO USE, ANY XR CONTENT, EVEN IF XR IS ADVISED OF THE POSSIBILITY OF 
SUCH DAMAGES OR IF SUCH DAMAGES WERE FORESEEABLE.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Failing to install {rggobi} on win-7 R 2.12.0

2011-01-26 Thread Prof Brian Ripley

Your GTK+ installation is not being found: check your PATH.

On Wed, 26 Jan 2011, Tal Galili wrote:


Hello Prof Brian Ripley, Yihui and Tom,

Thank you for your suggestions.  It seemed to have made some differences in
the error massages - but rggobi still fails to load.

Steps taken:
1) I removed the old GTK (through the uninstall interface)
2) I ran  library(RGtk2) which downloaded the new GTK-runtime
version 2.22.0-2010-10-21 (instead of the one I got from ggobi, which
was 2.12.9-2).
3) I downloaded both
ftp://ftp.zlatkovic.com/libxml/libxml2-2.7.7.win32.zip and 
ftp://ftp.zlatkovic.com/libxml/iconv-1.9.2.win32.zip
Unzipped them, and moved their dll's (from their bin directory), into - 
C:\Program Files (x86)\GTK2-Runtime\bin
4) I then tried starting rggobi:  library(rggobi)  and got the following
error massages:

Error 1:
  the program can't start because 
libgdk-win32-2.0-0.dll
is missing from your computer.
Try reinstalling the program to fix this problem.

It then tried to reinstall GTK, and after I refused to, it sent the second
Error massage:
  the program can't start because 
libfreetype-6.dll
is missing from your computer.
Try reinstalling the program to fix this problem.



Any suggestions what else I should try?

Many thanks for helping,
Tal




Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
---
---




On Wed, Jan 26, 2011 at 9:17 AM, Prof Brian Ripley 
wrote:
  On Tue, 25 Jan 2011, Tom La Bone wrote:

I recall that my problem on Windows was related to
having a number of stray
versions of GTK+ installed. I went back and deleted
all versions and
reinstalled the latest GTK+ and that seemed to fix
things. However, when I
went to do any work of substance ggobi locked up and
became unresponsive.
Never did get it working right on Windows. Had much
more luck with R/ggobi
on Ubuntu 10.10.


I've just been setting rggobi up for our classroom.  It seems that on
Windows we now need to use Rgui in SDI mode to run rggobi without
lookups.  (That was not the case last year, so it might be due to the
change in GTK+ version or it might be due to the change from XP to x64
Windows 7 on those machines.)

The rggobi binary on CRAN extras is statically linked against
everything except GTK+, but the www.ggobi.org ggobi DLL needs both
GTK+ DLLs and libxml2.dll (which needs iconv.dll and zlib1.dll). Late
last year there was a problem in that GTK+ and libxml2.dll needed
different zlib1.dll's, but AFAICS this is now resolved by using
ftp://ftp.zlatkovic.com/libxml/libxml2-2.7.7.win32.zip and
ftp://ftp.zlatkovic.com/libxml/iconv-1.9.2.win32.zip.  (Unpack those
and drop the DLLs into somewhere on your path, e.g. the GTK+ bin
directory.)

We've had a lot of trouble over zlib1.dll: those prepared from zLib
1.2.3 and 1.2.5 are incompatible.  The whole point of the '1' in the
name is to change the name in that case!  I suspect very few of those
benefitting from Windows binary packages have any idea how much work
goes into circumventing such issues.


--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Train error:: subscript out of bonds

2011-01-26 Thread Max Kuhn
Sort of. It lets you define a grid of candidate values to test and to
define the rule to choose the best. For some models, it is each to
come up with default values that work well (e.g. RBF SVM's, PLS, KNN)
while others are more data dependent. In the latter case, the defaults
may not work well.

MAx

On Wed, Jan 26, 2011 at 5:45 AM, Neeti  wrote:
>
> What I have understood in CARET train() method is that train() itself does
> the model selection and tune the parameter. (please correct me if I am
> wrong). That was my first motivation to select this package and method for
> fitting the model. And use the parameter to e1071 svm() method and compare
> the result.
>
> fit1<-train(train1,as.factor(trainset[,ncol(trainset)]),"svmpoly",trControl
> = trainControl((method = "cv"),10,verboseIter = F),tuneLength=3)
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Train-error-subscript-out-of-bonds-tp3234510p3237800.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help Derivate for Nonlinear Growth Models

2011-01-26 Thread Ben Bolker
acocac  gmail.com> writes:

> 
> 
> Hi!! Im doing my graduated work in Onion Curves Growth with Nonlinear Models,
> I'm amateur in R so i have  doubt how i put or program next models,
> 
> http://r.789695.n4.nabble.com/file/n3236748/96629508.png 
> 
> Also, i cant derivate for Gauss Model, and Richard Model dont have funtion,
> If someone could help me, i was so grate,
> 

  You need more help than we can give you here ... You need to 
use the ?nls function. I think if you are going to be doing a serious
project fitting nonlinear growth models, you should probably learn or
refresh enough calculus so that you can compute the derivatives yourself
(R can do some small bit of analytic differentiation -- see ?D -- but
it doesn't simplify at all, so the answers are quite often uglier than
if you did the computation by hand). 
  You don't absolutely need the derivative to use nls(), although it
helps a lot.

  I would also suggest going to Google scholar and
searching for 'nls "growth curve" Bates" to find some papers that
have used this approach.

  If you need to post again, please read the posting guide and
show us how far you have managed to get on your own.

  good luck,
   Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with "setMethod" in R package

2011-01-26 Thread Martin Morgan
On 01/26/2011 05:08 AM, Pascal A. Niklaus wrote:
> Dear all,
> 
> My apologies for re-posting this question, but I have not found any
> solution to my problem so far. I also think that my post may have been
> overseen due to the posting time and high traffic on this list.
> 
> I experience a problem in implementing a S4 class in a package and would
> appreciate your help in this matter.
> 
> The class is implemented using Rcpp, which works well. I then extend the
> class with R functions, adding them as methods with a call to
> setMethod(...). When I do this outside the package (after loading the
> package with "library(...)"), everything works smoothly. However, when I
> try to incorporate the call to setMethod(...) in the package code
> itself, the only way I get it to work is to include the call in
> .onLoad(...) or .onAttach(...). This works, but when loading the library
> I get the messages "creating new generic function for "plot" in
> ".GlobalEnv", as well as "no definition for class "Rcpp_rothC""...
> 
> Again, the code works, but the messages let me presume that I add the
> methods in the wrong way or miss-specify name spaces, class names, or
> something else.
> 
> Thank you for your advice.
> 
> Pascal
> 
> (The error message and the relevant parts of the package files are
> pasted below, together with my related questions as "comments".)
> 
> 
>> showMethods("plot")
> Function "plot":
># I am surprised about this --
># why isn't plot a generic function?
> 
>> library("RrothC2")
> Loading required package: Rcpp
> Loading required package: pascal
> Creating a new generic function for "plot" in ".GlobalEnv"
> in method for ‘plot’ with signature ‘x="Rcpp_rothC",y="missing"’: no
> definition for class "Rcpp_rothC"
># class seems not to be known
># at this stage.
># but where should I put setMethod()?
>> r1 <- new(rothC$rothC);
>> class(r1)# class is known by now
> [1] "Rcpp_rothC"
> attr(,"package")
> [1] ".GlobalEnv"
>> plot(r1) # "plot" is dispatched correctly
> 
>> showMethods("plot")
> Function: plot (package graphics) # now "plot" from graphics shows up
> x="ANY", y="ANY"  # why was it missing before?
> x="Rcpp_rothC", y="missing"
> 
> 
> # FILE rcpp_rothC_module.h #
> 
> class rothC { ... }
> 
> # FILE rcpp_rothC_module.cpp #
> 
> using namespace Rcpp ;
> 
> RCPP_MODULE(rothC) {
>   class_("rothC")
> .constructor()
>...
> }
> 
> # FILE rcpp_rothC.R #
> 
> Rcpp_rothC.plot <- function(x,y,...) { ... }
> 
> ## FILE zzz.R ##
> 
> .NAMESPACE <- environment()
> 
> rothC <- new( "Module" )
> 
> .onLoad <- function(libname, pkgname) {
>   unlockBinding( "rothC" , .NAMESPACE )
>   assign( "rothC",  Module( "rothC" ), .NAMESPACE )
> 
> setMethod(f="plot",signature(x="Rcpp_rothC",y="missing"),definition=Rcpp_rothC.plot,where=.GlobalEnv);
> 
>   lockBinding( "rothC", .NAMESPACE )
> }
> 
> .onAttach <- function(libname, pkgname) {}
> 
> .onUnload <- function(libname, pkgname) { gc(); }

Hi Pascal --

Normally one would arrange code (e.g., using the Collate: field of the
DESCRIPTION file to define S4 classes, generics, then methods, equivalent to

setClass("Foo")
setGeneric("plot")  ## promote plot to S4 generic
setMethod("plot, c(x="Foo", y="missing"), Rcpp_rothC.plot)

One would not normally specify a 'where' argument to setMethod; by
default the method is created in the 'top' environment at the time the
package is installed, which is the package name space. It is not usually
necessary to include setMethod and friends in .onLoad.

I do not use Rcpp, but one would normally have to write R code to wrap
the C++ object, e.g., an S4 class with an 'externalptr' slot to contain
the address of the object, plus methods to interact with the object.
This part of the problem sounds like a question for the Rcpp help list.

Martin
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: MAtrix addressing

2011-01-26 Thread David Winsemius



Begin forwarded message:


From: David Winsemius 
Date: January 26, 2011 8:32:30 AM EST
To: Alaios 
Subject: Re: [R] MAtrix addressing


On Jan 26, 2011, at 7:58 AM, Alaios wrote:

Unfortunately right now is convoluted... by I was trying to find  
some solution.

Bring again this picture in front of u
http://img545.imageshack.us/i/maptoregion.jpg/

Consider f a function that gets as input the coords
so
f(-1,-1) should return the value of the bottom left point
f(1,1) should return the value of the top right point.


A.For me an area is approximated by a matrix so each cell of the  
matrix corresponds to a fixed value in a small sub-area.
So When my function gets coords like f(0,0.3) should find the  
corresponding sub-area.


B. This is want to do
I also have a data structure called matrix that in every cell has  
the appropriate values of that area.


A+B => Combine these two and have a function that returns the  
appropriate value of a subarea given its coords. Attention my area  
spans from -1 to 1 in y plane and from -1 to 1 in the x plane.


> mtx <- matrix(seq(1:36), nrow=6, byrow=TRUE,  
dimnames=list(x=seq(-1, 1, length=7)[-7], y=seq(-1, 1, length=7)[-7]) )

> mtx
   y
x-1 -0.667 -0.333  0  
0.333 0.667
 -1  1  2  3   
4 5 6
 -0.667  7  8  9  
101112
 -0.333 13 14 15  
161718
 0  19 20 21  
222324
 0.333  25 26 27  
282930
 0.667  31 32 33  
343536


> fnfind <- function(x,y) mtx[ findInterval(x,  
c(as.numeric(rownames(mtx)), 1)),
+  findInterval(y,  
c(as.numeric(colnames(mtx)), 1))]

> fnfind(.5,.5)
[1] 29
> fnfind(-.5,-.5)
[1] 8

This could obviously be made more compact, but the current form allows  
simple modification of the length and endpoints of x and y.





Was it clearer this way? (Why is always so hard to me to explain  
even simple tasks?)


Regards
Alex

--- On Wed, 1/26/11, David Winsemius  wrote:


From: David Winsemius 
Subject: Re: [R] MAtrix addressing
To: "Alaios" 
Cc: R-help@r-project.org
Date: Wednesday, January 26, 2011, 12:49 PM

On Jan 26, 2011, at 2:47 AM, Alaios wrote:


The reason is the following image
http://img545.imageshack.us/i/maptoregion.jpg/
In the picture above you will find the indexes for

each cell.


Also you will see that I place that matrix inside a

x,y region that spans from -1 to 1. I am trying to write one
function that will get as argument a (x,y) value x e[-1,1] y
e[-1,1] and will return the indexes of that cell tha x,y
value correspond to.



I really do not have a clue how I should try to

approach that to solve it. So based on some version I had
for 1-d vector I tried to extend it for 2-d. I used
findInterval as a core to get results.

Unfortunately my code fails to produce accurate

results as my approach 'assumes' (this is something
inhereted by the find Interval function) that the numbering
starts bottom left and goes high top right.

You will find my code below


If one wants to take an ordinary r matrix and reorder it in
the manner you describe:

mtx2 <- mtx[ nrow(mtx):1, ]

Whether that is an efficient way to get at the sokution
your you programming task I cannot say. It sounds as though
it has gotten too convoluted. I was not able to comprehend
the overall goal from your problem description.






sr.map <- function(sr){
# This function converts the s(x,y) matrix into a

function x that spans #from -1 to 1 and y spans from -1 to
1.

# Input: sr a x,y matrix containing the shadowing

values of a Region

breaksX <- seq(from=-1, to

= 1, length = nrow(sr) +1L )

breaksY <- seq(from=-1, to

= 1, length = ncol(sr) + 1L)

function(x,y){ # SPAGGETI CODE

FOR EVER

indx <-

findInterval(x, breaksX,rightmost.closed=TRUE)

indy <-

findInterval(y, breaksY,rightmost.closed=TRUE)

c(indx,indy)
}
}





sr<-matrix(data=seq(from=1,to=36),nrow=6,ncol=6,byrow=TRUE)

f.sr.map<-sr.map((sr))
f.sr.map(-0.1,-0.1)
f.sr.map(0.1,0.1)



Best Regards
Alex





--- On Wed, 1/26/11, David Winsemius 

wrote:



From: David Winsemius 
Subject: Re: [R] MAtrix addressing
To: "Alaios" 
Cc: R-help@r-project.org
Date: Wednesday, January 26, 2011, 2:54 AM

On Jan 25, 2011, at 4:50 PM, Alaios wrote:


Hello
I would like to ask you if it is possible In R

Cran to

change the default way of addressing a matrix.

for example
matrix(data=seq(from=1,to=4,nrow=2,ncol=2, by

row

numbering) # not having R at this pc


will create something like the following
1 2
3 4

the way R addre

Re: [R] Failing to install {rggobi} on win-7 R 2.12.0

2011-01-26 Thread Tal Galili
Hello Prof Brian Ripley, Yihui and Tom,

Thank you for your suggestions.  It seemed to have made some differences in
the error massages - but rggobi still fails to load.

Steps taken:
1) I removed the old GTK (through the uninstall interface)
2) I ran  library(RGtk2) which downloaded the new GTK-runtime
version 2.22.0-2010-10-21 (instead of the one I got from ggobi, which
was 2.12.9-2).
3) I downloaded both
ftp://ftp.zlatkovic.com/libxml/libxml2-2.7.7.win32.zip and
ftp://ftp.zlatkovic.com/libxml/iconv-1.9.2.win32.zip
Unzipped them, and moved their dll's (from their bin directory), into -
C:\Program Files (x86)\GTK2-Runtime\bin
4) I then tried starting rggobi:  library(rggobi)  and got the following
error massages:

Error 1:

the program can't start because
libgdk-win32-2.0-0.dll
is missing from your computer.
Try reinstalling the program to fix this problem.

It then tried to reinstall GTK, and after I refused to, it sent the second
Error massage:

the program can't start because
libfreetype-6.dll
is missing from your computer.
Try reinstalling the program to fix this problem.



Any suggestions what else I should try?

Many thanks for helping,
Tal




Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Wed, Jan 26, 2011 at 9:17 AM, Prof Brian Ripley wrote:

> On Tue, 25 Jan 2011, Tom La Bone wrote:
>
>  I recall that my problem on Windows was related to having a number of
>> stray
>> versions of GTK+ installed. I went back and deleted all versions and
>> reinstalled the latest GTK+ and that seemed to fix things. However, when I
>> went to do any work of substance ggobi locked up and became unresponsive.
>> Never did get it working right on Windows. Had much more luck with R/ggobi
>> on Ubuntu 10.10.
>>
>
> I've just been setting rggobi up for our classroom.  It seems that on
> Windows we now need to use Rgui in SDI mode to run rggobi without lookups.
>  (That was not the case last year, so it might be due to the change in GTK+
> version or it might be due to the change from XP to x64 Windows 7 on those
> machines.)
>
> The rggobi binary on CRAN extras is statically linked against everything
> except GTK+, but the www.ggobi.org ggobi DLL needs both GTK+ DLLs and
> libxml2.dll (which needs iconv.dll and zlib1.dll). Late last year there was
> a problem in that GTK+ and libxml2.dll needed different zlib1.dll's, but
> AFAICS this is now resolved by using
> ftp://ftp.zlatkovic.com/libxml/libxml2-2.7.7.win32.zip and
> ftp://ftp.zlatkovic.com/libxml/iconv-1.9.2.win32.zip.  (Unpack those and
> drop the DLLs into somewhere on your path, e.g. the GTK+ bin directory.)
>
> We've had a lot of trouble over zlib1.dll: those prepared from zLib 1.2.3
> and 1.2.5 are incompatible.  The whole point of the '1' in the name is to
> change the name in that case!  I suspect very few of those benefitting from
> Windows binary packages have any idea how much work goes into circumventing
> such issues.
>
>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting number of rows with two criteria in dataframe

2011-01-26 Thread Hadley Wickham
On Wed, Jan 26, 2011 at 5:27 AM, Dennis Murphy  wrote:
> Hi:
>
> Here are two more candidates, using the plyr and data.table packages:
>
> library(plyr)
> ddply(X, .(x, y), function(d) length(unique(d$z)))
>  x y V1
> 1 1 1  2
> 2 1 2  2
> 3 2 3  2
> 4 2 4  2
> 5 3 5  2
> 6 3 6  2
>
> The function counts the number of unique z values in each sub-data frame
> with the same x and y values. The argument d in the anonymous function is a
> data frame object.

Another approach is to use the much faster count function:

count(unique(X))

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with "setMethod" in R package

2011-01-26 Thread Pascal A. Niklaus

Dear all,

My apologies for re-posting this question, but I have not found any 
solution to my problem so far. I also think that my post may have been 
overseen due to the posting time and high traffic on this list.


I experience a problem in implementing a S4 class in a package and would 
appreciate your help in this matter.


The class is implemented using Rcpp, which works well. I then extend the 
class with R functions, adding them as methods with a call to 
setMethod(...). When I do this outside the package (after loading the 
package with "library(...)"), everything works smoothly. However, when I 
try to incorporate the call to setMethod(...) in the package code 
itself, the only way I get it to work is to include the call in 
.onLoad(...) or .onAttach(...). This works, but when loading the library 
I get the messages "creating new generic function for "plot" in 
".GlobalEnv", as well as "no definition for class "Rcpp_rothC""...


Again, the code works, but the messages let me presume that I add the 
methods in the wrong way or miss-specify name spaces, class names, or 
something else.


Thank you for your advice.

Pascal

(The error message and the relevant parts of the package files are 
pasted below, together with my related questions as "comments".)




showMethods("plot")

Function "plot":
   # I am surprised about this --
   # why isn't plot a generic function?


library("RrothC2")

Loading required package: Rcpp
Loading required package: pascal
Creating a new generic function for "plot" in ".GlobalEnv"
in method for ‘plot’ with signature ‘x="Rcpp_rothC",y="missing"’: no 
definition for class "Rcpp_rothC"

   # class seems not to be known
   # at this stage.
   # but where should I put setMethod()?

r1 <- new(rothC$rothC);
class(r1)# class is known by now

[1] "Rcpp_rothC"
attr(,"package")
[1] ".GlobalEnv"

plot(r1) # "plot" is dispatched correctly



showMethods("plot")

Function: plot (package graphics) # now "plot" from graphics shows up
x="ANY", y="ANY"  # why was it missing before?
x="Rcpp_rothC", y="missing"


# FILE rcpp_rothC_module.h #

class rothC { ... }

# FILE rcpp_rothC_module.cpp #

using namespace Rcpp ;

RCPP_MODULE(rothC) {
  class_("rothC")
.constructor()
   ...
}

# FILE rcpp_rothC.R #

Rcpp_rothC.plot <- function(x,y,...) { ... }

## FILE zzz.R ##

.NAMESPACE <- environment()

rothC <- new( "Module" )

.onLoad <- function(libname, pkgname) {
  unlockBinding( "rothC" , .NAMESPACE )
  assign( "rothC",  Module( "rothC" ), .NAMESPACE )

setMethod(f="plot",signature(x="Rcpp_rothC",y="missing"),definition=Rcpp_rothC.plot,where=.GlobalEnv);
  lockBinding( "rothC", .NAMESPACE )
}

.onAttach <- function(libname, pkgname) {}

.onUnload <- function(libname, pkgname) { gc(); }

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Projecting data onto a NMS plot

2011-01-26 Thread cystis

Hello all and thanks for the help.

I am analyzing a dataset using MetaMDS and I would like to project some
extra samples onto the plot such that  the extra samples do not play a role
in defining the axes. I have been thinking of different ways of doing this
and I was wondering if anyone had a suggestion for an easy way to do this 
(possibly a way to weight samples in the analysis?)

Thanks
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Projecting-data-onto-a-NMS-plot-tp3238006p3238006.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mvoutlier

2011-01-26 Thread stephen sefick
I would look into ggplot2.  I use this quite frequently to do what you  
are talking about, and also for most of my plotting.  Hadley has done  
a wonderful job with this package.

kindest regards,

Stephen

On Jan 26, 2011, at 3:48 AM, Claudia Paladini wrote:



  Dear R-users,
  I used x.out=sign1(data,makeplot=TRUE) from the package mvoutlier  
to detect

  multivariate outliers.
  I would like to label the points in the resulting plot with the  
row names of
  my data set. But none of my attempts does lead to a result. Can  
anybody help

  me, please?
  Best regards
  Claudia


  Neu: WEB.DE De-Mail - Einfach wie E-Mail, sicher wie ein Brief!
  Jetzt De-Mail-Adresse reservieren: [1]https://produkte.web.de/go/demail02

References

  1. https://produkte.web.de/go/demail02
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 2 functions with same name - what to do to get the one I want

2011-01-26 Thread David Winsemius


On Jan 26, 2011, at 5:38 AM, pdb wrote:



There seems to be 2 functions call ecdf...

http://lib.stat.cmu.edu/S/Harrell/help/Hmisc/html/ecdf.html


Apparently that used to be its name and that some time in the  
intervening 10 years the name was changed to Ecdf. The Statlib  
repository has rather ancient version of the Hmisc documentation.  
Notice that it gives Harrell's UVa address. You should not use that  
repository for documentation. Use the RSiteSearch:


http://search.r-project.org/cgi-bin/namazu.cgi?query=ecdf&max=100&result=normal&sort=score&idxname=functions

--
David.



http://127.0.0.1:11885/library/stats/html/ecdf.html

How do I get the one ecdf {Hmisc} to run instead of the ecdf {stats}

A pointer in the right direction would be greatly appreciated.


Tried to instal Hmisc but got this message, so I assume I have it


utils:::menuInstallPkgs()

Warning: package 'Hmisc' is in use and will not be installed



ran the demo from Hmisc with no luck...


set.seed(1)
ch <- rnorm(1000, 200, 40)
ecdf(ch, xlab="Serum Cholesterol")

Error in ecdf(ch, xlab = "Serum Cholesterol") :
 unused argument(s) (xlab = "Serum Cholesterol")


ran the sample code from stats and it worked...


x <- rnorm(12)
Fn <- ecdf(x)
Fn # a *function*

Empirical CDF
Call: ecdf(x)
x[1:12] = -1.9123, -1.6626, -1.2468,  ..., 1.1119,  1.135

Fn(x)  # returns the percentiles for x

[1] 1. 0.9167 0. 0.6667 0.5833 0.1667
0.7500 0.0833 0.2500 0.8333 0.4167 0.5000

tt <- seq(-2,2, by = 0.1)
12 * Fn(tt) # Fn is a 'simple' function {with values k/12}
[1]  0  1  1  1  2  2  2  2  3  3  3  3  4  4  4  5  5  5  6  6  6   
7  7  8

8  8  8  8  8  9 10 10 12 12 12 12 12 12 12 12 12




--
View this message in context: 
http://r.789695.n4.nabble.com/2-functions-with-same-name-what-to-do-to-get-the-one-I-want-tp3237788p3237788.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] barchart panel.text add label value and percent

2011-01-26 Thread Christophe Bouffioux
Hello everybody,

i need some help to display text as label in my barchart
the label is the combination of x value + text
text= calculated percentage => per
it display properly the x value
but, wrongly repeats the text of the fisrt level LangueTXT factor on the
second

any solution?
Thanx very much
Christophe


 here is the code ##
library(lattice)
Langue <- c(1, 1, 1, 2, 2, 2, 2)
n03interessantscore <- c(1, 2, 3, 1, 2, 3, 4)
count <- c(89, 148, 16, 88, 192, 28, 7)
sumcount <- c(253, 253, 253, 315, 315, 315, 315)
per <- c('35.2%','58.5%','6.3%','27.9%','61%','8.9%','2.2%')
LangueTXT <- c('Nl','Nl','Nl','Fr','Fr','Fr','Fr')


databar <- data.frame(Langue, n03interessantscore, count, sumcount, per,
LangueTXT)

barchart(n03interessantscore ~ count| LangueTXT, data=databar,
 layout=c(1,max(databar$Langue)), stack=TRUE, rectangles=TRUE,
horizontal=TRUE,
 ylab="Score",
 xlab="Count",
 col="grey",
 main="3.0z Trouvez-vous ce rapport intéressant?",
 border="NA",
 panel= function(y,x,...){panel.grid(h=0, v=-1, col="gray")
 Y <- tapply(y, y, unique)
  panel.barchart(x,y,...)
  panel.text((x-0.1*x), Y, label= paste(round(x,0),'-',
databar$per), cex=0.9)}
 )

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MAtrix addressing

2011-01-26 Thread David Winsemius


On Jan 26, 2011, at 2:47 AM, Alaios wrote:


The reason is the following image
http://img545.imageshack.us/i/maptoregion.jpg/
In the picture above you will find the indexes for each cell.

Also you will see that I place that matrix inside a x,y region that  
spans from -1 to 1. I am trying to write one function that will get  
as argument a (x,y) value x e[-1,1] y e[-1,1] and will return the  
indexes of that cell tha x,y value correspond to.




I really do not have a clue how I should try to approach that to  
solve it. So based on some version I had for 1-d vector I tried to  
extend it for 2-d. I used findInterval as a core to get results.
Unfortunately my code fails to produce accurate results as my  
approach 'assumes' (this is something inhereted by the find Interval  
function) that the numbering starts bottom left and goes high top  
right.

You will find my code below


If one wants to take an ordinary r matrix and reorder it in the manner  
you describe:


mtx2 <- mtx[ nrow(mtx):1, ]

Whether that is an efficient way to get at the sokution your you  
programming task I cannot say. It sounds as though it has gotten too  
convoluted. I was not able to comprehend the overall goal from your  
problem description.







sr.map <- function(sr){
# This function converts the s(x,y) matrix into a function x that  
spans #from -1 to 1 and y spans from -1 to 1.

# Input: sr a x,y matrix containing the shadowing values of a Region
breaksX <- seq(from=-1, to = 1, length = nrow(sr) +1L )
breaksY <- seq(from=-1, to = 1, length = ncol(sr) + 1L)
function(x,y){ # SPAGGETI CODE FOR EVER
indx <- findInterval(x, breaksX,rightmost.closed=TRUE)
 indy <- findInterval(y, breaksY,rightmost.closed=TRUE)
 c(indx,indy)
}
}



sr<-matrix(data=seq(from=1,to=36),nrow=6,ncol=6,byrow=TRUE)
f.sr.map<-sr.map((sr))
f.sr.map(-0.1,-0.1)
f.sr.map(0.1,0.1)



Best Regards
Alex





--- On Wed, 1/26/11, David Winsemius  wrote:


From: David Winsemius 
Subject: Re: [R] MAtrix addressing
To: "Alaios" 
Cc: R-help@r-project.org
Date: Wednesday, January 26, 2011, 2:54 AM

On Jan 25, 2011, at 4:50 PM, Alaios wrote:


Hello
I would like to ask you if it is possible In R Cran to

change the default way of addressing a matrix.

for example
matrix(data=seq(from=1,to=4,nrow=2,ncol=2, by row

numbering) # not having R at this pc


will create something like the following
1 2
3 4

the way R address this matrix is from top left corner

moving to bottom right.

The cell numbers in that way are
1 2
3 4

IS it possible to change this default addresing number

to something that goes bottom left to top right? In this
simple case I want to have

3 4
1 2

Would that be possible?


Yes. it's possible but ... why?



I would like to thank y for your help
Regards
Alex

__
R-help@r-project.org

mailing list

https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained,

reproducible code.

David Winsemius, MD
West Hartford, CT








David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave: \Sexpr{} inside <<>>?

2011-01-26 Thread Duncan Murdoch

On 11-01-25 8:22 PM, zerfetzen wrote:
>
> Hi,
> Is it possible in Sweave to put \Sexpr{} inside<<>>?  This is a bad
> example, but here goes:
>
> <>
> Age<- 5
> @
>
> <<>>
> x<- \Sexpr{Age}
> @
>
> I'm trying to get it to display x<- 5, rather than x<- Age.  It's 
probably

> so obvious I'm going to feel sorry for having to ask, just the same, I'm
> stumped.  Any ideas?  Thanks.

No, you can't do that.  There are a couple of ways to do what you want. 
 Probably the easiest is to do what Sweave would do:


<>=
Age<- 5
@

\begin{Schunk}
\begin{Sinput}
> x<- \Sexpr{Age}
\end{Sinput}
\end{Schunk}

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a problem with is.na

2011-01-26 Thread Martyn Byng
Hi,

This isn't an issue with is.na, you get the same if you use

aa = c(1,1,0,1,1,1,1,1,1)
bb = abs(aa - 1)
xtabs(aa~x1+x2)
xtabs(bb~x1+x2)

it is because you do not have any data in (1,1), i.e. there is no case where x1 
= 1 and x2 = 1 so xtabs is putting a zero in that cell

Hope this helps

Martyn

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of René Holst
Sent: 26 January 2011 11:05
To: r-help@r-project.org
Subject: [R] a problem with is.na

Hello,

I have observed the following odd behavior of "is.na( )" and hope someone
can give me an explanation
Example:
X1=rep(1:2,5)[-1]
X2=rep(1:5,rep(2,5))[-1]
y= runif(9)
y[3]=NA
xtabs(y~x1+x2)

Now

xtabs(is.na(y)~x1+x2) says that cell 2,2 is NA

   x2
x1  1 2
  1 0 0
  2 0 1
  3 0 0
  4 0 0
  5 0 0

Whereas  

xtabs(!is.na(y)~x1+x2) says that all but cell 1,1 and 2,2 are not NA
   x2
x1  1 2
  1 0 1
  2 1 0
  3 1 1
  4 1 1
  5 1 1 

An explanation will be much appriciated

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This e-mail has been scanned for all viruses by Star.\ _...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hwo to speed up "aggregate"

2011-01-26 Thread Henrique Dallazuanna
Try this:

 unique(transform(df, quantity = ave(quantity, client, date, name, FUN =
sum), branch = NULL))

On Wed, Jan 26, 2011 at 8:39 AM, analys...@hotmail.com <
analys...@hotmail.com> wrote:

> I have
>
> > df
>   quantity branch client   date  name
> 110  1  1 2010-01-01   one
> 220  2  1 2010-01-01   one
> 330  3  2 2010-01-01   two
> 415  4  1 2010-01-01   one
> 510  5  2 2010-01-01   two
> 620  6  3 2010-01-01 three
> 7  1000  1  1 2011-01-01   one
> 8  2000  2  1 2011-01-01   one
> 9  3000  3  2 2011-01-01   two
> 10 1500  4  1 2011-01-01   one
> 11 1000  5  2 2011-01-01   two
> 12 2000  6  3 2011-01-01 three
>
> I want to aggregate away the branch. I followed a suggestion by Gabor
> (thanks) and did
>
> >
> aggregate(list(quantity=df$quantity),list(client=df$client,date=df$date),sum)
>  client   date quantity
> 1  1 2010-01-01   45
> 2  2 2010-01-01   40
> 3  3 2010-01-01   20
> 4  1 2011-01-01 4500
> 5  2 2011-01-01 4000
> 6  3 2011-01-01 2000
>
> I want df$name also in the output and did what looked obvious:
>
> >
> aggregate(list(quantity=df$quantity),list(client=df$client,date=df$date,name=df$name),sum)
>  client   date  name quantity
> 1  1 2010-01-01   one   45
> 2  1 2011-01-01   one 4500
> 3  3 2010-01-01 three   20
> 4  3 2011-01-01 three 2000
> 5  2 2010-01-01   two   40
> 6  2 2011-01-01   two 4000
>
> It seems to work, but slows down tremendously for a dataframe with
> around a 1000 rows.
>
> Could anyone explain what is going on and suggest a way out?
>
> Thanks.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mvoutlier

2011-01-26 Thread Claudia Paladini

   Dear R-users,
   I used x.out=sign1(data,makeplot=TRUE) from the package mvoutlier to detect
   multivariate outliers.
   I would like to label the points in the resulting plot with the row names of
   my data set. But none of my attempts does lead to a result. Can anybody help
   me, please?
   Best regards
   Claudia


   Neu: WEB.DE De-Mail - Einfach wie E-Mail, sicher wie ein Brief!
   Jetzt De-Mail-Adresse reservieren: [1]https://produkte.web.de/go/demail02

References

   1. https://produkte.web.de/go/demail02
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a problem with is.na

2011-01-26 Thread Henrique Dallazuanna
There isn't combination of c(1, 1), so is NA:

tapply(y, list(X1, X2), sum)

On Wed, Jan 26, 2011 at 9:04 AM, René Holst  wrote:

> Hello,
>
> I have observed the following odd behavior of "is.na( )" and hope someone
> can give me an explanation
> Example:
> X1=rep(1:2,5)[-1]
> X2=rep(1:5,rep(2,5))[-1]
> y= runif(9)
> y[3]=NA
> xtabs(y~x1+x2)
>
> Now
>
> xtabs(is.na(y)~x1+x2) says that cell 2,2 is NA
>
>   x2
> x1  1 2
>  1 0 0
>  2 0 1
>  3 0 0
>  4 0 0
>  5 0 0
>
> Whereas
>
> xtabs(!is.na(y)~x1+x2) says that all but cell 1,1 and 2,2 are not NA
>   x2
> x1  1 2
>  1 0 1
>  2 1 0
>  3 1 1
>  4 1 1
>  5 1 1
>
> An explanation will be much appriciated
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hwo to speed up "aggregate"

2011-01-26 Thread Dennis Murphy
If you have a large data frame, one option is package data.table. Try the
following:

library(data.table)
dt <- data.table(df)
dt[, list(qsum = sum(quantity)), by ='client, date, name']
 client   date  name qsum
[1,]  1 2010-01-01   one   45
[2,]  1 2011-01-01   one 4500
[3,]  2 2010-01-01   two   40
[4,]  2 2011-01-01   two 4000
[5,]  3 2010-01-01 three   20
[6,]  3 2011-01-01 three 2000

BTW, the leading comma after the opening bracket is not a typo :)

For R versions >= 2.11.x, aggregate() has a formula interface that saves a
fair bit of typing:

aggregate(quantity ~ client + date + name, data = df, FUN = sum)
  client   date  name quantity
1  1 2010-01-01   one   45
2  1 2011-01-01   one 4500
3  3 2010-01-01 three   20
4  3 2011-01-01 three 2000
5  2 2010-01-01   two   40
6  2 2011-01-01   two 4000

A third option is package plyr and function ddply():

library(plyr)
ddply(df, .(client, date, name), summarise, qsum = sum(quantity))
# same output as data.table

Hopefully one or more of these will improve your processing time.

Dennis

On Wed, Jan 26, 2011 at 2:39 AM, analys...@hotmail.com <
analys...@hotmail.com> wrote:

> I have
>
> > df
>   quantity branch client   date  name
> 110  1  1 2010-01-01   one
> 220  2  1 2010-01-01   one
> 330  3  2 2010-01-01   two
> 415  4  1 2010-01-01   one
> 510  5  2 2010-01-01   two
> 620  6  3 2010-01-01 three
> 7  1000  1  1 2011-01-01   one
> 8  2000  2  1 2011-01-01   one
> 9  3000  3  2 2011-01-01   two
> 10 1500  4  1 2011-01-01   one
> 11 1000  5  2 2011-01-01   two
> 12 2000  6  3 2011-01-01 three
>
> I want to aggregate away the branch. I followed a suggestion by Gabor
> (thanks) and did
>
> >
> aggregate(list(quantity=df$quantity),list(client=df$client,date=df$date),sum)
>  client   date quantity
> 1  1 2010-01-01   45
> 2  2 2010-01-01   40
> 3  3 2010-01-01   20
> 4  1 2011-01-01 4500
> 5  2 2011-01-01 4000
> 6  3 2011-01-01 2000
>
> I want df$name also in the output and did what looked obvious:
>
> >
> aggregate(list(quantity=df$quantity),list(client=df$client,date=df$date,name=df$name),sum)
>  client   date  name quantity
> 1  1 2010-01-01   one   45
> 2  1 2011-01-01   one 4500
> 3  3 2010-01-01 three   20
> 4  3 2011-01-01 three 2000
> 5  2 2010-01-01   two   40
> 6  2 2011-01-01   two 4000
>
> It seems to work, but slows down tremendously for a dataframe with
> around a 1000 rows.
>
> Could anyone explain what is going on and suggest a way out?
>
> Thanks.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >