date:20070227

Indermaur Lukas wrote:
> Hi,
> Fitting all possible models (GLM) with 10 predictors will result in loads of 
> (2^10 - 1) models. I want to do that in order to get the importance of 
> variables (having an unbalanced variable design) by summing the up the 
> AIC-weights of models including the same variable, for every variable 
> separately. It's time consuming and annoying to define all possible models by 
> hand. 
>  
> Is there a command, or easy solution to let R define the set of all possible 
> models itself? I defined models in the following way to process them with a 
> batch job:
>  
> # e.g. model 1
> preference<- formula(Y~Lwd + N + Sex + YY)
> 
> # e.g. model 2
> preference_heterogeneity<- formula(Y~Ri + Lwd + N + Sex + YY)  
> etc.
> etc.
>  
>  
> I appreciate any hint
> Cheers
> Lukas

If you choose the model from amount 2^10 -1 having best AIC, that model 
will be badly biased.  Why look at so many?  Pre-specification of 
models, or fitting full models with penalization, or using data 
reduction (masked to Y) may work better.

Frank

>  
>  
>  
>  
>  
> °°° 
> Lukas Indermaur, PhD student 
> eawag / Swiss Federal Institute of Aquatic Science and Technology 
> ECO - Department of Aquatic Ecology
> Überlandstrasse 133
> CH-8600 Dübendorf
> Switzerland
>  
> Phone: +41 (0) 71 220 38 25
> Fax: +41 (0) 44 823 53 15 
> Email: [EMAIL PROTECTED]
> www.lukasindermaur.ch
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting of all possible models

2007-02-27 Thread hadley wickham

Hi Lukas,

You may find my meifly package helpful.  It provides functions to
generate ensembles of models (eg. fitall) and then extract all the
coefficients, residuals etc (coef, summary, residual etc).  The main
point of the package is to visualise all these models, and I second
Frank's comment that merely selecting the best model will be perilous.

Unfortunately, the package is not on CRAN yet, but if you are
interested, contact me off list with your OS and I can email you the
package, and accompanying paper.

Regards,

Hadley

On 2/27/07, Indermaur Lukas <[EMAIL PROTECTED]> wrote:
> Hi,
> Fitting all possible models (GLM) with 10 predictors will result in loads of 
> (2^10 - 1) models. I want to do that in order to get the importance of 
> variables (having an unbalanced variable design) by summing the up the 
> AIC-weights of models including the same variable, for every variable 
> separately. It's time consuming and annoying to define all possible models by 
> hand.
>
> Is there a command, or easy solution to let R define the set of all possible 
> models itself? I defined models in the following way to process them with a 
> batch job:
>
> # e.g. model 1
> preference<- formula(Y~Lwd + N + Sex + YY)
> # e.g. model 2
> preference_heterogeneity<- formula(Y~Ri + Lwd + N + Sex + YY)
> etc.
> etc.
>
>
> I appreciate any hint
> Cheers
> Lukas
>
>
>
>
>
> °°°
> Lukas Indermaur, PhD student
> eawag / Swiss Federal Institute of Aquatic Science and Technology
> ECO - Department of Aquatic Ecology
> Überlandstrasse 133
> CH-8600 Dübendorf
> Switzerland
>
> Phone: +41 (0) 71 220 38 25
> Fax: +41 (0) 44 823 53 15
> Email: [EMAIL PROTECTED]
> www.lukasindermaur.ch
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Crosstabbing multiple response data

2007-02-27 Thread John Kane

Thanks to everyone for this.  I was looking at the
same problem last night and just was going to write a
posting to R-help when I saw this.  


--- Michael Wexler <[EMAIL PROTECTED]> wrote:

> 
> Thanks to Charles, Gabor, and a private message from
> Frank E Harrell with some good ideas and help.  This
> crossprod approach was very clever, I would never
> have thought of it.
> 
> Best, Michael
> 
> 
> - Original Message 
> From: Charles C. Berry <[EMAIL PROTECTED]>
> To: Michael Wexler <[EMAIL PROTECTED]>
> Cc: r-help@stat.math.ethz.ch
> Sent: Thursday, February 22, 2007 1:17:44 PM
> Subject: Re: [R] Crosstabbing multiple response data
> 
> 
> > res <- crossprod( as.matrix( ratings[ , -1] ) )
> > diag(res) <- ""
> > print(res, quote=F)
>   att1 att2 att3
> att1  21
> att2 2 2
> att3 12
> > 
> > res2 <- crossprod(as.matrix( ratings[ , -1])) *
> 100 / nrow( ratings )
> > res2[] <- paste( res2, "%", sep="" )
> > diag(res2) <- ""
> > print(res2, quote=F)
>   att1 att2 att3
> att1  50%  25%
> att2 50%   50%
> att3 25%  50%
> >
> 
> Be sure to bone up on format and sprintf before
> taking this into 
> production.
> 
> On Thu, 22 Feb 2007, Michael Wexler wrote:
> 
> > Using R version 2.4.1 (2006-12-18) on Windows, I
> have a dataset which resembles this:
> >
> > idatt1att2att3
> > 1110
> > 2100
> > 3011
> > 4111
> >
> > ratings <- data.frame(id = c(1,2,3,4), att1 =
> c(1,1,0,1), att2 = c(1,0,0,1), att3 = c(0,1,1,1))
> >
> > I would like to get a cross tab of counts of
> co-ocurrence, which might resemble this:
> >
> >att1att2att3
> > att1 2   1
> > att222
> > att312
> >
> > with the hope of understanding, at least pairwise,
> what things "hang together".   (Yes, there are much,
> much better ways to do this statistically including
> clustering and binary corrected correlation, but the
> audience I am working with asked for this version
> for a specific reason.)
> >
> > (Later on, I would also like to convert to
> percentages of the total unique pop, so the final
> version of the table would be
> >
> >
> >att1att2att3
> >
> > att1 50%   25%
> >
> > att250%50%
> >
> > att325%50%
> >
> >
> > But I can do this in excel if I can get the first
> table out.)
> >
> > I have tried the reshape library, but could not
> get anything resembling this (both on its own, as
> well as feeding in to table()).  (I have also played
> with transposing and using some comments from this
> list from 2002 and 2004, but the questioners appear
> to assume more knowledge than I have in use of R;
> the example in the posting guide was also more
> complex than I was ready for, I'm afraid.)
> >
> > Sample of some of my efforts:
> > library(reshape)
> > melt(ratings,id=c("id"))
> >
> > ds1 <- melt(ratings,id=c("id"))
> > table(ds1$variable, ds1$variable) # returns only
> rowcounts, 3 along diagonal
> > xtabs(formula = value ~ ds1$variable +
> ds1$variable , data=ds1) # returns only a single row
> of collapsed counts, appears to not allow 1 variable
> in multiple uses
> >
> > I suspect I am close, so any nudges in the right
> direction would be helpful.
> >
> > Thanks much, Michael
> >
> > PS: www.rseek.org is very impressive, I heartily
> encourage its use.
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> >
> 
> Charles C. Berry(858)
> 534-2098
>   Dept of
> Family/Preventive Medicine
> E mailto:[EMAIL PROTECTED] UC San
> Diego
> http://biostat.ucsd.edu/~cberry/ La Jolla,
> San Diego 92093-0901
> 
> 
> 
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Crosstabbing multiple response data

2007-02-27 Thread John Kane


--- John Kane <[EMAIL PROTECTED]> wrote:

> Thanks to everyone for this.  I was looking at the
> same problem last night and just was going to write
> a
> posting to R-help when I saw this.  
> 
> 
> --- Michael Wexler <[EMAIL PROTECTED]> wrote:
> 
> > 
> > Thanks to Charles, Gabor, and a private message
> from
> > Frank E Harrell with some good ideas and help. 
> This
> > crossprod approach was very clever, I would never
> > have thought of it.
> > 
> > Best, Michael
> > 
> > 
> > - Original Message 
> > From: Charles C. Berry <[EMAIL PROTECTED]>
> > To: Michael Wexler <[EMAIL PROTECTED]>
> > Cc: r-help@stat.math.ethz.ch
> > Sent: Thursday, February 22, 2007 1:17:44 PM
> > Subject: Re: [R] Crosstabbing multiple response
> data
> > 
> > 
> > > res <- crossprod( as.matrix( ratings[ , -1] ) )
> > > diag(res) <- ""
> > > print(res, quote=F)
> >   att1 att2 att3
> > att1  21
> > att2 2 2
> > att3 12
> > > 
> > > res2 <- crossprod(as.matrix( ratings[ , -1])) *
> > 100 / nrow( ratings )
> > > res2[] <- paste( res2, "%", sep="" )
> > > diag(res2) <- ""
> > > print(res2, quote=F)
> >   att1 att2 att3
> > att1  50%  25%
> > att2 50%   50%
> > att3 25%  50%
> > >
> > 
> > Be sure to bone up on format and sprintf before
> > taking this into 
> > production.
> > 
> > On Thu, 22 Feb 2007, Michael Wexler wrote:
> > 
> > > Using R version 2.4.1 (2006-12-18) on Windows, I
> > have a dataset which resembles this:
> > >
> > > idatt1att2att3
> > > 1110
> > > 2100
> > > 3011
> > > 4111
> > >
> > > ratings <- data.frame(id = c(1,2,3,4), att1 =
> > c(1,1,0,1), att2 = c(1,0,0,1), att3 = c(0,1,1,1))
> > >
> > > I would like to get a cross tab of counts of
> > co-ocurrence, which might resemble this:
> > >
> > >att1att2att3
> > > att1 2   1
> > > att222
> > > att312
> > >
> > > with the hope of understanding, at least
> pairwise,
> > what things "hang together".   (Yes, there are
> much,
> > much better ways to do this statistically
> including
> > clustering and binary corrected correlation, but
> the
> > audience I am working with asked for this version
> > for a specific reason.)
> > >
> > > (Later on, I would also like to convert to
> > percentages of the total unique pop, so the final
> > version of the table would be
> > >
> > >
> > >att1att2att3
> > >
> > > att1 50%   25%
> > >
> > > att250%50%
> > >
> > > att325%50%
> > >
> > >
> > > But I can do this in excel if I can get the
> first
> > table out.)
> > >
> > > I have tried the reshape library, but could not
> > get anything resembling this (both on its own, as
> > well as feeding in to table()).  (I have also
> played
> > with transposing and using some comments from this
> > list from 2002 and 2004, but the questioners
> appear
> > to assume more knowledge than I have in use of R;
> > the example in the posting guide was also more
> > complex than I was ready for, I'm afraid.)
> > >
> > > Sample of some of my efforts:
> > > library(reshape)
> > > melt(ratings,id=c("id"))
> > >
> > > ds1 <- melt(ratings,id=c("id"))
> > > table(ds1$variable, ds1$variable) # returns only
> > rowcounts, 3 along diagonal
> > > xtabs(formula = value ~ ds1$variable +
> > ds1$variable , data=ds1) # returns only a single
> row
> > of collapsed counts, appears to not allow 1
> variable
> > in multiple uses
> > >
> > > I suspect I am close, so any nudges in the right
> > direction would be helpful.
> > >
> > > Thanks much, Michael
> > >
> > > PS: www.rseek.org is very impressive, I heartily
> > encourage its use.
> > >
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained,
> > reproducible code.
> > >
> > 
> > Charles C. Berry(858)
> > 534-2098
> >   Dept of
> > Family/Preventive Medicine
> > E mailto:[EMAIL PROTECTED] UC San
> > Diego
> > http://biostat.ucsd.edu/~cberry/ La Jolla,
> > San Diego 92093-0901
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > [[alternative HTML version deleted]]
> > 
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> > reproducible code.
> > 
> 
> 
> __
> Do You Yahoo!?

> protection around 
> http://mail.yahoo.com 
>

__
R-help@stat.math.ethz.ch mailing list
h

[R] How to put the dependent variable in GLM proportion model

2007-02-27 Thread Serguei Kaniovski


Hello everyone,

I am confused about how the dependent variable should be specified, e.g.
say S and F denote series of successes and failures. Is it

share<-S/(S+F)
glm(share~x,family=quasibinomial)

or

glm(cbind(S,F)~x,family=quasibinomial)

The two variants produce very different dispersion parameter and deviances.
The book by Crawley, the only one R-book a have, says the second variant is
correct for proportions data.

Serguei

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fitting the gamma cumulative distribution function

2007-02-27 Thread Tim Bergsma

Hi.

I have a vector of quantiles and a vector of probabilites that, when 
plotted, look very like the gamma cumulative distribution function.  I 
can guess some shape and scale parameters that give a similar result, 
but I'd rather let the parameters be estimated.  Is there a direct way 
to do this in R?

Thanks,

Tim.

week <- c(0,5,6,7,9,11,14,19,39)
fraction <- c(0,0.23279,0.41093,0.58198,0.77935,0.88057,0.94231,0.98583,1)
weeks <- 1:160/4
plot(weeks, pgamma(weeks,shape=6,scale=1.15),type="l")
points(week,fraction,col="red")

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] stl function

2007-02-27 Thread Anja Eggert

I want to apply the stl-function to decompose a time series (daily 
measurements over 22 years) into seasonal component, trend and 
residuals. I was able to get the diagrams.
However, I could not find out what are the equations behind it. I.e. it 
is probably not an additive or multiplicative combination of season (as 
sin and cos-functions) and a linear trend?
Furthermore, what are the grey bars on the right hand side of the diagrams?
I would appreciate very much to receive some information or maybe a good 
reference.

Thank you very much,
Anja

-- 
*
Dr. Anja Eggert 
Universität Rostock
Institut für Biowissenschaften
AG Angewandte Ökologie

Albert-Einstein-Str. 3
18059 Rostock

T: ++49 381 498 6094
F: ++49 381 498 6072

e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting the gamma cumulative distribution function

2007-02-27 Thread Stephen Tucker

Hi Tim,

I believe fitdistr() in the MASS package is the function you are looking for.
(example given in help page)...

Best regards,
ST

--- Tim Bergsma <[EMAIL PROTECTED]> wrote:

> Hi.
> 
> I have a vector of quantiles and a vector of probabilites that, when 
> plotted, look very like the gamma cumulative distribution function.  I 
> can guess some shape and scale parameters that give a similar result, 
> but I'd rather let the parameters be estimated.  Is there a direct way 
> to do this in R?
> 
> Thanks,
> 
> Tim.
> 
> week <- c(0,5,6,7,9,11,14,19,39)
> fraction <- c(0,0.23279,0.41093,0.58198,0.77935,0.88057,0.94231,0.98583,1)
> weeks <- 1:160/4
> plot(weeks, pgamma(weeks,shape=6,scale=1.15),type="l")
> points(week,fraction,col="red")
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

TV dinner still cooling? 
Check out "Tonight's Picks" on Yahoo! TV.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rpart minimum sample size

2007-02-27 Thread Amy Uhrin

Is there an optimal / minimum sample size for attempting to construct a 
classification tree using /rpart/?

I have 27 seagrass disturbance sites (boat groundings) that have been 
monitored for a number of years.  The monitoring protocol for each site 
is identical.  From the monitoring data, I am able to determine the 
level of recovery that each site has experienced.  Recovery is our 
categorical dependent variable with values of none, low, medium, high 
which are based upon percent seagrass regrowth into the injury over 
time.  I wish to be able to predict the level of recovery of future 
vessel grounding sites based upon a number of categorical / continuous 
predictor variables used here including (but not limited to) such 
parameters as:  sediment grain size, wave exposure, original size 
(volume) of the injury, injury age, injury location.

When I run /rpart/, the data is split into only two terminal nodes based 
solely upon values of the original volume of each injury.  No other 
predictor variables are considered, even though I have included about 
six of them in the model.  When I remove volume from the model the same 
thing happens but with injury area - two terminal nodes are formed based 
upon area values and no other variables appear.  I was hoping that this 
was a programming issue, me being a newbie and all, but I really think 
I've got the code right.  Now I am beginning to wonder if my N is too 
small for this method?

-- 
Amy V. Uhrin, Research Ecologist

NOAA, National Ocean Service
Center for Coastal Fisheries and Habitat Research
101 Pivers Island Road
Beaufort, NC 28516
(252) 728-8778
(252) 728-8784 (fax)
[EMAIL PROTECTED]


 \!/ \!/   <:}><   \!/ \!/  >^<**>^<  \!/ \!/ 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] survival analysis using rpart

2007-02-27 Thread Terry Therneau

> I use rpart to predict survival time and have a problem in interpreting the
> output of ?estimated rate?

> (1) Is the ?estimated rate? the estimated hazard rate ratio? 
> (2) How does rpart calculate this rate?
> (3) Suppose I use xpred.rpart(fit, xval=10) to perform 10-fold
> cross-validation using (a) the complete stagec data set and (b) only a
> subset of it, say, using the columns Age, EET, and G2 only. For the i-th
> patient, I am likely to obtain a different estimated rate. How can I
> meaningfully compare both rates? How can say which one is ?better?? 

For questions 1 and 2, you need to read the documentation.
   www.mayo.edu/biostatistics , get technical report #61.  (We should bundle
 this with the package, I suspect)
or the appropriate chapter in Venables and Ripley, Modern Applied Statistics
 with S, 4th edition.
 
 For question 3, rpart does not have the usual "nested model" likelihood
 ratio tests. I don't know how to say which model is better.
 
Terry Therneau

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] garch and extra explanatory variable

2007-02-27 Thread Bernd Dittmann

Hi useRs,

a daily garch(1,1) model can be extended whereby the variance equation 
incorporates say higher frequency volatility measure.

The variance equation would look something like:

s(t)2 = garch(1,1) + a*v(t-1)

whereby v(t-1) would be the intraday vola of yesterday ("a" the coef.).

How can this be implemented in R?

I checked "garch" of "tseries". An extended formula cannot be specified. 
fitGarch of "fseries" might be able to do that. Unfortunately, I am not 
quite sure how to specify in the fseries package.

Or would the estimation have do be done manually?

Comments and hints highly appreciated!

Thanks!

Bernd

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting the gamma cumulative distribution function

2007-02-27 Thread Adelchi Azzalini

On Tue, 27 Feb 2007 06:59:51 -0800 (PST), Stephen Tucker wrote:

ST> Hi Tim,
ST> 
ST> I believe fitdistr() in the MASS package is the function you are looking
ST> for. (example given in help page)...
ST> 
ST> Best regards,
ST> ST
ST> 
ST> --- Tim Bergsma <[EMAIL PROTECTED]> wrote:
ST> 
ST> > Hi.
ST> > 
ST> > I have a vector of quantiles and a vector of probabilites that, when 
ST> > plotted, look very like the gamma cumulative distribution function.  I 
ST> > can guess some shape and scale parameters that give a similar result, 
ST> > but I'd rather let the parameters be estimated.  Is there a direct way 
ST> > to do this in R?
ST> > 
ST> > Thanks,
ST> > 
ST> > Tim.
ST> > 
ST> > week <- c(0,5,6,7,9,11,14,19,39)
ST> > fraction <- c
ST> > (0,0.23279,0.41093,0.58198,0.77935,0.88057,0.94231,0.98583,1) weeks <-
ST> > 1:160/4 plot(weeks, pgamma(weeks,shape=6,scale=1.15),type="l")
ST> > points(week,fraction,col="red")
ST> > 

you can decide a "distance" criterion and select the paramers which
minimize that distance, something like
 
  criterion  <- function(param, week, fraction){
  cdf <- pgamma(week, param[1], param[2])
  p <- diff(cdf)
  sum((diff(fraction)-p)^2/p) # or some other function
  }
  
and then minimize this criterion with respect to the parameters
using optim() or nlminb().

You cannot use fitdistr() because it requires the individual sample values.
In fact you cannot even use MLE for grouped data or X^2, since the sample
size is not  known (at least not reported), hence we do not have the absolute 
frequencies. If the sample size was known, then the problem would change.
  
-- 
Adelchi Azzalini  <[EMAIL PROTECTED]>
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rpart minimum sample size

2007-02-27 Thread Wensui Liu

amy,
without looking at your actual code, i would suggest you to take a
look at rpart.control()

On 2/27/07, Amy Uhrin <[EMAIL PROTECTED]> wrote:
> Is there an optimal / minimum sample size for attempting to construct a
> classification tree using /rpart/?
>
> I have 27 seagrass disturbance sites (boat groundings) that have been
> monitored for a number of years.  The monitoring protocol for each site
> is identical.  From the monitoring data, I am able to determine the
> level of recovery that each site has experienced.  Recovery is our
> categorical dependent variable with values of none, low, medium, high
> which are based upon percent seagrass regrowth into the injury over
> time.  I wish to be able to predict the level of recovery of future
> vessel grounding sites based upon a number of categorical / continuous
> predictor variables used here including (but not limited to) such
> parameters as:  sediment grain size, wave exposure, original size
> (volume) of the injury, injury age, injury location.
>
> When I run /rpart/, the data is split into only two terminal nodes based
> solely upon values of the original volume of each injury.  No other
> predictor variables are considered, even though I have included about
> six of them in the model.  When I remove volume from the model the same
> thing happens but with injury area - two terminal nodes are formed based
> upon area values and no other variables appear.  I was hoping that this
> was a programming issue, me being a newbie and all, but I really think
> I've got the code right.  Now I am beginning to wonder if my N is too
> small for this method?
>
> --
> Amy V. Uhrin, Research Ecologist
>
> NOAA, National Ocean Service
> Center for Coastal Fisheries and Habitat Research
> 101 Pivers Island Road
> Beaufort, NC 28516
> (252) 728-8778
> (252) 728-8784 (fax)
> [EMAIL PROTECTED]
>
> 
>  \!/ \!/   <:}><   \!/ \!/  >^<**>^<  \!/ \!/
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting of all possible models

2007-02-27 Thread Indermaur Lukas

Hi Frank
I fitted a set of 12 candidate models and evaluated the importance of variables 
based on model averaged coefficients and SE (model weights >=0.9). Variables in 
my models were not distributed in equal numbers across all models thus I was 
not able to evaluate the importance of variables just by summing up the 
AIC-weights of models including a specific variable. Now, why so many models to 
fit: I was curious, if the ranking in the importance of variables is similar, 
when just summing up the AIC-weights over an all-possible-models set and 
looking at the ordered model averaged coefficients (order of CV=SE/coefficient).

Any hint for me?
Cheers
Lukas

Indermaur Lukas wrote:
> Hi,
> Fitting all possible models (GLM) with 10 predictors will result in loads of 
> (2^10 - 1) models. I want to do that in order to get the importance of 
> variables (having an unbalanced variable design) by summing the up the 
> AIC-weights of models including the same variable, for every variable 
> separately. It's time consuming and annoying to define all possible models by 
> hand.
> 
> Is there a command, or easy solution to let R define the set of all possible 
> models itself? I defined models in the following way to process them with a 
> batch job:
> 
> # e.g. model 1
> preference<- formula(Y~Lwd + N + Sex + YY)
>
> # e.g. model 2
> preference_heterogeneity<- formula(Y~Ri + Lwd + N + Sex + YY) 
> etc.
> etc.
> 
> 
> I appreciate any hint
> Cheers
> Lukas

If you choose the model from amount 2^10 -1 having best AIC, that model
will be badly biased.  Why look at so many?  Pre-specification of
models, or fitting full models with penalization, or using data
reduction (masked to Y) may work better.

Frank

> 
> 
> 
> 
> 
> °°°
> Lukas Indermaur, PhD student
> eawag / Swiss Federal Institute of Aquatic Science and Technology
> ECO - Department of Aquatic Ecology
> Überlandstrasse 133
> CH-8600 Dübendorf
> Switzerland
> 
> Phone: +41 (0) 71 220 38 25
> Fax: +41 (0) 44 823 53 15
> Email: [EMAIL PROTECTED]
> www.lukasindermaur.ch
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] str() to extract components

2007-02-27 Thread Simon Pickett

Hi,

I have been dabbling with str() to extract values from outputs such as
lmer etc and have found it very helpful sometimes.

but only seem to manage to extract the values when the output is one
simple table, any more complicated and I'm stumped :-(

take this example of the extracted coeficients from a lmer analysis...

using str(coef(lmer(resp3~b$age+b$size+b$pcfat+(1|sex), data=b))) yields

Formal class 'lmer.coef' [package "Matrix"] with 3 slots
  ..@ .Data :List of 1
  .. ..$ :`data.frame': 2 obs. of  4 variables:
  .. .. ..$ (Intercept): num [1:2] 1.07 1.13
  .. .. ..$ b$age  : num [1:2] 0.00702 0.00702
  .. .. ..$ b$size : num [1:2] 0.0343 0.0343
  .. .. ..$ b$pcfat: num [1:2] 0.0451 0.0451
  ..@ varFac: list()
  ..@ stdErr: num(0)

how do I "get inside" the first table to get the value 1.07 for instance?

Any help much appreciated.


Simon Pickett
PhD student
Centre For Ecology and Conservation
Tremough Campus
University of Exeter in Cornwall
TR109EZ
Tel 01326371852

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting of all possible models

2007-02-27 Thread Bert Gunter

... Below

-- Bert 

Bert Gunter
Genentech Nonclinical Statistics
South San Francisco, CA 94404
650-467-7374


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Frank E Harrell Jr
Sent: Tuesday, February 27, 2007 5:14 AM
To: Indermaur Lukas
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] fitting of all possible models

Indermaur Lukas wrote:
> Hi,
> Fitting all possible models (GLM) with 10 predictors will result in loads
of (2^10 - 1) models. I want to do that in order to get the importance of
variables (having an unbalanced variable design) by summing the up the
AIC-weights of models including the same variable, for every variable
separately. It's time consuming and annoying to define all possible models
by hand. 
>  
> Is there a command, or easy solution to let R define the set of all
possible models itself? I defined models in the following way to process
them with a batch job:
>  
> # e.g. model 1
> preference<- formula(Y~Lwd + N + Sex + YY)

> # e.g. model 2
> preference_heterogeneity<- formula(Y~Ri + Lwd + N + Sex + YY)  
> etc.
> etc.
>  
>  
> I appreciate any hint
> Cheers
> Lukas

If you choose the model from amount 2^10 -1 having best AIC, that model 
will be badly biased.  Why look at so many?  Pre-specification of 
models, or fitting full models with penalization, 

--- ...the rub being how much to penalize. My impression from what I've read
is, for prediction, close to "the more, the better is the predictor..." .
Nature rewards parsimony.

Cheers,
Bert


Frank

>  
>  
>  
>  
>  
> °°° 
> Lukas Indermaur, PhD student 
> eawag / Swiss Federal Institute of Aquatic Science and Technology 
> ECO - Department of Aquatic Ecology
> Überlandstrasse 133
> CH-8600 Dübendorf
> Switzerland
>  
> Phone: +41 (0) 71 220 38 25
> Fax: +41 (0) 44 823 53 15 
> Email: [EMAIL PROTECTED]
> www.lukasindermaur.ch
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rpart minimum sample size

Amy Uhrin wrote:
> Is there an optimal / minimum sample size for attempting to construct a 
> classification tree using /rpart/?
> 
> I have 27 seagrass disturbance sites (boat groundings) that have been 
> monitored for a number of years.  The monitoring protocol for each site 
> is identical.  From the monitoring data, I am able to determine the 
> level of recovery that each site has experienced.  Recovery is our 
> categorical dependent variable with values of none, low, medium, high 
> which are based upon percent seagrass regrowth into the injury over 
> time.  I wish to be able to predict the level of recovery of future 
> vessel grounding sites based upon a number of categorical / continuous 
> predictor variables used here including (but not limited to) such 
> parameters as:  sediment grain size, wave exposure, original size 
> (volume) of the injury, injury age, injury location.
> 
> When I run /rpart/, the data is split into only two terminal nodes based 
> solely upon values of the original volume of each injury.  No other 
> predictor variables are considered, even though I have included about 
> six of them in the model.  When I remove volume from the model the same 
> thing happens but with injury area - two terminal nodes are formed based 
> upon area values and no other variables appear.  I was hoping that this 
> was a programming issue, me being a newbie and all, but I really think 
> I've got the code right.  Now I am beginning to wonder if my N is too 
> small for this method?
> 

In my experience N needs to be around 20,000 to get both good accuracy 
and replicability of patterns if the number of potential predictors is 
not tiny.  In general, the R^2 from rpart is not competitive with that 
from an intelligently fitted regression model.  It's just a difficult 
problem, when relying on a single tree (hence the popularity of random 
forests, bagging, boosting).

Frank
-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] str() to extract components

2007-02-27 Thread Joerg van den Hoff

On Tue, Feb 27, 2007 at 03:52:54PM -, Simon Pickett wrote:
> Hi,
> 
> I have been dabbling with str() to extract values from outputs such as
> lmer etc and have found it very helpful sometimes.
> 
> but only seem to manage to extract the values when the output is one
> simple table, any more complicated and I'm stumped :-(
> 
> take this example of the extracted coeficients from a lmer analysis...
> 
> using str(coef(lmer(resp3~b$age+b$size+b$pcfat+(1|sex), data=b))) yields
> 
> Formal class 'lmer.coef' [package "Matrix"] with 3 slots
>   ..@ .Data :List of 1
>   .. ..$ :`data.frame': 2 obs. of  4 variables:
>   .. .. ..$ (Intercept): num [1:2] 1.07 1.13
>   .. .. ..$ b$age  : num [1:2] 0.00702 0.00702
>   .. .. ..$ b$size : num [1:2] 0.0343 0.0343
>   .. .. ..$ b$pcfat: num [1:2] 0.0451 0.0451
>   ..@ varFac: list()
>   ..@ stdErr: num(0)
> 
> how do I "get inside" the first table to get the value 1.07 for instance?
> 
> Any help much appreciated.
> 
may `unlist' would be enough?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read.csv size limits

2007-02-27 Thread andy1983


I have been using the read.csv function for a while now without any problems.
My files are usually 20-50 MBs and they take up to a minute to import. They
have all been under 50,000 rows and under 100 columns.

Recently, I tried importing a file of a similar size (which means about the
same amount of data), but with ~500,000 columns and ~20 rows. The process is
taking forever (~1 hour so far). In Task Manager, I see the CPU is at max,
but memory slows down to a halt at around 50 MBs (far below memory limit).

Is this normal? Is there a way to optimize this operation or at least check
the progress? Will this take 2 hours or 200 hours?

All I was trying to do is transpose my extra-wide table with a process that
I assumed would take 5 minutes. Maybe R is not the solution I am looking
for?

Thanks.

-- 
View this message in context: 
http://www.nabble.com/read.csv-size-limits-tf3302366.html#a9186136
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stl function

On Tue, 2007-02-27 at 15:55 +0100, Anja Eggert wrote:
> I want to apply the stl-function to decompose a time series (daily 
> measurements over 22 years) into seasonal component, trend and 
> residuals. I was able to get the diagrams.
> However, I could not find out what are the equations behind it. I.e. it 
> is probably not an additive or multiplicative combination of season (as 
> sin and cos-functions) and a linear trend?
> Furthermore, what are the grey bars on the right hand side of the diagrams?
> I would appreciate very much to receive some information or maybe a good 
> reference.
> 
> Thank you very much,
> Anja
> 

?stl tells you all you need to know to answer this, including the
reference to the academic publication that describes the method.

?plot.stl tells you that the grey bars are range bars - they are used to
assess the relative magnitude of various decomposed components.

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] compiling issues with Mandriva Linux 2007 Discovery

2007-02-27 Thread Bricklemyer, Ross S

All,

I am a new user to Linux but I am familiar with R.  I have previously
used and installed R on a Windows platform without problems.  I recently
set up a dual boot system (XP_64, Mandriva) to run R on a Linux platform
in order to more efficiently handle large datasets.  I have not done
compiling before, but read the R instructions and followed to my best
ability.  I downloaded the most recent tar and unpacked it into
/usr/local/R_HOME.  I was able to run ./configure and added the
additional Linux packages necessary to compile.  The problem arises
using the 'make' command.  When I run make I get an error that there is
not a target or a makefile.  There is, however, Makefile.in in the
directory.  I think I am close to getting this installed, but I am
stuck.  Any help would be greatly appreciated.  I do not suppose that
any of the pre compiled Linux versions (i.e. Debian, SuSe) available on
CRAN mirrors would run on a Mandriva distro?

Best,
Ross

***
Ross Bricklemyer
Dept. of Crop and Soil Sciences
Washington State University
201 Johnson Hall
Pullman, WA 99164-6420
Work: 509.335.3661
Cell/Home: 406.570.8576
Fax: 509.335.8674
Email: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting of all possible models

Indermaur Lukas wrote:
> Hi Frank
> I fitted a set of 12 candidate models and evaluated the importance of 
> variables based on model averaged coefficients and SE (model weights >=0.9). 
> Variables in my models were not distributed in equal numbers across all 
> models thus I was not able to evaluate the importance of variables just by 
> summing up the AIC-weights of models including a specific variable. Now, why 
> so many models to fit: I was curious, if the ranking in the importance of 
> variables is similar, when just summing up the AIC-weights over an 
> all-possible-models set and looking at the ordered model averaged 
> coefficients (order of CV=SE/coefficient).
>  
> Any hint for me?
> Cheers
> Lukas

I have seen the literature on Bayesian model averaging which uses 
weights from Bayes factors, related to BIC, but not the approach you are 
using.

Frank

> 
>  
> 
> Indermaur Lukas wrote:
>> Hi,
>> Fitting all possible models (GLM) with 10 predictors will result in loads of 
>> (2^10 - 1) models. I want to do that in order to get the importance of 
>> variables (having an unbalanced variable design) by summing the up the 
>> AIC-weights of models including the same variable, for every variable 
>> separately. It's time consuming and annoying to define all possible models 
>> by hand.
>>
>> Is there a command, or easy solution to let R define the set of all possible 
>> models itself? I defined models in the following way to process them with a 
>> batch job:
>>
>> # e.g. model 1
>> preference<- formula(Y~Lwd + N + Sex + YY)   
>> 
>> # e.g. model 2
>> preference_heterogeneity<- formula(Y~Ri + Lwd + N + Sex + YY) 
>> etc.
>> etc.
>>
>>
>> I appreciate any hint
>> Cheers
>> Lukas
> 
> If you choose the model from amount 2^10 -1 having best AIC, that model
> will be badly biased.  Why look at so many?  Pre-specification of
> models, or fitting full models with penalization, or using data
> reduction (masked to Y) may work better.
> 
> Frank
> 
>>
>>
>>
>>
>> °°°
>> Lukas Indermaur, PhD student
>> eawag / Swiss Federal Institute of Aquatic Science and Technology
>> ECO - Department of Aquatic Ecology
>> Überlandstrasse 133
>> CH-8600 Dübendorf
>> Switzerland
>>
>> Phone: +41 (0) 71 220 38 25
>> Fax: +41 (0) 44 823 53 15
>> Email: [EMAIL PROTECTED]
>> www.lukasindermaur.ch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting of all possible models

Bert Gunter wrote:
> ... Below
> 
> -- Bert 
> 
> Bert Gunter
> Genentech Nonclinical Statistics
> South San Francisco, CA 94404
> 650-467-7374
> 
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Frank E Harrell Jr
> Sent: Tuesday, February 27, 2007 5:14 AM
> To: Indermaur Lukas
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] fitting of all possible models
> 
> Indermaur Lukas wrote:
>> Hi,
>> Fitting all possible models (GLM) with 10 predictors will result in loads
> of (2^10 - 1) models. I want to do that in order to get the importance of
> variables (having an unbalanced variable design) by summing the up the
> AIC-weights of models including the same variable, for every variable
> separately. It's time consuming and annoying to define all possible models
> by hand. 
>>  
>> Is there a command, or easy solution to let R define the set of all
> possible models itself? I defined models in the following way to process
> them with a batch job:
>>  
>> # e.g. model 1
>> preference<- formula(Y~Lwd + N + Sex + YY)
> 
>> # e.g. model 2
>> preference_heterogeneity<- formula(Y~Ri + Lwd + N + Sex + YY)  
>> etc.
>> etc.
>>  
>>  
>> I appreciate any hint
>> Cheers
>> Lukas
> 
> If you choose the model from amount 2^10 -1 having best AIC, that model 
> will be badly biased.  Why look at so many?  Pre-specification of 
> models, or fitting full models with penalization, 
> 
> --- ...the rub being how much to penalize. My impression from what I've read
> is, for prediction, close to "the more, the better is the predictor..." .
> Nature rewards parsimony.
> 
> Cheers,
> Bert

Bert,

In my experience nature rewards complexity, if done right.  See Savage's 
antiparsimony principle  -Frank

@Article{gre00whe,
   author =   {Greenland, Sander},
   title ={When should epidemiologic regressions use 
random coeff
icients?},
   journal =  Biometrics,
   year = 2000,
   volume =   56,
   pages ={915-921},
   annote =   {Bayesian methods;causal inference;empirical Bayes
estimators;epidemiologic method;hierarchical regression;mixed
models;multilevel modeling;random-coefficient
regression;shrinkage;variance components;use of statistics in
epidemiology is largely primitive;stepwise variable selection on
confounders leaves important confounders uncontrolled;composition
matrix;example with far too many significant predictors with many
regression coefficients absurdly inflated when
overfit;lack of evidence for dietary effects mediated through
constituents;shrinkage instead of variable selection;larger effect on
confidence interval width than on point estimates with variable
selection;uncertainty about variance of random effects is just
uncertainty about prior opinion;estimation of variance is
pointless;instead the analysis shuld be repeated using different
values;"if one feels compelled to estimate $\tau^2$, I would recommend
giving it a proper prior concentrated amount contextually reasonable
values";claim about ordinary MLE being unbiased is misleading because
it assumes the model is correct and is the only model
entertained;shrinkage towards compositional model;"models need to be
complex to capture uncertainty about the relations...an honest
uncertainty assessment requires parameters for all effects that we
know may be present.  This advice is implicit in an antiparsimony
principle often attributed to L. J. Savage 'All models should be as
big as an elephant (see Draper, 1995)'".  See also gus06per.}
}

> 
> 
> Frank
> 
>>  
>>  
>>  
>>  
>>  
>> °°° 
>> Lukas Indermaur, PhD student 
>> eawag / Swiss Federal Institute of Aquatic Science and Technology 
>> ECO - Department of Aquatic Ecology
>> Überlandstrasse 133
>> CH-8600 Dübendorf
>> Switzerland
>>  
>> Phone: +41 (0) 71 220 38 25
>> Fax: +41 (0) 44 823 53 15 
>> Email: [EMAIL PROTECTED]
>> www.lukasindermaur.ch
>>

-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] str() to extract components

2007-02-27 Thread joris . dewolf


mod <- lmer(resp3~b$age+b$size+b$pcfat+(1|sex), data=b)
coef(mod)[1]$Subject[1,1]


[EMAIL PROTECTED] wrote on 27/02/2007 16:52:54:

> Hi,
>
> I have been dabbling with str() to extract values from outputs such as
> lmer etc and have found it very helpful sometimes.
>
> but only seem to manage to extract the values when the output is one
> simple table, any more complicated and I'm stumped :-(
>
> take this example of the extracted coeficients from a lmer analysis...
>
> using str(coef(lmer(resp3~b$age+b$size+b$pcfat+(1|sex), data=b))) yields
>
> Formal class 'lmer.coef' [package "Matrix"] with 3 slots
>   ..@ .Data :List of 1
>   .. ..$ :`data.frame': 2 obs. of  4 variables:
>   .. .. ..$ (Intercept): num [1:2] 1.07 1.13
>   .. .. ..$ b$age  : num [1:2] 0.00702 0.00702
>   .. .. ..$ b$size : num [1:2] 0.0343 0.0343
>   .. .. ..$ b$pcfat: num [1:2] 0.0451 0.0451
>   ..@ varFac: list()
>   ..@ stdErr: num(0)
>
> how do I "get inside" the first table to get the value 1.07 for instance?
>
> Any help much appreciated.
>
>
> Simon Pickett
> PhD student
> Centre For Ecology and Conservation
> Tremough Campus
> University of Exeter in Cornwall
> TR109EZ
> Tel 01326371852
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ts; decompose; plot and title

2007-02-27 Thread Alberto Monteiro

Is there any way to give a "decent" title after I plot something
generated by decompose?

For example:

# generate something with period 12
x <- rnorm(600) + sin(2 * pi * (1:600) / 12)

# transform to a monthy time series
y <- ts(x, frequency=12, start=c(1950,1))

# decompose
z <- decompose(y)

# plot
plot(z)

Now, the title is the ugly "Decomposition of additive time series".
How can do this with a decent title, like "Analysis of UFO abductions"?

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ts; decompose; plot and title

2007-02-27 Thread Gabor Grothendieck

Try this:

plot(cbind(observed = z$random +
z$trend * z$seasonal, trend = z$trend, seasonal = z$seasonal,
random = z$random), main = "My title")

Change the * to + if your setup is additive.

On 2/27/07, Alberto Monteiro <[EMAIL PROTECTED]> wrote:
> Is there any way to give a "decent" title after I plot something
> generated by decompose?
>
> For example:
>
> # generate something with period 12
> x <- rnorm(600) + sin(2 * pi * (1:600) / 12)
>
> # transform to a monthy time series
> y <- ts(x, frequency=12, start=c(1950,1))
>
> # decompose
> z <- decompose(y)
>
> # plot
> plot(z)
>
> Now, the title is the ugly "Decomposition of additive time series".
> How can do this with a decent title, like "Analysis of UFO abductions"?
>
> Alberto Monteiro
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ordered matrix question

2007-02-27 Thread Juan Pablo Fededa

Hi all,

Is there an easy way to generate an object wich will be the same matrix, but
ordered by de cfp value?
The data frame consists of numeric columns:
"Block""X""Y""cfp""yfp""ID"
0524244213.417957.184821091
055627065.3839049.5683726612
052831640.7894745.5737321753
0642432135.8173412.401344274
071643034.3591353.9449230775
0894362109.631583.1971603166
095813063.984523.3964520047
05069283.5139319.1054968568
047646491.6749199.1780894149
036442644.139322.06833436410

Thanks in advance,


Juan Pablo

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RDA and trend surface regression

2007-02-27 Thread MORLON

Thanks a lot for your answers,

I am concerned by your advice not to use polynomial constraints, or to use
QDA instead of RDA. My final goal is to perform variation partitioning using
partial RDA to assess the relative importance of environmental vs spatial
variables. For the spatial analyses, trend surface analysis (polynomial
constraints) is recommended in Legendre and Legendre 1998 (p739). Is there a
better method to integrate space as an explanatory variable in a variation
partitioning analyses? 

Also, I don't understand this: when I test for the significant contribution
of monomials (forward elimination)

>anova(rda(Helling ~ I(x^2)+Condition(x)+Condition(y)))

performs the permutation test as expected, whereas 

>anova(rda(Helling ~ I(y^2)+Condition(x)+Condition(y)))

Returns this error message:

Error in "names<-.default"(`*tmp*`, value = "Model") : 
attempt to set an attribute on NULL

Thanks again for your help
Kind regards,
Helene

Helene MORLON
University of California, Merced

-Original Message-
From: Jari Oksanen [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 26, 2007 11:27 PM
To: r-help@stat.math.ethz.ch
Cc: [EMAIL PROTECTED]
Subject: [R] RDA and trend surface regression


> 'm performing RDA on plant presence/absence data, constrained by
> geographical locations. I'd like to constrain the RDA by the "extended
> matrix of geographical coordinates" -ie the matrix of geographical
> coordinates completed by adding all terms of a cubic trend surface
> regression- . 
> 
> This is the command I use (package vegan):
> 
>  
> 
> >rda(Helling ~ x+y+x*y+x^2+y^2+x*y^2+y*x^2+x^3+y^3) 
> 
>  
> 
> where Helling is the matrix of Hellinger-transformed presence/absence data
> 
> The result returned by R is exactly the same as the one given by:
> 
>  
> 
> >anova(rda(Helling ~ x+y)
> 
>  
> 
> Ie the quadratic and cubic terms are not taken into account
> 

You must *I*solate the polynomial terms with function I ("AsIs") so that
they are not interpreted as formula operators:

rda(Helling ~ x + y + I(x*y) + I(x^2) + I(y^2) + I(x*y^2) + I(y*x^2) +
I(x^3) + I(y^3))

If you don't have the interaction terms, then it is easier and better
(numerically) to use poly():

rda(Helling ~ poly(x, 3) + poly(y, 3))

Another issue is that in my opinion using polynomial constraints is an
Extremely Bad Idea(TM).

cheers, Jari Oksanen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RDA and trend surface regression

2007-02-27 Thread Kuhn, Max

Helene,

My point was only that RDA may fit a quadratic model for the terms
specified in your model. The terms that you had specified were already
higher order polynomials (some cubic). So a QDA classifier with the
model terms that you specified my be a fifth order polynomial in the
original data. I don't know the reference you cite or even the
subject-matter specifics. I'm just a simple cave man (for you SNL fans).
But I do know that there are more reliable ways to get nonlinear
classification boundaries than using x^5. 

If you want a quadratic model, I would suggest that you use QDA with the
predictors in the original units (or see Hastie's book for a good
example of using higher order terms with LDA). 

Looking at your email, you want a "a variation partitioning analyses".
RDA works best as a classification technique. Perhaps a multivariate
ANOVA model may be a more direct way to meet your needs. There is a
connection between LDA and some multivariate linear models, but I don't
know of a similar connection to RDA.

Max

-Original Message-
From: MORLON [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 27, 2007 12:53 PM
To: 'Jari Oksanen'; r-help@stat.math.ethz.ch
Cc: Kuhn, Max
Subject: RE: [R] RDA and trend surface regression

Thanks a lot for your answers,

I am concerned by your advice not to use polynomial constraints, or to
use
QDA instead of RDA. My final goal is to perform variation partitioning
using
partial RDA to assess the relative importance of environmental vs
spatial
variables. For the spatial analyses, trend surface analysis (polynomial
constraints) is recommended in Legendre and Legendre 1998 (p739). Is
there a
better method to integrate space as an explanatory variable in a
variation
partitioning analyses? 

Also, I don't understand this: when I test for the significant
contribution
of monomials (forward elimination)

>anova(rda(Helling ~ I(x^2)+Condition(x)+Condition(y)))

performs the permutation test as expected, whereas 

>anova(rda(Helling ~ I(y^2)+Condition(x)+Condition(y)))

Returns this error message:

Error in "names<-.default"(`*tmp*`, value = "Model") : 
attempt to set an attribute on NULL

Thanks again for your help
Kind regards,
Helene

Helene MORLON
University of California, Merced

-Original Message-
From: Jari Oksanen [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 26, 2007 11:27 PM
To: r-help@stat.math.ethz.ch
Cc: [EMAIL PROTECTED]
Subject: [R] RDA and trend surface regression


> 'm performing RDA on plant presence/absence data, constrained by
> geographical locations. I'd like to constrain the RDA by the "extended
> matrix of geographical coordinates" -ie the matrix of geographical
> coordinates completed by adding all terms of a cubic trend surface
> regression- . 
> 
> This is the command I use (package vegan):
> 
>  
> 
> >rda(Helling ~ x+y+x*y+x^2+y^2+x*y^2+y*x^2+x^3+y^3) 
> 
>  
> 
> where Helling is the matrix of Hellinger-transformed presence/absence
data
> 
> The result returned by R is exactly the same as the one given by:
> 
>  
> 
> >anova(rda(Helling ~ x+y)
> 
>  
> 
> Ie the quadratic and cubic terms are not taken into account
> 

You must *I*solate the polynomial terms with function I ("AsIs") so that
they are not interpreted as formula operators:

rda(Helling ~ x + y + I(x*y) + I(x^2) + I(y^2) + I(x*y^2) + I(y*x^2) +
I(x^3) + I(y^3))

If you don't have the interaction terms, then it is easier and better
(numerically) to use poly():

rda(Helling ~ poly(x, 3) + poly(y, 3))

Another issue is that in my opinion using polynomial constraints is an
Extremely Bad Idea(TM).

cheers, Jari Oksanen

--
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ts; decompose; plot and title

On Tue, 2007-02-27 at 15:24 -0200, Alberto Monteiro wrote:
> Is there any way to give a "decent" title after I plot something
> generated by decompose?
> 
> For example:
> 
> # generate something with period 12
> x <- rnorm(600) + sin(2 * pi * (1:600) / 12)
> 
> # transform to a monthy time series
> y <- ts(x, frequency=12, start=c(1950,1))
> 
> # decompose
> z <- decompose(y)
> 
> # plot
> plot(z)
> 
> Now, the title is the ugly "Decomposition of additive time series".
> How can do this with a decent title, like "Analysis of UFO abductions"?
> 
> Alberto Monteiro

It is because plot.decompose.ts decides to impose it's own title for
some reason (using getAnywhere(plot.decompose.ts) to get the function
definition):

function (x, ...)
{
plot(cbind(observed = x$random + if (x$type == "additive")
x$trend + x$seasonal
else x$trend * x$seasonal, trend = x$trend, seasonal = x$seasonal,
random = x$random), main = paste("Decomposition of",
x$type, "time series"), ...)
}

I'd just write your own wrapper instead, using plot.decompose.ts, along
the lines of:

decomp.plot <- function(x, main = NULL, ...)
{
if(is.null(main))
main <- paste("Decomposition of", x$type, "time series")
plot(cbind(observed = x$random + if (x$type == "additive")
x$trend + x$seasonal
else x$trend * x$seasonal, trend = x$trend, seasonal = x$seasonal,
random = x$random), main = main, ...)
}

#then to complete your example:

# generate something with period 12
x <- rnorm(600) + sin(2 * pi * (1:600) / 12)

# transform to a monthy time series
y <- ts(x, frequency=12, start=c(1950,1))

# decompose
z <- decompose(y)

# plot
decomp.plot(z, main = "Analysis of UFO abductions")

Perhaps you could also file a bug report under the wish list category,
showing your example and the fact that

plot(z, main = "Analysis of UFO abductions") 

gives this error:

Error in plotts(x = x, y = y, plot.type = plot.type, xy.labels =
xy.labels,  :
formal argument "main" matched by multiple actual arguments

It isn't really a bug, but an infelicity in the way the function
currently works - my decomp.plot function may even be a suitable patch
or maybe the following is better:

decomp.plot2 <- function(x, main, ...)
{
if(missing(main))
main <- paste("Decomposition of", x$type, "time series")
plot(cbind(observed = x$random + if (x$type == "additive")
x$trend + x$seasonal
else x$trend * x$seasonal, trend = x$trend, seasonal = x$seasonal,
random = x$random), main = main, ...)
}

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ordered matrix question

2007-02-27 Thread Mendiburu, Felipe \(CIP\)

Juan Pablo,

X is data.frame or matrix
X <- X[order(X[,4]),]
options see help(order)

Felipe de Mendiburu

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Juan Pablo Fededa
Sent: Tuesday, February 27, 2007 12:47 PM
To: R-help@stat.math.ethz.ch
Subject: [R] ordered matrix question


Hi all,

Is there an easy way to generate an object wich will be the same matrix, but
ordered by de cfp value?
The data frame consists of numeric columns:
"Block""X""Y""cfp""yfp""ID"
0524244213.417957.184821091
055627065.3839049.5683726612
052831640.7894745.5737321753
0642432135.8173412.401344274
071643034.3591353.9449230775
0894362109.631583.1971603166
095813063.984523.3964520047
05069283.5139319.1054968568
047646491.6749199.1780894149
036442644.139322.06833436410

Thanks in advance,


Juan Pablo

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ordered matrix question

2007-02-27 Thread Earl F. Glynn

"Juan Pablo Fededa" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Hi all,
>
> Is there an easy way to generate an object wich will be the same matrix, 
> but
> ordered by de cfp value?

Does this help?

> RawData <- "BlockXYcfpyfpID
+ 0524244213.417957.184821091
+ 055627065.3839049.5683726612
+ 052831640.7894745.5737321753
+ 0642432135.8173412.401344274
+ 071643034.3591353.9449230775
+ 0894362109.631583.1971603166
+ 095813063.984523.3964520047
+ 05069283.5139319.1054968568
+ 047646491.6749199.1780894149
+ 036442644.139322.06833436410"
> d <- read.table(textConnection(RawData), header=TRUE)

> d.ordered <- data.matrix( d[order(d$cfp),] )

> d.ordered
   Block   X   Y   cfp   yfp ID
5  0 716 430  34.35914  3.944923  5
3  0 528 316  40.78947  5.573732  3
10 0 364 426  44.13932  2.068334 10
7  0 958 130  63.98452  3.396452  7
2  0 556 270  65.38390  9.568373  2
8  0 506  92  83.51393  9.105497  8
9  0 476 464  91.67492  9.178089  9
6  0 894 362 109.63158  3.197160  6
4  0 642 432 135.81734 12.401344  4
1  0 524 244 213.41795  7.184821  1


efg

Earl F. Glynn
Stowers Institute for Medical Research

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sample size for 2-sample proportion tests

2007-02-27 Thread Berta

Hi R-users,
I want to calculate the sample size needed to carry out a  2-sample 
proprotion test.
I have the hypotesized treatment probability of success (0.80), the 
hypotesized control probability of success (0.05), and also de proportion of 
the sample devoted to treated group (5%), (fraction=rho=0.05, n2/n1=19). 
Using the Hsmisc library, it seemss that I can use bsamsize (option 1) or 
samplesize.bin (option 2, alpha=0.05 or option 3 alpha=0.05/2, I am not sure 
after reading the help page) and I can use STATA (option 4).

library(Hmisc)

#OPTION 1:
bsamsize(p1=0.8, p2=0.05, fraction=0.05, alpha=.05, power=.9)
#  n1  =2.09,   n2=39.7,  TOTAL=42

#OPTION 2:
samplesize.bin(alpha=0.05, beta=0.9, pit=0.8, pic=0.05, rho=0.05)
#  n= 58, TOTAL= 58

#OPTION 3:
samplesize.bin(alpha=0.025, beta=0.9, pit=0.8, pic=0.05, rho=0.05)
# n= 72, TOTAL= 72

#OPTION 4:
sampsi 0.8 0.05, p(0.90) a(0.05) r(19)
#  n1=4,  n2 = 76   TOTAL=80

Can the method used produces the differences (42 vs 72 vs 80)? Can somebody 
give me hints about the possible reasons (asymptotic-exact distribution- 
continuity correction-my own error)? Which method would be recomended?

Thanks a lot in advance,

Berta

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Function to do multiple named lookups faster?

2007-02-27 Thread David Reiss

Hi,
I apologize if this topic has been discussed - I could not figure out
a good search phrase for this question.

I have a named vector x, with multiple (duplicate) names, and I would
like to obtain a (shorter) vector with non-duplicate names in which
the values are the means of the values of the duplicated indexes in x.
My best (fastest) solution to this was this code:

nms <- names( x )
x.uniq <- sapply( unique( nms ), function( i ) mean( subtracted[ nms == i ] ) )

However, this takes forever on my beefy Mac Pro. Is there a faster way
to this using pre-written functions in R?

Thanks a lot for any advice.
-David

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ordered matrix question

On Tue, 2007-02-27 at 14:46 -0300, Juan Pablo Fededa wrote:
> Hi all,
> 
> Is there an easy way to generate an object wich will be the same matrix, but
> ordered by de cfp value?
> The data frame consists of numeric columns:
> "Block""X""Y""cfp""yfp""ID"
> 0524244213.417957.184821091
> 055627065.3839049.5683726612
> 052831640.7894745.5737321753
> 0642432135.8173412.401344274
> 071643034.3591353.9449230775
> 0894362109.631583.1971603166
> 095813063.984523.3964520047
> 05069283.5139319.1054968568
> 047646491.6749199.1780894149
> 036442644.139322.06833436410
> 
> Thanks in advance,

Yes, see ?order. E.g.:

mat <- scan()
0524244213.417957.184821091
055627065.3839049.5683726612
052831640.7894745.5737321753
0642432135.8173412.401344274
071643034.3591353.9449230775
0894362109.631583.1971603166
095813063.984523.3964520047
05069283.5139319.1054968568
047646491.6749199.1780894149
036442644.139322.06833436410


mat <- data.frame(matrix(mat, ncol = 6, byrow = TRUE))
names(mat) <- c("Block", "X", "Y", "cfp", "yfp", "ID")
mat
mat[order(mat$cfp), ]

HTH

G
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how much performance penalty does this incur, scalar as a vector of one element?

2007-02-27 Thread Jason Liao

Dear Prof. Tierney, thank you very much to answer my question. It is good to
know that the loss of efficiency can be small.

I came to this question after using R to implement a few low level algorithm:
KD-tree and recursive algorithm for conditional Poisson binomial. The R's speed
has been slow and even much slower than Ruby.

I love R dearly and always tell my students that it is the best thing that ever
happened to statistics. R is much more elegant than C or Fortran. Unfortunately
Fortran or C is still needed when speed is a concern and a statistician has
then to confront the ugly and complex large world. A huge gain in productivity
and reduction in mental anguish can be achieved If R's speed can be improved
via compilation.

I did a little research. The following tool claims to make Python as fast as C

http://www-128.ibm.com/developerworks/linux/library/l-psyco.html

Recently, a new Ruby implementation makes it several times faster:

http://www.antoniocangiano.com/articles/2007/02/19/ruby-implementations-shootout-ruby-vs-yarv-vs-jruby-vs-gardens-point-ruby-net-vs-rubinius-vs-cardinal

Jason Liao, http://www.geocities.com/jg_liao
Associate Professor of Biostatistics
Drexel University School of Public Health
245 N. 15th Street, Mail Stop 660
Philadelphia, PA 19102-1192
phone 215-762-3934

Expecting? Get great news right away with email Auto-Check.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ordered matrix question

2007-02-27 Thread Daniel Nordlund

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> On Behalf Of Juan Pablo Fededa
> Sent: Tuesday, February 27, 2007 9:47 AM
> To: R-help@stat.math.ethz.ch
> Subject: [R] ordered matrix question
> 
> Hi all,
> 
> Is there an easy way to generate an object wich will be the same matrix, but
> ordered by de cfp value?
> The data frame consists of numeric columns:
> "Block""X""Y""cfp""yfp""ID"
> 0524244213.417957.184821091
> 055627065.3839049.5683726612
> 052831640.7894745.5737321753
> 0642432135.8173412.401344274
> 071643034.3591353.9449230775
> 0894362109.631583.1971603166
> 095813063.984523.3964520047
> 05069283.5139319.1054968568
> 047646491.6749199.1780894149
> 036442644.139322.06833436410
> 
> Thanks in advance,
> 
> 
> Juan Pablo

Juan,

Look at ?order.  Something like this should work

your.df[order(your.df$cfp, decreasing=TRUE), ]

Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA  USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Macros in R

Others have pointed you to the answer to your question, but both FAQ
7.21 and the assign help page should really have a big banner at the top
saying "Here Be Dragons".

Using a loop or other automated procedure to create variables in the
main namespace can cause hard to find bugs, accidentally clobber
existing variables, and other non-fun things.

For this type of thing it is usually best to use a list (or an
environment, but I am more comforatable with lists).

For your example you could do something like:

> mymats <- list()
> for (i in 1:54){
+   myname <- paste('mymatrix',i,sep='')
+   mymats[[myname]] <- matrix( # insert whatever code you want here
+ }

A big advantage of this approach is that you can then deal with your
list of matricies as a single unit.  If you want to delete them, you
just delete the list rather than having to delete 54 individual
matricies.  The list can also be saved as a single unit to a file,
passed to another function, etc.

To access a single matrix (for example 'mymatrix5' which is in position
5) you have several options:

> mean( mymats[[5]] )
> mean( mymats[['mymatrix5']] )
> with( mymats, mean(mymatrix5) )
> attach(mymats)
> mean(mymatrix5) # as long as there is not a mymatrix 5 in the global
environment
> detach()

And probably others.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Monika Kerekes
> Sent: Sunday, February 25, 2007 9:03 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Macros in R
> 
> Dear members,
> 
>  
> 
> I have started to work with R recently and there is one thing 
> which I could not solve so far. I don't know how to define 
> macros in R. The problem at hand is the following: I want R 
> to go through a list of 1:54 and create the matrices input1, 
> input2, input3 up to input54. I have tried the following:
> 
>  
> 
> for ( i in 1:54) {
> 
>   input[i] = matrix(nrow = 1, ncol = 107)
> 
>   input[i][1,]=datset$variable
> 
> }
> 
>  
> 
> However, R never creates the required matrices. I have also 
> tried to type input'i' and input$i, none of which worked. I 
> would be very grateful for help as this is a basic question 
> the answer of which is paramount to any further usage of the software.
> 
>  
> 
> Thank you very much
> 
>  
> 
> Monika
> 
>  
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plotting rpart objects - fancy option

2007-02-27 Thread Volker Bahn


Hi all,

I'm trying to create nice plots of rpart objects. In particular, I'd 
like to use the "fancy" option to text() that creates ellipses and 
rectangles at the splits and endnotes, respectively. This worked fine in 
the past, but now the ellipses do not interrupt the original tree lines 
anymore but overlay them (see attached ps file). I'd like it to look the 
way Figure 18 on page 49 of the rpart report "An Introduction to 
Recursive Partitioning Using the RPART Routines" looks and can't figure 
out why it doesn't.


I created the attached ps with the command:

> post(tree)

which according to the report is pretty much equivalent to:

> plot(tree, uniform = T, branch = 0.2, compress = T, margin = 0.1)
> text(tree, all = T, use.n=T, fancy = T)

As for my system info:

> sessionInfo()
R version 2.4.1 (2006-12-18)
i386-pc-mingw32

locale:
LC_COLLATE=English_Canada.1252;LC_CTYPE=English_Canada.1252;LC_MONETARY=English_Canada.1252;LC_NUMERIC=C;LC_TIME=English_Canada.1252

attached base packages:
[1] "stats" "graphics"  "grDevices" "utils" "datasets"  "methods" 
[7] "base"


other attached packages:
  rpart
"3.1-34"


Thank you,

Volker


temptree.pr.ps
Description: PostScript document
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multiple conditional without if

2007-02-27 Thread bunny , lautloscrew.com

Dear all,

i am stuck with a syntax problem.

i have a matrix which has about 500 rows and 6 columns.
now i want to kick some data out.
i want create a new matrix which is basically the old one except for all
entries which have a 4 in the 5 column AND a 1 in the 6th column.

i tried the following but couldn´t get a new matrix, just some wierd  
errors:

newmatrix=oldmatrix[,2][oldmatrix[,5]==4]&&oldmatrix[,2][oldmatrix[,6] 
==1]

all i get is:
numeric(0)

does anybody have an idea how to fix this one ?

thx in advance

matthias
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RDA and trend surface regression

On Tue, 2007-02-27 at 13:13 -0500, Kuhn, Max wrote:
> Helene,
> 
> My point was only that RDA may fit a quadratic model for the terms
> specified in your model. The terms that you had specified were already
> higher order polynomials (some cubic). So a QDA classifier with the
> model terms that you specified my be a fifth order polynomial in the
> original data. I don't know the reference you cite or even the
> subject-matter specifics. I'm just a simple cave man (for you SNL fans).
> But I do know that there are more reliable ways to get nonlinear
> classification boundaries than using x^5. 

I doubt that Helene is trying to do a classification - unless you
consider classification to mean that all rows/samples are in different
groups (i.e. n samples therefore n groups) - which is how RDA
(Redundancy Analysis) is used in ecology.

You could take a look at multispati in package ade4 for a different way
to handle spatial constraints. There is also the principle coordinates
analysis of neighbour matrices (PCNM) method - not sure this is coded
anywhere in R yet though. Here are two references that may be useful:

Dray, S., P. Legendre, and P. R. Peres- Neto. 2006. Spatial modeling: a
comprehensive framework for principal coordinate analysis of neighbor
matrices (PCNM). Ecological Modelling, in press.

Griffith, D. A., and P. R. Peres- Neto. 2006. Spatial modeling in
ecology: the flexibility of eigenfunction spatial analyses. Ecology, in
press.

HTH

G

> 
> If you want a quadratic model, I would suggest that you use QDA with the
> predictors in the original units (or see Hastie's book for a good
> example of using higher order terms with LDA). 
> 
> Looking at your email, you want a "a variation partitioning analyses".
> RDA works best as a classification technique. Perhaps a multivariate
> ANOVA model may be a more direct way to meet your needs. There is a
> connection between LDA and some multivariate linear models, but I don't
> know of a similar connection to RDA.
> 
> Max
> 
> -Original Message-
> From: MORLON [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, February 27, 2007 12:53 PM
> To: 'Jari Oksanen'; r-help@stat.math.ethz.ch
> Cc: Kuhn, Max
> Subject: RE: [R] RDA and trend surface regression
> 
> Thanks a lot for your answers,
> 
> I am concerned by your advice not to use polynomial constraints, or to
> use
> QDA instead of RDA. My final goal is to perform variation partitioning
> using
> partial RDA to assess the relative importance of environmental vs
> spatial
> variables. For the spatial analyses, trend surface analysis (polynomial
> constraints) is recommended in Legendre and Legendre 1998 (p739). Is
> there a
> better method to integrate space as an explanatory variable in a
> variation
> partitioning analyses? 
> 
> Also, I don't understand this: when I test for the significant
> contribution
> of monomials (forward elimination)
> 
> >anova(rda(Helling ~ I(x^2)+Condition(x)+Condition(y)))
> 
> performs the permutation test as expected, whereas 
> 
> >anova(rda(Helling ~ I(y^2)+Condition(x)+Condition(y)))
> 
> Returns this error message:
> 
> Error in "names<-.default"(`*tmp*`, value = "Model") : 
> attempt to set an attribute on NULL
> 
> Thanks again for your help
> Kind regards,
> Helene
> 
> Helene MORLON
> University of California, Merced
> 
> -Original Message-
> From: Jari Oksanen [mailto:[EMAIL PROTECTED] 
> Sent: Monday, February 26, 2007 11:27 PM
> To: r-help@stat.math.ethz.ch
> Cc: [EMAIL PROTECTED]
> Subject: [R] RDA and trend surface regression
> 
> 
> > 'm performing RDA on plant presence/absence data, constrained by
> > geographical locations. I'd like to constrain the RDA by the "extended
> > matrix of geographical coordinates" -ie the matrix of geographical
> > coordinates completed by adding all terms of a cubic trend surface
> > regression- . 
> > 
> > This is the command I use (package vegan):
> > 
> >  
> > 
> > >rda(Helling ~ x+y+x*y+x^2+y^2+x*y^2+y*x^2+x^3+y^3) 
> > 
> >  
> > 
> > where Helling is the matrix of Hellinger-transformed presence/absence
> data
> > 
> > The result returned by R is exactly the same as the one given by:
> > 
> >  
> > 
> > >anova(rda(Helling ~ x+y)
> > 
> >  
> > 
> > Ie the quadratic and cubic terms are not taken into account
> > 
> 
> You must *I*solate the polynomial terms with function I ("AsIs") so that
> they are not interpreted as formula operators:
> 
> rda(Helling ~ x + y + I(x*y) + I(x^2) + I(y^2) + I(x*y^2) + I(y*x^2) +
> I(x^3) + I(y^3))
> 
> If you don't have the interaction terms, then it is easier and better
> (numerically) to use poly():
> 
> rda(Helling ~ poly(x, 3) + poly(y, 3))
> 
> Another issue is that in my opinion using polynomial constraints is an
> Extremely Bad Idea(TM).
> 
> cheers, Jari Oksanen
> 
> --
> LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}
> 
> __
>

Re: [R] Macros in R

2007-02-27 Thread Gabor Grothendieck

The FAQ does mention your point already.

On 2/27/07, Greg Snow <[EMAIL PROTECTED]> wrote:
> Others have pointed you to the answer to your question, but both FAQ
> 7.21 and the assign help page should really have a big banner at the top
> saying "Here Be Dragons".
>
> Using a loop or other automated procedure to create variables in the
> main namespace can cause hard to find bugs, accidentally clobber
> existing variables, and other non-fun things.
>
> For this type of thing it is usually best to use a list (or an
> environment, but I am more comforatable with lists).
>
> For your example you could do something like:
>
> > mymats <- list()
> > for (i in 1:54){
> +   myname <- paste('mymatrix',i,sep='')
> +   mymats[[myname]] <- matrix( # insert whatever code you want here
> + }
>
> A big advantage of this approach is that you can then deal with your
> list of matricies as a single unit.  If you want to delete them, you
> just delete the list rather than having to delete 54 individual
> matricies.  The list can also be saved as a single unit to a file,
> passed to another function, etc.
>
> To access a single matrix (for example 'mymatrix5' which is in position
> 5) you have several options:
>
> > mean( mymats[[5]] )
> > mean( mymats[['mymatrix5']] )
> > with( mymats, mean(mymatrix5) )
> > attach(mymats)
> > mean(mymatrix5) # as long as there is not a mymatrix 5 in the global
> environment
> > detach()
>
> And probably others.
>
> Hope this helps,
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> [EMAIL PROTECTED]
> (801) 408-8111
>
>
>
> > -Original Message-
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On Behalf Of Monika Kerekes
> > Sent: Sunday, February 25, 2007 9:03 AM
> > To: r-help@stat.math.ethz.ch
> > Subject: [R] Macros in R
> >
> > Dear members,
> >
> >
> >
> > I have started to work with R recently and there is one thing
> > which I could not solve so far. I don't know how to define
> > macros in R. The problem at hand is the following: I want R
> > to go through a list of 1:54 and create the matrices input1,
> > input2, input3 up to input54. I have tried the following:
> >
> >
> >
> > for ( i in 1:54) {
> >
> >   input[i] = matrix(nrow = 1, ncol = 107)
> >
> >   input[i][1,]=datset$variable
> >
> > }
> >
> >
> >
> > However, R never creates the required matrices. I have also
> > tried to type input'i' and input$i, none of which worked. I
> > would be very grateful for help as this is a basic question
> > the answer of which is paramount to any further usage of the software.
> >
> >
> >
> > Thank you very much
> >
> >
> >
> > Monika
> >
> >
> >
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] recording graphics going from lattice to traditional plots, and issues with log axes

2007-02-27 Thread Michael D. Rennie



Hi there

I am using the options(graphics.record = TRUE) command to keep track of 
my plots as I execute them. However, if I do a couple of plots using the 
lattice package (i.e., xyplots(y ~ x, ...), then go back to using the 
traditional graphics state (i.e., plot(y, x, etc...), then I can't go 
back to see the lattice plots.


Is the answer here "just stick with one graphics platform"? The reason 
I'm switching back and forth is that I am plotting logarithmic axes. In 
both the traditional state (plot(y, x,)) and in the lattice plots 
(xyplot(y~x,... ,)) the arguments


...xlog = TRUE, ylog = TRUE, ...

don't do anything, and the only way I can get my log log axes is to do a 
traditional plot with the argument


log = "xy"

which doesn't do anything in the lattice framework.

Any suggestions on either of these issues (recording and log axes)?

Cheers,

Mike

--
Michael D. Rennie
Ph.D. Candidate
University of Toronto at Mississauga
3359 Missisagua Rd. N. 
Mississauga, ON L5L 1C6

Ph: 905-828-5452 Fax: 905-828-3792
www.utm.utoronto.ca/~w3rennie

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] matplot on lattice graphics

2007-02-27 Thread Mario A. Morales R.

can I  use matplot on each panel of a lattice graphic? How?   
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.csv size limits

2007-02-27 Thread jim holtman

Have you used colClasses to define what each of the columns contain?  Can
you use 'scan'?  I haven't tried anything with 500,000 columns, but if they
are numeric, this should not take too long.  So I create a 20 line file with
500,000 columns and here is what it took reading it both as numeric and
character:

> system.time(x <- scan('/tempyy.txt', what=0))
Read 1000 items
[1] 17.57  0.12 18.89NANA
> str(x)
 num [1:1000] 12345 12345 12345 12345 12345 ...
> system.time(x <- scan('/tempyy.txt', what=''))
Read 1000 items
[1]  9.03  0.10 11.21NANA
> str(x)
 chr [1:1000] "12345" "12345" "12345" "12345" "12345" "12345" "12345"
"12345" ...
>





On 2/27/07, andy1983 <[EMAIL PROTECTED]> wrote:
>
>
> I have been using the read.csv function for a while now without any
> problems.
> My files are usually 20-50 MBs and they take up to a minute to import.
> They
> have all been under 50,000 rows and under 100 columns.
>
> Recently, I tried importing a file of a similar size (which means about
> the
> same amount of data), but with ~500,000 columns and ~20 rows. The process
> is
> taking forever (~1 hour so far). In Task Manager, I see the CPU is at max,
> but memory slows down to a halt at around 50 MBs (far below memory limit).
>
> Is this normal? Is there a way to optimize this operation or at least
> check
> the progress? Will this take 2 hours or 200 hours?
>
> All I was trying to do is transpose my extra-wide table with a process
> that
> I assumed would take 5 minutes. Maybe R is not the solution I am looking
> for?
>
> Thanks.
>
> --
> View this message in context:
> http://www.nabble.com/read.csv-size-limits-tf3302366.html#a9186136
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple conditional without if

2007-02-27 Thread Mendiburu, Felipe \(CIP\)

Dear matthias,

newmatrix = oldmatrix[ (oldmatrix[,5]==4 & oldmatrix[,6]==1) , ]

Felipe

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of bunny ,
lautloscrew.com
Sent: Tuesday, February 27, 2007 1:25 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Multiple conditional without if


Dear all,

i am stuck with a syntax problem.

i have a matrix which has about 500 rows and 6 columns.
now i want to kick some data out.
i want create a new matrix which is basically the old one except for all
entries which have a 4 in the 5 column AND a 1 in the 6th column.

i tried the following but couldn´t get a new matrix, just some wierd  
errors:

newmatrix=oldmatrix[,2][oldmatrix[,5]==4]&&oldmatrix[,2][oldmatrix[,6] 
==1]

all i get is:
numeric(0)

does anybody have an idea how to fix this one ?

thx in advance

matthias
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RDA and trend surface regression

2007-02-27 Thread Roger Bivand

On Tue, 27 Feb 2007, Gavin Simpson wrote:

> On Tue, 2007-02-27 at 13:13 -0500, Kuhn, Max wrote:
> > Helene,
> > 
> > My point was only that RDA may fit a quadratic model for the terms
> > specified in your model. The terms that you had specified were already
> > higher order polynomials (some cubic). So a QDA classifier with the
> > model terms that you specified my be a fifth order polynomial in the
> > original data. I don't know the reference you cite or even the
> > subject-matter specifics. I'm just a simple cave man (for you SNL fans).
> > But I do know that there are more reliable ways to get nonlinear
> > classification boundaries than using x^5. 
> 
> I doubt that Helene is trying to do a classification - unless you
> consider classification to mean that all rows/samples are in different
> groups (i.e. n samples therefore n groups) - which is how RDA
> (Redundancy Analysis) is used in ecology.
> 
> You could take a look at multispati in package ade4 for a different way
> to handle spatial constraints. There is also the principle coordinates
> analysis of neighbour matrices (PCNM) method - not sure this is coded
> anywhere in R yet though. Here are two references that may be useful:
> 
> Dray, S., P. Legendre, and P. R. Peres- Neto. 2006. Spatial modeling: a
> comprehensive framework for principal coordinate analysis of neighbor
> matrices (PCNM). Ecological Modelling, in press.
> 
> Griffith, D. A., and P. R. Peres- Neto. 2006. Spatial modeling in
> ecology: the flexibility of eigenfunction spatial analyses. Ecology, in
> press.

Pedro Peres-Neto helped cast his matlab original to R as function ME() in
the spdep package, at least partly to see if it worked like
SpatialFiltering() in spdep, based on a forthcoming paper by Tiefelsdorf
and Griffith; code suggestions from Stéphane Dray were also used. For
sample data sets, including those used in the papers, ME() reproduces the
original results, and reproduces results from the matlab code from which
it was derived with possible differences for the stopping rule of the
semi-parametric stage - the Oribatid mites data set is in ade4.

Roger

> 
> HTH
> 
> G
> 
> > 
> > If you want a quadratic model, I would suggest that you use QDA with the
> > predictors in the original units (or see Hastie's book for a good
> > example of using higher order terms with LDA). 
> > 
> > Looking at your email, you want a "a variation partitioning analyses".
> > RDA works best as a classification technique. Perhaps a multivariate
> > ANOVA model may be a more direct way to meet your needs. There is a
> > connection between LDA and some multivariate linear models, but I don't
> > know of a similar connection to RDA.
> > 
> > Max
> > 
> > -Original Message-
> > From: MORLON [mailto:[EMAIL PROTECTED] 
> > Sent: Tuesday, February 27, 2007 12:53 PM
> > To: 'Jari Oksanen'; r-help@stat.math.ethz.ch
> > Cc: Kuhn, Max
> > Subject: RE: [R] RDA and trend surface regression
> > 
> > Thanks a lot for your answers,
> > 
> > I am concerned by your advice not to use polynomial constraints, or to
> > use
> > QDA instead of RDA. My final goal is to perform variation partitioning
> > using
> > partial RDA to assess the relative importance of environmental vs
> > spatial
> > variables. For the spatial analyses, trend surface analysis (polynomial
> > constraints) is recommended in Legendre and Legendre 1998 (p739). Is
> > there a
> > better method to integrate space as an explanatory variable in a
> > variation
> > partitioning analyses? 
> > 
> > Also, I don't understand this: when I test for the significant
> > contribution
> > of monomials (forward elimination)
> > 
> > >anova(rda(Helling ~ I(x^2)+Condition(x)+Condition(y)))
> > 
> > performs the permutation test as expected, whereas 
> > 
> > >anova(rda(Helling ~ I(y^2)+Condition(x)+Condition(y)))
> > 
> > Returns this error message:
> > 
> > Error in "names<-.default"(`*tmp*`, value = "Model") : 
> > attempt to set an attribute on NULL
> > 
> > Thanks again for your help
> > Kind regards,
> > Helene
> > 
> > Helene MORLON
> > University of California, Merced
> > 
> > -Original Message-
> > From: Jari Oksanen [mailto:[EMAIL PROTECTED] 
> > Sent: Monday, February 26, 2007 11:27 PM
> > To: r-help@stat.math.ethz.ch
> > Cc: [EMAIL PROTECTED]
> > Subject: [R] RDA and trend surface regression
> > 
> > 
> > > 'm performing RDA on plant presence/absence data, constrained by
> > > geographical locations. I'd like to constrain the RDA by the "extended
> > > matrix of geographical coordinates" -ie the matrix of geographical
> > > coordinates completed by adding all terms of a cubic trend surface
> > > regression- . 
> > > 
> > > This is the command I use (package vegan):
> > > 
> > >  
> > > 
> > > >rda(Helling ~ x+y+x*y+x^2+y^2+x*y^2+y*x^2+x^3+y^3) 
> > > 
> > >  
> > > 
> > > where Helling is the matrix of Hellinger-transformed presence/absence
> > data
> > > 
> > > The result returned by R is exactly the same as the one giv

Re: [R] Multiple conditional without if

2007-02-27 Thread Petr Klasterecky

bunny , lautloscrew.com napsal(a):
> Dear all,
> 
> i am stuck with a syntax problem.
> 
> i have a matrix which has about 500 rows and 6 columns.
> now i want to kick some data out.
> i want create a new matrix which is basically the old one except for all
> entries which have a 4 in the 5 column AND a 1 in the 6th column.
> 
> i tried the following but couldn´t get a new matrix, just some wierd  
> errors:
> 
> newmatrix=oldmatrix[,2][oldmatrix[,5]==4]&&oldmatrix[,2][oldmatrix[,6] 
> ==1]
This is nonsense.

 > m <- matrix(rep(1:12,3),ncol=6)
 > m[1,6] <-1
 > m
  [,1] [,2] [,3] [,4] [,5] [,6]
[1,]171711
[2,]282828
[3,]393939
[4,]4   104   104   10
[5,]5   115   115   11
[6,]6   126   126   12

 > m[!((m[,5]==4)|(m[,6]==1)),]
  [,1] [,2] [,3] [,4] [,5] [,6]
[1,]282828
[2,]393939
[3,]5   115   115   11
[4,]6   126   126   12

Please read the appropriate chapter in R-intro to become familiar with 
vector, matrix and list indexing.
Petr


> 
> all i get is:
> numeric(0)
> 
> does anybody have an idea how to fix this one ?
> 
> thx in advance
> 
> matthias
>   [[alternative HTML version deleted]]
> 
> 
> 
> 
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Petr Klasterecky
Dept. of Probability and Statistics
Charles University in Prague
Czech Republic

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with R interface termination

2007-02-27 Thread Bhanu Kalyan.K

Dear Sir,

The R interface that i am currently using is R 2.4.1( in Win XP). I am not able 
to terminate the program by giving the q() command. Each time I pass this 
command,

> q()
Error in .Last() : could not find function "finalizeSession"

this error creeps in and it neither allows me to save my workspace nor come out 
of R. Therefore, whenever this happens, I am forced to end the Rgui.exe process 
from the task manager.
Kindly help me fix this problem.

Regards,

Bhanu Kalyan K


Bhanu Kalyan K
B.Tech Final Year, CSE

Tel: +91-9885238228

Alternate E-Mail: 
[EMAIL PROTECTED]

 
-
Need Mail bonding?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] interactions and GAM

2007-02-27 Thread Beaulaton Laurent

Dear R-users,

I have 1 remark and 1 question on the inclusion of interactions in the gam 
function from the gam package.

I need to fit quantitative predictors in interactions with factors. You can see 
an example of what I need in fig 9.13 p265  from Hastie and Tibshirani book 
(1990). 
It's clearly stated that in ?gam  "Interactions with nonparametric smooth terms 
are not fully supported".
I have found a trick in a former 
http://www.math.yorku.ca/Who/Faculty/Monette/S-news/2284.html, using NAs and 
na.gam.replace argument, but some points are still unclear for me.

First the prediction of new data (using predict function) is not so easy (see 
script below), and need a close reading from section 7.3.2 of the Chambers and 
Hastie (1992).

Second I need to have the same intercept for all levels of factor and this not 
achievable with this trick. My question is : why not replacing NA by 0 (or any 
other particular value) ?

Here is a quite long (sorry for that) script with a generated dataset to better 
undestand my question.
in this script the model to fit is (in a GLM-like writing) : y~s(x2):x1
the generated dataset follows this model and y(x2=0)=10 whatever x1.


#start of script


#data construction  (with deliberately very small noise)
data1=data.frame(x1=rep(NA,27),x2=NA,y=NA)

data1$x1=factor(c(rep(1,11),rep(2,11),rep(3,5)))
data1$x2=c(rep(0:10,2),0:4)

data1[data1$x1==1,"y"]=data1[data1$x1==1,"x2"]^4*5+rnorm(11)+1
data1[data1$x1==2,"y"]=data1[data1$x1==2,"x2"]^4*(-3)+rnorm(11)+1
data1[data1$x1==3,"y"]=1*data1[data1$x1==3,"x2"]+rnorm(5)+1

library(lattice)
xyplot(data1$y~data1$x2,groups=data1$x1)

#creation of dummy variables for interactions
data1$x2_1=ifelse(data1$x1=="1",data1$x2,NA)
data1$x2_2=ifelse(data1$x1=="2",data1$x2,NA)
data1$x2_3=ifelse(data1$x1=="3",data1$x2,NA)

#model fitting
library(gam)
model=gam(y~s(x2_1)+s(x2_2)+s(x2_3)+x1,data=data1,na=na.gam.replace)

#prediction fit well data :
summary(model)
plot(data1$x2,data1$y)
points(data1$x2,model$fitted.value,col="red",pch="+")

#trying to see prediction:
predict(model) #does work
predict(model,newdata=data1) #produce NA

#trying to replace NA in data1 by mean, to mimic na.gam.replace:
Ndata=data1
Ndata$x2_1=ifelse(data1$x1=="1",data1$x2,mean(data1$x2_1,na.rm=TRUE))
Ndata$x2_2=ifelse(data1$x1=="2",data1$x2,mean(data1$x2_2,na.rm=TRUE))
Ndata$x2_3=ifelse(data1$x1=="3",data1$x2,mean(data1$x2_3,na.rm=TRUE))

predict(model,Ndata)-predict(model) #as you can see there is a systematic biais

#correct way to predict (=returned 0 for terms with NA value):
p=predict(model,data1,type="term")
rowSums(cbind(p,attr(p,"constant")),na.rm=TRUE)-predict(model)

#alternative solution, 0 instead of NA
data1$v1=ifelse(data1$x1=="1",data1$x2,0)
data1$v2=ifelse(data1$x1=="2",data1$x2,0)
data1$v3=ifelse(data1$x1=="3",data1$x2,0)

model1=gam(y~s(v1)+s(v2)+s(v3),data=data1)
summary(model1)
points(data1$x2,predict(model1,data1),col="green",pch="X")
#no particular problem with predict function

#what's happened in x2=0 ?
predict(model)[data1$x2==0]
predict(model1)[data1$x2==0]


#end of script


thanks in advance
best regards
Laurent Beaulaton

-
Laurent Beaulaton
###
# NEW    #
#  http://www.laurent-beaulaton.fr/#
# Tel + 33 (0)5 57 89 27 17 #
###
-
Cemagref (French Institute of Agricultural and Environmental Engineering 
Research )
Unité "Ecosystèmes estuariens et poissons migrateurs amphihalins"
(anciennement Unité "Ressources aquatiques continentales")
50 avenue de Verdun
F 33612 Cestas Cedex

Tel + 33 (0)5 57 89 27 17
Fax + 33 (0)5 57 89 08 01
mailto:[EMAIL PROTECTED]

http://www.laurent-beaulaton.fr/ 
http://www.bordeaux.cemagref.fr/rabx/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple conditional without if [Broadcast]

2007-02-27 Thread Liaw, Andy

From: bunny, lautloscrew.com
> 
> Dear all,
> 
> i am stuck with a syntax problem.
> 
> i have a matrix which has about 500 rows and 6 columns.
> now i want to kick some data out.
> i want create a new matrix which is basically the old one 
> except for all entries which have a 4 in the 5 column AND a 1 
> in the 6th column.
> 
> i tried the following but couldn´t get a new matrix, just some wierd
> errors:
> 
> newmatrix=oldmatrix[,2][oldmatrix[,5]==4]&&oldmatrix[,2][oldmatrix[,6]
> ==1]
> 
> all i get is:
> numeric(0)

That's not a `weird error', but a numeric vector of length 0.
 
> does anybody have an idea how to fix this one ?

Try:

newmatrix = oldmatrix[oldmatrix[, 5]==4 & oldmatrix[, 6] == 1, 2, drop=FALSE]

If you just want a subset of column 2 as a vector, you can leave off the 
drop=FALSE part.

Reading "An Introduction to R" should have save you some trouble in the first 
place.

Andy
 
> thx in advance
> 
> matthias
>   [[alternative HTML version deleted]]
> 
> 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to put the dependent variable in GLM proportion model



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 

The first one should be:
> n <- (S+F)
> share <- S/(S+F)
> glm(share~x, family=quasibinomial, weights=n)

This should give you results more comparable to the second one.  Either
way is acceptable.  

Hope this helps,

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Serguei Kaniovski
> Sent: Tuesday, February 27, 2007 6:37 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] How to put the dependent variable in GLM proportion model
> 
> 
> Hello everyone,
> 
> I am confused about how the dependent variable should be 
> specified, e.g.
> say S and F denote series of successes and failures. Is it
> 
> share<-S/(S+F)
> glm(share~x,family=quasibinomial)
> 
> or
> 
> glm(cbind(S,F)~x,family=quasibinomial)
> 
> The two variants produce very different dispersion parameter 
> and deviances.
> The book by Crawley, the only one R-book a have, says the 
> second variant is correct for proportions data.
> 
> Serguei
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Angle of Bar Plot

2007-02-27 Thread Mohsen Jafarikia

Hello everyone,

I want to use 'angle' in Bar Plot with different angles for different bars.

For example if the 3th column of the following data is '0.0', I want the
angle to be '45' degrees, if it is '1.92', I want '65' and 



1 3.740.00

2 6.121.92

3 9.710.00

4 1.321.92

5 8.244.48


Thanks,

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] compiling issues with Mandriva Linux 2007 Discovery

2007-02-27 Thread Douglas Bates

On 2/27/07, Bricklemyer, Ross S <[EMAIL PROTECTED]> wrote:
> All,
>
> I am a new user to Linux but I am familiar with R.  I have previously
> used and installed R on a Windows platform without problems.  I recently
> set up a dual boot system (XP_64, Mandriva) to run R on a Linux platform
> in order to more efficiently handle large datasets.  I have not done
> compiling before, but read the R instructions and followed to my best
> ability.  I downloaded the most recent tar and unpacked it into
> /usr/local/R_HOME.  I was able to run ./configure and added the
> additional Linux packages necessary to compile.  The problem arises
> using the 'make' command.  When I run make I get an error that there is
> not a target or a makefile.  There is, however, Makefile.in in the
> directory.  I think I am close to getting this installed, but I am
> stuck.  Any help would be greatly appreciated.  I do not suppose that
> any of the pre compiled Linux versions (i.e. Debian, SuSe) available on
> CRAN mirrors would run on a Mandriva distro?

You skipped a step.  Unpack then run configure then run make.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PLotting R graphics/symbols without user x-y scaling

Beyond the other replies that you have received, here are a couple more
ideas.

The cnvrt.coords function in the TeachingDemos package can aid in
switching between the coordinate systems so you could use this to
convert your x,y coordinates from the user system to the plot
coordinates (0 to 1), add values to these to represent the graphic you
want to plot, then convert these coordinates back to user coordinates
and use the lines function to do the plotting.

You could also do the conversion, then change the user plotting
coordinates rather than converting back to user coordinates.

You can also use the subplot function in the TeachingDemos package to
add small plots to an existing plot (see the examples).

Here is a quick stab at a function that uses subplot internally to do
similar to symbols, you give it the x and y coordinates where the center
of your symbol should be, then the 'symb' argument can either be a
matrix with points that define the shape of the symbol, or a function
that creates a plot of the symbol.

my.symbols <- function(x, y=NULL, symb, inches=1, add=FALSE,
xlab=deparse(substitute(x)), ylab=deparse(substitute(y)),
main=NULL,
xlim=NULL, ylim=NULL, vadj=0.5, hadj=0.5, pars=NULL, ...){

  if(!require(TeachingDemos)) stop('The TeachingDemos package is
required')

  if(!add){
plot(x,y, type='n', xlab=xlab,ylab=ylab,xlim=xlim,ylim=ylim)
  }

  if(is.function(symb)){
symb2 <- symb
  } else {
symb2 <- function(...){
  plot( rbind(symb, symb[1,]), xlab='',ylab='', xaxs='i',yaxs='i',
  ann=FALSE, axes=FALSE, type='l'   )
}
  }

  for (i in seq(along=x)){
subplot(symb2(i,...),x[i],y[i], size=c(inches,inches),
vadj=vadj, 
   hadj=hadj, pars=pars)
  }

}

Here are a couple of quick examples that use the above function:

tmp.hex <- rbind( c(0,0), c(0,1), c(1,2), c(2,2), c(2,1), c(1,0) )

my.symbols( runif(10), runif(10), tmp.hex, inches=.25 )

tmp.fun <- function(which, r, ...){
  r <- r[which]
  plot( 0.5, 0.5, xlim=c(0,1), ylim=c(0,1), xaxs='i', yaxs='i', 
ann=FALSE, axes=FALSE, cex=r)
}

my.symbols( runif(10), runif(10), tmp.fun, r=sample(1:10) )


I will polish and document this function and add it to the next release
of the TeachingDemos package (when I get enough time to put out a new
release, hopefully soon, but don't hold your breath).

Hope this helps,


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Jonathan Lees
> Sent: Monday, February 26, 2007 7:26 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] PLotting R graphics/symbols without user x-y scaling
> 
> 
> Is it possible to add lines or other
> user defined graphics
> to a plot in R that does not depend on
> the user scale for the plot?
> 
> For example I have a plot
> plot(x,y)
> and I want to add some graphic that is
> scaled in inches or cm but I do not want the graphic to 
> change when the x-y scales are changed - like a thermometer, 
> scale bar or other symbol - How does one do this?
> 
> I want to build my own library of glyphs to add to plots but 
> I do not know how to plot them when their size is independent 
> of the device/user coordinates.
> 
> Is it possible to add to the list
> of symbols in the function symbols()
> other than:
>   _circles_, _squares_, _rectangles_, _stars_, _thermometers_, and
>   _boxplots_
> 
> can I make my own symbols and have symbols call these?
> 
> 
> Thanks-
> 
> 
> --
> Jonathan M. Lees
> Professor
> THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL Department of 
> Geological Sciences Campus Box #3315 Chapel Hill, NC  27599-3315
> TEL: (919) 962-0695
> FAX: (919) 966-4519
> [EMAIL PROTECTED]
> http://www.unc.edu/~leesj
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RDA and trend surface regression

2007-02-27 Thread Jari Oksanen

On 27 Feb 2007, at 20:55, Gavin Simpson wrote:

> On Tue, 2007-02-27 at 13:13 -0500, Kuhn, Max wrote:
>> Helene,
>>
>> My point was only that RDA may fit a quadratic model for the terms
>> specified in your model. The terms that you had specified were already
>> higher order polynomials (some cubic). So a QDA classifier with the
>> model terms that you specified my be a fifth order polynomial in the
>> original data. I don't know the reference you cite or even the
>> subject-matter specifics. I'm just a simple cave man (for you SNL 
>> fans).
>> But I do know that there are more reliable ways to get nonlinear
>> classification boundaries than using x^5.
>
> I doubt that Helene is trying to do a classification - unless you
> consider classification to mean that all rows/samples are in different
> groups (i.e. n samples therefore n groups) - which is how RDA
> (Redundancy Analysis) is used in ecology.
>
> You could take a look at multispati in package ade4 for a different way
> to handle spatial constraints. There is also the principle coordinates
> analysis of neighbour matrices (PCNM) method - not sure this is coded
> anywhere in R yet though. Here are two references that may be useful:
>
Stéphane Dray has R code for finding PCNM matrices. Google for his 
name: it's not that common. I also have a copy of his function and can 
send it if really needed, but it may be better to check Dray's page 
first. Stéphane Dray says think that not all functions need be in CRAN. 
May be true, but I think it might help many people.

There are at least three reasons why not use polynomial constraints in 
RDA. Max Kuhn mentioned one: polynomials typically flip wildly at 
margins (or they are unstable in more neutral speech). Second reason is 
that they are almost impossible to interpret in ordination display. The 
third reason is that RDA (or CCA) avoid some ordination artefacts 
(curvature, horseshoe, arc etc.) just because the constraints are 
linear: allowing them to be curved allows curved solutions. These 
arguments are not necessarily valid if you only want to have variance 
partitioning, or if you use polynomial conditions ("partial out" 
polynomial effects in Canoco language). In that case it may make sense 
to use quadratic (or polynomial) constraints or conditions.

cheers, Jari Oksanen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple conditional without if

2007-02-27 Thread Mendiburu, Felipe \(CIP\)

Matthias,

According to the logic,

New matrix which is basically the old one except for all
entries which have a 4 in the 5 column AND a 1 in the 6th column

newmatrix <- oldmatrix[ (oldmatrix[,5]!=4 & oldmatrix[,6]!=1) , ]

---

newmatrix <- oldmatrix[ !(oldmatrix[,5]==4 & oldmatrix[,6]==1) , ]

by,

Felipe

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of bunny ,
lautloscrew.com
Sent: Tuesday, February 27, 2007 1:25 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Multiple conditional without if


Dear all,

i am stuck with a syntax problem.

i have a matrix which has about 500 rows and 6 columns.
now i want to kick some data out.
i want create a new matrix which is basically the old one except for all
entries which have a 4 in the 5 column AND a 1 in the 6th column.

i tried the following but couldn´t get a new matrix, just some wierd  
errors:

newmatrix=oldmatrix[,2][oldmatrix[,5]==4]&&oldmatrix[,2][oldmatrix[,6] 
==1]

all i get is:
numeric(0)

does anybody have an idea how to fix this one ?

thx in advance

matthias
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] looping

For the example that you give, using lapply, sapply, or replicate may be
the better way to go:

> mysample <- replicate( 50, dataset[ sample(10,100), ] )

If you really want to use a loop, then use a list:

> mysamples <- list()
> mysampdata <- list()
> for (i in 1:50){
+   mysamples[[i]] <- sample(10, 100)
+   mysampdata[[i]] <- dataset[ mysamples[[i]], ]
+ }

Then you can use lapply or sapply to do something with each sampled
dataset:

> sapply( mysampdata, summary )

Or you can access individual elements in a number of ways:

> summary( mysampdata[[1]] )
> names(mysampdata) <- paste('d',1:50, sep='')
> with(mysampdata, summary(d2))
> summary( mysampdata$d3 )
> attach(mysampdata)
> summary(d4)
> detach()

Hope this helps,


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Neil Hepburn
> Sent: Monday, February 26, 2007 5:11 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] looping
> 
> 
> Greetings:
> 
> I am looking for some help (probably really basic) with 
> looping. What I want to do is repeatedly sample observations 
> (about 100 per sample) from a large dataset (100,000 
> observations).  I would like the samples labelled sample.1, 
> sample.2, and so on (or some other suitably simple naming 
> scheme).  To do this manually I would 
> 
> >smp.1 <- sample(10, 100)
> >sample.1 <- dataset[smp.1,]
> >smp.2 <- sample(10, 100)
> >sample.2 <- dataset[smp.2,]
> .
> .
> .
> >smp.50 <- sample(10, 100)
> >sample.50 <- dataset[smp.50,]
> 
> and so on.
> 
> I tried the following loop code to generate 100 samples:
> 
> >for (i in 1:50){
> >+ smp.[i] <- sample(10, 100)
> >+ sample.[i] <- dataset[smp.[i],]}
> 
> Unfortunately, that does not work -- specifying the looping 
> variable i in the way that I have does not work since R uses 
> that to reference places in a vector (x[i] would be the ith 
> element in the vector x)
> 
> Is it possible to assign the value of the looping variable in 
> a name within the loop structure?
> 
> Cheers,
> Neil Hepburn
> 
> ===
> Neil Hepburn, Economics Instructor
> Social Sciences Department,
> The University of Alberta Augustana Campus
> 4901 - 46 Avenue
> Camrose, Alberta
> T4V 2R3
> 
> Phone (780) 697-1588
> email [EMAIL PROTECTED]
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple conditional without if

2007-02-27 Thread Nordlund, Dan (DSHS/RDA)

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Petr 
> Klasterecky
> Sent: Tuesday, February 27, 2007 12:15 PM
> To: bunny , lautloscrew.com
> Cc: r-help@stat.math.ethz.ch
> Subject: Re: [R] Multiple conditional without if
> 
> bunny , lautloscrew.com napsal(a):
> > Dear all,
> > 
> > i am stuck with a syntax problem.
> > 
> > i have a matrix which has about 500 rows and 6 columns.
> > now i want to kick some data out.
> > i want create a new matrix which is basically the old one 
> except for all
> > entries which have a 4 in the 5 column AND a 1 in the 6th column.
> > 
> > i tried the following but couldn´t get a new matrix, just 
> some wierd  
> > errors:
> > 
> > 
> newmatrix=oldmatrix[,2][oldmatrix[,5]==4]&&oldmatrix[,2][oldma
> trix[,6] 
> > ==1]
> This is nonsense.
> 
>  > m <- matrix(rep(1:12,3),ncol=6)
>  > m[1,6] <-1
>  > m
>   [,1] [,2] [,3] [,4] [,5] [,6]
> [1,]171711
> [2,]282828
> [3,]393939
> [4,]4   104   104   10
> [5,]5   115   115   11
> [6,]6   126   126   12
> 
>  > m[!((m[,5]==4)|(m[,6]==1)),]

I think what was requested was to exclude rows where column 5 was 4 AND column 
6 was 1, so,

m[((m[,5]!=4)|(m[,6]!=1)),]

<<>>

Hope this is helpful,

Dan

Daniel J. Nordlund
Research and Data Analysis
Washington State Department of Social and Health Services
Olympia, WA  98504-5204

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting of all possible models

You may want to look at the packages 'leaps'.  I don't think it does glm's, but 
possibly you could modify it to.

Otherwise here is one quick approach (though there are probably better ones):

> apply( expand.grid( c(TRUE,FALSE),c(TRUE,FALSE),c(TRUE,FALSE) ),
+ 1, function(x) as.formula(paste(c('y~1', c('x1','x2','x3')[x]), 
collapse='+')))

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Indermaur Lukas
> Sent: Tuesday, February 27, 2007 12:46 AM
> To: r-help@stat.math.ethz.ch
> Subject: [R] fitting of all possible models
> 
> Hi,
> Fitting all possible models (GLM) with 10 predictors will 
> result in loads of (2^10 - 1) models. I want to do that in 
> order to get the importance of variables (having an 
> unbalanced variable design) by summing the up the AIC-weights 
> of models including the same variable, for every variable 
> separately. It's time consuming and annoying to define all 
> possible models by hand. 
>  
> Is there a command, or easy solution to let R define the set 
> of all possible models itself? I defined models in the 
> following way to process them with a batch job:
>  
> # e.g. model 1
> preference<- formula(Y~Lwd + N + Sex + YY)
> 
> # e.g. model 2
> preference_heterogeneity<- formula(Y~Ri + Lwd + N + Sex + YY) etc.
> etc.
>  
>  
> I appreciate any hint
> Cheers
> Lukas
>  
>  
>  
>  
>  
> °°° 
> Lukas Indermaur, PhD student 
> eawag / Swiss Federal Institute of Aquatic Science and Technology 
> ECO - Department of Aquatic Ecology
> Überlandstrasse 133
> CH-8600 Dübendorf
> Switzerland
>  
> Phone: +41 (0) 71 220 38 25
> Fax: +41 (0) 44 823 53 15 
> Email: [EMAIL PROTECTED]
> www.lukasindermaur.ch
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Macros in R

The FAQ does mention using a list (and I did not mean to imply that it
did not).  Personally I think it is a little soft on this point for the
following reasons:

1. It mentions lists at the very end, some users may read about the
assign function, think that that answers the question that they think
they have, and never read on to the end.
2. The phrase "often easier" seems a soft sell to me, like "consider
this", not "DO IT THIS WAY".
3. It does not point out any of the dangers of using assign and the
additional benefits of using lists.
4. The phrase "Here Be Dragons" is fun to say, and seeing it grabs
attention and gets people thinking (except maybe on September 19th,
International Talk Like a Pirate Day).

And the parenthetical remark on number 4 brings up the obvious addition
to the R Infrequently Asked Questions list:

Q: Aye, mateys, what be a pirate's favorite statistical package?
A: R, of course (but you need to pronounce it R :-).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111

> -Original Message-
> From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, February 27, 2007 12:11 PM
> To: Greg Snow
> Cc: Monika Kerekes; r-help@stat.math.ethz.ch
> Subject: Re: [R] Macros in R
> 
> The FAQ does mention your point already.
> 
> On 2/27/07, Greg Snow <[EMAIL PROTECTED]> wrote:
> > Others have pointed you to the answer to your question, but both FAQ
> > 7.21 and the assign help page should really have a big 
> banner at the 
> > top saying "Here Be Dragons".
[snip]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R in linux

2007-02-27 Thread Aimin Yan

I used to use R-WinEdt in Windows. I know R-Winedt is only for R in Windows.
Now I try to use R in linux. I am wondering if there  is something 
that can do same thing as R-Winedt for R in linux?

Aimin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R in linux

2007-02-27 Thread Aimin Yan

I used to use R-WinEdt in Windows. I know R-Winedt is only for R in Windows.
Now I try to use R in linux. I am wondering if there  is something 
that can do same thing as R-Winedt for R in linux?

Aimin

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help with NSST models in GaussRf

2007-02-27 Thread Eleonora Demaria

Hi R-Users,
   
  I am ussing GaussRf in the Random Fields Package to generate spatial-temporal 
fields. 
   
  I am having a hard time understanding how to use the NSST model. Does anybody 
have some experience with this model or any of the non-separable space time 
models available?
   
  I got all the papers cited in the manual but I am still lost. 
   
  Thanks a lot
   
  Ele
   


--
Eleonora Demaria
   http://www.u.arizona.edu/~edemaria/
 
-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] factor documentation issue

2007-02-27 Thread Geoff Russell

There is a warning in the documentation for ?factor  (R version 2.3.0)
as follows:

" The interpretation of a factor depends on both the codes and the
  '"levels"' attribute.  Be careful only to compare factors with the
  same set of levels (in the same order).  In particular,
  'as.numeric' applied to a factor is meaningless, and may happen by
  implicit coercion.  To "revert" a factor 'f' to its original
  numeric values, 'as.numeric(levels(f))[f]' is recommended and
  slightly more efficient than 'as.numeric(as.character(f))'.


But as.numeric seems to work fine whereas as.numeric(levels(f))[f] doesn't
always do anything useful.

For example:

> f<-factor(1:3,labels=c("A","B","C"))
> f
[1] A B C
Levels: A B C
> as.numeric(f)
[1] 1 2 3
> as.numeric(levels(f))[f]
[1] NA NA NA
Warning message:
NAs introduced by coercion

And also,

> f<-factor(1:3,labels=c(1,5,6))
> f
[1] 1 5 6
Levels: 1 5 6
> as.numeric(f)
[1] 1 2 3
> as.numeric(levels(f))[f]
[1] 1 5 6

Is the documentation wrong, or is the code wrong, or have I missed
something?

Cheers,
Geoff Russell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Angle of Bar Plot

2007-02-27 Thread Ben Bolker

Mohsen Jafarikia  gmail.com> writes:

> 
> Hello everyone,
> 
> I want to use 'angle' in Bar Plot with different angles for different bars.
> 
> For example if the 3th column of the following data is '0.0', I want the
> angle to be '45' degrees, if it is '1.92', I want '65' and 

[snip]

 couldn't read the end of your e-mail but chose to set angle=80
for the third value ...

z <- matrix(c(3.74,0,6.12,1.92,9.71,0,1.32,1.92,8.24,4.48),byrow=TRUE,
+ ncol=2)
> z
 [,1] [,2]
[1,] 3.74 0.00
[2,] 6.12 1.92
[3,] 9.71 0.00
[4,] 1.32 1.92
[5,] 8.24 4.48

> barplot(z[,1],angle=c(45,65,80)[as.numeric(factor(z[,2]))],density=5)

  I will point out that this is pretty ugly though ...

  Ben Bolker

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] .C HoltWinters

2007-02-27 Thread sj

Hello,

I would like to look at the compiled C code behind HoltWinters from the
stats package. Is that possible? If so where do I find it?

thanks,

Spencer

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Datamining-package-?

Dear Group,

I am looking for a package that is going to help me on Data preprocessing
methods in Datamining.

Is there any package in R2.4.0 to support DM? or what is the suitable
package that i can adopt do the work?

Kindly need your assistance.

Thanks & Regards




JJ
---

-- 
Lecturer J. Joshua Thomas
KDU College Penang Campus
Research Student,
University Sains Malaysia

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Datamining-package-?

2007-02-27 Thread Wensui Liu

what do you mean by data preprocessing? there are tons of R functions
that you can use to process data and do data mining.

On 2/27/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:
> Dear Group,
>
> I am looking for a package that is going to help me on Data preprocessing
> methods in Datamining.
>
> Is there any package in R2.4.0 to support DM? or what is the suitable
> package that i can adopt do the work?
>
> Kindly need your assistance.
>
> Thanks & Regards
>
>
>
>
> JJ
> ---
>
> --
> Lecturer J. Joshua Thomas
> KDU College Penang Campus
> Research Student,
> University Sains Malaysia
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Datamining-package-?

Hi again,
The idea of preprocessing is mainly based on the need to prepare the data
before they are actually used in pattern extraction.or feed the data
into EA's (Genetic Algorithm) There are no standard practice yet however,
the frequently used on are

1. the extraction of derived attributes that is quantities that accompany
but not directly related to the data patterns and may prove meaningful or
increase the understanding of the patterns

2. the removal of some existing attributes that should be of no concern to
the mining process and its insignificance

So i looking for a package that can do this two above mentioned points

Initially i would like to visualize the data into pattern and understand the
patterns.


Thanks in Advance,

JJ



On 2/28/07, Wensui Liu <[EMAIL PROTECTED]> wrote:
>
> what do you mean by data preprocessing? there are tons of R functions
> that you can use to process data and do data mining.
>
> On 2/27/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:
> > Dear Group,
> >
> > I am looking for a package that is going to help me on Data
> preprocessing
> > methods in Datamining.
> >
> > Is there any package in R2.4.0 to support DM? or what is the suitable
> > package that i can adopt do the work?
> >
> > Kindly need your assistance.
> >
> > Thanks & Regards
> >
> >
> >
> >
> > JJ
> > ---
> >
> > --
> > Lecturer J. Joshua Thomas
> > KDU College Penang Campus
> > Research Student,
> > University Sains Malaysia
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> WenSui Liu
> A lousy statistician who happens to know a little programming
> (http://spaces.msn.com/statcompute/blog)
>



-- 
Lecturer J. Joshua Thomas
KDU College Penang Campus
Research Student,
University Sains Malaysia

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lm ANOVA vs. AOV

2007-02-27 Thread Matthew Bridgman

Why would someone use lm and ANOVA (anova(lm(x))) instead of AOV (or  
the other way around)?
The mean squares and sum of squares are the same, but the F values  
and p-values are slightly different.

I am modeling a dependent~independent1*independent2.

Thanks,
Matt Bridgman
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Datamining-package-?

2007-02-27 Thread Daniel Nordlund

> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> On Behalf Of j.joshua thomas
> Sent: Tuesday, February 27, 2007 5:52 PM
> To: r-help@stat.math.ethz.ch
> Subject: Re: [R] Datamining-package-?
> 
> Hi again,
> The idea of preprocessing is mainly based on the need to prepare the data
> before they are actually used in pattern extraction.or feed the data
> into EA's (Genetic Algorithm) There are no standard practice yet however,
> the frequently used on are
> 
> 1. the extraction of derived attributes that is quantities that accompany
> but not directly related to the data patterns and may prove meaningful or
> increase the understanding of the patterns
> 
> 2. the removal of some existing attributes that should be of no concern to
> the mining process and its insignificance
> 
> So i looking for a package that can do this two above mentioned points
> 
> Initially i would like to visualize the data into pattern and understand the
> patterns.
> 
> 
<<>>

Joshua,

You might take a look at the package rattle on CRAN for initially looking at 
your data and doing some basic data mining.

Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA, USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with R interface termination

2007-02-27 Thread Henrik Bengtsson

Hi,

this is due to the R.utils library.  Interestingly, others reported
this yesterday.  There are three different ways to get around the
problem right now:  1) Exit R so it does not call .Last() by
quit(runLast=FALSE),  2) load the R.utils package and then exit via
quit() as usual, or 3) remove the faulty .Last() function by
rm(.Last).  (Thus, you don't have kill the R process).

Brief explanation: If you load the R.utils package, it will modify or
create .Last() so that  finalizeSession() (defined in R.utils) is
called, which in turn calls so called registered hook functions
allowing optional code to be evaluated "whenever" R exists, cf.
http://tolstoy.newcastle.edu.au/R/devel/05/06/1206.html.  So, if you
exit R and save the session, this .Last() function will be saved to
your .RData file.  If you then start a new R session and load the
saved .RData, that modified .Last() function will also be loaded.
However, if you now try to exit R again, .Last() will be called which
in turn calls finalizeSession() which is undefined since R.utils is
not loaded and you will get the error that you reported.

I'm testing out an updated version of R.utils (v 0.8.6), which will
not cause this problem in the first place, and will fix the problem in
"faulty" .RData.  I will put in on CRAN as soon as I know it has been
tested thoroughly.  In the meanwhile, you can install it by:

source("http://www.braju.com/R/hbLite.R";)
hbLite("R.utils")

To fix an old faulty .RData file, start R in the same directory, then
load the updated R.utils (which will replace that faulty .Last() with
a better one), and then save the session again when you exit R.  When
you do this you will see "Warning message: 'package:R.utils' may not
be available when loading", which is ok.  That should fix your
problems.

Let me know if it works

Henrik (author of R.utils)

On 2/27/07, Bhanu Kalyan.K <[EMAIL PROTECTED]> wrote:
> Dear Sir,
>
> The R interface that i am currently using is R 2.4.1( in Win XP). I am not 
> able to terminate the program by giving the q() command. Each time I pass 
> this command,
>
> > q()
> Error in .Last() : could not find function "finalizeSession"
>
> this error creeps in and it neither allows me to save my workspace nor come 
> out of R. Therefore, whenever this happens, I am forced to end the Rgui.exe 
> process from the task manager.
> Kindly help me fix this problem.
>
> Regards,
>
> Bhanu Kalyan K
>
>
> Bhanu Kalyan K
> B.Tech Final Year, CSE
>
> Tel: +91-9885238228
>
> Alternate E-Mail:
> [EMAIL PROTECTED]
>
>
> -
> Need Mail bonding?
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Datamining-package-?

I couldn't locate package rattle?  Need some one's help.


JJ
---



On 2/28/07, Daniel Nordlund <[EMAIL PROTECTED]> wrote:
>
> > -Original Message-
> > From: [EMAIL PROTECTED] [mailto:
> [EMAIL PROTECTED]
> > On Behalf Of j.joshua thomas
> > Sent: Tuesday, February 27, 2007 5:52 PM
> > To: r-help@stat.math.ethz.ch
> > Subject: Re: [R] Datamining-package-?
> >
> > Hi again,
> > The idea of preprocessing is mainly based on the need to prepare the
> data
> > before they are actually used in pattern extraction.or feed the data
> > into EA's (Genetic Algorithm) There are no standard practice yet
> however,
> > the frequently used on are
> >
> > 1. the extraction of derived attributes that is quantities that
> accompany
> > but not directly related to the data patterns and may prove meaningful
> or
> > increase the understanding of the patterns
> >
> > 2. the removal of some existing attributes that should be of no concern
> to
> > the mining process and its insignificance
> >
> > So i looking for a package that can do this two above mentioned
> points
> >
> > Initially i would like to visualize the data into pattern and understand
> the
> > patterns.
> >
> >
> <<>>
>
> Joshua,
>
> You might take a look at the package rattle on CRAN for initially looking
> at your data and doing some basic data mining.
>
> Hope this is helpful,
>
> Dan
>
> Daniel Nordlund
> Bothell, WA, USA
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Lecturer J. Joshua Thomas
KDU College Penang Campus
Research Student,
University Sains Malaysia

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Datamining-package-?

Dear Group,

I have found the package.

Thanks very much


JJ
---


On 2/28/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:
>
>
> I couldn't locate package rattle?  Need some one's help.
>
>
> JJ
> ---
>
>
>
> On 2/28/07, Daniel Nordlund <[EMAIL PROTECTED]> wrote:
> >
> > > -Original Message-
> > > From: [EMAIL PROTECTED] [mailto:
> > [EMAIL PROTECTED]
> > > On Behalf Of j.joshua thomas
> > > Sent: Tuesday, February 27, 2007 5:52 PM
> > > To: r-help@stat.math.ethz.ch
> > > Subject: Re: [R] Datamining-package-?
> > >
> > > Hi again,
> > > The idea of preprocessing is mainly based on the need to prepare the
> > data
> > > before they are actually used in pattern extraction.or feed the data
> > > into EA's (Genetic Algorithm) There are no standard practice yet
> > however,
> > > the frequently used on are
> > >
> > > 1. the extraction of derived attributes that is quantities that
> > accompany
> > > but not directly related to the data patterns and may prove meaningful
> > or
> > > increase the understanding of the patterns
> > >
> > > 2. the removal of some existing attributes that should be of no
> > concern to
> > > the mining process and its insignificance
> > >
> > > So i looking for a package that can do this two above mentioned
> > points
> > >
> > > Initially i would like to visualize the data into pattern and
> > understand the
> > > patterns.
> > >
> > >
> > <<>>
> >
> > Joshua,
> >
> > You might take a look at the package rattle on CRAN for initially
> > looking at your data and doing some basic data mining.
> >
> > Hope this is helpful,
> >
> > Dan
> >
> > Daniel Nordlund
> > Bothell, WA, USA
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Lecturer J. Joshua Thomas
> KDU College Penang Campus
> Research Student,
> University Sains Malaysia
>



-- 
Lecturer J. Joshua Thomas
KDU College Penang Campus
Research Student,
University Sains Malaysia

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Datamining-package-?

2007-02-27 Thread jim holtman

>From the RGUI select Packages/Install Package and you should find it.

On 2/27/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:
>
> I couldn't locate package rattle?  Need some one's help.
>
>
> JJ
> ---
>
>
>
> On 2/28/07, Daniel Nordlund <[EMAIL PROTECTED]> wrote:
> >
> > > -Original Message-
> > > From: [EMAIL PROTECTED] [mailto:
> > [EMAIL PROTECTED]
> > > On Behalf Of j.joshua thomas
> > > Sent: Tuesday, February 27, 2007 5:52 PM
> > > To: r-help@stat.math.ethz.ch
> > > Subject: Re: [R] Datamining-package-?
> > >
> > > Hi again,
> > > The idea of preprocessing is mainly based on the need to prepare the
> > data
> > > before they are actually used in pattern extraction.or feed the data
> > > into EA's (Genetic Algorithm) There are no standard practice yet
> > however,
> > > the frequently used on are
> > >
> > > 1. the extraction of derived attributes that is quantities that
> > accompany
> > > but not directly related to the data patterns and may prove meaningful
> > or
> > > increase the understanding of the patterns
> > >
> > > 2. the removal of some existing attributes that should be of no
> concern
> > to
> > > the mining process and its insignificance
> > >
> > > So i looking for a package that can do this two above mentioned
> > points
> > >
> > > Initially i would like to visualize the data into pattern and
> understand
> > the
> > > patterns.
> > >
> > >
> > <<>>
> >
> > Joshua,
> >
> > You might take a look at the package rattle on CRAN for initially
> looking
> > at your data and doing some basic data mining.
> >
> > Hope this is helpful,
> >
> > Dan
> >
> > Daniel Nordlund
> > Bothell, WA, USA
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
> Lecturer J. Joshua Thomas
> KDU College Penang Campus
> Research Student,
> University Sains Malaysia
>
>[[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Datamining-package-?

2007-02-27 Thread Roberto Perdisci

Hi,
  out of curiosity, what is the name of the package you found?

Roberto

On 2/27/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:
> Dear Group,
>
> I have found the package.
>
> Thanks very much
>
>
> JJ
> ---
>
>
> On 2/28/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:
> >
> >
> > I couldn't locate package rattle?  Need some one's help.
> >
> >
> > JJ
> > ---
> >
> >
> >
> > On 2/28/07, Daniel Nordlund <[EMAIL PROTECTED]> wrote:
> > >
> > > > -Original Message-
> > > > From: [EMAIL PROTECTED] [mailto:
> > > [EMAIL PROTECTED]
> > > > On Behalf Of j.joshua thomas
> > > > Sent: Tuesday, February 27, 2007 5:52 PM
> > > > To: r-help@stat.math.ethz.ch
> > > > Subject: Re: [R] Datamining-package-?
> > > >
> > > > Hi again,
> > > > The idea of preprocessing is mainly based on the need to prepare the
> > > data
> > > > before they are actually used in pattern extraction.or feed the data
> > > > into EA's (Genetic Algorithm) There are no standard practice yet
> > > however,
> > > > the frequently used on are
> > > >
> > > > 1. the extraction of derived attributes that is quantities that
> > > accompany
> > > > but not directly related to the data patterns and may prove meaningful
> > > or
> > > > increase the understanding of the patterns
> > > >
> > > > 2. the removal of some existing attributes that should be of no
> > > concern to
> > > > the mining process and its insignificance
> > > >
> > > > So i looking for a package that can do this two above mentioned
> > > points
> > > >
> > > > Initially i would like to visualize the data into pattern and
> > > understand the
> > > > patterns.
> > > >
> > > >
> > > <<>>
> > >
> > > Joshua,
> > >
> > > You might take a look at the package rattle on CRAN for initially
> > > looking at your data and doing some basic data mining.
> > >
> > > Hope this is helpful,
> > >
> > > Dan
> > >
> > > Daniel Nordlund
> > > Bothell, WA, USA
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> >
> >
> > --
> > Lecturer J. Joshua Thomas
> > KDU College Penang Campus
> > Research Student,
> > University Sains Malaysia
> >
>
>
>
> --
> Lecturer J. Joshua Thomas
> KDU College Penang Campus
> Research Student,
> University Sains Malaysia
>
> [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Be true

2007-02-27 Thread Magdalena Lott

(Watch Michael J. Fox back McCaskill on stem cells -- :32 ) Democrats say they 
are ahead

Critical Care Inc, 
symbl: .CTCX.
This one is an easy doubler
@ 65 cents wont last long
Expected target : $ 3.00 
 Critical Care Announces Expansion of Cost Containment Activities 
Business Wire (Fri, Feb 16)  
 Critical Care to Acquire Health Care Company 
Business Wire (Wed, Jan 31)  

GET IN TOMORROW, This one is shoo in to DOUBLE

evening. "It's unfortunate that Talent is one of the only Republicans who 
agrees."

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Datamining-package rattle() Errors

Dear Group

I have few errors while installing package rattle from CRAN

i do the installing from the local zip files...

 I am using R 2.4.0 do i have to upgrade to R2.4.1 ?


~~

utils:::menuInstallLocal()
package 'rattle' successfully unpacked and MD5 sums checked
updating HTML package descriptions
> help(rattle)
No documentation for 'rattle' in specified packages and libraries:
you could try 'help.search("rattle")'
> library(rattle)
Rattle, Graphical interface for data mining using R, Version 2.2.0.
Copyright (C) 2006 [EMAIL PROTECTED], GPL
Type "rattle()" to shake, rattle, and roll your data.
Warning message:
package 'rattle' was built under R version 2.4.1
> rattle()
Error in rattle() : could not find function "gladeXMLNew"
In addition: Warning message:
there is no package called 'RGtk2' in: library(package, lib.loc = lib.loc,
character.only = TRUE, logical = TRUE,
> local({pkg <- select.list(sort(.packages(all.available = TRUE)))
+ if(nchar(pkg)) library(pkg, character.only=TRUE)})
> update.packages(ask='graphics')


On 2/28/07, Roberto Perdisci <[EMAIL PROTECTED]> wrote:
>
> Hi,
> out of curiosity, what is the name of the package you found?
>
> Roberto
>
> On 2/27/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:
> > Dear Group,
> >
> > I have found the package.
> >
> > Thanks very much
> >
> >
> > JJ
> > ---
> >
> >
> > On 2/28/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:
> > >
> > >
> > > I couldn't locate package rattle?  Need some one's help.
> > >
> > >
> > > JJ
> > > ---
> > >
> > >
> > >
> > > On 2/28/07, Daniel Nordlund <[EMAIL PROTECTED]> wrote:
> > > >
> > > > > -Original Message-
> > > > > From: [EMAIL PROTECTED] [mailto:
> > > > [EMAIL PROTECTED]
> > > > > On Behalf Of j.joshua thomas
> > > > > Sent: Tuesday, February 27, 2007 5:52 PM
> > > > > To: r-help@stat.math.ethz.ch
> > > > > Subject: Re: [R] Datamining-package-?
> > > > >
> > > > > Hi again,
> > > > > The idea of preprocessing is mainly based on the need to prepare
> the
> > > > data
> > > > > before they are actually used in pattern extraction.or feed the
> data
> > > > > into EA's (Genetic Algorithm) There are no standard practice yet
> > > > however,
> > > > > the frequently used on are
> > > > >
> > > > > 1. the extraction of derived attributes that is quantities that
> > > > accompany
> > > > > but not directly related to the data patterns and may prove
> meaningful
> > > > or
> > > > > increase the understanding of the patterns
> > > > >
> > > > > 2. the removal of some existing attributes that should be of no
> > > > concern to
> > > > > the mining process and its insignificance
> > > > >
> > > > > So i looking for a package that can do this two above mentioned
> > > > points
> > > > >
> > > > > Initially i would like to visualize the data into pattern and
> > > > understand the
> > > > > patterns.
> > > > >
> > > > >
> > > > <<>>
> > > >
> > > > Joshua,
> > > >
> > > > You might take a look at the package rattle on CRAN for initially
> > > > looking at your data and doing some basic data mining.
> > > >
> > > > Hope this is helpful,
> > > >
> > > > Dan
> > > >
> > > > Daniel Nordlund
> > > > Bothell, WA, USA
> > > >
> > > > __
> > > > R-help@stat.math.ethz.ch mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > PLEASE do read the posting guide
> > > > http://www.R-project.org/posting-guide.html<
> http://www.r-project.org/posting-guide.html>
> > > > and provide commented, minimal, self-contained, reproducible code.
> > > >
> > >
> > >
> > >
> > > --
> > > Lecturer J. Joshua Thomas
> > > KDU College Penang Campus
> > > Research Student,
> > > University Sains Malaysia
> > >
> >
> >
> >
> > --
> > Lecturer J. Joshua Thomas
> > KDU College Penang Campus
> > Research Student,
> > University Sains Malaysia
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>



-- 
Lecturer J. Joshua Thomas
KDU College Penang Campus
Research Student,
University Sains Malaysia

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Function to do multiple named lookups faster?

2007-02-27 Thread jim holtman

try this:

> x <- 1:30
> names(x) <- sample(LETTERS[1:5], 30, TRUE)
>
> x
 B  B  C  E  B  E  E  D  D  A  B  A  D  B  D  C  D  E  B  D  E  B  D  A  B
B  A  B  E  B
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28 29 30
> x <- 1:30
> names(x) <- sample(LETTERS[1:5], 30, TRUE)
> x
 C  C  C  A  E  D  D  A  D  C  E  D  D  C  C  D  A  C  D  D  C  E  C  B  A
A  B  C  D  C
 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28 29 30
> tapply(x, names(x), mean)
   ABCDE
16.0 25.5 15.0 14.6 12.7
>
>



On 2/27/07, David Reiss <[EMAIL PROTECTED]> wrote:
>
> Hi,
> I apologize if this topic has been discussed - I could not figure out
> a good search phrase for this question.
>
> I have a named vector x, with multiple (duplicate) names, and I would
> like to obtain a (shorter) vector with non-duplicate names in which
> the values are the means of the values of the duplicated indexes in x.
> My best (fastest) solution to this was this code:
>
> nms <- names( x )
> x.uniq <- sapply( unique( nms ), function( i ) mean( subtracted[ nms == i
> ] ) )
>
> However, this takes forever on my beefy Mac Pro. Is there a faster way
> to this using pre-written functions in R?
>
> Thanks a lot for any advice.
> -David
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Datamining-package rattle() Errors

2007-02-27 Thread Tim Churches

j.joshua thomas wrote:
> Dear Group
> 
> I have few errors while installing package rattle from CRAN
> 
> i do the installing from the local zip files...
> 
>  I am using R 2.4.0 do i have to upgrade to R2.4.1 ?

You *do* have to read the r-help posting guide and take exact heed of
what it suggests: http://www.r-project.org/posting-guide.html

Tim C

> ~~
> 
> utils:::menuInstallLocal()
> package 'rattle' successfully unpacked and MD5 sums checked
> updating HTML package descriptions
>> help(rattle)
> No documentation for 'rattle' in specified packages and libraries:
> you could try 'help.search("rattle")'
>> library(rattle)
> Rattle, Graphical interface for data mining using R, Version 2.2.0.
> Copyright (C) 2006 [EMAIL PROTECTED], GPL
> Type "rattle()" to shake, rattle, and roll your data.
> Warning message:
> package 'rattle' was built under R version 2.4.1
>> rattle()
> Error in rattle() : could not find function "gladeXMLNew"
> In addition: Warning message:
> there is no package called 'RGtk2' in: library(package, lib.loc = lib.loc,
> character.only = TRUE, logical = TRUE,
>> local({pkg <- select.list(sort(.packages(all.available = TRUE)))
> + if(nchar(pkg)) library(pkg, character.only=TRUE)})
>> update.packages(ask='graphics')
> 
> 
> On 2/28/07, Roberto Perdisci <[EMAIL PROTECTED]> wrote:
>> Hi,
>> out of curiosity, what is the name of the package you found?
>>
>> Roberto
>>
>> On 2/27/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:
>>> Dear Group,
>>>
>>> I have found the package.
>>>
>>> Thanks very much
>>>
>>>
>>> JJ
>>> ---
>>>
>>>
>>> On 2/28/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:

 I couldn't locate package rattle?  Need some one's help.


 JJ
 ---



 On 2/28/07, Daniel Nordlund <[EMAIL PROTECTED]> wrote:
>> -Original Message-
>> From: [EMAIL PROTECTED] [mailto:
> [EMAIL PROTECTED]
>> On Behalf Of j.joshua thomas
>> Sent: Tuesday, February 27, 2007 5:52 PM
>> To: r-help@stat.math.ethz.ch
>> Subject: Re: [R] Datamining-package-?
>>
>> Hi again,
>> The idea of preprocessing is mainly based on the need to prepare
>> the
> data
>> before they are actually used in pattern extraction.or feed the
>> data
>> into EA's (Genetic Algorithm) There are no standard practice yet
> however,
>> the frequently used on are
>>
>> 1. the extraction of derived attributes that is quantities that
> accompany
>> but not directly related to the data patterns and may prove
>> meaningful
> or
>> increase the understanding of the patterns
>>
>> 2. the removal of some existing attributes that should be of no
> concern to
>> the mining process and its insignificance
>>
>> So i looking for a package that can do this two above mentioned
> points
>> Initially i would like to visualize the data into pattern and
> understand the
>> patterns.
>>
>>
> <<>>
>
> Joshua,
>
> You might take a look at the package rattle on CRAN for initially
> looking at your data and doing some basic data mining.
>
> Hope this is helpful,
>
> Dan
>
> Daniel Nordlund
> Bothell, WA, USA
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<
>> http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>


 --
 Lecturer J. Joshua Thomas
 KDU College Penang Campus
 Research Student,
 University Sains Malaysia

>>>
>>>
>>> --
>>> Lecturer J. Joshua Thomas
>>> KDU College Penang Campus
>>> Research Student,
>>> University Sains Malaysia
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
> 
> 
>

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Datamining-package rattle() Errors

2007-02-27 Thread Nordlund, Dan (DSHS/RDA)

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of j.joshua thomas
> Sent: Tuesday, February 27, 2007 6:54 PM
> To: r-help@stat.math.ethz.ch
> Subject: [R] Datamining-package rattle() Errors
> 
> Dear Group
> 
> I have few errors while installing package rattle from CRAN
> 
> i do the installing from the local zip files...
> 
>  I am using R 2.4.0 do i have to upgrade to R2.4.1 ?
> 
> 
> ~~
> 
> utils:::menuInstallLocal()
> package 'rattle' successfully unpacked and MD5 sums checked
> updating HTML package descriptions
> > help(rattle)
> No documentation for 'rattle' in specified packages and libraries:
> you could try 'help.search("rattle")'
> > library(rattle)
> Rattle, Graphical interface for data mining using R, Version 2.2.0.
> Copyright (C) 2006 [EMAIL PROTECTED], GPL
> Type "rattle()" to shake, rattle, and roll your data.
> Warning message:
> package 'rattle' was built under R version 2.4.1
> > rattle()
> Error in rattle() : could not find function "gladeXMLNew"
> In addition: Warning message:
> there is no package called 'RGtk2' in: library(package, 
> lib.loc = lib.loc,
> character.only = TRUE, logical = TRUE,
> > local({pkg <- select.list(sort(.packages(all.available = TRUE)))
> + if(nchar(pkg)) library(pkg, character.only=TRUE)})
> > update.packages(ask='graphics')
> 
> 
> On 2/28/07, Roberto Perdisci <[EMAIL PROTECTED]> wrote:
> >
> > Hi,
> > out of curiosity, what is the name of the package you found?
> >
> > Roberto
> >
> > On 2/27/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:
> > > Dear Group,
> > >
> > > I have found the package.
> > >
> > > Thanks very much
> > >
> > >
> > > JJ
> > > ---
> > >
> > >
> > > On 2/28/07, j.joshua thomas <[EMAIL PROTECTED]> wrote:
> > > >
> > > >
> > > > I couldn't locate package rattle?  Need some one's help.
> > > >
> > > >
> > > > JJ
> > > > ---
> > > >
> > > >
> > > >
> > > > On 2/28/07, Daniel Nordlund <[EMAIL PROTECTED]> wrote:
> > > > >
> > > > > > -Original Message-
> > > > > > From: [EMAIL PROTECTED] [mailto:
> > > > > [EMAIL PROTECTED]
> > > > > > On Behalf Of j.joshua thomas
> > > > > > Sent: Tuesday, February 27, 2007 5:52 PM
> > > > > > To: r-help@stat.math.ethz.ch
> > > > > > Subject: Re: [R] Datamining-package-?
> > > > > >
> > > > > > Hi again,
> > > > > > The idea of preprocessing is mainly based on the 
> need to prepare
> > the
> > > > > data
> > > > > > before they are actually used in pattern 
> extraction.or feed the
> > data
> > > > > > into EA's (Genetic Algorithm) There are no standard 
> practice yet
> > > > > however,
> > > > > > the frequently used on are
> > > > > >
> > > > > > 1. the extraction of derived attributes that is 
> quantities that
> > > > > accompany
> > > > > > but not directly related to the data patterns and may prove
> > meaningful
> > > > > or
> > > > > > increase the understanding of the patterns
> > > > > >
> > > > > > 2. the removal of some existing attributes that 
> should be of no
> > > > > concern to
> > > > > > the mining process and its insignificance
> > > > > >
> > > > > > So i looking for a package that can do this two 
> above mentioned
> > > > > points
> > > > > >
> > > > > > Initially i would like to visualize the data into 
> pattern and
> > > > > understand the
> > > > > > patterns.
> > > > > >
> > > > > >
> > > > > <<>>
> > > > >
> > > > > Joshua,
> > > > >
> > > > > You might take a look at the package rattle on CRAN 
> for initially
> > > > > looking at your data and doing some basic data mining.
> > > > >
> > > > > Hope this is helpful,
> > > > >
> > > > > Dan
> > > > >
> > > > > Daniel Nordlund
> > > > > Bothell, WA, USA
> > > > >
> > > > > __
> > > > > R-help@stat.math.ethz.ch mailing list
> > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > > PLEASE do read the posting guide
> > > > > http://www.R-project.org/posting-guide.html<
> > http://www.r-project.org/posting-guide.html>
> > > > > and provide commented, minimal, self-contained, 
> reproducible code.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Lecturer J. Joshua Thomas
> > > > KDU College Penang Campus
> > > > Research Student,
> > > > University Sains Malaysia
> > > >
> > >
> > >
> > >
> > > --
> > > Lecturer J. Joshua Thomas
> > > KDU College Penang Campus
> > > Research Student,
> > > University Sains Malaysia
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> 
> 
> 
> -- 
> Lecturer J. Joshua Thomas
> KDU College Penang Campus
> Research Student,
> University Sains Malaysia
> 
>   [[alternative HTML version deleted]]
>

[R] matplot on lattice graphics

2007-02-27 Thread Mario A. Morales R.

can I  use matplot on each panel of a lattice graphic? How?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bootcov and cph error

Williams Scott wrote:
> Hi all,
> I am trying to get bootstrap resampled estimates of covariates in a Cox
> model using cph (Design library).
> 
> Using the following I get the error:
> 
>> ddist2.abr <- datadist(data2.abr)
>> options(datadist='ddist2.abr') 
>> cph1.abr <- cph(Surv(strt3.abr,loc3.abr)~cov.a.abr+cov.b.abr,
> data=data2.abr, x=T, y=T) 
>> boot.cph1 <- bootcov(cph1.abr, B=100, coef.reps=TRUE, pr=T)
> 1 Error in oosl(f, matxv(X, cof), Y) : not implemented for cph models
> 
> Removing coef.reps argument works fine, but I really need the
> coefficients if at all possible. I cant find anything in the help files
> suggesting that I cant use coef.reps in a cph model. Any help
> appreciated.
> 
> Cheers
> 
> Scott

Sorry it's taken so long to get to this.  The documentation needs to be 
clarified.  Add loglik=FALSE to allow coef.reps=TRUE to work for cph models.

Frank

> 
> _
> 
>  
> 
> Dr. Scott Williams MD
> 
> Peter MacCallum Cancer Centre
> 
> Melbourne, Australia
> 
> [EMAIL PROTECTED]


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cluster analysis under contiguity constraints with R ?

2007-02-27 Thread Friedrich Leisch

> On Fri, 16 Feb 2007 13:10:44 +0100,
> Bellanger Lise (BL) wrote:

  > Hello,
  > I would like to know if there is a function in an R library that 
  > allows to do cluster analysis under contiguity constraints ?
 
I don't know what exactly you mean by "contiguity constraints", but
package flexclust can cluster with group constraints, see

http://www.ci.tuwien.ac.at/papers/Leisch+Gruen-2006.pdf

Hope this helps,
Fritz Leisch

-- 
---
Prof. Dr. Friedrich Leisch 

Institut für Statistik  Tel: (+49 89) 2180 3165
Ludwig-Maximilians-Universität  Fax: (+49 89) 2180 5308
Ludwigstraße 33
D-80539 München http://www.stat.uni-muenchen.de/~leisch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lm ANOVA vs. AOV

2007-02-27 Thread Richard M. Heiberger

The underlying least squares arithmetic of aov and lm is identical.
In R, the QR algorithm is used.  The difference between the two is
intent of the analysis and the default presentation of the results.

With lm [Linear Model], the focus is on the effect of the individual
columns of the predictor matrix.  The columns are usually interpreted
as values of real-valued observations.  The regression coefficients
are usually meaningful and interesting.

With aov [Analysis Of Variance], the focus is on the effects of
factors.  These are multi-degree of freedom effects associated with
categorical variables.  The arithmetic is based on a set of dummy
variables constructed from a contrast matrix.  The individual
regression coefficients themselves are not easily interpretable.

You can pursue the details of this summary in any good statistical
methods book.

Rich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] factor documentation issue

2007-02-27 Thread Peter Dalgaard

Geoff Russell wrote:
> There is a warning in the documentation for ?factor  (R version 2.3.0)
> as follows:
>
> " The interpretation of a factor depends on both the codes and the
>   '"levels"' attribute.  Be careful only to compare factors with the
>   same set of levels (in the same order).  In particular,
>   'as.numeric' applied to a factor is meaningless, and may happen by
>   implicit coercion.  To "revert" a factor 'f' to its original
>   numeric values, 'as.numeric(levels(f))[f]' is recommended and
>   slightly more efficient than 'as.numeric(as.character(f))'.
>
>
> But as.numeric seems to work fine whereas as.numeric(levels(f))[f] doesn't
> always do anything useful.
>
> For example:
>
>   
>> f<-factor(1:3,labels=c("A","B","C"))
>> f
>> 
> [1] A B C
> Levels: A B C
>   
>> as.numeric(f)
>> 
> [1] 1 2 3
>   
>> as.numeric(levels(f))[f]
>> 
> [1] NA NA NA
> Warning message:
> NAs introduced by coercion
>
> And also,
>
>   
>> f<-factor(1:3,labels=c(1,5,6))
>> f
>> 
> [1] 1 5 6
> Levels: 1 5 6
>   
>> as.numeric(f)
>> 
> [1] 1 2 3
>   
>> as.numeric(levels(f))[f]
>> 
> [1] 1 5 6
>
> Is the documentation wrong, or is the code wrong, or have I missed
> something?
>   

The documentation is somewhat unclear: The last sentence presupposes 
that the factor was generated from numeric data, i.e. the 
factor(c(7,9,13)) syndrome:

 > f <- factor (c(7,9,13))
 > f
[1] 7  9  13
Levels: 7 9 13
 > as.numeric(f)
[1] 1 2 3

Also, the statement that as.numeric(f) is meaningless is a bit strong. 
Probably should say "meaningless without knowledge of the levels and 
their order". And you can actually compare factors with their levels in 
different order:

 > g <- factor (c("7",9,13))
 > g
[1] 7  9  13
Levels: 13 7 9
 > f==g
[1] TRUE TRUE TRUE
 > as.numeric(f)==as.numeric(g)
[1] FALSE FALSE FALSE

Where you need to be careful is that if you do things like
   sexsymbols <- c(16, 19)
   plot(x, y, pch=sexsymbols[sex]),
then you should also do
   legend(x0, y0, legend=levels(sex), pch=sexsymbols)
in order to be sure the symbols match the legend. (Notice that indexing 
with  [sex] implicitly coerces sex to numeric).

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Datamining-package rattle() Errors