Re: [R] Adding predicted values as a new variable in a data frame

2006-09-14 Thread Gabor Grothendieck
Specify na.action = na.exlude, e.g.

 x - y - 1:10; x[5] - NA
 fitted(lm(y ~ x, na.action = na.exclude))
 1  2  3  4  5  6  7  8  9 10
 1  2  3  4 NA  6  7  8  9 10

On 9/14/06, Robi Ragan [EMAIL PROTECTED] wrote:
 I am running a regression:

 ols.reg1 - lm(y ~ x1 + x2 + x3 + x4)

 on a data.frame

 and then generating fitted values:

 y.hat - ols.reg1$fitted.values

 Then I would like to add these fitted values to the data.frame as a
 new variable. The problem is that when the values are predicted the
 resulting output has too few rows. for some reason certian
 observations do not get predicted values. So this shrinks the column
 down and I then cannot combine the output into the original
 data.frame.

 If someone could please help I would apreciate it. Stata automatically
 adds a new column to the data set when you find the fitted values. So
 having to fight with R just to do something I used to routimely do has
 made me think of turning back to the dark side. I hope I have just
 missed something trival in all the help files I have been
 looking through.

 Thanks,


 --

 Robi Ragan
 Graduate Student
 Department of Economics
 Department of Political Science
 The University of Georgia

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rv generation

2006-09-14 Thread nmi13
Hi,
 Can Someone inform me how to generate RV's using the below CDF, by inverse 
technique.
Thanks for your help and time.

My CDF is as follows
\[
F(x)=0 \ \text{if} \  x  0\]\[
F(x)=\{\frac{x-x_i}{x_{i+1}-x_{i}}*(p_{i+1}-p_{i})\}+p_{i}\
\forall \ x_{i}\leq x  x_{i+1} \]
\[ F(x)=1 \ \text{if} \  x  x_{i+1}
\]
Regards
Murthy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding predicted values as a new variable in a data frame

2006-09-14 Thread Prof Brian D Ripley
?na.exclude should help you: my guess is that you asked (by using the 
default) na.action = na.omit) for rows with missing values to be excluded 
from the residuals. But since you have not mentioned missing values, we 
have to guess what 'for some reason' was: please note the footer of this 
messag.

On Thu, 14 Sep 2006, Robi Ragan wrote:

 I am running a regression:

 ols.reg1 - lm(y ~ x1 + x2 + x3 + x4)

 on a data.frame

Hmm, no data frame is mentioned: you want a data= argument.

 and then generating fitted values:

 y.hat - ols.reg1$fitted.values

Please use the accessor functions and not dive into the internal details, 
e.g.

y.hat - fitted(ols.reg1)

BTW: where did you get the use of ols.reg1$fitted.values from?

 Then I would like to add these fitted values to the data.frame as a
 new variable. The problem is that when the values are predicted the
 resulting output has too few rows. for some reason certian
 observations do not get predicted values. So this shrinks the column
 down and I then cannot combine the output into the original
 data.frame.

fit - lm(formula, data=data_frame, na.action=na.exclude)
data_frame$fitted - fitted(fit)

 If someone could please help I would apreciate it. Stata automatically
 adds a new column to the data set when you find the fitted values. So
 having to fight with R just to do something I used to routimely do has
 made me think of turning back to the dark side. I hope I have just
 missed something trival in all the help files I have been
 looking through.

The above looks trivial to me.  It was not in R or S when lm was first 
introduced (1991 White Book), but was added last century (thanks to the 
ideas and persistent advocacy of Terry Therneau).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] an error message with 't.test' with R under Unix

2006-09-14 Thread Peter Dalgaard
Tao Shi [EMAIL PROTECTED] writes:

 Again, I don't see the error in R under Windows (R 2.3.0) or Unix (R2.3.1).  
 Is this the bug you were talking about?


CHANGES IN R VERSION 2.3.0

o   Some of the classical tests put unnecessary restrictions on the
LHS in the formula interface (e.g., t.test(x+y ~ g) was not
allowed).



-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conservative ANOVA tables in lmer

2006-09-14 Thread Dimitris Rizopoulos

- Original Message - 
From: Manuel Morales [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: Douglas Bates [EMAIL PROTECTED]; Manuel Morales 
[EMAIL PROTECTED]; r-help@stat.math.ethz.ch
Sent: Wednesday, September 13, 2006 1:04 PM
Subject: Re: [R] Conservative ANOVA tables in lmer


 On Wed, 2006-09-13 at 08:04 +1000, Andrew Robinson wrote:
 On Tue, September 12, 2006 7:34 am, Manuel Morales wrote:
  On Mon, 2006-09-11 at 11:43 -0500, Douglas Bates wrote:
  Having made that offer I think I will now withdraw it.  Peter's
  example has convinced me that this is the wrong thing to do.
 
  I am encouraged by the fact that the results from mcmcsamp 
  correspond
  closely to the correct theoretical results in the case that 
  Peter
  described.  I appreciate that some users will find it difficult 
  to
  work with a MCMC sample (or to convince editors to accept 
  results
  based on such a sample) but I think that these results indicate 
  that
  it is better to go after the marginal distribution of the fixed
  effects estimates (which is what is being approximated by the 
  MCMC
  sample - up to Bayesian/frequentist philosophical differences) 
  than to
  use the conditional distribution and somehow try to adjust the
  reference distribution.
 
  Am I right that the MCMC sample can not be used, however, to 
  evaluate
  the significance of parameter groups. For example, to assess the
  significance of a three-level factor? Are there better 
  alternatives than
  simply adjusting the CI for the number of factor levels
  (1-alpha/levels).

 I wonder whether the likelihood ratio test would be suitable here? 
 That
 seems to be supported.  It just takes a little longer.

  require(lme4)
  data(sleepstudy)
  fm1 - lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
  fm2 - lmer(Reaction ~ Days + I(Days^2) + (Days|Subject), 
  sleepstudy)
  anova(fm1, fm2)

 So, a brief overview of the popular inferential needs and solutions 
 would
 then be:

 1) Test the statistical significance of one or more fixed or random
 effects - fit a model with and a model without the terms, and use 
 the LRT.

 I believe that the LRT is anti-conservative for fixed effects, as
 described in Pinheiro and Bates companion book to NLME.


You have this effect if you're using REML, for ML I don't think there 
is any problem to use LRT between nested models with different 
fixed-effects structure.

Best,
Dimitris


 2) Obtain confidence intervals for one or more fixed or random 
 effects -
 use mcmcsamp

 Did I miss anything important? - What else would people like to do?

 Cheers

 Andrew

 Andrew Robinson
 Senior Lecturer in Statistics   Tel: 
 +61-3-8344-9763
 Department of Mathematics and StatisticsFax: +61-3-8344 
 4599
 University of Melbourne, VIC 3010 Australia
 Email: [EMAIL PROTECTED]Website: 
 http://www.ms.unimelb.edu.au

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conservative ANOVA tables in lmer

2006-09-14 Thread Douglas Bates
On 9/13/06, Dimitris Rizopoulos [EMAIL PROTECTED] wrote:

 - Original Message -
 From: Manuel Morales [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Cc: Douglas Bates [EMAIL PROTECTED]; Manuel Morales
 [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
 Sent: Wednesday, September 13, 2006 1:04 PM
 Subject: Re: [R] Conservative ANOVA tables in lmer


  On Wed, 2006-09-13 at 08:04 +1000, Andrew Robinson wrote:
  On Tue, September 12, 2006 7:34 am, Manuel Morales wrote:
   On Mon, 2006-09-11 at 11:43 -0500, Douglas Bates wrote:
   Having made that offer I think I will now withdraw it.  Peter's
   example has convinced me that this is the wrong thing to do.
  
   I am encouraged by the fact that the results from mcmcsamp
   correspond
   closely to the correct theoretical results in the case that
   Peter
   described.  I appreciate that some users will find it difficult
   to
   work with a MCMC sample (or to convince editors to accept
   results
   based on such a sample) but I think that these results indicate
   that
   it is better to go after the marginal distribution of the fixed
   effects estimates (which is what is being approximated by the
   MCMC
   sample - up to Bayesian/frequentist philosophical differences)
   than to
   use the conditional distribution and somehow try to adjust the
   reference distribution.
  
   Am I right that the MCMC sample can not be used, however, to
   evaluate
   the significance of parameter groups. For example, to assess the
   significance of a three-level factor? Are there better
   alternatives than
   simply adjusting the CI for the number of factor levels
   (1-alpha/levels).
 
  I wonder whether the likelihood ratio test would be suitable here?
  That
  seems to be supported.  It just takes a little longer.
 
   require(lme4)
   data(sleepstudy)
   fm1 - lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
   fm2 - lmer(Reaction ~ Days + I(Days^2) + (Days|Subject),
   sleepstudy)
   anova(fm1, fm2)
 
  So, a brief overview of the popular inferential needs and solutions
  would
  then be:
 
  1) Test the statistical significance of one or more fixed or random
  effects - fit a model with and a model without the terms, and use
  the LRT.
 
  I believe that the LRT is anti-conservative for fixed effects, as
  described in Pinheiro and Bates companion book to NLME.
 

 You have this effect if you're using REML, for ML I don't think there
 is any problem to use LRT between nested models with different
 fixed-effects structure.

There are two issues here: how the test statistic is evaluated and
what reference distribution is used for the test statistic (i.e. how
do you convert a value of the test statistic to a p-value).  Manuel is
addressing the latter issue.  If you compare the difference of
-2*logLik for the models to a chi-square with degrees of freedom
determined by the difference in the number of parameters the test will
be anti-conservative when the number of observations is small.  The
use of the chi-square as a reference distribution is based on
asymptotic properties.

The other question is how does one evaluate the likelihood-ratio test
statistic and that is the issue that Dimitris is addressing.  The REML
criterion is a modified likelihood and it is inappropriate to look at
differences in the REML criterion when the models being compared have
different fixed-effects specifications, or even a different
parameterization of the fixed effects.  However, the anova method for
an lmer object does not use the REML criterion even when the model has
been estimated by REML.  It uses the profiled log-likelihood evaluated
at the REML estimates of the relative variances of the random effects.
 That's a complicated statement so let me break it down.

The optimization in lmer is done with respect to the elements of the
variance-covariance matrix of the random effects relative to
$\sigma^2$.  Given these values the conditional estimates of the
fixed-effects parameters and of $\sigma^2$ can be evaluated directly
with some linear algebra.  In the summary or show output of an lmer
model there are two quantities called the MLdeviance and the
REMLdeviance.  Those are based on the same relative variances but
different conditional estimates of $\sigma^2$ (and hence different
estimates of the elements of the variance-covariance of the random
effects).  It turns out that there is very little difference in the
value of the profiled log-likelihood at the ML estimates and at the
REML estimates.  This is not to say that the log-likelihood is similar
at the two (complete) sets of estimates - it is the profiled
log-likelihoods that are similar and these are what are used to create
the likelihood ratio test statistic, even when the models have been
fit by REML.

As I said, this is complicated and until I went to reply to this
message I hadn't really sorted it out for myself.  I knew the
empirical result, which I had checked before releasing the code, but I
hadn't worked out the details of why 

[R] ANOVA in R

2006-09-14 Thread Russell Compton
Despite having used R on a daily basis for the past two years, I’m
encountering some difficulty performing an ANOVA on my data. What I’m trying
to do is the following:
 
Given data such as:
 
Day 1    Day 4    Day 8
2  7  2
3  2  8  
3  4  7
6  6  8
1  3  4
 
I want to use ANOVA to determine if there is a significant change over the
three days. In other stats packages I have used, I can just select this data
and run the ANOVA function and get the F and p values. However in R, the
anova function seems to only work with a fitted model, eg. Linear
regression. This function seems to assume there is a relationship such as
day1~ day 4 + day 8, but in my case there isn’t – I just want to perform an
ANOVA without regression. If anyone could point me in the right direction
I’d greatly appreciate it,
 
Thanks

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conservative ANOVA tables in lmer

2006-09-14 Thread Douglas Bates
On 9/13/06, Dimitris Rizopoulos [EMAIL PROTECTED] wrote:

 - Original Message -
 From: Manuel Morales [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Cc: Douglas Bates [EMAIL PROTECTED]; Manuel Morales
 [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
 Sent: Wednesday, September 13, 2006 1:04 PM
 Subject: Re: [R] Conservative ANOVA tables in lmer


  On Wed, 2006-09-13 at 08:04 +1000, Andrew Robinson wrote:
  On Tue, September 12, 2006 7:34 am, Manuel Morales wrote:
   On Mon, 2006-09-11 at 11:43 -0500, Douglas Bates wrote:
   Having made that offer I think I will now withdraw it.  Peter's
   example has convinced me that this is the wrong thing to do.
  
   I am encouraged by the fact that the results from mcmcsamp
   correspond
   closely to the correct theoretical results in the case that
   Peter
   described.  I appreciate that some users will find it difficult
   to
   work with a MCMC sample (or to convince editors to accept
   results
   based on such a sample) but I think that these results indicate
   that
   it is better to go after the marginal distribution of the fixed
   effects estimates (which is what is being approximated by the
   MCMC
   sample - up to Bayesian/frequentist philosophical differences)
   than to
   use the conditional distribution and somehow try to adjust the
   reference distribution.
  
   Am I right that the MCMC sample can not be used, however, to
   evaluate
   the significance of parameter groups. For example, to assess the
   significance of a three-level factor? Are there better
   alternatives than
   simply adjusting the CI for the number of factor levels
   (1-alpha/levels).
 
  I wonder whether the likelihood ratio test would be suitable here?
  That
  seems to be supported.  It just takes a little longer.
 
   require(lme4)
   data(sleepstudy)
   fm1 - lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
   fm2 - lmer(Reaction ~ Days + I(Days^2) + (Days|Subject),
   sleepstudy)
   anova(fm1, fm2)
 
  So, a brief overview of the popular inferential needs and solutions
  would
  then be:
 
  1) Test the statistical significance of one or more fixed or random
  effects - fit a model with and a model without the terms, and use
  the LRT.
 
  I believe that the LRT is anti-conservative for fixed effects, as
  described in Pinheiro and Bates companion book to NLME.
 

 You have this effect if you're using REML, for ML I don't think there
 is any problem to use LRT between nested models with different
 fixed-effects structure.

There are two issues here: how the test statistic is evaluated and
what reference distribution is used for the test statistic (i.e. how
do you convert a value of the test statistic to a p-value).  Manuel is
addressing the latter issue.  If you compare the difference of
-2*logLik for the models to a chi-square with degrees of freedom
determined by the difference in the number of parameters the test will
be anti-conservative when the number of observations is small.  The
use of the chi-square as a reference distribution is based on
asymptotic properties.

The other question is how does one evaluate the likelihood-ratio test
statistic and that is the issue that Dimitris is addressing.  The REML
criterion is a modified likelihood and it is inappropriate to look at
differences in the REML criterion when the models being compared have
different fixed-effects specifications, or even a different
parameterization of the fixed effects.  However, the anova method for
an lmer object does not use the REML criterion even when the model has
been estimated by REML.  It uses the profiled log-likelihood evaluated
at the REML estimates of the relative variances of the random effects.
 That's a complicated statement so let me break it down.

The optimization in lmer is done with respect to the elements of the
variance-covariance matrix of the random effects relative to
$\sigma^2$.  Given these values the conditional estimates of the
fixed-effects parameters and of $\sigma^2$ can be evaluated directly
with some linear algebra.  In the summary or show output of an lmer
model there are two quantities called the MLdeviance and the
REMLdeviance.  Those are based on the same relative variances but
different conditional estimates of $\sigma^2$ (and hence different
estimates of the elements of the variance-covariance of the random
effects).  It turns out that there is very little difference in the
value of the profiled log-likelihood at the ML estimates and at the
REML estimates.  This is not to say that the log-likelihood is similar
at the two (complete) sets of estimates - it is the profiled
log-likelihoods that are similar and these are what are used to create
the likelihood ratio test statistic, even when the models have been
fit by REML.

As I said, this is complicated and until I went to reply to this
message I hadn't really sorted it out for myself.  I knew the
empirical result, which I had checked before releasing the code, but I
hadn't worked out the details of why 

[R] ANOVA in R

2006-09-14 Thread Russell Compton
Despite having used R on a daily basis for the past two years, I'm
encountering some difficulty performing an ANOVA on my data. What I'm trying
to do is the following:

 

Given data such as:

 

Day 1Day 4Day 8

2  7  2

3  2  8  

3  4  7

6  6  8

1  3  4

 

I want to use ANOVA to determine if there is a significant change over the
three days. In other stats packages I have used, I can just select this data
and run the ANOVA function and get the F and p values. However in R, the
anova function seems to only work with a fitted model, eg. Linear
regression. This function seems to assume there is a relationship such as
day1~ day 4 + day 8, but in my case there isn't - I just want to perform an
ANOVA without regression. If anyone could point me in the right direction
I'd greatly appreciate it,

 

Thanks


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA in R

2006-09-14 Thread Andrew Robinson


Try

test - data.frame(day.1=c(2,3,3,6,1), 
 day.4=c(7,2,4,6,3),
 day.8=c(2,8,7,8,4))

test

test.long - reshape(test, direction=long, 
  varying=c(day.1,day.4,day.8),
  v.names=response,
  timevar=day,
  times=names(test))

test.long$day - factor(test.long$day)

test.long

aov(response ~ day, data=test.long)


I hope that this helps,

Andrew


On Thu, Sep 14, 2006 at 09:23:13AM +0100, Russell Compton wrote:
 Despite having used R on a daily basis for the past two years, I'm
 encountering some difficulty performing an ANOVA on my data. What I'm trying
 to do is the following:
 
  
 
 Given data such as:
 
  
 
 Day 1Day 4Day 8
 
 2  7  2
 
 3  2  8  
 
 3  4  7
 
 6  6  8
 
 1  3  4
 
  
 
 I want to use ANOVA to determine if there is a significant change over the
 three days. In other stats packages I have used, I can just select this data
 and run the ANOVA function and get the F and p values. However in R, the
 anova function seems to only work with a fitted model, eg. Linear
 regression. This function seems to assume there is a relationship such as
 day1~ day 4 + day 8, but in my case there isn't - I just want to perform an
 ANOVA without regression. If anyone could point me in the right direction
 I'd greatly appreciate it,
 
  
 
 Thanks
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
Email: [EMAIL PROTECTED] http://www.ms.unimelb.edu.au

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R input output error

2006-09-14 Thread Johannes Jenkner
Dear R users!

I have some problems with some cronjobs containing R programs in batch 
mode. The CPU  load always is quite high, as I plot some weather charts 
which require extensive interpolation procedures. The crucial point is 
that I frequently get R input/output errors, if my linux PC operates 
at full capacity. I am curious, if anyone of you has similar problems. 
Is there a possibility to prevent those runtime errors?

Thanks in advance!

Johannes

-- 
Johannes Jenkner
Institute for Atmospheric and Climate Science
ETH Zuerich
Universitätsstrasse 16
ETH Zentrum, CHN M18
CH-8092 Zuerich
phone: +41/44/6332773
www: http://www.iac.ethz.ch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA in R

2006-09-14 Thread Peter Dalgaard
Andrew Robinson [EMAIL PROTECTED] writes:

 Try
 
 test - data.frame(day.1=c(2,3,3,6,1), 
  day.4=c(7,2,4,6,3),
  day.8=c(2,8,7,8,4))
 
 test
 
 test.long - reshape(test, direction=long, 
 varying=c(day.1,day.4,day.8),
 v.names=response,
 timevar=day,
 times=names(test))
 
 test.long$day - factor(test.long$day)
 
 test.long
 
 aov(response ~ day, data=test.long)


Was a one-way ANOVA intended? He never said.

On a more elementary level,

y - with(test, c(day.1,day.4,day.8))
day - factor(rep(c(1,4,8),each=5)) # or gl(3,5,labels=c(1,4,8))
sub - factor(rep(1:5,3))   # or gl(5,1,15)

print(data.frame(y,day,sub)) # just to show the point

anova(lm(y~day))# 1-way
anova(lm(y~day+sub))# 2-way

# This could be better for unbalanced designs:

drop1(lm(y~day+sub),test=F)

 

 
 
 I hope that this helps,
 
 Andrew
 
 
 On Thu, Sep 14, 2006 at 09:23:13AM +0100, Russell Compton wrote:
  Despite having used R on a daily basis for the past two years, I'm
  encountering some difficulty performing an ANOVA on my data. What I'm trying
  to do is the following:
  
   
  
  Given data such as:
  
   
  
  Day 1Day 4Day 8
  
  2  7  2
  
  3  2  8  
  
  3  4  7
  
  6  6  8
  
  1  3  4
  
   
  
  I want to use ANOVA to determine if there is a significant change over the
  three days. In other stats packages I have used, I can just select this data
  and run the ANOVA function and get the F and p values. However in R, the
  anova function seems to only work with a fitted model, eg. Linear
  regression. This function seems to assume there is a relationship such as
  day1~ day 4 + day 8, but in my case there isn't - I just want to perform an
  ANOVA without regression. If anyone could point me in the right direction
  I'd greatly appreciate it,
  
   
  
  Thanks
  
  
  [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 -- 
 Andrew Robinson  
 Department of Mathematics and StatisticsTel: +61-3-8344-9763
 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
 Email: [EMAIL PROTECTED] http://www.ms.unimelb.edu.au
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA in R

2006-09-14 Thread Russell Compton
Andrew, Peter,

Thanks both for the help, that's exactly what I was after. 

It is for a one-way ANOVA, looking at identifying differentially expressed
genes across time in a microarray dataset. Also, one of the datasets I'm
working with is unbalanced, so that additional code will be most useful.

Thanks again,

Russell Compton

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter
Dalgaard
Sent: 14 September 2006 10:26
To: Andrew Robinson
Cc: Russell Compton; r-help@stat.math.ethz.ch
Subject: Re: [R] ANOVA in R

Andrew Robinson [EMAIL PROTECTED] writes:

 Try
 
 test - data.frame(day.1=c(2,3,3,6,1), 
  day.4=c(7,2,4,6,3),
  day.8=c(2,8,7,8,4))
 
 test
 
 test.long - reshape(test, direction=long, 
 varying=c(day.1,day.4,day.8),
 v.names=response,
 timevar=day,
 times=names(test))
 
 test.long$day - factor(test.long$day)
 
 test.long
 
 aov(response ~ day, data=test.long)


Was a one-way ANOVA intended? He never said.

On a more elementary level,

y - with(test, c(day.1,day.4,day.8))
day - factor(rep(c(1,4,8),each=5)) # or gl(3,5,labels=c(1,4,8))
sub - factor(rep(1:5,3))   # or gl(5,1,15)

print(data.frame(y,day,sub)) # just to show the point

anova(lm(y~day))# 1-way
anova(lm(y~day+sub))# 2-way

# This could be better for unbalanced designs:

drop1(lm(y~day+sub),test=F)

 

 
 
 I hope that this helps,
 
 Andrew
 
 
 On Thu, Sep 14, 2006 at 09:23:13AM +0100, Russell Compton wrote:
  Despite having used R on a daily basis for the past two years, I'm
  encountering some difficulty performing an ANOVA on my data. What I'm
trying
  to do is the following:
  
   
  
  Given data such as:
  
   
  
  Day 1Day 4Day 8
  
  2  7  2
  
  3  2  8  
  
  3  4  7
  
  6  6  8
  
  1  3  4
  
   
  
  I want to use ANOVA to determine if there is a significant change over
the
  three days. In other stats packages I have used, I can just select this
data
  and run the ANOVA function and get the F and p values. However in R, the
  anova function seems to only work with a fitted model, eg. Linear
  regression. This function seems to assume there is a relationship such
as
  day1~ day 4 + day 8, but in my case there isn't - I just want to perform
an
  ANOVA without regression. If anyone could point me in the right
direction
  I'd greatly appreciate it,
  
   
  
  Thanks
  
  
  [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 -- 
 Andrew Robinson  
 Department of Mathematics and StatisticsTel: +61-3-8344-9763
 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
 Email: [EMAIL PROTECTED] http://www.ms.unimelb.edu.au
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Greedy triangulation

2006-09-14 Thread Dan Bebber
Hello,

does anyone have code that will generate a greedy triangulation 
(triangulation that uses shortest non-overlapping edges) for a set of points 
in Euclidean space?

Thanks,
Dan Bebber
___
Dr. Daniel P. Bebber
Department of Plant Sciences
University of Oxford
South Parks Road
Oxford OX1 3RB
UK
Tel. 01865 275060

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conservative

2006-09-14 Thread Gregor Gorjanc
Douglas Bates bates at stat.wisc.edu writes:

 On 9/13/06, Dimitris Rizopoulos dimitris.rizopoulos at med.kuleuven.be
   I believe that the LRT is anti-conservative for fixed effects, as
   described in Pinheiro and Bates companion book to NLME.
  
  You have this effect if you're using REML, for ML I don't think there
  is any problem to use LRT between nested models with different
  fixed-effects structure.
...
 The other question is how does one evaluate the likelihood-ratio test
 statistic and that is the issue that Dimitris is addressing.  The REML
 criterion is a modified likelihood and it is inappropriate to look at
 differences in the REML criterion when the models being compared have
 different fixed-effects specifications, or even a different
 parameterization of the fixed effects.  However, the anova method for
 an lmer object does not use the REML criterion even when the model has
 been estimated by REML.  It uses the profiled log-likelihood evaluated
 at the REML estimates of the relative variances of the random effects.
  That's a complicated statement so let me break it down.
...

Is this then the same answer as given by Robinson:1991 (ref at the end) to
question by Robin Thompson on which likelihood (ML or REML) should be used
in testing the fixed effects. Robinson answered (page 49 near bottom 
right) that both likelihoods give the same conclusion about fixed effects. 
Can anyone comment on this issues? 

Thanks, Gregor

@Article{Robinson:1991,
  author =   {Robinson, G. K.},
  title ={That {BLUP} is a good thing: the estimation of random
  effects},
  journal =  ss,
  year = {1991},
  volume =   {6},
  number =   {1},
  pages ={15--51},
  keywords = {BLUP, example, derivations, links, applications},
  vnos = {GG}
}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kendall's w

2006-09-14 Thread Jim Lemon
Bianca Vieru wrote:
 Hi,
 
 I try to calculate Kendall's W coefficient and I have a bizarre error.
 
 
 little.app.mat-matrix(c(1,3,4,2,6,5,2,4,3,1,5,6,3,2,5,1,5,4),nrow=3,byrow=TRUE)
 print(kendall.w(little.app.mat[-1,]))
   Kendall's W for ordinal data
   W = 0.7753623Error in if (is.na(x$p.table)) { : argument is of 
 length zero
 
 big.app.mat-matrix(c(1,3,4,2,6,5,2,4,3,1,5,6,3,2,5,1,5,42,3,5,3,6,7,9,9,8,7),nrow=3,byrow=TRUE)
 print(kendall.w(big.app.mat[-1,]))
  Kendall's W for ordinal data
  W = 0.4568966  p(X2[8]) = 0.5035488
 
 Why is that working for the big matrix and not for the little one?
 
Thanks for finding this and thanks to David Barron for the correct 
answer - I'll insert a check for out of range matrices like this and 
submit a new version.

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conservative

2006-09-14 Thread Peter Dalgaard
Gregor Gorjanc [EMAIL PROTECTED] writes:

 Douglas Bates bates at stat.wisc.edu writes:
 
  On 9/13/06, Dimitris Rizopoulos dimitris.rizopoulos at med.kuleuven.be
I believe that the LRT is anti-conservative for fixed effects, as
described in Pinheiro and Bates companion book to NLME.
   
   You have this effect if you're using REML, for ML I don't think there
   is any problem to use LRT between nested models with different
   fixed-effects structure.
 ...
  The other question is how does one evaluate the likelihood-ratio test
  statistic and that is the issue that Dimitris is addressing.  The REML
  criterion is a modified likelihood and it is inappropriate to look at
  differences in the REML criterion when the models being compared have
  different fixed-effects specifications, or even a different
  parameterization of the fixed effects.  However, the anova method for
  an lmer object does not use the REML criterion even when the model has
  been estimated by REML.  It uses the profiled log-likelihood evaluated
  at the REML estimates of the relative variances of the random effects.
   That's a complicated statement so let me break it down.
 ...
 
 Is this then the same answer as given by Robinson:1991 (ref at the end) to
 question by Robin Thompson on which likelihood (ML or REML) should be used
 in testing the fixed effects. Robinson answered (page 49 near bottom 
 right) that both likelihoods give the same conclusion about fixed effects. 
 Can anyone comment on this issues? 

At the risk of sticking my foot in it due to not reading the paper
carefully enough: There appears to be two other likelihoods in play,
one traditional one depending on fixed effects and variances and
another depending on fixed effects and BLUPs (most likely
unobservables). I think Robinson is talking about the equivalence of
those two.

(and BTW ss=Statistical Science in the ref.)

 
 Thanks, Gregor
 
 @Article{Robinson:1991,
   author =   {Robinson, G. K.},
   title ={That {BLUP} is a good thing: the estimation of random
   effects},
   journal =  ss,
   year = {1991},
   volume =   {6},
   number =   {1},
   pages ={15--51},
   keywords = {BLUP, example, derivations, links, applications},
   vnos = {GG}
 }


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rv generation

2006-09-14 Thread Duncan Murdoch
On 9/14/2006 2:16 AM, nmi13 wrote:
 Hi,
  Can Someone inform me how to generate RV's using the below CDF, by inverse 
 technique.
 Thanks for your help and time.
 
 My CDF is as follows
 \[
 F(x)=0 \ \text{if} \  x  0\]\[
 F(x)=\{\frac{x-x_i}{x_{i+1}-x_{i}}*(p_{i+1}-p_{i})\}+p_{i}\
 \forall \ x_{i}\leq x  x_{i+1} \]
 \[ F(x)=1 \ \text{if} \  x  x_{i+1}
 \]

This sounds an awful lot like a class assignment.  You should ask your 
instructor for help with it.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Binomial test using R

2006-09-14 Thread Srinivasan Ramachandran
Hullo,

Can someone suggest whether the binomial test as described in the link
http://home.clara.net/sisa/binomial.htm is available in an equivalent form
in R? I have downloaded the R package from the CRAN site.

Using R will help me do this test rapidly
Many Thanks

Ramachandran
Dr. S. Ramachandran
Scientist E I
G.N. Ramachandran Knowledge Centre for Genome Informatics
Institute of Genomics and Integrative Biology
Mall Road, Delhi 110 007
Tel: 091-11-2766-6156
Fax: 091-11-2766-7471
http://www.igib.res.in/Sprofile/cram.html

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Binomial test using R

2006-09-14 Thread Peter Dalgaard
Srinivasan Ramachandran [EMAIL PROTECTED] writes:

 Hullo,
 
 Can someone suggest whether the binomial test as described in the link
 http://home.clara.net/sisa/binomial.htm is available in an equivalent form
 in R? I have downloaded the R package from the CRAN site.
 
 Using R will help me do this test rapidly

I think you could be expected to do a bit more of your homework before
querying the list! Anyways, have a look at binom.test().

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Run-time error 521 in SciViews

2006-09-14 Thread Wuming Gong
Dear list,

I use SciViews R Console 0.8.9 with R-2.2.1 under Windows XP SP2, and
it works very well for most of time.  However, sometimes, when
commands were executed by clicking F5, the error Run-time error 521,
Can't open clipboard pop out.  After choosing Yes, the both R
console and SciViews exits, and all things in the Command window
losts, which is very annoying.  Could anyone tell me how to avoid this
problem?

Thanks.

Wuming

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotting all subgroups with augPred

2006-09-14 Thread Afshartous, David
All,

I have a question RE plotting the prediction lines of a random effects
model via augPred.  I'll illustrate via the Pixel dataset within the
nlme package: 

library(nlme)
attach(Pixel)
fm1Pixel = lme(pixel ~ day + I(day^2), data = Pixel, random = list(Dog =
~ 1))
plot(augPred(fm1Pixel))   ### 10 fitted lines since there are 10 dogs

fm2Pixel = update(fm1Pixel, . ~ . + Side)
plot(augPred(fm2Pixel))## also produces 10 fitted lines

For the second plot, shouldn't we have 2 fitted lines per dog, one for
each level
of the fixed effect Side?  

1) How does one plot insure that this is plotted accordingly?

2) How does one plot say only the first 5 dogs?


Thanks!
Dave

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] working with strptime data

2006-09-14 Thread Richard Evans
Dear R-forum,

I am looking for a good resource/help on working with POSIXct values and
controlling the pretty x-axis labels and tick marks for a data VS time
plots. Specifically, I wish to do programming logic which creates
different vertical ablines calculations based on the range of times
which i am working on. The default plot results are adequate, but I
would love to make explicit calls on how the x-axis annotates the
timestamps.

Does anyone have example code or know of a good reference for these
kinds of R-programming tasks?


Here's a simplified example:
--
I have a large data set that consists of N pairs of values and
timestamps strings.
Like this:

TheData - c(1.2, 0.9, etc...[to the Nth
datapoint])
TheTime - c(9/1/2006 8:00, 9/1/2006 8:13, etc...[to the Nth
timestamp])

I convert the timestamp strings into POSIXct types as:

TheTime - as.POSIXct(strptime(TheTime, format=%m/%d/%Y %H:%M))

And create a plot as such:

plot(MyTime,MyData)

--

And here is a specific question:

How do I calculate the number of months than are spanned between two
POSIXct values?
(i.e. NumOfMonths - MonthFunction(range(MyTimeStampData))

Thanks-in-advance,
- rich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] time varying covariates

2006-09-14 Thread Martin Wagner

Hello,

I am trying to model an intensity function with time-varying covariates.
Before, I have successfully defined a log likelihood function for a 
Power-Law Process (lambda(t)=alpha*beta*t^(beta-1))  with two paramters 
and no covariates for a repairable systems with failure times (t).

This function was maximized with R optim. No problem!

But now I want to include a covariate indicating a time-varying value at 
each failure time t. For constant covariates, the procedure is feasible :



leads to following log likelihood funciton:


here zi are the covariates which are constant for each unit i under 
observation. tij are the failure time for failure j of unit i.


Do you know how to formulate a log likelihood function for covariates 
which vary for each tj of each unit i ?


Thank you very much
Best regards

Martin Wagner
Berlin University of Technology
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem setting options(error=recover) in my Rprofile.site

2006-09-14 Thread Marcus, Jeffrey
I'd like to be able to set options(error=recover) in my Rprofile.site file.
When I do this I get the message
 Error in options(error = recover) : object recover not found

I assume this is because the utils package (where recover and dump.frames
are defined) has not been loaded at the time I make this call. 

I suppose I could explicitly do library(utils) before setting the
options even though it will be loaded again when the default packages are
loaded. 

Any simpler suggestions of how to get options(error=recover) set
automatically every time I start R? 

More generally, is there a way to have code executed on startup but *after*
the default packages are loaded? 

Thanks.

  Jeff

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Greedy triangulation

2006-09-14 Thread Greg Snow
Does the deldir package do what you want? 


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Dan Bebber
Sent: Thursday, September 14, 2006 3:56 AM
To: r-help@stat.math.ethz.ch
Subject: [R] Greedy triangulation

Hello,

does anyone have code that will generate a greedy triangulation
(triangulation that uses shortest non-overlapping edges) for a set of
points in Euclidean space?

Thanks,
Dan Bebber
___
Dr. Daniel P. Bebber
Department of Plant Sciences
University of Oxford
South Parks Road
Oxford OX1 3RB
UK
Tel. 01865 275060

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Greedy triangulation

2006-09-14 Thread roger koenker
Or, perhaps, tripack?

url:www.econ.uiuc.edu/~rogerRoger Koenker
email[EMAIL PROTECTED]Department of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820


On Sep 14, 2006, at 10:32 AM, Greg Snow wrote:

 Does the deldir package do what you want?


 --  
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 [EMAIL PROTECTED]
 (801) 408-8111


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Dan Bebber
 Sent: Thursday, September 14, 2006 3:56 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Greedy triangulation

 Hello,

 does anyone have code that will generate a greedy triangulation
 (triangulation that uses shortest non-overlapping edges) for a set of
 points in Euclidean space?

 Thanks,
 Dan Bebber
 ___
 Dr. Daniel P. Bebber
 Department of Plant Sciences
 University of Oxford
 South Parks Road
 Oxford OX1 3RB
 UK
 Tel. 01865 275060

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Greedy triangulation

2006-09-14 Thread Dan Bebber
Thanks, but no, the Delaunay is different.

I have written the following, which interested persons might want to check 
for accuracy and streamlining:

#GREEDY TRIANGULATION
#
#Pick all lines that are shortest and don't overlap, starting with shortest.
#
greedy-function(xy){   #input is a matrix with 2 columns (x,y)
require(spatstat)   #need the crossing.psp function
w-owin(range(xy[,1]),range(xy[,2]))#set the window for crossing.psp
dists-dist(xy) #calculate Euclidean distances for each line
ind-which(lower.tri(dists),arr.ind=T)  #matrix with 2 columns (start point, 
end point)
x1-xy[ind[,1],1]   #x of start point
y1-xy[ind[,1],2]   #y of start point
x2-xy[ind[,2],1]   #x of end point
y2-xy[ind[,2],2]   #y of end point
dat-data.frame(strt=ind[,1],fin=ind[,2],
x1=x1,y1=y1,x2=x2,y2=y2,len=as.vector(dists))#put all data for lines 
together
dat-dat[order(dists),] #order data by line length, shortest first
inc-dat[1,]#include the shortest line in the triangulation
dat-dat[-1,]   #keep remaining lines
while(nrow(dat)0){ #while lines remain to be tested, do the following...
A-psp(inc$x1,inc$y1,inc$x2,inc$y2,w) #create the psp object for the lines 
already included
B-psp(dat$x1[1],dat$y1[1],dat$x2[1],dat$y2[1],w) #create the psp object for 
the next line to be tested
cross-crossing.psp(A,B) #check for crossing of the test line over the lines 
already included
p.match-sum(cross$x%in%c(inc$x1,inc$x2)) #check if the crossing occurs 
because same points are included
if(cross$n-p.match==0){inc[nrow(inc)+1,]-dat[1,]} #if the only crossing are 
due to shared points, include the line
dat-dat[-1,]}  #remove the line from the untested lines
return(inc)} #when all lines have been tested, return all included lines
#
#Plot the greedy triangulation
#
plot.greedy-function(xy,gr,...){
plot(xy,asp=1,xlab=x,ylab=y,...)
segments(gr$x1,gr$y1,gr$x2,gr$y2)}
#
#Test it
#
xy-matrix(runif(40,0,1),nc=2)
gr-greedy(xy)
plot.greedy(xy,gr,main=Greedy Triangulation)
#END

- Original Message - 
From: Greg Snow [EMAIL PROTECTED]
To: Dan Bebber [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
Sent: Thursday, September 14, 2006 4:32 PM
Subject: RE: [R] Greedy triangulation


Does the deldir package do what you want?


-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time varying covariates

2006-09-14 Thread David Barron
Have a look at the survSplit function in the survival package.  It looks to
me as though you could use survreg with the weibull option to achieve what
you want.  Otherwise, you'll have to rewrite the likelihood in terms of both
start and end times.

On 14/09/06, Martin Wagner [EMAIL PROTECTED] wrote:

 Hello,

I am trying to model an intensity function with time-varying covariates.
Before, I have successfully defined a log likelihood function for a
Power-Law Process (lambda(t)=alpha*beta*t^(beta-1))  with two paramters
and no covariates for a repairable systems with failure times (t).
This function was maximized with R optim. No problem!

But now I want to include a covariate indicating a time-varying value at
each failure time t. For constant covariates, the procedure is feasible :


leads to following log likelihood funciton:


here zi are the covariates which are constant for each unit i under
observation. tij are the failure time for failure j of unit i.

Do you know how to formulate a log likelihood function for covariates
which vary for each tj of each unit i ?

Thank you very much
Best regards

Martin Wagner
Berlin University of Technology


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP


-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem setting options(error=recover) in my Rprofile.site

2006-09-14 Thread Prof Brian D Ripley
On Thu, 14 Sep 2006, Marcus, Jeffrey wrote:

 I'd like to be able to set options(error=recover) in my Rprofile.site file.
 When I do this I get the message
 Error in options(error = recover) : object recover not found

 I assume this is because the utils package (where recover and dump.frames
 are defined) has not been loaded at the time I make this call.

 I suppose I could explicitly do library(utils) before setting the
 options even though it will be loaded again when the default packages are
 loaded.

It will not be loaded again.

 Any simpler suggestions of how to get options(error=recover) set
 automatically every time I start R?

options(error=utils::recover)

 More generally, is there a way to have code executed on startup but *after*
 the default packages are loaded?

Load the default packages yourself in your startup code, by calling 
.First.sys() (and resetting it to avoid the slight overhead of subsequent 
require() calls if you want).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Beta stochastic simulation

2006-09-14 Thread Mark Pinkerton
Hi,
I am finding that I get quite different results when I interchange the
following equivalent lines for sampling from a beta distribution in my
r script. The rbeta line is correct judging by the summary statistics of
the simulated values, while the qbeta line consistently leads to a
higher mean simulated value.
 
simulation - rbeta(1, alpha, beta)
simulation - qbeta(runif(1), alpha, beta)
 
Are there any implementation reasons for this?
 
Thanks
 
Mark Pinkerton
Risk Management Solutions 
Peninsular House
30 Monument Street
London EC3R 8HB 
UK 
 
www.RMS.com http://www.rms.com/  
Tel:  +44 20 7444 7783 
Fax: +44 20 7444 7601
 


This message and any attachments contain information that may be RMS Inc. 
confidential and/or privileged.  If you are not the intended recipient 
(or authorized to receive for the intended recipient), and have received 
this message in error, any use, disclosure or distribution is strictly 
prohibited.   If you have received this message in error, please notify 
the sender immediately by replying to the e-mail and permanently deleting 
the message from your computer and/or storage system.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] polr

2006-09-14 Thread grazia
Dear R heplper,
I'm using polr:

fm - polr(factor(y) ~ x, ,method='logistic', data = dat, Hess=T)

ans I'm getting a very strange message:
Error in optim(start, fmin, gmin, method = BFGS, hessian = Hess, ...) :
initial value in 'vmmin' is not finite
Can you help me?
Thank you in advance,
Grazia



M. Grazia Pittau, Ph.D.
Post-Doctoral Research Fellow
Department of Statistics
Columbia University
1255 Amsterdam Avenue
New York, NY  10027

[EMAIL PROTECTED]
Phone: 212.851.2160
Fax: 212.851.2164

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Beta stochastic simulation

2006-09-14 Thread Duncan Murdoch
On 9/14/2006 1:42 PM, Mark Pinkerton wrote:
 Hi,
 I am finding that I get quite different results when I interchange the
 following equivalent lines for sampling from a beta distribution in my
 r script. The rbeta line is correct judging by the summary statistics of
 the simulated values, while the qbeta line consistently leads to a
 higher mean simulated value.
  
 simulation - rbeta(1, alpha, beta)
 simulation - qbeta(runif(1), alpha, beta)
  
 Are there any implementation reasons for this?

You need to be more specific about the R version, the parameter values 
you're using, and the size of the differences you're seeing.  I just tried

  x - rbeta(10, 1, 6)
  y - qbeta(runif(10), 1, 6)
  summary(x); summary(y)
  Min.   1st Qu.Median  Mean   3rd Qu.  Max.
4.076e-07 4.681e-02 1.085e-01 1.423e-01 2.055e-01 8.773e-01
  Min.   1st Qu.Median  Mean   3rd Qu.  Max.
2.803e-06 4.583e-02 1.078e-01 1.419e-01 2.047e-01 8.861e-01

and I think those results look reasonable (and the qbeta method has a 
smaller mean, this run).

Duncan Murdoch

  
 Thanks
  
 Mark Pinkerton
 Risk Management Solutions 
 Peninsular House
 30 Monument Street
 London EC3R 8HB 
 UK 
  
 www.RMS.com http://www.rms.com/  
 Tel:  +44 20 7444 7783 
 Fax: +44 20 7444 7601
  
 
 
 This message and any attachments contain information that may be RMS Inc. 
 confidential and/or privileged.  If you are not the intended recipient 
 (or authorized to receive for the intended recipient), and have received 
 this message in error, any use, disclosure or distribution is strictly 
 prohibited.   If you have received this message in error, please notify 
 the sender immediately by replying to the e-mail and permanently deleting 
 the message from your computer and/or storage system.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help On EBAM

2006-09-14 Thread Learning R
Dear RUsers,

I am new to R. I am learning how to use R. I am a PC user and run R on
windows. I would appreciate if some one could guide me on a few questions I
have:

1) I have 4 cel files (2 replicates for NORM and Disease resp). I have been
able to run siggenes on this dataset where I have 4 labels in the class file
groupsnhi.cl  op- (0,0,1,1) and my data has been read into datrmanhi after
performing rma. When I run these commands here I receive these errors:

 plot(samnhi.out,seq(0.1,0.6,0.1))
 identify(samnhi.out,ll=FALSE)
warning: no point with 0.25 inches
warning: no point with 0.25 inches
warning: no point with 0.25 inches

Why does this happen.

2) Now I am trying to run EBAM: and when I type I get an error

  find.out-find.a0(datrmanhi,groupsnhi.cl,rand=123)
Loading required package: affy
Loading required package: affyio
EBAM Analysis for the two class unpaired case.

Warning: There are 1 genes which have variance Zero or no non-missing
values.
 The d-value of these genes is set to NA.


The following object(s) are masked _by_ .GlobalEnv :

 n


The following object(s) are masked from mat.repeat ( position 5 ) :

 center log.bino n p success x1 x2 x3 x4 x5

Error in optim(rep(0, 6), neglogLik.repeat, method = BFGS) :
non-finite finite-difference value [1]
In addition: Warning message:
collapsing to unique 'x' values in: approx(Z, Z.norm, z, rule = 2)


-
I have also tried :
 find.out-find.a0(exprs2,c(1,1,1,0,0,0),rand=123)
EBAM Analysis for the two class unpaired case.

Warning: There are 1 genes which have variance Zero or no non-missing
values.
 The d-value of these genes is set to NA.


The following object(s) are masked _by_ .GlobalEnv :

 n


The following object(s) are masked from mat.repeat ( position 3 ) :

 center log.bino n p success x1 x2 x3 x4 x5


The following object(s) are masked from mat.repeat ( position 6 ) :

 center log.bino n p success x1 x2 x3 x4 x5

Error in optim(rep(0, 6), neglogLik.repeat, method = BFGS) :
non-finite finite-difference value [1]
In addition: Warning message:
collapsing to unique 'x' values in: approx(Z, Z.norm, z, rule = 2)


I would greatly appreciate any solutions and help to solve this problem.

Thank you,
Appreciate your time.
Regards,
Lolita

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] converting strings to expressions

2006-09-14 Thread Deepayan Sarkar
Hi,

consider this:

--

estr - c(2^4, alpha[1])
eexp - expression(2^4, alpha[1])


## Is it possible to get 'eexp' starting from 'estr'? The closest I could
## get was:

do.call(expression, lapply(estr, as.name))

## but it is not quite the same; e.g. the following behave differently:

library(lattice)

xyplot(1:10 ~ 1:10,
   scales = list(x = list(at = c(3, 6), labels = eexp)))

xyplot(1:10 ~ 1:10,
   scales = list(x = list(at = c(3, 6),
 labels = do.call(expression,
 lapply(estr, as.name)

---

This happens in both 2.3.1 and pre-2.4.0.

Deepayan


 sessionInfo()
Version 2.3.1 Patched (2006-08-27 r39012)
x86_64-unknown-linux-gnu

attached base packages:
[1] methods   stats graphics  grDevices utils datasets
[7] base

other attached packages:
  lattice
0.13-10

 sessionInfo()
R version 2.4.0 Under development (unstable) (2006-08-30 r39022)
x86_64-unknown-linux-gnu

locale:
C

attached base packages:
[1] methods   stats graphics  grDevices utils datasets
[7] base

other attached packages:
 lattice
0.14-3

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting strings to expressions

2006-09-14 Thread Peter Dalgaard
Deepayan Sarkar [EMAIL PROTECTED] writes:

 Hi,
 
 consider this:
 
 --
 
 estr - c(2^4, alpha[1])
 eexp - expression(2^4, alpha[1])
 
 
 ## Is it possible to get 'eexp' starting from 'estr'? The closest I could
 ## get was:
 
 do.call(expression, lapply(estr, as.name))
 
 ## but it is not quite the same; e.g. the following behave differently:

Er, how about

 estr - c(2^4, alpha[1])
 parse(text=estr)
expression(2^4, alpha[1])

or (brain teaser alert!)

 parse(text=deparse(parse(text=estr)))[[1]]
expression(2^4, alpha[1])

which is _not_ quite the same thing.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting strings to expressions

2006-09-14 Thread Sundar Dorai-Raj


Deepayan Sarkar said the following on 9/14/2006 2:31 PM:
 Hi,
 
 consider this:
 
 --
 
 estr - c(2^4, alpha[1])
 eexp - expression(2^4, alpha[1])
 
 
 ## Is it possible to get 'eexp' starting from 'estr'? The closest I could
 ## get was:
 
 do.call(expression, lapply(estr, as.name))
 
 ## but it is not quite the same; e.g. the following behave differently:
 
 library(lattice)
 
 xyplot(1:10 ~ 1:10,
scales = list(x = list(at = c(3, 6), labels = eexp)))
 
 xyplot(1:10 ~ 1:10,
scales = list(x = list(at = c(3, 6),
  labels = do.call(expression,
  lapply(estr, as.name)
 
 ---
 
 This happens in both 2.3.1 and pre-2.4.0.
 
 Deepayan
 
 
 sessionInfo()
 Version 2.3.1 Patched (2006-08-27 r39012)
 x86_64-unknown-linux-gnu
 
 attached base packages:
 [1] methods   stats graphics  grDevices utils datasets
 [7] base
 
 other attached packages:
   lattice
 0.13-10
 
 sessionInfo()
 R version 2.4.0 Under development (unstable) (2006-08-30 r39022)
 x86_64-unknown-linux-gnu
 
 locale:
 C
 
 attached base packages:
 [1] methods   stats graphics  grDevices utils datasets
 [7] base
 
 other attached packages:
  lattice
 0.14-3
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

Hi, Deepayan,

Will this work for you?

estr - c(2^4, alpha[1])
eexp - expression(2^4, alpha[1])

library(lattice)

xyplot(1:10 ~ 1:10,
scales = list(x = list(at = c(3, 6), labels = eexp)))

estr.2 - sprintf(expression(%s), paste(estr, collapse = ,))
eexp.2 - eval(parse(text = estr.2))
xyplot(1:10 ~ 1:10,
scales = list(x = list(at = c(3, 6), labels = eexp.2)))

Works in both R-2.3.1 and R-2.4.0dev.

Typically, I don't like using eval(parse(text=...)), but I've run into 
this problem before I could not see another way. Perhaps Gabor will 
enlighten us with something slicker.

HTH,

--sundar

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting strings to expressions

2006-09-14 Thread Deepayan Sarkar
On 14 Sep 2006 21:44:01 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote:
 Deepayan Sarkar [EMAIL PROTECTED] writes:

  Hi,
 
  consider this:
 
  --
 
  estr - c(2^4, alpha[1])
  eexp - expression(2^4, alpha[1])
 
 
  ## Is it possible to get 'eexp' starting from 'estr'? The closest I could
  ## get was:
 
  do.call(expression, lapply(estr, as.name))
 
  ## but it is not quite the same; e.g. the following behave differently:

 Er, how about

  estr - c(2^4, alpha[1])
  parse(text=estr)
 expression(2^4, alpha[1])

 or (brain teaser alert!)

  parse(text=deparse(parse(text=estr)))[[1]]
 expression(2^4, alpha[1])

 which is _not_ quite the same thing.

Ah, I'd forgotten about parse. A link from ?expression might be reasonable.

Thanks,
Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting strings to expressions

2006-09-14 Thread Richard M. Heiberger
sapply(estr, FUN=function(x) parse(text=x))

and it does print the greek letter in the xlab.

xyplot(1:10 ~ 1:10,
   scales = list(x = list(at = c(3, 6),
 labels = sapply(estr, FUN=function(x) parse(text=x)

Rich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting strings to expressions

2006-09-14 Thread Prof Brian Ripley
On Thu, 14 Sep 2006, Deepayan Sarkar wrote:

[...]

 Ah, I'd forgotten about parse. A link from ?expression might be reasonable.

But it says

Details:

  'Expression' here is not being used in its colloquial sense, that
  of mathematical expressions.  Those are calls (see 'call') in R,
  and an R expression vector is a list of calls, typically as
  returned by 'parse'.

so it already has a link.

'Expression' was often misused in the R documentation, and a couple of 
months ago I tried to move to a more careful usage.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting strings to expressions

2006-09-14 Thread Peter Dalgaard
Prof Brian Ripley [EMAIL PROTECTED] writes:

 On Thu, 14 Sep 2006, Deepayan Sarkar wrote:
 
 [...]
 
  Ah, I'd forgotten about parse. A link from ?expression might be reasonable.
 
 But it says
 
 Details:
 
   'Expression' here is not being used in its colloquial sense, that
   of mathematical expressions.  Those are calls (see 'call') in R,
   and an R expression vector is a list of calls, typically as
   returned by 'parse'.
 
 so it already has a link.
 
 'Expression' was often misused in the R documentation, and a couple of
 months ago I tried to move to a more careful usage.

Actually, you need to be even more careful because has commonly been
called unevaluated expressions cover symbols and constants too.
I.e., quote(1) and quote(x) are not calls. And such non-call objects
can be elements of an expression vector.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group bunch of lines in a data.frame, an additional requirement

2006-09-14 Thread Emmanuel Levy
Thanks Gabor, that is much faster than using a loop!

I've got a last question:

Can you think of a fast way of keeping track of the number of
observations collapsed for each entry?

i.e. I'd like to end up with:

A 2.0 400 ID1 3 (3obs in the first matrix)
B 0.7 35 ID2 2 (2obs in the first matrix)
C 5.0 70 ID1 1 (1obs in the first matrix)

Or is it required to use an temporary matrix that is merged later? (As
examplified by Mark in a previous email?)

Thanks a lot for your help,

  Emmanuel

On 9/13/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 See below.

 On 9/13/06, Emmanuel Levy [EMAIL PROTECTED] wrote:
  Thanks for pointing me out aggregate, that works fine!
 
  There is one complication though: I have mixed types (numerical and 
  character),
 
  So the matrix is of the form:
 
  A 1.0 200 ID1
  A 3.0 800 ID1
  A 2.0 200 ID1
  B 0.5 20   ID2
  B 0.9 50   ID2
  C 5.0 70   ID1
 
  One letter always has the same ID but one ID can be shared by many
  letters (like ID1)
 
  I just want to keep track of the ID, and get a matrix like:
 
  A 2.0 400 ID1
  B 0.7 35 ID2
  C 5.0 70 ID1
 
  Any idea on how to do that without a loop?

 If V4 is a function of V1 then you can aggregate by it too and it will
 appear but have no effect on the classification:

  aggregate(DF[2:3], DF[c(1,4)], mean)
   V1  V4  V2  V3
 1  A ID1 2.0 400
 2  C ID1 5.0  70
 3  B ID2 0.7  35


 
   Many thanks,
 
  Emmanuel
 
  On 9/12/06, Emmanuel Levy [EMAIL PROTECTED] wrote:
   Hello,
  
   I'd like to group the lines of a matrix so that:
   A 1.0 200
   A 3.0 800
   A 2.0 200
   B 0.5 20
   B 0.9 50
   C 5.0 70
  
   Would give:
   A 2.0 400
   B 0.7 35
   C 5.0 70
  
   So all lines corresponding to a letter (level), become a single line
   where all the values of each column are averaged.
  
   I've done that with a loop but it doesn't sound right (it is very
   slow). I imagine there is a
   sort of apply shortcut but I can't figure it out.
  
   Please note that it is not exactly a matrix I'm using, the function
   typeof tells me it's a list, however I access to it like it was a
   matrix.
  
   Could someone help me with the right function to use, a help topic or
   a piece of code?
  
   Thanks,
  
 Emmanuel
  
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting strings to expressions

2006-09-14 Thread Peter Dalgaard
Peter Dalgaard [EMAIL PROTECTED] writes:

 Actually, you need to be even more careful because has commonly been

Drats! ...because _what_ has..., of course. Talk about being careful. 

 called unevaluated expressions cover symbols and constants too.
 I.e., quote(1) and quote(x) are not calls. And such non-call objects
 can be elements of an expression vector.
 
 -- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting strings to expressions

2006-09-14 Thread Deepayan Sarkar
On 9/14/06, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 On Thu, 14 Sep 2006, Deepayan Sarkar wrote:

 [...]

  Ah, I'd forgotten about parse. A link from ?expression might be reasonable.

 But it says

 Details:

   'Expression' here is not being used in its colloquial sense, that
   of mathematical expressions.  Those are calls (see 'call') in R,
   and an R expression vector is a list of calls, typically as
   returned by 'parse'.

 so it already has a link.

Right, I missed that (although I remember reading the first part of
that paragraph).

Deepayan


 'Expression' was often misused in the R documentation, and a couple of
 months ago I tried to move to a more careful usage.

 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hist for two sets

2006-09-14 Thread Ethan Johnsons
A quick question, please.

x = c(0.0001, 0.0059, 0.0855, 0.9082)
y = c(0.54, 0.813, 0.379, 0.35)

where A = 1st set, B = 2nd set, C = 3rd set, D = 4th set respectivley.

How do you make hist plot side by side for x  y?

i.e. 0.0001 and then to the right 0.54, 0.0059 and then to the right 0.813,
etc.

thx much

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Beta stochastic simulation

2006-09-14 Thread Duncan Murdoch
On 9/14/2006 5:26 PM, Mark Pinkerton wrote:
 Hi Duncan,
 I had also validated the logic with a simple test which is why I was 
 surprised by the differences I was seeing from tthe more complex simulation. 
 I am running R on a Windows 2000 - I'll have to check which version at my 
 desk tomorrow but it's pretty up to date, maybe 6 monthes old. Attached is a 
 code snippet  from my simulation program which is used to estimate 
 multi-event annual losses for US hurricanes. The event set being sampled from 
 is quite large (~14000) with each event and account combination having a 
 unique beta loss distribution. Simply swapping lines 23 and 24 has the effect 
 on results that I mentioned in the previous email. The simulation is large 
 enough that the MC error in the estimated means are negligible.

The code you sent isn't usable, because it's missing your data.  Could 
you please do the following?

  - verify that the behaviour still happens in the current alpha test 
version

  - try to simplify the example code so someone else can run it?  It 
could be that certain values of alpha and beta trigger a bug but the 
ones I tried were fine.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] EBAM ERROR

2006-09-14 Thread Learning R
Dear RUsers,

I am new to R. I am learning how to use R. I am a PC user and run R on
windows. I would appreciate if some one could guide me on a few questions I
have:

1) I have 4 cel files (2 replicates for NORM and Disease resp). I have been
able to run siggenes on this dataset where I have 4 labels in the class file
groupsnhi.cl  op- (0,0,1,1) and my data has been read into datrmanhi after
performing rma. When I run these commands here I receive these errors:

 plot(samnhi.out,seq(0.1,0.6,0.1))
 identify(samnhi.out,ll=FALSE)
warning: no point with 0.25 inches
warning: no point with 0.25 inches
warning: no point with 0.25 inches

Why does this happen.

2) Now I am trying to run EBAM: and when I type I get an error

  find.out-find.a0(datrmanhi,groupsnhi.cl,rand=123)
Loading required package: affy
Loading required package: affyio
EBAM Analysis for the two class unpaired case.

Warning: There are 1 genes which have variance Zero or no non-missing
values.
 The d-value of these genes is set to NA.


The following object(s) are masked _by_ .GlobalEnv :

 n


The following object(s) are masked from mat.repeat ( position 5 ) :

 center log.bino n p success x1 x2 x3 x4 x5

Error in optim(rep(0, 6), neglogLik.repeat, method = BFGS) :
non-finite finite-difference value [1]
In addition: Warning message:
collapsing to unique 'x' values in: approx(Z, Z.norm, z, rule = 2)


-

I have also tried :
 find.out-find.a0(exprs2,c(1,1,1,0,0,0),rand=123)
EBAM Analysis for the two class unpaired case.

Warning: There are 1 genes which have variance Zero or no non-missing
values.
 The d-value of these genes is set to NA.


The following object(s) are masked _by_ .GlobalEnv :

 n


The following object(s) are masked from mat.repeat ( position 3 ) :

 center log.bino n p success x1 x2 x3 x4 x5


The following object(s) are masked from mat.repeat ( position 6 ) :

 center log.bino n p success x1 x2 x3 x4 x5

Error in optim(rep(0, 6), neglogLik.repeat, method = BFGS) :
non-finite finite-difference value [1]
In addition: Warning message:
collapsing to unique 'x' values in: approx(Z, Z.norm, z, rule = 2)


I would greatly appreciate any solutions and help to solve this problem.

Thank you,
Appreciate your time.
Regards,
Lolita

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hist for two sets

2006-09-14 Thread Marc Schwartz
On Thu, 2006-09-14 at 19:37 -0400, Ethan Johnsons wrote:
 A quick question, please.
 
 x = c(0.0001, 0.0059, 0.0855, 0.9082)
 y = c(0.54, 0.813, 0.379, 0.35)
 
 where A = 1st set, B = 2nd set, C = 3rd set, D = 4th set respectivley.
 
 How do you make hist plot side by side for x  y?
 
 i.e. 0.0001 and then to the right 0.54, 0.0059 and then to the right 0.813,
 etc.
 
 thx much

You don't want a histogram, but a barplot:

 x - c(0.0001, 0.0059, 0.0855, 0.9082)
 y - c(0.54, 0.813, 0.379, 0.35)

 # create a two row matrix with x and y
 height - rbind(x, y)

 # Use height and set 'beside = TRUE' to get pairs
 # save the bar midpoints in 'mp'
 # Set the bar pair labels to A:D
 mp - barplot(height, beside = TRUE, ylim = c(0, 1), 
   names.arg = LETTERS[1:4])

 # Draw the bar values above the bars
 text(mp, height, labels = format(height, 4), pos = 3, cex = .75)


See ?barplot, ?text and ?format (or ?formatC or ?sprintf).

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] what does Height represent?

2006-09-14 Thread zhaoshi
hi--

I am new to R and try to use R cluster my binary data. I use 
hierarchical clustering
plot (hclust (dist(x,method=binary),method=average),cex=0.1)
I end up with a cluster Dendrogram. On the left of my dendrogram, there 
is scale called Height from 0.0 to 1.0.
I don't understand what does Height represent. If the Height represents 
the distance scale between two different data point,
it looks like if I add up the length of each branch, I end up with 
distance of some pairs  1. It is not possible the distance
between any data point will greater than 1. Could some help me out?
 
Thanks


Zhaoshi

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group bunch of lines in a data.frame, an additional requirement

2006-09-14 Thread Marc Schwartz
Emmanuel,

I wouldn't be surprised if Gabor comes up with something, but since
aggregate() can only return scalars, you can't do it in one step here.
There are possibilities using other functions such as split(), tapply()
or by(), but each has it own respective limitations requiring more than
one step or post consolidation reformatting. 

You could certainly write a unified wrapper function that would do this
in a single call, but unless you plan on doing this sort of operation a
lot, it is probably easier with multiple calls.

I suspect using Gabor's approach as he had below, combined with my own
on using aggregate() (now twice) then using merge() may be the easiest.

HTH,

Marc

On Thu, 2006-09-14 at 21:35 +0100, Emmanuel Levy wrote:
 Thanks Gabor, that is much faster than using a loop!
 
 I've got a last question:
 
 Can you think of a fast way of keeping track of the number of
 observations collapsed for each entry?
 
 i.e. I'd like to end up with:
 
 A 2.0 400 ID1 3 (3obs in the first matrix)
 B 0.7 35 ID2 2 (2obs in the first matrix)
 C 5.0 70 ID1 1 (1obs in the first matrix)
 
 Or is it required to use an temporary matrix that is merged later? (As
 examplified by Mark in a previous email?)
 
 Thanks a lot for your help,
 
   Emmanuel
 
 On 9/13/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:
  See below.
 
  On 9/13/06, Emmanuel Levy [EMAIL PROTECTED] wrote:
   Thanks for pointing me out aggregate, that works fine!
  
   There is one complication though: I have mixed types (numerical and 
   character),
  
   So the matrix is of the form:
  
   A 1.0 200 ID1
   A 3.0 800 ID1
   A 2.0 200 ID1
   B 0.5 20   ID2
   B 0.9 50   ID2
   C 5.0 70   ID1
  
   One letter always has the same ID but one ID can be shared by many
   letters (like ID1)
  
   I just want to keep track of the ID, and get a matrix like:
  
   A 2.0 400 ID1
   B 0.7 35 ID2
   C 5.0 70 ID1
  
   Any idea on how to do that without a loop?
 
  If V4 is a function of V1 then you can aggregate by it too and it will
  appear but have no effect on the classification:
 
   aggregate(DF[2:3], DF[c(1,4)], mean)
V1  V4  V2  V3
  1  A ID1 2.0 400
  2  C ID1 5.0  70
  3  B ID2 0.7  35
 
 
  
Many thanks,
  
   Emmanuel
  
   On 9/12/06, Emmanuel Levy [EMAIL PROTECTED] wrote:
Hello,
   
I'd like to group the lines of a matrix so that:
A 1.0 200
A 3.0 800
A 2.0 200
B 0.5 20
B 0.9 50
C 5.0 70
   
Would give:
A 2.0 400
B 0.7 35
C 5.0 70
   
So all lines corresponding to a letter (level), become a single line
where all the values of each column are averaged.
   
I've done that with a loop but it doesn't sound right (it is very
slow). I imagine there is a
sort of apply shortcut but I can't figure it out.
   
Please note that it is not exactly a matrix I'm using, the function
typeof tells me it's a list, however I access to it like it was a
matrix.
   
Could someone help me with the right function to use, a help topic or
a piece of code?
   
Thanks,
   
  Emmanuel
   
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] working with strptime data

2006-09-14 Thread Gabor Grothendieck
Read the help desk article in R News 4/1.

On 9/14/06, Richard Evans [EMAIL PROTECTED] wrote:
 Dear R-forum,

 I am looking for a good resource/help on working with POSIXct values and
 controlling the pretty x-axis labels and tick marks for a data VS time
 plots. Specifically, I wish to do programming logic which creates
 different vertical ablines calculations based on the range of times
 which i am working on. The default plot results are adequate, but I
 would love to make explicit calls on how the x-axis annotates the
 timestamps.

 Does anyone have example code or know of a good reference for these
 kinds of R-programming tasks?


 Here's a simplified example:
 --
 I have a large data set that consists of N pairs of values and
 timestamps strings.
 Like this:

 TheData - c(1.2, 0.9, etc...[to the Nth
 datapoint])
 TheTime - c(9/1/2006 8:00, 9/1/2006 8:13, etc...[to the Nth
 timestamp])

 I convert the timestamp strings into POSIXct types as:

 TheTime - as.POSIXct(strptime(TheTime, format=%m/%d/%Y %H:%M))

 And create a plot as such:

 plot(MyTime,MyData)

 --

 And here is a specific question:

 How do I calculate the number of months than are spanned between two
 POSIXct values?
 (i.e. NumOfMonths - MonthFunction(range(MyTimeStampData))

 Thanks-in-advance,
 - rich

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Table manipulation question

2006-09-14 Thread Geoff Russell
I have a table:

 C1
RowName13
RowName22

and another table:

  C2
RowName15.6
RowName1a  4.3
RowName2NA

I want to join join the tables with matching rows:

  C1   C2
RowName1 35.6
RowName22 NA

I'm thinking of something like:

T1$C2=T2$C2[index-expression-to-pullout the matching ones]

Any ideas would be appreciated.

Cheers,
Geoff Russell

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] group bunch of lines in a data.frame, an additional requirement

2006-09-14 Thread Gabor Grothendieck
Here are three different ways to do it:

# base R
fb - function(x)
   c(V1 = x$V1[1], V4 = x$V4[1], V2.mean = mean(x$V2),
 V3.mean = mean(x$V3), n = length(x$V1))
do.call(rbind, by(DF, DF[c(1,4)], fb))

# package doBy
library(doBy)
summaryBy(V2 + V3 ~ V1 + V4, DF, FUN = c(mean, length))[,-5]

# package reshape
library(reshape)
f - function(x) c(mean = mean(x), n = length(x))
cast(melt(DF, id = c(1,4)), V1 + V4 ~ variable, fun.aggregate = f)[,-6]



 # base R
 fb - function(x)
+c(V1 = x$V1[1], V4 = x$V4[1], V2.mean = mean(x$V2),
+  V3.mean = mean(x$V3), n = length(x$V1))
 do.call(rbind, by(DF, DF[c(1,4)], fb))
 V1 V4 V2.mean V3.mean n
[1,]  1  1 2.0 400 3
[2,]  3  1 5.0  70 1
[3,]  2  2 0.7  35 2

 # package doBy
 library(doBy)
 summaryBy(V2 + V3 ~ V1 + V4, DF, FUN = c(mean, length))[,-5]
  V1  V4 mean.V2 mean.V3 length.V3
1  A ID1 2.0 400 3
2  C ID1 5.0  70 1
3  B ID2 0.7  35 2

 # package reshape
 library(reshape)
 f - function(x) c(mean = mean(x), n = length(x))
 cast(melt(DF, id = c(1,4)), V1 + V4 ~ variable, fun.aggregate = f)[,-6]
  V1  V4 V2_mean V2_n V3_mean
1  A ID1 2.03 400
2  B ID2 0.72  35
3  C ID1 5.01  70






---

 library(doBy)
 summaryBy(V2 + V3 ~ V1 + V4, DF, FUN = c(mean, length))[,-5]
  V1  V4 mean.V2 mean.V3 length.V3
1  A ID1 2.0 400 3
2  C ID1 5.0  70 1
3  B ID2 0.7  35 2

 library(reshape)
 f - function(x) c(mean = mean(x), n = length(x))
 cast(melt(DF, id = c(1,4)), V1 + V4 ~ variable, fun.aggregate = f)[,-6]
  V1  V4 V2_mean V2_n V3_mean
1  A ID1 2.03 400
2  B ID2 0.72  35
3  C ID1 5.01  70



On 9/14/06, Emmanuel Levy [EMAIL PROTECTED] wrote:
 Thanks Gabor, that is much faster than using a loop!

 I've got a last question:

 Can you think of a fast way of keeping track of the number of
 observations collapsed for each entry?

 i.e. I'd like to end up with:

 A 2.0 400 ID1 3 (3obs in the first matrix)
 B 0.7 35 ID2 2 (2obs in the first matrix)
 C 5.0 70 ID1 1 (1obs in the first matrix)

 Or is it required to use an temporary matrix that is merged later? (As
 examplified by Mark in a previous email?)

 Thanks a lot for your help,

  Emmanuel

 On 9/13/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:
  See below.
 
  On 9/13/06, Emmanuel Levy [EMAIL PROTECTED] wrote:
   Thanks for pointing me out aggregate, that works fine!
  
   There is one complication though: I have mixed types (numerical and 
   character),
  
   So the matrix is of the form:
  
   A 1.0 200 ID1
   A 3.0 800 ID1
   A 2.0 200 ID1
   B 0.5 20   ID2
   B 0.9 50   ID2
   C 5.0 70   ID1
  
   One letter always has the same ID but one ID can be shared by many
   letters (like ID1)
  
   I just want to keep track of the ID, and get a matrix like:
  
   A 2.0 400 ID1
   B 0.7 35 ID2
   C 5.0 70 ID1
  
   Any idea on how to do that without a loop?
 
  If V4 is a function of V1 then you can aggregate by it too and it will
  appear but have no effect on the classification:
 
   aggregate(DF[2:3], DF[c(1,4)], mean)
V1  V4  V2  V3
  1  A ID1 2.0 400
  2  C ID1 5.0  70
  3  B ID2 0.7  35
 
 
  
Many thanks,
  
   Emmanuel
  
   On 9/12/06, Emmanuel Levy [EMAIL PROTECTED] wrote:
Hello,
   
I'd like to group the lines of a matrix so that:
A 1.0 200
A 3.0 800
A 2.0 200
B 0.5 20
B 0.9 50
C 5.0 70
   
Would give:
A 2.0 400
B 0.7 35
C 5.0 70
   
So all lines corresponding to a letter (level), become a single line
where all the values of each column are averaged.
   
I've done that with a loop but it doesn't sound right (it is very
slow). I imagine there is a
sort of apply shortcut but I can't figure it out.
   
Please note that it is not exactly a matrix I'm using, the function
typeof tells me it's a list, however I access to it like it was a
matrix.
   
Could someone help me with the right function to use, a help topic or
a piece of code?
   
Thanks,
   
  Emmanuel
   
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Table manipulation question

2006-09-14 Thread Wensui Liu
Hi, Russell,

Here is a piece of code and you might need to tweak it a little.
MERGE 2 DATA FRAMES###
# MERGE 2 DATA FRAMES:#
# INNER JOIN, LEFT JOIN, RIGHT JOIN,  #
# FULL JOIN,  CARTESIAN PRODUCT  #
###

data1-data.frame(id1 = 1:10, x = rnorm(length(1:10)));
data2-data.frame(id2 = seq(1, 20, by = 2), y = rnorm(length(seq(1, 20, by =
2;

# INNER JOIN
inner.join-merge(data1, data2, by.x = id1, by.y = id2);

# LEFT JOIN
left.join-merge(data1, data2, by.x = id1, by.y = id2, all.x = TRUE);

# RIGHT JOIN
right.join-merge(data1, data2, by.x = id1, by.y = id2, all.y = TRUE);

# FULL JOIN
full.join-merge(data1, data2, by.x = id1, by.y = id2, all = TRUE);

# CARTESIAN PRODUCT
cartesian-merge(data1, data2);



On 9/14/06, Geoff Russell [EMAIL PROTECTED] wrote:

 I have a table:

  C1
 RowName13
 RowName22

 and another table:

   C2
 RowName15.6
 RowName1a  4.3
 RowName2NA

 I want to join join the tables with matching rows:

   C1   C2
 RowName1 35.6
 RowName22 NA

 I'm thinking of something like:

 T1$C2=T2$C2[index-expression-to-pullout the matching ones]

 Any ideas would be appreciated.

 Cheers,
 Geoff Russell

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
WenSui Liu
(http://spaces.msn.com/statcompute/blog)
Senior Decision Support Analyst
Health Policy and Clinical Effectiveness
Cincinnati Children Hospital Medical Center

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hist for two sets

2006-09-14 Thread Ethan Johnsons
thx so much, Marc.

ej

On 9/14/06, Marc Schwartz [EMAIL PROTECTED] wrote:

 On Thu, 2006-09-14 at 19:37 -0400, Ethan Johnsons wrote:
  A quick question, please.
 
  x = c(0.0001, 0.0059, 0.0855, 0.9082)
  y = c(0.54, 0.813, 0.379, 0.35)
 
  where A = 1st set, B = 2nd set, C = 3rd set, D = 4th set respectivley.
 
  How do you make hist plot side by side for x  y?
 
  i.e. 0.0001 and then to the right 0.54, 0.0059 and then to the right
 0.813,
  etc.
 
  thx much

 You don't want a histogram, but a barplot:

 x - c(0.0001, 0.0059, 0.0855, 0.9082)
 y - c(0.54, 0.813, 0.379, 0.35)

 # create a two row matrix with x and y
 height - rbind(x, y)

 # Use height and set 'beside = TRUE' to get pairs
 # save the bar midpoints in 'mp'
 # Set the bar pair labels to A:D
 mp - barplot(height, beside = TRUE, ylim = c(0, 1),
names.arg = LETTERS[1:4])

 # Draw the bar values above the bars
 text(mp, height, labels = format(height, 4), pos = 3, cex = .75)


 See ?barplot, ?text and ?format (or ?formatC or ?sprintf).

 HTH,

 Marc Schwartz




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot region too large

2006-09-14 Thread Kamila Naxerova
Hi!

I don't understand this:

layout(matrix(c(1:10),5,2),heights=c(1,rep(2,4)))
plot(1,1)
error in plot.new() :  plot region too large

Why??

Thanks!
Kamila

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot region too large

2006-09-14 Thread Kiermeier, Andreas \(PIRSA - SARDI\)
The figure margins come from what is set in par(mar), eg

 layout(matrix(c(1:10),5,2),heights=c(1,rep(2,4)))
 par(mar)
[1] 5.1 4.1 4.1 2.1
 

There is not enough space left to plot anything with those margins.  You
will need to make them smaller first, e.g.

 par(mar=c(1,1,1,1,))
 plot(1,1)

In which case things work.

Cheers,

Andreas

_
Dr Andreas Kiermeier
Senior Statistician
SARDI FOOD SAFETY PROGRAM

33 Flemington Street
Glenside   SA   5065
Ph: +61 8 8207 7884
Fax:+61 8 8207 7854
Mob:0423 028 565

Email: [EMAIL PROTECTED]
Web: http://www.sardi.sa.gov.au/
_
If you would like to correspond with me in a secure way, using public
key encryption, please contact me directly for details of my GPG public
key or visit the SARDI website, where you can download my public key
from my personal staff page.

The information in this e-mail and attachments (if any) may be
confidential and/or legally privileged. If you are not the intended
recipient, any disclosure, copying, distribution or action taken is
prohibited. SARDI, The South Australian Research and Development
Institute, is the research division of Primary Industries and Resources
(SA) 


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Kamila Naxerova
Sent: Friday, 15 September 2006 13:26
To: r-help@stat.math.ethz.ch
Subject: [R] plot region too large

Hi!

I don't understand this:

layout(matrix(c(1:10),5,2),heights=c(1,rep(2,4)))
plot(1,1)
error in plot.new() :  plot region too large

Why??

Thanks!
Kamila

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.