[R] mvpart analyses with covariables

2011-09-12 Thread 'Ben Ford'
Hi all,

I am fairly new to R and I am trying to run mvpart and create a MRT using
explanatory variables and covariables. I've been following the procedures in
Numerical Ecoogy with R.

The command (no covariables) which works fine -

ABUNDTMRT <- mvpart(abundance ~
.,factors,margin=0.08,cp=0,xv="1se",xval=nrow(abundance),xvmult=100,which=4)

where abundance is 4th root transformed fish abundance (103 species x 168
samples), and factors is the relief (high, medium, low profile, sand
inundated reef, flat), benthos (coral, sessile inverts, kelp, macroalgae,
seagrass, sand), depth (continuous in meters), latitude, and longitude of
each sample.

To try and incorporate spatial autocorrelation (as a covariate) into this I
have been trying the command -

ABUNDTMRT <- mvpart(abundance ~ environ + spatial,
data.frame,margin=0.08,cp=0,xv="1se",xval=nrow(abundance),xvmult=100,which=4)

where abundance is as above, environ is the environmental factors (from
above) and spatial is the eigenfunctions from a PCNM analysis. data.frame is
the environ and spatial factors as a data.frame.

This gives the error -

"Error in `[[<-.data.frame`(`*tmp*`, preds, value = c(72L, 72L, 80L, 72L,
 :
  replacement has 504 rows, data has 168"

As I am new to this, I am not sure if I am entering an incorrect formula
when trying to include the covariables, or if this is just something which
mvpart cannot do.

Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Radial basis function network

2011-09-12 Thread ospoz
Hi,

Does anyone know where I can find a package which implements this network?

http://en.wikipedia.org/wiki/Radial_basis_function_network

The package "neural" was good for that I think, but it doesn't exist anymore
(I don't know why).

Thanks a lot

--
View this message in context: 
http://r.789695.n4.nabble.com/Radial-basis-function-network-tp3809219p3809219.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting Rcpp SEXP data in C++

2011-09-12 Thread Worik R
Friends

I am looking at Rcpp and I am a bit stuck on a simple matter.

(I am calling R from c++, if there is a better way...)

Given this simple example using the TTR package and the SMA function which
returns a simple moving average

Rcpp::NumericVector rv;
for(int i = 0; i < 100; i++){
  rv.push_back(rand());
}
Rcpp::Environment TTR("package:TTR");
Rcpp::Function sma = TTR["SMA"];
SEXP sma_res = sma(rv);

How do I get the values of sma_res from a SEXP type into something
useful in C++, like a vector?

Where can I find some documentation of SEXP?  I have tried googling it
to no avail.

cheers
Worik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.POSIXct on vector weird output

2011-09-12 Thread jim holtman
'f$V1' is a factor.

try

as.POSIXct(as.character(f$V1), format="%m/%d/%Y %H:%M:%S")

You need to convert to a character first.

On Mon, Sep 12, 2011 at 7:24 PM, bradford  wrote:
> I don't know R, so maybe I've done something wrong, but I'm working off an
> example I saw on the web and wondering why as.POXIXct isn't returning the
> same result on f$V1 as it is on z.  Did I do something wrong?  Or is it a
> problem with my build?
>
>> f$V1
>  [1] 09/11/2011 13:46:39 09/11/2011 13:45:18 09/11/2011 13:44:58
>  [4] 09/11/2011 13:40:02 09/11/2011 13:37:58 09/11/2011 13:36:09
>  [7] 09/11/2011 13:32:31 09/11/2011 13:25:29 09/11/2011 13:24:40
> [10] 09/11/2011 13:23:48
> 10 Levels: 09/11/2011 13:23:48 09/11/2011 13:24:40 ... 09/11/2011 13:46:39
>> z
> [1] "09/11/2011 13:46:39"
>> as.POSIXct(z, format="%m/%d/%Y %H:%M:%S")
> [1] "2011-09-11 13:46:39 EDT"
>> as.POSIXct(f$V1, format="%m/%d/%Y %H:%M:%S")
>  [1] "0009-11-20 EST" "0009-11-20 EST" "0009-11-20 EST" "0009-11-20 EST"
>  [5] "0009-11-20 EST" "0009-11-20 EST" "0009-11-20 EST" "0009-11-20 EST"
>  [9] "0009-11-20 EST" "0009-11-20 EST"
>
> -
> R version 2.12.2 (2011-02-25)
> Copyright (C) 2011 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with glmm.admb

2011-09-12 Thread Ben Bolker
eeadie  unm.edu> writes:


> Now I have a new problem with the same model that I've been working on. Here
> is the model and the error message:
> 
> >
modelnbbb<-glmmadmb(total_bites_rounded~age_class_back+
(1|focal_individual)+(1|food.dif.id)+
offset(log(forage_time)),data=data,family="nbinom")
> Error in parse(text = x) : :1:41: unexpected ')'
> 1: total_bites_rounded ~ age_class_back ++ )

  Sorry I didn't respond sooner, I lost track of this thread.
  I think glmmADMB is getting confused by trying to evaluate
the offset term on the fly.  Try instead:

  data$logforage <- log(data$forage_time)

 glmmadmb(...+offset(logforage),...)

> It seems like this message is trying to tell me that I have an extra )
> somewhere, but I don't think this is true because the exact same model works
> fine with lmer. Is there something special about glmmadmb syntax that is
> giving me problems?

  Eventually I will try to fix this (and at the very least put some
more warnings in the documentation) but in the meantime I hope
this workaround helps.

  If it fails, would you be willing to send me your data to
have a look?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nls, the four parameter logisitc equation, and prediction band

2011-09-12 Thread sg
I have uploaded a datafile that contains the following two variables: time
(X value) and response (Y value).  This is a fairly extensive file (with >
16000 entries).  I have two questions:

1. I want to use the following equation to regress Y on X: Y-hat = min +
(max-min)/(1 + (X/EC50)^Hillslope).

Here is my R command:

nlsout <- nls(Y ~ (0 - (100-0)/(1 + (X/EC50)^hill)), start=c(EC50=125,
hill=-1))

However, I get the following error message:

Error in numericDeriv(form[[3L]], names(ind), env) :
  Missing value or an infinity produced when evaluating the model
Could someone explain the error message to me, please, and what I need to do
to be able to run the command without error?  The problem is that this exact
same formula works on the exact same dataset when I use a macro in Excel
(unfortunately I don't have the code, though).


2. I want to compute the prediction band for the above regression.

Any help will be greatly appreciated.

Thanks,

Joe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] as.POSIXct on vector weird output

2011-09-12 Thread bradford
I don't know R, so maybe I've done something wrong, but I'm working off an
example I saw on the web and wondering why as.POXIXct isn't returning the
same result on f$V1 as it is on z.  Did I do something wrong?  Or is it a
problem with my build?

> f$V1
 [1] 09/11/2011 13:46:39 09/11/2011 13:45:18 09/11/2011 13:44:58
 [4] 09/11/2011 13:40:02 09/11/2011 13:37:58 09/11/2011 13:36:09
 [7] 09/11/2011 13:32:31 09/11/2011 13:25:29 09/11/2011 13:24:40
[10] 09/11/2011 13:23:48
10 Levels: 09/11/2011 13:23:48 09/11/2011 13:24:40 ... 09/11/2011 13:46:39
> z
[1] "09/11/2011 13:46:39"
> as.POSIXct(z, format="%m/%d/%Y %H:%M:%S")
[1] "2011-09-11 13:46:39 EDT"
> as.POSIXct(f$V1, format="%m/%d/%Y %H:%M:%S")
 [1] "0009-11-20 EST" "0009-11-20 EST" "0009-11-20 EST" "0009-11-20 EST"
 [5] "0009-11-20 EST" "0009-11-20 EST" "0009-11-20 EST" "0009-11-20 EST"
 [9] "0009-11-20 EST" "0009-11-20 EST"

-
R version 2.12.2 (2011-02-25)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-apple-darwin10.8.0 (64-bit)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question on reading nodes info from all random forests generated trees

2011-09-12 Thread k
Hi All,

I have a quick question on random forests. Simply, I am not sure how to read 
the values of independent variables related to the highest value of a response 
variable from all trees generated from random forests. It is easy to do this in 
a single regression tree. But I am not clear how to do it for all trees 
produced from random forests.

Thanks,
Kai

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nested anova<-R chrashing

2011-09-12 Thread Ben Bolker
joerg stephan  rhrk.uni-kl.de> writes:

> 
> Hi,
> 
> I tried to do a nested Anova with the attached Data. My response 
> variable is "survivors" and I would like to know the effect of 
> (insect-egg clutch) "size", "position" (of clutch on twig) and "clone" 
> (/plant genotype) on the survival of eggs (due to predation). Each plant 
> was provided with three different sizes of clutches (45,15,5) and had 
> pseudo-replications of size 15 and 5 on it.

This may not be what you wanted at all, but here is my take on
what you could try with these data.

You have binomial data with lots of zeros and ones (i.e. 0 or 100%
survival), so they are unlikely to be transformable to normality
however you try.  You *might* be able to get away with modeling
these as normally distributed and then rely on permutation tests
to get your significance levels right, but (tricky as it is) I
actually think GLMMs (Zuur et al 2009, Bolker et al 2009) are
the best way to go ...

## read in the data
x <- read.table("aovmisc.dat",header=TRUE)

## take a look
summary(x)
## with(x,table(clone,as.factor(size),as.factor(position)))

## make categorical versions of size & plant variables,
##  and reconstitute the "total number surviving" variable
x <- transform(x,
   fsize=factor(size),
   fplant=factor(plant),
   tsurv=round(survivors*size)
   )

library(ggplot2)

## trying to plot everything ...
zspace <- opts(panel.margin=unit(0,"lines"))  ## squash panels together
theme_update(theme_bw())  ## white background

## plot all data: survival vs position, with all plants represented;
##  separate subplots for each clone
## overlay GLM fit
ggplot(x,aes(x=position,y=survivors,size=size,colour=fplant))+
   geom_point(alpha=0.5)+
  facet_wrap(~clone)+xlim(0,15)+geom_line(size=0.5,alpha=0.2)+zspace+
  geom_smooth(aes(group=1,weight=size),
  colour="black",method="glm",family="binomial")

## it looks like there is a positive effect of position, and also
##  a positive effect of size (large bubbles toward high-survival)

## there are a few outliers in 'position' -- are these typos?
subset(x,position>15)


## focus on size and clone instead, suppress position:

ggplot(x,aes(x=fsize,y=survivors,colour=clone))+stat_sum(aes(size=..n..))+
  facet_grid(.~clone)+
  geom_smooth(method="glm",family="binomial",aes(weight=size,
 x=as.numeric(fsize)))
## OR
## +  stat_summary(fun.y="mean",geom="line",aes(x=as.numeric(fsize)))+zspace

library(mgcv)
ggplot(x,aes(x=position,y=survivors,colour=clone))+stat_sum(aes(size=..n..))+
  facet_grid(fsize~clone)+
  geom_smooth(method="gam",family="binomial")+xlim(0,15)+zspace

## appears to be an effect of size, and a clone*size interaction.
## possible bimodal distribution (0/1) at size=5?

with(x,table(clone,plant)) ## each plant is only in one clone 
## (explicit nesting)
library(lme4)

## fit with size*clone interaction, main effect of position
## no need to nest plant in clone because they are explicitly nested
##  (i.e. unique plant IDs)
g1 <- glmer(cbind(tsurv,size)~fsize*clone+position+(1|fplant),
data=x,
family="binomial")

## doesn't LOOK like overdispersion ...
sum(residuals(g1)^2)  ## >> residual df
nrow(x)-length(fixef(g1))
pchisq(778,df=537)

## ... but try fitting observation-level random effect anyway
x <- transform(x,obs=seq(nrow(x)))
g2 <- update(g1,.~.+(1|obs))
summary(g2)
## among-observation variance 2x among-plant variance
anova(g1,g2)  ## looks much better (10 AIC units lower / p <<< 0.05

## 
drop1(g2,test="Chisq")
## looks like we can drop the size*clone interaction?

g3 <- update(g2,.~.-clone:fsize)
drop1(g3,test="Chisq")
## strong effects of size, position;
## weak effects of clone
fixef(g3)

>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solve your R problems

2011-09-12 Thread Rolf Turner

On 13/09/11 11:27, Carl Witthoft wrote:

Love the page.

Just out of interest,  is this an updated version or the same ol' 
Inferno document?


And why do I keep thinking you (Patrick Burns) are the  Hab's coach?  :-)



Now *that's* a blast from the past!  Burns coached the Canadiens
way back when they were a real hockey team! :-)

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calc.relimp pmvd for US R-user

2011-09-12 Thread Ben Bolker
YAddo  gmail.com> writes:

> 
> Dear All:
> 
> I am calculating  the relative importance of a regressor in a linear model. 
> Does anyone know how I can obtain/install the 'pmvd' computation type? I am
> a US user.
> 

  I didn't know what the heck you were talking about, but having looked
at http://prof.beuth-hochschule.de/groemping/relaimpo/ it seems that
you can't legally install that version of the package because of
conflicts with .

  Would some of the other results of 
library("sos")
findFn("{relative importance}")

work for you?

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Very slow using S4 classes

2011-09-12 Thread André Rossi
Thank you a lot Morgan.  Your suggestion helped me to speed up my code.  But
I still believe that the inefficience is an S4 issue.

Best regards,

André Rossi


2011/9/12 Martin Morgan 

> Hi André...
>
>
> On 09/12/2011 07:20 AM, André Rossi wrote:
>
>> Dear Martin Morgan and Martin Maechler...
>>
>> Here is an example of the computational time when a slot of a S4 class
>> is of another S4 class and when it is just one object. I'm sending you
>> the data file.
>>
>> Thank you!
>>
>> Best regards,
>>
>> André Rossi
>>
>> ##**##
>>
>> setClass("SupervisedExample",
>> representation(
>> attr.value = "ANY",
>> target.value = "ANY"
>> ))
>>
>> setClass("StreamBuffer",
>> representation=representation(
>> examples = "list", #SupervisedExample
>> max.length = "integer"
>> ),
>> prototype=list(
>> max.length = as.integer(1)
>> )
>> )
>> b <- new("StreamBuffer")
>>
>> load("~/Dropbox/dataList2.**RData")
>>
>
> For a reproducible example, I guess you have something like
>
>  data <- replicate(1, new("SupervisedExample"))
>
>
>  b@examples <- data #data is a list of SupervisedExample class.
>>
>>  > system.time({for (i in 1:100) b@examples[[1]]@attr.value[1] = 2 })
>>
>
> Yes, this is slow. [[<-,S4 is not as clever as [[<-,list and performs extra
> duplication, including those 10,000 S4 objects it contains.
>
> As before, an improvement is to think in terms of vectors, maybe a
> 'SupervisedExamples' class to act as a collection of examples
>
> setClass("SupervisedExamples",
> representation=representation(
>   attr.value = "list",
>   target.value = "list"))
>
> setClass("StreamBuffer",
> representation=representation(
>   examples="SupervisedExamples")**)
>
> SupervisedExamples <-
>function(attr.value=vector("**list", n),
> target.value=vector("list", n), n, ...)
> {
>new("SupervisedExamples", attr.value=attr.value,
>target.value=target.value, ...)
> }
>
> StreamBuffer <-
>function(examples, ...)
> {
>new("StreamBuffer", examples=examples, ...)
> }
>
> data <- SupervisedExamples(n=10)
>
> b <- StreamBuffer(data)
>
> I then have
>
> > system.time({for (i in 1:100) data@attr.value[[1]] = 2 })
>   user  system elapsed
>  1.081   0.013   1.094
> > system.time({for (i in 1:100) b@examples@attr.value[[1]] <- 2})
>   user  system elapsed
>  4.283   0.000   4.295
>
> (note the 10x increase in size); still slower, but this will be amortized
> when the updates are vectorized, e.g.,
>
> > idx = sample(length(b@examples@attr.**value), 100)
> > system.time(b@examples@attr.**value[idx] <- list(2))
>   user  system elapsed
>  0.013   0.000   0.014
>
> A further change might be to recognize 'StreamBuffer' as an abstract class
> that SupervisedExamples extends
>
> setClass("StreamBuffer",
> representation=representation(
>   "VIRTUAL", max.len="integer"),
> prototype=prototype(max.len=**10L),
> validity=function(object) {
> if (obj...@max.len < length(object))
> "too many elements"
> else TRUE
> })
>
> setMethod(length, "StreamBuffer", function(x) {
>stop("'length' undefined on '", class(x), "'")
> })
>
> setClass("SupervisedExamples",
> representation=representation(
>   attr.value = "list",
>   target.value = "list"),
> contains="StreamBuffer")
>
> setMethod(length, "SupervisedExamples", function(x) {
>length(x@attr.value)
> })
>
> SupervisedExamples <-
>function(attr.value=vector("**list", n),
> target.value=vector("list", n), n, ...)
> {
>new("SupervisedExamples", attr.value=attr.value,
>target.value=target.value, ...)
> }
>
> data <- SupervisedExamples(n=10)
>
> > system.time({for (i in 1:100) data@attr.value[[1]] = 2 })
>   user  system elapsed
>  1.043   0.014   1.061
>
> Martin Morgan
>
> user  system elapsed
>>  16.837   0.108  18.244
>>
>>  > system.time({for (i in 1:100) data[[1]]@attr.value[1] = 2 })
>>user  system elapsed
>>   0.024   0.000   0.026
>>
>> ##**##
>>
>>
>> 2011/9/10 Martin Morgan mailto:mtmor...@fhcrc.org>>
>>
>>
>>On 09/10/2011 08:08 AM, André Rossi wrote:
>>
>>Hi everybody!
>>
>>I'm creating an object of a S4 class that has two slots:
>>ListExamples, which
>>is a list, and idx, which is an integer (as the code below).
>>
>>Then, I read a data.frame file with 1 (ten thousands) of
>>lines and 10
>>columns, do some pre-processing and, basically, I store each
>>line as an
>>element of a list in the slot ListExamples of the S4 object.
>>However, many
>>operations after this take a considerable time.
>>
>>Can anyone explain me why dois it happen? Is it possible to
>>speed up an
>

Re: [R] Solve your R problems

2011-09-12 Thread Carl Witthoft

Love the page.

Just out of interest,  is this an updated version or the same ol' 
Inferno document?


And why do I keep thinking you (Patrick Burns) are the  Hab's coach?  :-)

--
-
Sent from my Cray XK6

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] On-line machine learning packages?

2011-09-12 Thread Jason Edgecombe
I already provided the link to the task view, which provides a list of 
the more popular machine learning algorithms for R.


Do you have a particular algorithm or technique in mind? Does it have a 
name?


How does sequential classification differ form running a one-off 
classifier for each run?


On 09/12/2011 05:24 AM, Jay wrote:

In my mind this sequential classification task with feedback is
somewhat different from an completely offline, once-off,
classification. Am I wrong?
However, it looks like the mentality on this topic is to refer me to
cran/google in order to look for solutions myself. Oblivious I know
about these sources, and as I said, I used rseek.org among other
sources to look for solutions. I did not start this topic for fun, I'm
asking for help to find a suitable machine learning packages that
readily incorporates feedback loops and online learning. If somebody
has experience these kinds of problems in R, please respond.


Or will
"http://cran.r-project.org
Look for 'Task Views'"
be my next piece of advice?

On Sep 12, 11:31 am, Dennis Murphy  wrote:

http://cran.r-project.org/web/views/

Look for 'machine learning'.

Dennis



On Sun, Sep 11, 2011 at 11:33 PM, Jay  wrote:

If the answer is so obvious, could somebody please spell it out?
On Sep 11, 10:59 pm, Jason Edgecombe  wrote:

Try this:
http://cran.r-project.org/web/views/MachineLearning.html
On 09/11/2011 12:43 PM, Jay wrote:

Hi,
I used the rseek search engine to look for suitable solutions, however
as I was unable to find anything useful, I'm asking for help.
Anybody have experience with these kinds of problems? I looked into
dynaTree, but as information is a bit scares and as I understand it,
it might not be what I'm looking for..(?)
BR,
Jay
On Sep 11, 7:15 pm, David Winsemiuswrote:

On Sep 11, 2011, at 11:42 AM, Jay wrote:

What R packages are available for performing classification tasks?
That is, when the predictor has done its job on the dataset (based on
the training set and a range of variables), feedback about the true
label will be available and this information should be integrated for
the next classification round.

You should look at CRAN Task Views. Extremely easy to find from the
main R-project page.
--
David Winsemius, MD
West Hartford, CT
__
r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
r-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
r-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] calc.relimp pmvd for US R-user

2011-09-12 Thread YAddo
Dear All:

I am calculating  the relative importance of a regressor in a linear model. 
Does anyone know how I can obtain/install the 'pmvd' computation type? I am
a US user.

Regards,
Y 

--
View this message in context: 
http://r.789695.n4.nabble.com/calc-relimp-pmvd-for-US-R-user-tp3808752p3808752.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot 3 lines with ggplot2

2011-09-12 Thread J Toll
Justin,

Thanks for your help.

On Mon, Sep 12, 2011 at 2:19 PM, Justin Haynes  wrote:
>
> the data you've given is all character vectors!

Yes, I'm sorry about that.  I should not have used cbind when forming
my data.frame.  It changed my numeric data to character.  This command
would have been better.

x <- data.frame(Symbol = sym, X1 = x1/sum(x1), X2 = x2/sum(x2), X3 =
x3/sum(x3), stringsAsFactors = FALSE)

>> library(ggplot2)
>> x.melt<-melt(x,id.vars='Symbol')
>
> now, i think what you're looking for looks like:
>
>> ggplot(x.melt, aes(x=1:100, y=value, colour=variable)) + geom_line()

Yes, that's 90% of what I was looking for.  The rest is just formatting.


> but I'm not totally sure.  You aren't misunderstanding the syntax per say,
> but are asking it to plot a factor along the x-axis.  the behaviour of
> geom_line in that case is not what you are expecting.  you have a density
> value at each point but a categorical variable along the x-axis, so you cant
> draw a continuous line through them, they are discrete.
>
> the plot you reference uses more data and does the density calculation
> internally:
>
>> dat<-data.frame(x.var=rnorm(1000),cat.var=rep(letters[1:4],250))
>> ggplot(dat,aes(x=x.var,colour=cat.var))+geom_density()
>
> let me know if this helps or if you would like further explanation.

Thanks so much for the help.  It's much easier understanding when I
can see an example with my own data.  I'll keep working to get the
more custom formatting of the example plot.

Thank you,

James



> On Mon, Sep 12, 2011 at 11:43 AM, J Toll  wrote:
>>
>> Hi,
>>
>> I am trying to learn to use ggplot2 for what I had hoped would be a
>> fairly simple task.  I have a relatively small data.frame (100 by 4).
>> The first column contains symbols.  The 2nd, 3rd and 4th columns
>> represent percentage weightings for each symbol using 3 different
>> methodologies.  For example:
>>
>> sym <- make.unique(replicate(100, paste(sample(LETTERS, 3, replace =
>> TRUE), collapse = "")))
>> x1 <- sort(rexp(100) * 100, decreasing = TRUE)
>> x2 <- sort(rexp(100) * 100, decreasing = TRUE)
>> x3 <- sort(rexp(100) * 100, decreasing = TRUE)
>>
>> x <- data.frame(cbind(Symbol = sym, x1/sum(x1), x2/sum(x2),
>> x3/sum(x3)), stringsAsFactors = FALSE)
>>
>> I'd like to plot a line for each of the 3 different methodologies.
>> The y-axis would be percentage weight, the x-axis would be the symbol
>> or row number, although I'd prefer that not to be shown.
>> Aesthetically, I'd like the plot to look like this one (except with 3
>> lines), although I believe that's not the proper kind of plot for my
>> data:
>>
>>
>> http://www.ling.upenn.edu/~joseff/rstudy/plots/ggplotintro/densityidentity.png
>> http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html
>>
>> Is there a way to do this with my data?  From what I've been reading,
>> I get the sense that my data might be in the wrong format for what I'm
>> trying to do.  I've tried experimenting with melt to reformat my data,
>> but have gotten nowhere. I've tried a number of different ways to try
>> to get this working.  Most of the time, I don't even get any output.
>> The closest I've gotten is this simple command, but the output is
>> terrible.
>>
>> ggplot(x, aes(Symbol, c(V2,V3,V4))) + geom_line()
>>
>> I think I'm just completely misunderstanding the syntax of ggplot.
>> Could someone please point me in the right direction?  Thank you.
>>
>> James
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem in put.var.ncdf

2011-09-12 Thread David William Pierce
On Mon, Sep 12, 2011 at 1:32 PM, Claudia Stocker
 wrote:
> Dear all,
>
> I have a problem in writing a variable to a NetCDF-File.
> My code works pretty well until the step put.var.ncdf():
>
>
[...code omitted...]
> R prints the following error:
> #-
>
> Error in put.var.ncdf(spi, var.spi.24.03.me, spi24.me) :   NA/NaN/Inf in
> foreign function call (arg 5)
>
> I can exclude an error in length (dim.time.spi and spi24.me have the same
> length) or NA (there are no NA in the spi24.me vector). Where is the
> error?? The code seems so simple..

Hi Claudia,

could you put the following code in right before the put.var.ncdf call
and see what it prints:

print(paste('num of NAs in time:", sum(is.na(dim.time)) ))
print(paste("length of time:", length(dim.time) ))
print(paste("num of NAs in spi24.me:", sum(is.na(spi24.me)) ))
print(paste("length of spi24.me:", length(spi24.me) ))

If that doesn't show anything wrong, you can email me the data and
your code and I can give it a try. Or, you can try the newest version
of the ncdf/ncdf4 packages, which you can download from:

http://cirrus.ucsd.edu/~pierce/ncdf/

it may be a bug that I've already fixed.

Regards,

--Dave

-- 
David W. Pierce
Division of Climate, Atmospheric Science, and Physical Oceanography
Scripps Institution of Oceanography
(858) 534-8276 (voice)  /  (858) 534-8561 (fax)    dpie...@ucsd.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple regression intercept

2011-09-12 Thread Daniel Malter
This suggests that this is a dangerous office to be in because this is a
basic question. I am sure somebody in your office knows this. Anyway, the
baseline gives you the average value of the group that constitutes the
baseline when all other covariates are zero. Let's say you measure whether
men or women are happier.

set.seed(23423)
sex<-sample(c("male","female"),100,replace=T)

#Now assume that men reach on average 1 point on the happiness scale and
women 2 points on the happiness scale, and some random noise

y<-1*(sex=="male")+2*(sex=="female")+rnorm(100)

summary(lm(y~sex))

The summary tells you that the intercept, i.e., females, reach an average of
1.98 happiness points and that this value is significantly different from
zero. The male coefficient tells you that men are, on average, 0.75 points
less happy on the happiness scale and that this is significantly different
from the average happiness of women. If there were more groups, then the
coefficients compare the group for which the coefficients is estimated with
the baseline group.

HTH,
Daniel





 


burdy wrote:
> 
> Hi I am having difficulty interpreting the multiple regression output. I
> would like to know what it means when one of the factors is assigned as
> the intercept?
> 
> In my data I am looking at the relationship between environmental
> parameters and biological production.
> 
> One of my variables in the analysis is substratum type and gravel is
> identified as the intercept and the P-value is significant,... 
> Does this mean that I can talk about the relationship which it has with
> production as being significant or because it is the intercept can I not
> use it because it has been selected and used as the basis for the
> relationships between the other factors in the variable substratum?
> 
> No one in the PhD office can answer this  
> 
> I look forward to any replies 
> 
> Thanks
> 
> Matt
> 

--
View this message in context: 
http://r.789695.n4.nabble.com/Multiple-regression-intercept-tp3808045p3808599.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multilevel model in lme4 and nlme

2011-09-12 Thread Ben Bolker
jonas garcia  googlemail.com> writes:

> I am trying to fit some mixed models using packages lme4 and nlme.
> 
> I did the model selection using lmer but I suspect that I may have some
> autocorrelation going on in my data so I would like to have a look using the
> handy correlation structures available in nlme.
> 
> The problem is that I cannot translate my lmer model to lme:
> 
> mod1<- lmer(y~x + (1|a:b) + (1|b:c), data=mydata)
> 
> "a", "b" and "c" are factors with "c" nested in "b" and "b" nested in "a"
> 
> The best I can do with lme is:
> 
> mod2<- lme(y~x, random=list(a=~1, b=~1, c=~1), data=mydata)
> 
> which is the same as:
> 
> lmer(y~x + (1|a) + (1|a:b) + (1|a:b:c), data=mydata)
> 
> I am not at all interested in random effects (1|a) and (1|a:b:c) as they are
> not significant. I just need two random intercepts as specified in mod1. How
> can I translate mod1 into lme language?
> 
> Any help on this would be much appreciated.

  This would probably be better on the r-sig-mixed-models list.

  Does random=list(~1|a:b,~1|b:c) work?

  I would be a little bit careful throwing out ~1|a (non-significance
is not necessarily sufficient reason to discard a term from the model --
it depends a lot on your procedure), and with the interpretation of
your nesting.  If b is only explicitly and not implicitly nested in a
(i.e. if there a levels of 'b' that occur in more than one level of 'a',
for example if a corresponded to families, b corresponded to individuals,
and you labeled individuals 1..N_b_i in each family) then I'm not
sure how you would actually interpret b:c, as it would be crossed
rather than nested.  But assuming that your model specification in
lmer is correct and sensible, I think my suggestion above should (?)
work to get the equivalent in lme.
> 
> Jonas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating a new column with values from another

2011-09-12 Thread Daniel Malter
You can do this using ifelse(). See example below.

x<-rpois(100,100)
NA.x<-sample(1:100,40)
x[NA.x]=NA

y<-rpois(100,100)
NA.y<-sample(1:100,40)
y[NA.y]=NA

z<-ifelse(!is.na(y),y,ifelse(!is.na(x),x,NA))

HTH,
Daniel



holly shakya wrote:
> 
> I have 2 columns for weight. There are NAs in each column but not for the
> same observation. Some observations have values for both.  I would want to
> prioritize the WT2 values so I would like to do the following:
> 
>>From this:
> ID  WT1WT2
> 1   134  NA
> 2   145   155
> 1NA  175
> 3NA  187
> 
> To this:
> IDWT1   WT2WT
> 1NA 175 175
> 2  145 155  155
> 3   NA 187  187
> 
> Populating the NA values of wt2 with those of wt1 would work as well. Any
> suggestions would be greatly appreciated.
> -- 
> Holly Shakya
> 
> Doctoral Student
> San Diego State University/University of California, San Diego
> Joint Doctoral Program in Public Health
> (Global Health)
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

--
View this message in context: 
http://r.789695.n4.nabble.com/creating-a-new-column-with-values-from-another-tp3808528p3808550.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem in put.var.ncdf

2011-09-12 Thread Claudia Stocker
Dear all,

I have a problem in writing a variable to a NetCDF-File.
My code works pretty well until the step put.var.ncdf():


# Get variables
#-

data1 <- open.ncdf("PREC_me_03-1500.nc")

prec1 <- get.var.ncdf(data1,"PRECT")
dim.time <- get.var.ncdf(data1,"time2")


close.ncdf(data1)


# Calculation
#-

spi24.me <- spi.func(prec1)


--> spi24.me is a vector with 437270 values, ranging from -4 to 4 with 10
decimal places.


# Save as NetCDF

#-


dim.time.spi <- dim.def.ncdf("time", "days since 1499-01-01 00:00:00",
dim.time)



var.spi.24.03.me <- var.def.ncdf("spi03_24_me", "unitless", dim.time.spi,
0, longname="SPI tr_1500_03 24mt MA", prec="float")


spi <- create.ncdf("spi_regions_03.nc", var.spi.24.03.me)

put.var.ncdf(spi, var.spi.24.03.me, spi24.me)

close.ncdf(spi)


R prints the following error:
#-

Error in put.var.ncdf(spi, var.spi.24.03.me, spi24.me) :   NA/NaN/Inf in
foreign function call (arg 5)

I can exclude an error in length (dim.time.spi and spi24.me have the same
length) or NA (there are no NA in the spi24.me vector). Where is the
error?? The code seems so simple..

I found a similar post of Sashi Challa on Thu, 04 Nov 2010 22:09:23 -0700,
but there was unfortunately no answer.



Thanks your help!
claudia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message for .csv file

2011-09-12 Thread David Winsemius
You are asking a question about a package that seldom appears on  
rhelp, leading me to infer that there is not a large user community  
that reads this mailing list. You are also not providing the data  
needed to reproduced the problem. You would be better off taking the  
time to contact the package authors.


--
David.

On Sep 12, 2011, at 1:11 PM, vkent wrote:

I am using the package SPACECAP which provides an interface for R. I  
used
this interface to import the csv file into R as well as the other  
two csv

files that are required.

This is the output I get when querying the error message

Error in NN[i, 1:length(od)] <- od : subscript out of bounds

str(NN)

Error in str(NN) : object 'NN' not found

dim(NN)

Error: object 'NN' not found

length(od)

Error: object 'od' not found

i

Error: object 'i' not found




I hope this helps

--
View this message in context: 
http://r.789695.n4.nabble.com/Error-message-for-csv-file-tp3807651p3808052.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] function to include factors in summary data frame

2011-09-12 Thread Daniel Malter
I have read it three times and still no concrete idea what you are actually
trying to do, mainly because there is no information as to which
level/variable you are aggregating on. It'd help if you provided the
aggregated data (or sample rows thereof) so that we know what you want the
result to be.

Best,
Daniel


Wade Wall wrote:
> 
> Hi all,
> 
> I have a dataframe that includes data on individuals that are distributed
> across multiple rows.  I have aggregated the data using ddply, but I have
> columns in the original data frame that are factors ( such as sites "A",
> "B", and "C") that I would like to include in the new data frame.  I have
> done this in a clunky way using match() and a loop, but am wondering if
> there is a more elegant approach.  Here is an example data set.
> 
> #Example
> a<-c(rep(1:5,6)); b<-sort(b)
> b<-c(rep("A",10),rep("B",10),rep("C",10))
> a<-c(rep(1:5,6)); b<-sort(b)
> d<-c(2008,2008,2009,2009,2010,2010);d<-rep(d,5)
> e<-rnorm(30,2,1)
> df<-data.frame(a,b,d,e) ; names(df)<-c("ind","site","year","height")
> 
> I created a new factor using ind and year, and would basically like to
> include site in the new dataframe. Does anyone know how this could easily
> be
> done?
> 
> Thanks in advance.
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

--
View this message in context: 
http://r.789695.n4.nabble.com/function-to-include-factors-in-summary-data-frame-tp3808391p3808538.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] creating a new column with values from another

2011-09-12 Thread holly shakya
I have 2 columns for weight. There are NAs in each column but not for the
same observation. Some observations have values for both.  I would want to
prioritize the WT2 values so I would like to do the following:

>From this:
ID  WT1WT2
1   134  NA
2   145   155
1NA  175
3NA  187

To this:
IDWT1   WT2WT
1NA 175 175
2  145 155  155
3   NA 187  187

Populating the NA values of wt2 with those of wt1 would work as well. Any
suggestions would be greatly appreciated.
-- 
Holly Shakya

Doctoral Student
San Diego State University/University of California, San Diego
Joint Doctoral Program in Public Health
(Global Health)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nested anova<-R chrashing

2011-09-12 Thread joerg stephan

Hi,

I tried to do a nested Anova with the attached Data. My response 
variable is "survivors" and I would like to know the effect of 
(insect-egg clutch) "size", "position" (of clutch on twig) and "clone" 
(/plant genotype) on the survival of eggs (due to predation). Each plant 
was provided with three different sizes of clutches (45,15,5) and had 
pseudo-replications of size 15 and 5 on it.


_Question 1: right code for nested Anova_

Here part of the code I used:

clone1<-as.factor(ifelse(clone=="Gudrun",1,ifelse(clone=="Loden",2,ifelse(clone=="021",3,4

plant1<-as.factor(ifelse(plant==1,1,ifelse(plant==2,2,ifelse(plant==3,3,ifelse(plant==4,4,ifelse(plant==5,5,ifelse(plant==6,6,

ifelse(plant==7,7,ifelse(plant==8,8,ifelse(plant==9,9,ifelse(plant==10,10,ifelse(plant==11,11,ifelse(plant==12,12,

ifelse(plant==13,13,ifelse(plant==14,14,ifelse(plant==15,15,ifelse(plant==16,16,ifelse(plant==17,17,ifelse(plant==18,18,

ifelse(plant==19,19,ifelse(plant==20,20,ifelse(plant==21,21,ifelse(plant==22,22,ifelse(plant==23,23,ifelse(plant==24,24,

ifelse(plant==25,25,ifelse(plant==26,26,ifelse(plant==27,27,ifelse(plant==28,28,ifelse(plant==29,29,ifelse(plant==30,30,

ifelse(plant==31,31,ifelse(plant==32,32,ifelse(plant==33,33,ifelse(plant==34,34,ifelse(plant==35,35,ifelse(plant==36,36,

ifelse(plant==37,37,ifelse(plant==38,38,ifelse(plant==39,39,ifelse(plant==40,40,ifelse(plant==41,41,ifelse(plant==42,42,43)))

size1<-as.factor(ifelse(size==5,1,ifelse(size==15,2,3)))

position1<-as.factor(ifelse(position==1,1,ifelse(position==2,2,ifelse(position==3,3,

ifelse(position==4,4,ifelse(position==5,5,ifelse(position==6,6,ifelse(position==7,7,

ifelse(position==8,8,ifelse(position==9,9,ifelse(position==10,10,ifelse(position==11,11,ifelse(position==12,12,13)

ANOVA<-aov(survivors~(clone1*size1*position1)+Error(plant1/(size1*position1)))

After that command it says: "Error() model is singular". Even after 
googling (looks like many people had that problem) I am still uncertain 
how to solve that, mainly because I am not sure if the used code for the 
Anova meets my set up requirements. If I am using 
"ANOVA<-aov(survivors~clone1*plant1/(size1*position1)))" R sometimes is 
even crashing down.


_Question 2: normal distribution and homogeneity of variance_

So fare I tested "normal" Anovas by doing a linear model following 
plotting and KS-test and Bartlett-test including Q-Q-plot.


fm<-lm(survivors~clone1*plant1*size1*position1)

Residuen<-resid(fm) Here R is crashing again!!!

ks.test(Residuen,mean(Residuen),sd(Residuen))

plot(density(Residuen))

qqnorm(Residuen)

par(mfrow=c(2,2))

plot(fm)

bartlett.test(survivors~clone1*plant1*position1*size1)

Is that too much to compute for R? Is it OK to look for the 
preconditions for that nested Anova in that way, or do I have to use the 
same code for the Bartlett also (or evaluate only by looking at the 
Graphs (Zuur et al. 2010))??


Thank you very much in advance

Jörg

 clone plant size position  survivors
1   Gudrun 1   453 0.9556
2   Gudrun 1   151 1.
3   Gudrun 1   158 0.9333
4   Gudrun 1   15   11 0.
5   Gudrun 152 0.
6   Gudrun 154 0.4000
7   Gudrun 155 0.6000
8   Gudrun 156 0.8000
9   Gudrun 157 0.8000
10  Gudrun 159 0.4000
11  Gudrun 15   10 0.8000
12  Gudrun 15   12 0.8000
13  Gudrun 15   13 1.
14  Gudrun 2   451 0.8222
15  Gudrun 2   155 0.
16  Gudrun 2   157 0.9333
17  Gudrun 2   15   13 0.8000
18  Gudrun 252 0.2000
19  Gudrun 253 0.
20  Gudrun 254 0.6000
21  Gudrun 256 0.6000
22  Gudrun 258 1.
23  Gudrun 259 0.
24  Gudrun 25   10 0.4000
25  Gudrun 25   11 1.
26  Gudrun 25   12 1.
27  Gudrun 3   458 1.
28  Gudrun 3   151 1.
29  Gudrun 3   157 1.
30  Gudrun 3   159 1.
31  Gudrun 352 0.
32  Gudrun 353 1.
33  Gudrun 354 0.
34  Gudrun 355 0.6000
35  Gudrun 356 1.
36  Gudrun 35   10 0.8000
37  Gudrun 35   11 1.
38  Gudrun 35   12 1.
39  Gudrun 35   13 0.6000
40  Gudrun 4   45   13 0.8667
41  Gudrun 4   154 0.8667
42  Gudrun 4   158 0.9333
43  Gudrun 4   15   11 1.
44  Gudrun 451 1.
45  Gudrun 452 1.
46  Gudrun 453 0.2000
47  Gudrun 

Re: [R] findFreqTerms vs minDocFreq in Package 'tm'

2011-09-12 Thread vioravis
Thanks, Bettina.

--
View this message in context: 
http://r.789695.n4.nabble.com/findFreqTerms-vs-minDocFreq-in-Package-tm-tp3806644p3808134.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multilevel model in lme4 and nlme

2011-09-12 Thread jonas garcia
Dear list,



I am trying to fit some mixed models using packages lme4 and nlme.

I did the model selection using lmer but I suspect that I may have some
autocorrelation going on in my data so I would like to have a look using the
handy correlation structures available in nlme.



The problem is that I cannot translate my lmer model to lme:


mod1<- lmer(y~x + (1|a:b) + (1|b:c), data=mydata)


"a", "b" and "c" are factors with "c" nested in "b" and "b" nested in "a"


The best I can do with lme is:



mod2<- lme(y~x, random=list(a=~1, b=~1, c=~1), data=mydata)



which is the same as:



lmer(y~x + (1|a) + (1|a:b) + (1|a:b:c), data=mydata)



I am not at all interested in random effects (1|a) and (1|a:b:c) as they are
not significant. I just need two random intercepts as specified in mod1. How
can I translate mod1 into lme language?



Any help on this would be much appreciated.



Jonas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message for .csv file

2011-09-12 Thread vkent
I am using the package SPACECAP which provides an interface for R. I used
this interface to import the csv file into R as well as the other two csv
files that are required.

This is the output I get when querying the error message

Error in NN[i, 1:length(od)] <- od : subscript out of bounds
> str(NN)
Error in str(NN) : object 'NN' not found
> dim(NN)
Error: object 'NN' not found
> length(od)
Error: object 'od' not found
> i
Error: object 'i' not found
> 

I hope this helps

--
View this message in context: 
http://r.789695.n4.nabble.com/Error-message-for-csv-file-tp3807651p3808052.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple regression intercept

2011-09-12 Thread burdy
Hi I am having difficulty interpretive the multiple regression output. I
would like to know what it means when one of the factors is assigned as the
intercept?

In my data I am looking at the relationship between environmental parameters
and biological production.

One of my variables in the analysis is substratum type and gravel is
identified as the intercept and the P-value is significant,... 
Does this mean that I can talk about the relationship which it has with
production as being significant or because it is the intercept can I not use
it because it has been selected and used as the basis for the relationships
between the other factors in the variable substratum?

No one in the PhD office can answer this  

I look forward to any replies 

Thanks

Matt 

--
View this message in context: 
http://r.789695.n4.nabble.com/Multiple-regression-intercept-tp3808045p3808045.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Centering lines on barplot centers.

2011-09-12 Thread Greg Snow
Setting par(usr=something) does not survive the creating of a new high level 
plot.  Using par(new=TRUE) is to be avoided if at all possible (it just leads 
to problems like yours), it would be better for you to use matlines instead of 
matplot which adds lines to the current plot.

If you want to set the limits for a new plot to exactly match the previous one 
use the xlim argument along with xaxs='i' (or an additional buffer will be 
added again).

If you really need to adjust the range of one of the axes to combine the plots 
(not recommended, this can cause confusion and emphasize meaningless crossings, 
it is better to juxtapose two aligned plots) then consider using the updateusr 
function in the TeachingDemos package.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of gerald.j...@dgag.ca
> Sent: Monday, September 12, 2011 1:34 PM
> To: r-h...@stat.math.ethz.ch
> Subject: [R] Centering lines on barplot centers.
> 
> 
> Hello,
> 
> I am trying to port one of my plotting S+ functions to R and I am
> having
> difficulties!!!  I am including here only the troublesome code!
> 
> I first produce a barplot, saving the positions of the bar's centers.
> 
>   par(mar = c(6.1, 5.1, 4.1, 4.1), mgp = c(3, 3.0, 0))
>   ticks.loc  <- barplot(sum.of.weights, col = 5, xlab = "", ylab = "",
> axes = FALSE, axisnames = FALSE)
>   pretty.bar <-pretty(c(0, sum.of.weights), 6)
>   pretty.lab <- paste(pretty.bar, "%", sep = "")
>   axis(side = 2, at = pretty.bar, labels = pretty.lab, col = 1, line =
> 0,
>cex.axis = 0.80, las = 2, mgp = c(3, 2, 0))
>   my.axis(side = 1, at = ticks.loc, labels = plot.labels,
>   las = 2, col = 1, adj = 1, cex.axis = labels.cex, mgp = c(2,
> 1,
> 0))
>   box()
>   title(main = titre, cex = title.cex)
> 
> Then I would like to plot three lines with the x-positions of the
> points on
> the bar's centers.  In S+ I use "par(new = TRUE, xaxs = "d")", which
> freezes the "x-axis" and that does the trick.  In R this option is not
> supported, I tried to play with the user-coordinates, using the same
> ones
> from one plot to the other but there is a slight offset in the location
> of
> the points w.r. to the centers of the bar???
> 
>   user.corr <- par("usr")
>   par(new = TRUE, mar = c(6.1, 5.1, 4.1, 4.1), mgp = c(3, 3.0, 0),
>   usr = user.corr)
>   matplot(x = ticks.loc,
>   y = observed.means[, c("observed", "rebal.old",
> "rebal.new")],
>   type = "b", lwd = 1.0, cex = 0.80, xlab = "", ylab = "",
>   axes = FALSE)
>   pretty.bar <- pretty(c(0, max(observed.means[, c("observed",
> "rebal.old",
>"rebal.new")])))
>   axis(side = 4, at = pretty.bar, line = 0, mgp = c(2, 2, 0), las = 2,
>col.axis = 1, cex.axis = 0.80)
> 
> Any suggestions???  Thanks in advance,
> 
> Gérald Jean
> Conseiller senior en statistiques,
> VP Actuariat et Solutions d'assurances,
> Desjardins Groupe d'Assurances Générales
> télephone: (418) 835-4900 poste (7639)
> télecopieur  : (418) 835-6657
> courrier électronique: gerald.j...@dgag.ca
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Centering lines on barplot centers.

2011-09-12 Thread gerald.jean

Hello,

I am trying to port one of my plotting S+ functions to R and I am having
difficulties!!!  I am including here only the troublesome code!

I first produce a barplot, saving the positions of the bar's centers.

  par(mar = c(6.1, 5.1, 4.1, 4.1), mgp = c(3, 3.0, 0))
  ticks.loc  <- barplot(sum.of.weights, col = 5, xlab = "", ylab = "",
axes = FALSE, axisnames = FALSE)
  pretty.bar <-pretty(c(0, sum.of.weights), 6)
  pretty.lab <- paste(pretty.bar, "%", sep = "")
  axis(side = 2, at = pretty.bar, labels = pretty.lab, col = 1, line = 0,
   cex.axis = 0.80, las = 2, mgp = c(3, 2, 0))
  my.axis(side = 1, at = ticks.loc, labels = plot.labels,
  las = 2, col = 1, adj = 1, cex.axis = labels.cex, mgp = c(2, 1,
0))
  box()
  title(main = titre, cex = title.cex)

Then I would like to plot three lines with the x-positions of the points on
the bar's centers.  In S+ I use "par(new = TRUE, xaxs = "d")", which
freezes the "x-axis" and that does the trick.  In R this option is not
supported, I tried to play with the user-coordinates, using the same ones
from one plot to the other but there is a slight offset in the location of
the points w.r. to the centers of the bar???

  user.corr <- par("usr")
  par(new = TRUE, mar = c(6.1, 5.1, 4.1, 4.1), mgp = c(3, 3.0, 0),
  usr = user.corr)
  matplot(x = ticks.loc,
  y = observed.means[, c("observed", "rebal.old", "rebal.new")],
  type = "b", lwd = 1.0, cex = 0.80, xlab = "", ylab = "",
  axes = FALSE)
  pretty.bar <- pretty(c(0, max(observed.means[, c("observed", "rebal.old",
   "rebal.new")])))
  axis(side = 4, at = pretty.bar, line = 0, mgp = c(2, 2, 0), las = 2,
   col.axis = 1, cex.axis = 0.80)

Any suggestions???  Thanks in advance,

Gérald Jean
Conseiller senior en statistiques,
VP Actuariat et Solutions d'assurances,
Desjardins Groupe d'Assurances Générales
télephone: (418) 835-4900 poste (7639)
télecopieur  : (418) 835-6657
courrier électronique: gerald.j...@dgag.ca
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] function to include factors in summary data frame

2011-09-12 Thread Wade Wall
Hi all,

I have a dataframe that includes data on individuals that are distributed
across multiple rows.  I have aggregated the data using ddply, but I have
columns in the original data frame that are factors ( such as sites "A",
"B", and "C") that I would like to include in the new data frame.  I have
done this in a clunky way using match() and a loop, but am wondering if
there is a more elegant approach.  Here is an example data set.

#Example
a<-c(rep(1:5,6)); b<-sort(b)
b<-c(rep("A",10),rep("B",10),rep("C",10))
a<-c(rep(1:5,6)); b<-sort(b)
d<-c(2008,2008,2009,2009,2010,2010);d<-rep(d,5)
e<-rnorm(30,2,1)
df<-data.frame(a,b,d,e) ; names(df)<-c("ind","site","year","height")

I created a new factor using ind and year, and would basically like to
include site in the new dataframe. Does anyone know how this could easily be
done?

Thanks in advance.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot 3 lines with ggplot2

2011-09-12 Thread J Toll
Hi,

I am trying to learn to use ggplot2 for what I had hoped would be a
fairly simple task.  I have a relatively small data.frame (100 by 4).
The first column contains symbols.  The 2nd, 3rd and 4th columns
represent percentage weightings for each symbol using 3 different
methodologies.  For example:

sym <- make.unique(replicate(100, paste(sample(LETTERS, 3, replace =
TRUE), collapse = "")))
x1 <- sort(rexp(100) * 100, decreasing = TRUE)
x2 <- sort(rexp(100) * 100, decreasing = TRUE)
x3 <- sort(rexp(100) * 100, decreasing = TRUE)

x <- data.frame(cbind(Symbol = sym, x1/sum(x1), x2/sum(x2),
x3/sum(x3)), stringsAsFactors = FALSE)

I'd like to plot a line for each of the 3 different methodologies.
The y-axis would be percentage weight, the x-axis would be the symbol
or row number, although I'd prefer that not to be shown.
Aesthetically, I'd like the plot to look like this one (except with 3
lines), although I believe that's not the proper kind of plot for my
data:

http://www.ling.upenn.edu/~joseff/rstudy/plots/ggplotintro/densityidentity.png
http://www.ling.upenn.edu/~joseff/rstudy/summer2010_ggplot2_intro.html

Is there a way to do this with my data?  From what I've been reading,
I get the sense that my data might be in the wrong format for what I'm
trying to do.  I've tried experimenting with melt to reformat my data,
but have gotten nowhere. I've tried a number of different ways to try
to get this working.  Most of the time, I don't even get any output.
The closest I've gotten is this simple command, but the output is
terrible.

ggplot(x, aes(Symbol, c(V2,V3,V4))) + geom_line()

I think I'm just completely misunderstanding the syntax of ggplot.
Could someone please point me in the right direction?  Thank you.

James

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loops on data˜1

2011-09-12 Thread R. Michael Weylandt
I may be totally off base with this, but I'm wondering what exactly this
would suggest or why you want to do it. Specifically "multiple regression
with only intercept" -- how is it multiple if you don't have any regressors?
Furthermore, you want to run a "regression" on a single data point --
really?

Best I can figure, an "intercept-only" regression is basically just the mean
of the data (if you have no variates, your best estimate as to the mean of
what you'll see is, well, just the mean [plus or minus some stuff about the
median we'll ignore here]) If I'm right about this, use of the lm()
function is far more powerful than you actually need and a simple cumulative
rolling mean, in conjunction with the rev() function, will suffice.

Maybe you could say more about this odd request and we could provide a
little more guidance,

Michael Weylandt

On Mon, Sep 12, 2011 at 10:20 AM, Trying To learn again <
tryingtolearnag...@gmail.com> wrote:

> Hi all,
>
> I have a time series a column vector with the ordered data so that the
> first
> column is the first observation and so on.
>
> The fact is that I want to run a multiple regression with only intercept.
>
> My first task is to run the regression on the first observation (1 from
> 276)
> and at the same time the same type of regressión on the 275 data.
>
> Then, is to run a regression on 2 of the data (the first and the second
> observation) and other with the 274 rest of observations
>
> ...
>
> The final tram of the loop would the to run 1 regression with the 275 first
> observations and one with the last observation.
>
> I want to save each pair of regression made in each loop.
>
> I have seen that the regression I want is
>
> data˜1
>
> But how Shoud I use mapply or sapply? To run avoind using loops?
>
> Thanks in advance¡¡¡
>
> Hope someone can send me examples similar o documents to try to make by my
> own.
>
> I attach the data.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difference in function arima estimation between 2.11.1 and R 2.12.2

2011-09-12 Thread Berend Hasselman

Luis Felipe Parra wrote:
> 
> 
> and as you can see in the results some coefficients (for example ar2 and
> ar8) are different in the different R versions. does anybody know what
> might
> be going on. Was there any change in the arima function between the two
> versions?
> 

You asked the same question about these results 3 days ago.
The results are not significantly different.
You were told that the very small differences can be caused by lots of
factors and you are not providing relevant information about e.g. OS,
32bit/64bit. Besides you are using an old version of R.
I really don't think there is anything to worry about.

/Berend

--
View this message in context: 
http://r.789695.n4.nabble.com/Difference-in-function-arima-estimation-between-2-11-1-and-R-2-12-2-tp3807782p3808170.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writting excel files

2011-09-12 Thread Bos, Roger
Marc's links lists many packages.  Of those, I would recommend
XLConnect. 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Marc Schwartz
Sent: Monday, September 12, 2011 12:53 PM
To: Damian Abalo
Cc: r-help
Subject: Re: [R] Writting excel files

On Sep 12, 2011, at 10:28 AM, Damian Abalo wrote:

> Hello.
> I need to generate, using R code, an excel file with multiple sheets, 
> I wonder if any of you know how to do so.
> 
> Thanks for the help


See the following:

R Data Import/Export Manual:
http://cran.r-project.org/doc/manuals/R-data.html

R Wiki: http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
***

This message is for the named person's use only. It may\...{{dropped:14}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] suggestion for proportions

2011-09-12 Thread csrabak

Em 9/9/2011 14:32, array chip escreveu:

Thanks all again for the suggestions. I agree with Wolfgang that
mcnemar.test() is what I am looking for. The accuracy is the
proportion of correct diagnosis compared to a gold standard, and I am
interested in which diagnosis test is better, not particular
interested in assessing the agreement between the 2 tests.


John,

Since you mention you already have the comparison of each test against 
the golden standard, perhaps the measures obtainable from the confusion 
matrix for especifity, sensitivity, etc. won't help you to ascertain in 
a more complete way?


If you're interested, get a look at epi.tests() in package epiR or 
diagnosis() in package DiagnosisMed.


--
Cesar Rabak

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 1 not equal to 1, and rep command

2011-09-12 Thread Daniel Nordlund


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Benjamin Høyer
> Sent: Monday, September 12, 2011 6:19 AM
> To: r-help@r-project.org
> Cc: t...@novozymes.com
> Subject: [R] 1 not equal to 1, and rep command
> 
> Hi
> 
> I need to use rep() to get a vector out, but I have spotted something very
> strange.  See the reproducible example below.
> 
> N <- 79
> seg <- 5
> segN <- N / seg   # = 15.8
> 
> d1 <- seg - ( segN - floor(segN) ) * seg
> d1# = 1

Not on my machine.

> d1-1
[1] -3.552714e-15

See FAQ 7.31

> 
> rep(2, d1)  # = numeric(0), strange - why doesn't it print one
> d1-1
[1] -3.552714e-15

<<>>
> 
> 
> Seems like there's some binary maths errors here somewhere.  Very strange
> to
> me.  Anyway, I need to be able to use the result d1 in a rep() command.
> Any
> way to force rep not to be *too* specific in how it reads its "times"
> argument?

Maybe round() will work for you, since you seem to be expecting a whole number.

rep(2, round(d1))



Hope this is helpful.

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple t.test

2011-09-12 Thread Greg Snow
Here is another approach.  A linear regression with a single binomial predictor 
will give the same results as a pooled t-test (if you insist on non-pooled then 
use sapply as previously suggested).  The lm function will do multiple 
regressions if given a matrix as the y-variable, so you can do a whole bunch of 
t-tests like:

> tmp <- summary( lm( as.matrix(example[,-3]) ~ example[,3] ) )
> sapply( tmp, function(x) coef(x)[2,4] )
   Response age Response height 
 0.02131164  0.02131164 
>

But note that you will be dealing with a bunch of tests and could have the 
standard problems that go with that.  Also note that what you are doing (while 
common) is really backwards, you are seeing if disease status is predictive of 
height and age when what would be more interesting is if height or age is 
predictive of disease status, you can do this individually like:

> fit <- glm( disease ~ 1, data=example, family=binomial )
> fit2 <- glm( disease ~ ., data=example, family=binomial )
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred 
> add1( fit, fit2, test='Chisq' )[[5]][-1]
[1] 0.003925917 0.003925917
Warning messages:
1: glm.fit: fitted probabilities numerically 0 or 1 occurred 
2: glm.fit: fitted probabilities numerically 0 or 1 occurred

(the warnings are because of the small unrealistic sample data, for real data 
they are much less likely).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of C.H.
> Sent: Monday, September 12, 2011 1:54 AM
> To: R-help
> Subject: [R] Multiple t.test
> 
> Dear R experts,
> 
> Suppose I have an data frame likes this:
> 
> > example <- data.frame(age=c(1,2,3, 4,5,6),
> height=c(100,110,120,130,140,150), disease=c(TRUE, TRUE, TRUE, FALSE,
> FALSE, FALSE))
> 
> > example
>   age height disease
> 1   1100TRUE
> 2   2110TRUE
> 3   3120TRUE
> 4   4130   FALSE
> 5   5140   FALSE
> 6   6150   FALSE
> 
> Is there anyway to compare the age and height between those with
> disease=TRUE and disease=FALSE using t.test and extract the p-values
> quickly?
> 
> I can do this individually
> 
> t.test(example$age~example$disease)[3]
> 
> But when the number of variable grow to something like 200 it is not
> easy any more.
> 
> Thanks!
> 
> Regards,
> 
> CH
> 
> --
> CH Chan
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message for .csv file

2011-09-12 Thread Sarah Goslee
Go to your R session.

First off, tell the R-help list how you got your csv file into R as NN.
read.csv() ? read.table()?

Then tell us exactly what code you're using. Where did i come from?

Also type in exactly the commands I gave you (four lines), and share the
output with the R-help list.

str(NN)
dim(NN)
length(od)
i

The likely problems: either you imported the csv file incorrectly, or
there's something wrong with the code (a loop?) in which it is
processed. Without
the R commands you used for either, there's no way to tell which, or what.

Sarah

On Mon, Sep 12, 2011 at 12:49 PM, Vivien Kent  wrote:
> Hi,
>
> I don't really understand what all that means I'm afraid. I am attaching the
> csv file which is just a list of coordinates classified as either habitat or
> non habitat. The other data I enter into the package are locations of
> captures (of animals) and locations of camera traps. I have got it to run in
> other scenarios with more rows so don't see what the problem can be.
>
> Thanks
>
> Vivien
>
> On 12/09/2011 17:43, Sarah Goslee wrote:
>
> Hi,
>
> On Mon, Sep 12, 2011 at 11:07 AM, vkent  wrote:
>
> I would be grateful if anyone could tell me what the error message:
>
> Error in NN[i, 1:length(od)] <- od : subscript out of bounds
>
> It means that either i ends up being larger than the number of rows in
> NN, or that length(od) ends up being larger than the number of
> columns.
>
> More than that we can't tell you without more information, like, say,
> a reproducible example, or at least some information like
> str(NN)
> dim(NN)
> length(od)
> and where i comes from.
>
> How you got the csv file into R might also be relevant, or not.
>
> means for a large .csv file containing gps coordinates. I am using package
> SPACECAP and have successfully run it with other .csv files but now keep
> getting this error message. I can't see anything wrong with the file which
> is simply three columns of data.
>
> I am relatively new to R and am at a loss as to what I can do to correct
> this.
>
> Can anyone help?
>
> Thanks
>
> Vivien
>
>



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] barplot in hexagram layout

2011-09-12 Thread Schatzi
I updated the code as follows:
dev.new(width=2.5, height=3,mar=c(0,0,0,0))
par(mfrow=c(5,1),mar=c(.5, .5, 1.5, .5), oma=c(.4, 0,.5, 0))
barplot(c(1,1,1,1,1),col=c("white","white","red","white","white"), axes =
FALSE,border=NA)
barplot(c(1,1,1,1,1),col=c("orange","white","white","white","yellow"), axes
= FALSE,border=NA)
barplot(c(1,1,1,1,1),col=c("white","white","white","white","white"), axes =
FALSE,border=NA)
barplot(c(1,1,1,1,1),col=c("blue","white","white","white","green"), axes =
FALSE,border=NA)
barplot(c(1,1,1,1,1),col=c("white","white","brown","white","white"), axes =
FALSE,border=NA)

I added the middle plot to add a space between the two middle barplots. The
problem is that it is a bit too large of a space (they are further apart
than the other points). I tried adjusting different options - mar, mgp, omd
- of par and the "height" option of barplot and none of them seemed to help.
Do you have any ideas of how to make the invisible middle plot a bit shorter
or the plots above and below the middle slightly closer?

Thank you,

Adele

-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/barplot-in-hexagram-layout-tp3807600p3808044.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hourly data with zoo

2011-09-12 Thread steven mosher
worked beautifully.

Thanks.

On Mon, Sep 12, 2011 at 9:45 AM, Gabor Grothendieck
 wrote:
> On Mon, Sep 12, 2011 at 11:57 AM, steven mosher  
> wrote:
>> Gabor.. thanks.
>>
>> zr <- zooreg(rnorm(24), as.chron("2011-01-01"), frequency = 24)
>>
>> a couple issues:  my date data  has missing days and missing hours..
>> Sorry if I was not clear on
>> that.. I input it to a data frame and dates are of the form  20110101
>> and hours are in the format
>> 0,100,200
>>
>> The end goal is to create a data structure for around 200 series aligned by 
>> time
>
> Try this:
>
> v <- df$data
> z <- zoo(v, as.chron(d, format = "%Y%m%d") + h / 2400)
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 1 not equal to 1, and rep command

2011-09-12 Thread Sarah Goslee
Not so strange, in fact this is FAQ 7.31, and has to do (as you guess)
with the way that computers store numbers.

You need to do as you did, and use round() or floor() or similar to
ensure that you get the results you expect.

Sarah

2011/9/12 Benjamin Høyer :
> Hi
>
> I need to use rep() to get a vector out, but I have spotted something very
> strange.  See the reproducible example below.
>
> N <- 79
> seg <- 5
> segN <- N / seg   # = 15.8
>
> d1 <- seg - ( segN - floor(segN) ) * seg
> d1                    # = 1
>
> rep(2, d1)          # = numeric(0), strange - why doesn't it print one "2"?
> rep(2, 1)            # 2, ok
> rep(2, d1 / 1,1)   # 2, this does work
> rep(2, d1 + 2)    # "2 2" - also works but...
> d1 + 2              # = 3! so why does it print two 2s above?
>
> d1 == 1             # FALSE
> all.equal(d1, 1)   # TRUE
> identical(d1, 1)   # FALSE
>
> Try something else...
>
> d2 <- 4 - ( (79/4) - floor(79/4))* 4
> d2                     # = 1
>
> rep(2, d2)          # 2 : this works!
>
> d2 == 1             # TRUE
> all.equal(d2, 1)   # TRUE
> identical(d2, 1)   # TRUE
>
> #version info
> platform       x86_64-pc-linux-gnu
> arch           x86_64
> os             linux-gnu
> system         x86_64, linux-gnu
> status
> major          2
> minor          10.1
> year           2009
> month          12
> day            14
> svn rev        50720
> language       R
> version.string R version 2.10.1 (2009-12-14)
>
>
> Seems like there's some binary maths errors here somewhere.  Very strange to
> me.  Anyway, I need to be able to use the result d1 in a rep() command.  Any
> way to force rep not to be *too* specific in how it reads its "times"
> argument?
>
> Thanks in advance,
> Benjamin Hoyer
>



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] barplot in hexagram layout

2011-09-12 Thread Schatzi
Here is the new code. It works just like I wanted.

dev.new(width=6, height=6.5,mar=c(0,0,0,0))
par(mfrow=c(5,1),mar=c(.5, .5, 1.5, .5), oma=c(.4, 0,.5, 0))
barplot(c(1,1,1,1,1),col=c("white","white","red","white","white"), axes =
FALSE,border=NA)
barplot(c(1,1,1,1,1),col=c("orange","white","white","white","yellow"), axes
= FALSE,border=NA)
barplot(c(1,1,1,1,1),col=c("white","white","white","white","white"), axes =
FALSE,border=NA)
barplot(c(1,1,1,1,1),col=c("blue","white","white","white","green"), axes =
FALSE,border=NA)
barplot(c(1,1,1,1,1),col=c("white","white","brown","white","white"), axes =
FALSE,border=NA)

-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/barplot-in-hexagram-layout-tp3807600p3807953.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] barplot in hexagram layout

2011-09-12 Thread Schatzi
I will try stacking 5 barplots (with 5 bars per plot) and somehow only
showing the middle bar for the top and bottom plots and the two end bars for
the two middle plots.

-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/barplot-in-hexagram-layout-tp3807600p3807939.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Very slow using S4 classes

2011-09-12 Thread Martin Morgan

Hi André...

On 09/12/2011 07:20 AM, André Rossi wrote:

Dear Martin Morgan and Martin Maechler...

Here is an example of the computational time when a slot of a S4 class
is of another S4 class and when it is just one object. I'm sending you
the data file.

Thank you!

Best regards,

André Rossi



setClass("SupervisedExample",
 representation(
 attr.value = "ANY",
 target.value = "ANY"
))

setClass("StreamBuffer",
 representation=representation(
 examples = "list", #SupervisedExample
 max.length = "integer"
 ),
 prototype=list(
 max.length = as.integer(1)
 )
)
b <- new("StreamBuffer")

load("~/Dropbox/dataList2.RData")


For a reproducible example, I guess you have something like

  data <- replicate(1, new("SupervisedExample"))


b@examples <- data #data is a list of SupervisedExample class.

 > system.time({for (i in 1:100) b@examples[[1]]@attr.value[1] = 2 })


Yes, this is slow. [[<-,S4 is not as clever as [[<-,list and performs 
extra duplication, including those 10,000 S4 objects it contains.


As before, an improvement is to think in terms of vectors, maybe a 
'SupervisedExamples' class to act as a collection of examples


setClass("SupervisedExamples",
 representation=representation(
   attr.value = "list",
   target.value = "list"))

setClass("StreamBuffer",
 representation=representation(
   examples="SupervisedExamples"))

SupervisedExamples <-
function(attr.value=vector("list", n),
 target.value=vector("list", n), n, ...)
{
new("SupervisedExamples", attr.value=attr.value,
target.value=target.value, ...)
}

StreamBuffer <-
function(examples, ...)
{
new("StreamBuffer", examples=examples, ...)
}

data <- SupervisedExamples(n=10)

b <- StreamBuffer(data)

I then have

> system.time({for (i in 1:100) data@attr.value[[1]] = 2 })
   user  system elapsed
  1.081   0.013   1.094
> system.time({for (i in 1:100) b@examples@attr.value[[1]] <- 2})
   user  system elapsed
  4.283   0.000   4.295

(note the 10x increase in size); still slower, but this will be 
amortized when the updates are vectorized, e.g.,


> idx = sample(length(b@examples@attr.value), 100)
> system.time(b@examples@attr.value[idx] <- list(2))
   user  system elapsed
  0.013   0.000   0.014

A further change might be to recognize 'StreamBuffer' as an abstract 
class that SupervisedExamples extends


setClass("StreamBuffer",
 representation=representation(
   "VIRTUAL", max.len="integer"),
 prototype=prototype(max.len=10L),
 validity=function(object) {
 if (obj...@max.len < length(object))
 "too many elements"
 else TRUE
 })

setMethod(length, "StreamBuffer", function(x) {
stop("'length' undefined on '", class(x), "'")
})

setClass("SupervisedExamples",
 representation=representation(
   attr.value = "list",
   target.value = "list"),
 contains="StreamBuffer")

setMethod(length, "SupervisedExamples", function(x) {
length(x@attr.value)
})

SupervisedExamples <-
function(attr.value=vector("list", n),
 target.value=vector("list", n), n, ...)
{
new("SupervisedExamples", attr.value=attr.value,
target.value=target.value, ...)
}

data <- SupervisedExamples(n=10)

> system.time({for (i in 1:100) data@attr.value[[1]] = 2 })
   user  system elapsed
  1.043   0.014   1.061

Martin Morgan


user  system elapsed
  16.837   0.108  18.244

 > system.time({for (i in 1:100) data[[1]]@attr.value[1] = 2 })
user  system elapsed
   0.024   0.000   0.026




2011/9/10 Martin Morgan mailto:mtmor...@fhcrc.org>>

On 09/10/2011 08:08 AM, André Rossi wrote:

Hi everybody!

I'm creating an object of a S4 class that has two slots:
ListExamples, which
is a list, and idx, which is an integer (as the code below).

Then, I read a data.frame file with 1 (ten thousands) of
lines and 10
columns, do some pre-processing and, basically, I store each
line as an
element of a list in the slot ListExamples of the S4 object.
However, many
operations after this take a considerable time.

Can anyone explain me why dois it happen? Is it possible to
speed up an
script that deals with a big number of data (it might be
data.frame or
list)?

Thank you,

André Rossi

setClass("Buffer",
 representation=representation(
 Listexamples = "list",
 idx = "integer"
 )
)


Hi André,

Can you provide a simpler and more reproducible example, for instance

 > setClass("Buf", representation=representation(__lst="list"))
[1] "Buf"
 > b=

Re: [R] Writting excel files

2011-09-12 Thread Marc Schwartz
On Sep 12, 2011, at 10:28 AM, Damian Abalo wrote:

> Hello.
> I need to generate, using R code, an excel file with multiple sheets,
> I wonder if any of you know how to do so.
> 
> Thanks for the help


See the following:

R Data Import/Export Manual: http://cran.r-project.org/doc/manuals/R-data.html

R Wiki: http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] envfit vector labels with ordiplot3d

2011-09-12 Thread Gavin Simpson
On Mon, 2011-09-12 at 03:24 -0700, Briony wrote:
> Thank you very much for the suggestion. And while I'm here, thank you for
> vegan and the documentation that goes with it.
> 
> >Function ordiplot3d uses scatterplot3d, and it returns also all
> >scatterplot3d items, like functions xyz.converter and points3d that
> >can be used for tuning labels.
> 
> I tried ordilabel(pl$arrows) but the labels only seem to be in two
> dimensions.

It *really* does help to read the documentation for functions. Like most
of the functions in vegan that extract information from ordinations,
ordilabel() extracts information for "axes" 1 and 2 only, by default.

Does

ordilabel(pl$arrows, choices = 1:3)

work for you?

G

> >With ordixyplot I can see no other choice than that you edit the
> >function and preferably contribute your edited function to vegan
> >(and will be credited with the function help).
> 
> I'm not a skilled enough user of R to edit the ordixyplot function - so I'll
> pass on that invitation to anyone else who reads this thread?
> 
> Thanks again,
> Briony 
> 
> Briony  gmail.com> writes:
> 
> >
> > Hi R experts,
> >
> > I'm looking for some help with plotting vectors from envfit in vegan, onto
> > a
> > 3d plot using ordiplot3d. So far I have
> >
> > data.mds <- metaMDS(data, k=3,trace = FALSE)
> > vect_data<-envfit(data.mds,vegdata[,3:21],choices=1:3,permu=)
> > ordiplot3d(data.mds,envfit=vect_data)
> > ordixyplot(data.mds,pch=pts,envfit=vect_data)
> >
> > (my data's not really called data, I thought it might be easier to
> > communicate this way)
> >
> > These display the vectors as arrows, but what I would really like is for
> > the
> > arrows to be labelled, like what comes up automatically in ordirgl or with
> > a
> > 2D ordiplot.
> >
> > I've gone through the help and tried everything I can work out, but I must
> > be missing something important, because nothing's worked so far. I would
> > be
> > happy to use ordixyplot and show a series of 2D plots, but I can't get
> > labels on those arrows either.
> >
> > Any pointers in the right direction would be gratefully received.
> > Briony
> 
> Briony,
> 
> There really is no way to do this automatically, but if someone fixes the
> functions, we are happy to incorporate those changes in vegan.
> 
> You may be able to achieve something like that with ordiplot3d, but I am
> not sure it looks completely satisfactory. Function ordiplot3d returns
> invisibly the plotting object which contains, among other items. the
> coordinates of arrow heads in the  flattened graph. So this could work:
> 
> pl <- ordiplot3d(data.mds,envfit=vect_data)
> ordilabel(pl$arrows)
> 
> Function ordiplot3d uses scatterplot3d, and it returns also all
> scatterplot3d items, like functions xyz.converter and points3d that
> can be used for tuning labels.
> 
> With ordixyplot I can see no other choice than that you edit the
> function and preferably contribute your edited function to vegan
> (and will be credited with the function help).
> 
> Cheers, Jari Oksanen
> 
> __
> [hidden email] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code. 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/envfit-vector-labels-with-ordiplot3d-tp3800669p3807015.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Writting excel files

2011-09-12 Thread Damian Abalo
Hello.
I need to generate, using R code, an excel file with multiple sheets,
I wonder if any of you know how to do so.

Thanks for the help

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Very slow using S4 classes

2011-09-12 Thread André Rossi
Dear Martin Morgan and Martin Maechler...

Here is an example of the computational time when a slot of a S4 class is of
another S4 class and when it is just one object. I'm sending you the data
file.

Thank you!

Best regards,

André Rossi



setClass("SupervisedExample",
representation(
attr.value = "ANY",
target.value = "ANY"
))

setClass("StreamBuffer",
representation=representation(
examples = "list", #SupervisedExample
max.length = "integer"
),
prototype=list(
max.length = as.integer(1)
)
)

b <- new("StreamBuffer")

load("~/Dropbox/dataList2.RData")

b@examples <- data #data is a list of SupervisedExample class.

> system.time({for (i in 1:100) b@examples[[1]]@attr.value[1] = 2 })
   user  system elapsed
 16.837   0.108  18.244

> system.time({for (i in 1:100) data[[1]]@attr.value[1] = 2 })
   user  system elapsed
  0.024   0.000   0.026




2011/9/10 Martin Morgan 

> On 09/10/2011 08:08 AM, André Rossi wrote:
>
>> Hi everybody!
>>
>> I'm creating an object of a S4 class that has two slots: ListExamples,
>> which
>> is a list, and idx, which is an integer (as the code below).
>>
>> Then, I read a data.frame file with 1 (ten thousands) of lines and 10
>> columns, do some pre-processing and, basically, I store each line as an
>> element of a list in the slot ListExamples of the S4 object. However, many
>> operations after this take a considerable time.
>>
>> Can anyone explain me why dois it happen? Is it possible to speed up an
>> script that deals with a big number of data (it might be data.frame or
>> list)?
>>
>> Thank you,
>>
>> André Rossi
>>
>> setClass("Buffer",
>> representation=representation(
>> Listexamples = "list",
>> idx = "integer"
>> )
>> )
>>
>
> Hi André,
>
> Can you provide a simpler and more reproducible example, for instance
>
> > setClass("Buf", representation=representation(**lst="list"))
> [1] "Buf"
> > b=new("Buf", lst=replicate(1, list(10), simplify=FALSE))
> > system.time({ b@lst[[1]][[1]] = 2 })
>   user  system elapsed
>  0.005   0.000   0.005
>
> Generally it sounds like you're modeling the rows as elements of
> Listofelements, but you're better served by modeling the columns (lst =
> replicate(10, integer(1)), if all of your 10 columns were
> integer-valued, for instance). Also, S4 is providing some measure of type
> safety, and you're undermining that by having your class contain a 'list'.
> I'd go after
>
> setClass("Buffer",
> representation=representation(
>   col1="integer",
>   col2="character",
>   col3="numeric"
>   ## etc.
>   ),
> validity=function(object) {
> nms <- slotNames(object)
> len <- sapply(nms, function(nm) length(slot(object, nm)))
> if (1L != length(unique(len)))
> "slots must all be of same length"
> else TRUE
> })
>
> Buffer <-
>function(col1, col2, col3, ...)
> {
>new("Buffer", col1=col1, col2=col2, col3=col3, ...)
> }
>
> Let's see where the inefficiencies are before deciding that this is an S4
> issue.
>
> Martin
>
>
>
>>[[alternative HTML version deleted]]
>>
>>
>>
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
> --
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
> Location: M1-B861
> Telephone: 206 667-2793
>
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hourly data with zoo

2011-09-12 Thread Gabor Grothendieck
On Mon, Sep 12, 2011 at 11:57 AM, steven mosher  wrote:
> Gabor.. thanks.
>
> zr <- zooreg(rnorm(24), as.chron("2011-01-01"), frequency = 24)
>
> a couple issues:  my date data  has missing days and missing hours..
> Sorry if I was not clear on
> that.. I input it to a data frame and dates are of the form  20110101
> and hours are in the format
> 0,100,200
>
> The end goal is to create a data structure for around 200 series aligned by 
> time

Try this:

v <- df$data
z <- zoo(v, as.chron(d, format = "%Y%m%d") + h / 2400)



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Loops on data˜1

2011-09-12 Thread Trying To learn again
Hi all,

I have a time series a column vector with the ordered data so that the first
column is the first observation and so on.

The fact is that I want to run a multiple regression with only intercept.

My first task is to run the regression on the first observation (1 from 276)
and at the same time the same type of regressión on the 275 data.

Then, is to run a regression on 2 of the data (the first and the second
observation) and other with the 274 rest of observations

...

The final tram of the loop would the to run 1 regression with the 275 first
observations and one with the last observation.

I want to save each pair of regression made in each loop.

I have seen that the regression I want is

data˜1

But how Shoud I use mapply or sapply? To run avoind using loops?

Thanks in advance¡¡¡

Hope someone can send me examples similar o documents to try to make by my
own.

I attach the data.
0,044018608
-0,110930705
-0,004672806
0,036287839
0,076363838
0,184673507
0,054624796
0,004673399
-0,350369342
-0,018010741
0,04551358
0,103917907
0,003625427
0,053179293
-0,033782013
0,045388376
0,046058974
-0,026597853
-0,043024922
0,000107398
0,023316523
-0,025215586
-0,022010788
0,032180548
-0,046694162
0,04754586
0,04477713
0,050400827
-0,006016904
-0,009683332
0,060819249
0,02778801
-0,066637611
-0,015145689
-0,024107075
-0,054245015
-0,028267407
-0,106028435
0,058280299
0,048688903
0,03650455
-0,011679631
-0,118281484
-0,20053834
0,111253729
-0,006097049
-0,017805671
0,03705825
0,116787459
0,061440659
-0,012156779
0,038099168
-0,039006922
-0,016260521
0,021109252
0,022975469
-0,023373208
-0,05750481
-0,002784887
0,040829062
0,052226191
-0,038081497
-0,025343725
0,048157965
-0,09697759
-0,109103964
-0,057933746
-0,061468063
0,037968306
0,104264106
0,0007851
0,075840568
0,018310751
0,031504956
0,010294806
0,046695997
0,014182633
0,043380758
0,119688009
-0,055959275
0,082225757
-0,050393069
0,097278802
0,096262265
-0,06241493
-0,069361955
0,009030263
-0,002314697
-0,092980997
0,049805643
-0,008103107
-0,04925431
0,005615955
0,028565833
-0,062582675
-0,015542748
-0,005118238
-0,031176554
0,053190392
0,042622007
-0,002308574
0,046831888
0,017443516
-0,016442058
-0,029890565
0,068621158
0,033791768
0,02816927
0,04733688
-0,015027352
0,056886729
0,00701932
0,036404299
-0,069112649
0,011523185
0,046143139
0,021396424
0,080506631
0,099234631
0,032944569
-0,004136244
0,022154347
0,085647286
0,064147326
0,088601439
-0,010758421
-0,041759107
0,106954082
-0,130488552
0,082933574
0,045583222
0,092556088
0,111759282
0,137223967
-0,01813
-0,001992872
0,013962033
0,033656114
-0,238781644
-0,073829719
0,136588007
0,091739764
0,019618638
0,004280924
0,01192401
-0,026002073
0,023809079
0,00966702
0,014420507
-0,084361973
0,043157025
-0,029042726
0,022433195
0,117683799
0,060488801
-0,071776846
0,14977833
-0,053093935
-0,039923432
-0,070383431
-0,010080105
-0,004708031
0,032977872
0,005981323
-0,055088036
-0,11747309
-0,011427573
0,104767572
-0,057430588
-0,025781265
0,047488379
-0,027029374
-0,06774412
-0,045910911
-0,018915992
-0,129004138
0,061033102
0,073197048
0,003925481
-0,042224171
0,010515422
0,013939634
-0,011619177
-0,025398309
-0,139755653
-0,100934239
0,029391158
-0,169608455
0,122474857
0,085228942
-0,102065325
-0,014886045
0,008654874
-0,021719655
0,100245677
0,002923529
0,05288993
0,028686871
0,006999252
-0,059040373
0,06159641
0,017105131
0,064693628
0,02460056
0,039500045
-0,028438984
0,011334728
-0,018695158
0,014840398
-0,019878636
-0,00630829
0,020090369
0,047323011
0,032110196
0,04364419
0,015635647
0,017953842
-0,014177333
-0,028172111
0,046186181
0,037078109
0,033412158
-0,01060409
0,077357646
-0,030047737
0,006080317
0,016542034
0,033925463
0,055729017
0,009629233
0,003217278
-0,04752756
0,018140532
0,023102872
0,027269068
0,063020762
0,06134336
0,006977708
0,021232554
0,028343657
-0,021166281
0,027229002
-0,018410862
0,064309793
-0,028948397
-0,006034826
-0,022034755
0,006656068
0,086310803
-0,008252707
-0,037338464
-0,137718886
-0,004439502
0,0074586
0,039114909
-0,014409429
-0,12138671
-0,013783522
-0,014753157
-0,063454316
-0,186727152
-0,022789532
0,031505279
-0,084533081
-0,103319305
0,025150491
0,145392947
0,041853549
0,037845252
0,103498303
0,045912241
0,033824999
-0,029461496
0,019940382
0,025042967
-0,097638684
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error message for .csv file

2011-09-12 Thread Sarah Goslee
Hi,

On Mon, Sep 12, 2011 at 11:07 AM, vkent  wrote:
> I would be grateful if anyone could tell me what the error message:
>
> Error in NN[i, 1:length(od)] <- od : subscript out of bounds

It means that either i ends up being larger than the number of rows in
NN, or that length(od) ends up being larger than the number of
columns.

More than that we can't tell you without more information, like, say,
a reproducible example, or at least some information like
str(NN)
dim(NN)
length(od)
and where i comes from.

How you got the csv file into R might also be relevant, or not.

> means for a large .csv file containing gps coordinates. I am using package
> SPACECAP and have successfully run it with other .csv files but now keep
> getting this error message. I can't see anything wrong with the file which
> is simply three columns of data.
>
> I am relatively new to R and am at a loss as to what I can do to correct
> this.
>
> Can anyone help?
>
> Thanks
>
> Vivien
>


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 1 not equal to 1, and rep command

2011-09-12 Thread Benjamin Høyer
Hi

I need to use rep() to get a vector out, but I have spotted something very
strange.  See the reproducible example below.

N <- 79
seg <- 5
segN <- N / seg   # = 15.8

d1 <- seg - ( segN - floor(segN) ) * seg
d1# = 1

rep(2, d1)  # = numeric(0), strange - why doesn't it print one "2"?
rep(2, 1)# 2, ok
rep(2, d1 / 1,1)   # 2, this does work
rep(2, d1 + 2)# "2 2" - also works but...
d1 + 2  # = 3! so why does it print two 2s above?

d1 == 1 # FALSE
all.equal(d1, 1)   # TRUE
identical(d1, 1)   # FALSE

Try something else...

d2 <- 4 - ( (79/4) - floor(79/4))* 4
d2 # = 1

rep(2, d2)  # 2 : this works!

d2 == 1 # TRUE
all.equal(d2, 1)   # TRUE
identical(d2, 1)   # TRUE

#version info
platform   x86_64-pc-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  2
minor  10.1
year   2009
month  12
day14
svn rev50720
language   R
version.string R version 2.10.1 (2009-12-14)


Seems like there's some binary maths errors here somewhere.  Very strange to
me.  Anyway, I need to be able to use the result d1 in a rep() command.  Any
way to force rep not to be *too* specific in how it reads its "times"
argument?

Thanks in advance,
Benjamin Hoyer

DTU MSc Mathematical Modelling (2012) / Novozymes student helper

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] kernel weight

2011-09-12 Thread Soberon Velez, Alexandra Pilar
Hello dear members,



I need to calculate "by hand" a local lineal regression so I need to compute a 
kernel weight.



Does somebody knows how to get a kernel to use as weighted? I can calculate a 
density kernel function and after pre-multiply it by the sample size. However I 
know this is not what I have to do.



Please, can somebody help me?



Thanks a lot.

Alexandra

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error message for .csv file

2011-09-12 Thread vkent
I would be grateful if anyone could tell me what the error message: 

Error in NN[i, 1:length(od)] <- od : subscript out of bounds

means for a large .csv file containing gps coordinates. I am using package
SPACECAP and have successfully run it with other .csv files but now keep
getting this error message. I can't see anything wrong with the file which
is simply three columns of data.

I am relatively new to R and am at a loss as to what I can do to correct
this.

Can anyone help?

Thanks

Vivien

--
View this message in context: 
http://r.789695.n4.nabble.com/Error-message-for-csv-file-tp3807651p3807651.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hourly data with zoo

2011-09-12 Thread steven mosher
Gabor.. thanks.

zr <- zooreg(rnorm(24), as.chron("2011-01-01"), frequency = 24)

a couple issues:  my date data  has missing days and missing hours..
Sorry if I was not clear on
that.. I input it to a data frame and dates are of the form  20110101
and hours are in the format
0,100,200

The end goal is to create a data structure for around 200 series aligned by time

On Mon, Sep 12, 2011 at 3:48 AM, Gabor Grothendieck
 wrote:
> On Mon, Sep 12, 2011 at 1:58 AM, steven mosher  wrote:
>> I have date data as a numeric and hourly data in 0 to 2300 hours in a 
>> dataframe.
>>
>> d  <-  rep(20110101,24)
>> h  <-  seq(from =  0, to  =  2300, by  = 100)
>>
>> df  <-  data.frame(LST_DATE  =  d,  LST_TIME  =  h,  data  =  rnorm(24, 0, 
>> 1))
>>
>> S  <-  chron(dates. = as.character(df$LST_DATE), times. =
>> paste(as.character(df$LST_TIME/100), ":0:0", sep  = ""),
>>           format  = c(dates  =  "Ymd",  times =  "h:m:s"))
>> X  <-  zoo(df$data, order.by = S)
>>
>> And I want to create a regular zoo series,  The above works but its
>> pretty ugly. Is there a more elegant way to do this.
>
> You probably want to create a zooreg object:
>
> library(zoo)
> library(chron)
>
> zr <- zooreg(rnorm(24), as.chron("2011-01-01"), frequency = 24)
>
> although if you really do want a zoo object that is not a zooreg
> object then you can do it like this:
>
> z <- as.zoo(zr)
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difference in function arima estimation between 2.11.1 and R 2.12.2

2011-09-12 Thread Luis Felipe Parra
Hello , I have estimated the following model, a sarima:

p=9
d=1
q=2
P=0
D=1
Q=1
S=12


In R 2.12.2
Call:
arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, Q),
period = S),
optim.control = list(reltol = tol))

Coefficients:
 ar1 ar2  ar3 ar4 ar5 ar6  ar7  ar8
ar9
  0.3152  0.8762  -0.4413  0.0152  0.1500  0.0001  -0.0413  -0.1811
 0.0646
s.e.  0.0865  0.0885   0.1141  0.1181  0.1196  0.1220   0.1120   0.0908
 0.0865
  ma1  ma2 sma1
  -0.0221  -0.9779  -0.7635
s.e.   0.0539   0.0534   0.0834

sigma^2 estimated as 1.965e+17:  log likelihood = -3316.07,  aic = 6658.13


and in In R 2.11.1
Call:
arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D, Q),
period = S),
optim.control = list(reltol = tol))

Coefficients:
 ar1 ar2  ar3 ar4 ar5ar6  ar7  ar8
ar9
  0.3152  0.8761  -0.4413  0.0153  0.1500  0.000  -0.0413  -0.1810
 0.0646
s.e.  0.0865  0.0885   0.1141  0.1181  0.1196  0.122   0.1120   0.0908
 0.0865
  ma1  ma2 sma1
  -0.0221  -0.9779  -0.7635
s.e.   0.0539   0.0534   0.0834

sigma^2 estimated as 1.965e+17:  log likelihood = -3316.07,  aic = 6658.13

and as you can see in the results some coefficients (for example ar2 and
ar8) are different in the different R versions. does anybody know what might
be going on. Was there any change in the arima function between the two
versions?

Thank you

Felipe Parra

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Superimposing titles on dotcharts

2011-09-12 Thread Mikkel Grum
I've created a chart with times that employees have entered data on named tasks 
as in the following example:

Employee <- c(rep("Tom", 127), 
rep("Dick", 121), 
rep("Sally", 130)
)
Time <- c(seq(as.POSIXct("2011-09-12 07:00:00"), as.POSIXct("2011-09-12 
14:00:00"), 200),
seq(as.POSIXct("2011-09-12 07:00:00"), as.POSIXct("2011-09-12 14:00:00"), 210),
seq(as.POSIXct("2011-09-12 07:05:00"), as.POSIXct("2011-09-12 13:55:00"), 190)
)
Task <- c(rep("Task A", 56), rep("Task B", 27), rep("Task C", 44), 
rep("Task A", 22), rep("Task D", 99), 
rep("Task B", 44), rep("Task E", 26), rep("Task F", 38), rep("Task G", 22)
)
Schedule <- data.frame(Employee, Time, Task)

require(lattice)
ticks = seq(as.POSIXct("2011-09-12 06:30:00"),
    as.POSIXct("2011-09-12 14:30:00"), by = '30 min')
marks = c("", "07:00", "", "08:00", "", "09:00", "",
    "10:00", "", "11:00", "", "12:00", "", "13:00", "", "14:00", "")
dotplot(Employee ~ Time, group = Task, data = Schedule, xlab = "",
    horizontal = TRUE, scales = list(x = list(at = ticks, labels = marks, cex = 
0.4), cex = 0.5), cex = 0.5)

I would like to label the tasks in the chart, i.e. have a left aligned label 
above each new task. This would mean plotting the data from 

aggregate(Schedule$Time, by = list(Schedule$Employee, Schedule$Task), min)

How do I do this? A legend becomes unwieldy when there are many tasks and 
employees.
Any help would greatly be appreciated.

Regards
Mikkel


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] barplot in hexagram layout

2011-09-12 Thread Schatzi
I'm not sure this is the right location (maybe R-devel would be better). 

-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/barplot-in-hexagram-layout-tp3807600p3807608.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] barplot in hexagram layout

2011-09-12 Thread Schatzi
dev.new(width=6, height=1.5,mar=c(0,0,0,0))
par(mfrow=c(1,1),mar=c(.5, .5, 1.5, .5), oma=c(.4, 0,.5, 0))
barplot(c(1,1,1,1,1,1),col=c("blue","purple","red","green","orange","yellow"),
axes = FALSE)

I have a barplot that returns six colors in a line. I would like to get the
same six color blocks in a hexagram layout (if it were a clock, the blocks
would be at 12, 2, 4, 6, 8 and 10 o'clock).

Is this possible with R? I do not know java or c++ to make this with GUI, so
I have been doing it in R instead and it has worked great, except now I need
to change the layout just a bit.

Thank you,

Adele


-
In theory, practice and theory are the same. In practice, they are not - Albert 
Einstein
--
View this message in context: 
http://r.789695.n4.nabble.com/barplot-in-hexagram-layout-tp3807600p3807600.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] On-line machine learning packages?

2011-09-12 Thread Paul Hiemstra
 On 09/11/2011 03:42 PM, Jay wrote:
> What R packages are available for performing classification tasks?
> That is, when the predictor has done its job on the dataset (based on
> the training set and a range of variables), feedback about the true
> label will be available and this information should be integrated for
> the next classification round.
>
> //Jay
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

http://lmgtfy.com/?q=R+machine+learning

Paul

-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automated generation of combinations

2011-09-12 Thread Andrej Blejec
Try this

> ltr<-LETTERS[1:3]
> unique(apply(expand.grid(ltr,ltr,ltr),1,function(x) 
> paste("Var",unique(sort(x)),collapse="+",sep="")))
[1] "VarA"   "VarA+VarB"  "VarA+VarC"  "VarA+VarB+VarC" "VarB"  
 "VarB+VarC"  "VarC"
>

Andrej

--
Andrej Blejec
National Institute of Biology
Vecna pot 111 POB 141, SI-1000 Ljubljana, SLOVENIA
e-mail: andrej.ble...@nib.si
URL: http://ablejec.nib.si
tel: + 386 (0)59 232 789
fax: + 386 1 241 29 80
--
Organizer:
Applied Statistics 2011 conference
http://conferences.nib.si/AS2011


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
> Behalf Of Santiago Guallar
> Sent: Monday, September 12, 2011 2:45 PM
> To: r-help@r-project.org
> Subject: [R] Automated generation of combinations
>
> Hello,
>
> I'd like to generate automatically all the possible combinations of a set of
> 8 variables (there are 535, too many to do it by hand). For example:
>
> input: varA, varB, varC
> output: varA+varB+varC
> varA+varB
> varA+varC
> varB+varC
> varA
> varB
> varC
> Is there any function that produces this option?
>
> Thank you
>   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] findFreqTerms vs minDocFreq in Package 'tm'

2011-09-12 Thread Bettina Gruen

On 09/12/2011 04:28 PM, vioravis wrote:

I am using 'tm' package for text mining and facing an issue with finding the
frequently occuring terms. From the definition it appears that findFreqTerms
and minDocFreq are equivalent commands and both tries to identify the
documents with terms appearing more than a specified threshold. However, I
am getting drastically different results with both. I have given the results
from both the commands below:

findFreqTerms identifies 3140 words that appear more than 5 times but
minDocFreq identifies only 659 terms. Can someone please explain the reason
for the different or whether I have misunderstood their definitions??


From the help page of termFreq:

‘minDocFreq’ An integer value. Words that appear less often
  in ‘doc’ than this number are discarded. Defaults to ‘1’
  (i.e., every token will be used).

The description for findFreqTerms states:

Find frequent terms in a term-document matrix.

So minDocFreq assesses how often a word appears in a document in order to 
decide if it should be included in the frequency vector of words for this 
document.

By contrast findFreqTerms focuses on the document-term matrix and determines 
how often the word occurs in the matrix. So in fact the whole corpus is used to 
decide on the frequency and if the word should be included or not.

Because one function uses frequency of words in a document, while the other 
uses frequency of words in the document-term matrix, they are obviously not 
equivalent commands. Your results indicate that 3140 words occur at least 5 
times in the whole corpus, i.e., when summing over all documents. By contrasts 
659 words occur at least 5 times in one single document.

HTH,
Bettina


--
---
Bettina Grün
Institut für Angewandte Statistik / IFAS
Johannes Kepler Universität Linz
Altenbergerstraße 69
4040 Linz, Austria

Tel: +43 732 2468-6829
Fax: +43 732 2468-6800
E-Mail: bettina.gr...@jku.at
www.ifas.jku.at

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] number of repetition

2011-09-12 Thread Duncan Murdoch

On 12/09/2011 9:03 AM, amir wrote:

Hi,

Is there any function or command in R that show that how many times a
number is repeated in an array?


?table

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] number of repetition

2011-09-12 Thread amir

Hi,

Is there any function or command in R that show that how many times a 
number is repeated in an array?


Regards,
Amir

--
___
 Amir Darehshoorzadeh |Comp. Architecture Dept.
 PhD Student  |UPC-Campus Nord, C6-221
 Email: a...@ac.upc.edu   |c/ Jordi Girona, 1-3
 Tel: |08034 Barcelona - SPAIN
 http://personals.ac.upc.edu/amir

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple t.test

2011-09-12 Thread Mihovil Pletikos
Thank you for your help

Yes i wanted to do the t test for all columns except for the grouping
column.


2011/9/12 Uwe Ligges 

>
>
> On 12.09.2011 13:16, Raphael Saldanha wrote:
>
>> Hi!
>>
>> Try something like this:
>>
>> subset(example, disease==TRUE)
>> subset(example, disease==FALSE)
>>
>
>
> Hmmm, I think the actual answer to the question is something along this
> line:
>
> sapply(example[names(example)!**="disease"],
>   function(x) t.test(x ~ example[["disease"]])[[3]])
>
>
> Uwe Ligges
>
>
>
>
>
>> On Mon, Sep 12, 2011 at 4:54 AM, C.H.  wrote:
>>
>>  Dear R experts,
>>>
>>> Suppose I have an data frame likes this:
>>>
>>>  example<- data.frame(age=c(1,2,3, 4,5,6),

>>> height=c(100,110,120,130,140,**150), disease=c(TRUE, TRUE, TRUE, FALSE,
>>> FALSE,
>>> FALSE))
>>>
>>>  example

>>>  age height disease
>>> 1   1100TRUE
>>> 2   2110TRUE
>>> 3   3120TRUE
>>> 4   4130   FALSE
>>> 5   5140   FALSE
>>> 6   6150   FALSE
>>>
>>> Is there anyway to compare the age and height between those with
>>> disease=TRUE and disease=FALSE using t.test and extract the p-values
>>> quickly?
>>>
>>> I can do this individually
>>>
>>> t.test(example$age~example$**disease)[3]
>>>
>>> But when the number of variable grow to something like 200 it is not
>>> easy any more.
>>>
>>> Thanks!
>>>
>>> Regards,
>>>
>>> CH
>>>
>>> --
>>> CH Chan
>>>
>>> __**
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/**posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to get xlab and ylab in bold?

2011-09-12 Thread Uwe Ligges



On 12.09.2011 12:30, Nevil Amos wrote:

A very basic query

This code plots OK the axis values are in bold but the axis labels are
not. how do I get them in bold too?


Add

 font.lab=2

Uwe Ligges




thanks

Nevil Amos

plot(c(1,1),xlim=c(0,450),ylim=c(0.7,1.4),xlab="Distance (cells) from
edge of grid",ylab="Resistance distance",
type="l",col="white",lwd=2,font=2,family='sans')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hclust and cutree: identifying branches as classes

2011-09-12 Thread Peter Langfelder
On Mon, Sep 12, 2011 at 4:59 AM, Laurent Fernandez Soldevila
 wrote:
> Good afternoon,
>
>
> After cuting a hierarchical tree using cutree(), how to check correspondances 
> between classes and branches?
> This is what we do:
>
> srndpchc <- hclust(dist(srndpc$x[1:1000,1:3]),method="ward") #creation of 
> hierarchical tree
> plclust(srndpchc,hmin=2) #visualisation
>
> srndpchc2 = cutree(srndpchc,h=2) #returns 4 classes
> table(srndpchc2 )
>
> srndclass2 = cbind(srnd@data[1:1000,],srndpchc2) #assigning classes 
> to objects
> srndcents2 <- aggregate(srndclass2, by=list(srndpchc2), 
> FUN=median)
> matplot(1:36,t(srndcents2[,-c(1,38)]))
>
>
> But how can we make sure that, for example, class 1 is the first branch in 
> the tree plotted  by plclust() ?
Hi Laurent,

in the WGCNA package we have a function called plotDendroAndColors
that takes, as the minimum, the hierarchical clustering tree and the
clustering, and plots the tree with colors underneath.

you would use something along the lines of

plotDendroAndColors(srndpchc, srndpchc2,
  "Classes", # Some label for the classes,
   dendroLabels = FALSE # To not plot
labels for each leaf
  )

An example of the plot, with several different tree cuts, can be seen, e.g., at

http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting/Example-Dendrogram-10.png

HTH,

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Automated generation of combinations

2011-09-12 Thread Dimitris Rizopoulos

one option is the following:

varNames <- c("varA", "varB", "varC", "varD")

f <- function (i) {
combn(length(varNames), i,
function (x) paste(varNames[x], collapse = " + "))
}
lapply(seq_along(varNames), f)


However, in case you're interested in performing a linear regression 
with these subsets, then have also a look at package leaps 
(http://CRAN.R-project.org/package=leaps).


Best,
Dimitris


On 9/12/2011 11:20 AM, Santiago Guallar wrote:

Hello,

I'd like to generate automatically all the possible combinations of a set of 8 
variables (there are 535, too many to do it by hand). For example:

input: varA, varB, varC
output: varA+varB+varC
 varA+varB
 varA+varC
 varB+varC
 varA
 varB
 varC
Is there any function that produces this option?

Thank you
[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] hclust and cutree: identifying branches as classes

2011-09-12 Thread Laurent Fernandez Soldevila
Good afternoon,


After cuting a hierarchical tree using cutree(), how to check correspondances 
between classes and branches?
This is what we do:

srndpchc <- hclust(dist(srndpc$x[1:1000,1:3]),method="ward") #creation of 
hierarchical tree
plclust(srndpchc,hmin=2) #visualisation

srndpchc2 = cutree(srndpchc,h=2) #returns 4 classes
table(srndpchc2 )

srndclass2 = cbind(srnd@data[1:1000,],srndpchc2) #assigning classes to 
objects
srndcents2 <- aggregate(srndclass2, by=list(srndpchc2), FUN=median)
matplot(1:36,t(srndcents2[,-c(1,38)]))


But how can we make sure that, for example, class 1 is the first branch in the 
tree plotted  by plclust() ?


Regards
Laurent

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PerMANOVA of community data

2011-09-12 Thread Markus Lindh
Hi!

How can I make a PerMANOVA in R comparing treatments in a matrix that looks
something like this:

 Treatment 1  Treatment 2   Treatment 3
Species 1  0.6
0.2  0

Species 2   0
0.7  0.3

Species 3   0
0.5  0

I have 16 different species and 12 treatments

The treatments come i triplicates, so e.g. I want to compare between
treatment 1-3 with 4-6.

Please help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to get xlab and ylab in bold?

2011-09-12 Thread Nevil Amos

A very basic query

This code plots OK the axis values are in bold but the axis labels are 
not.  how do I get them in bold too?


thanks

Nevil Amos

plot(c(1,1),xlim=c(0,450),ylim=c(0.7,1.4),xlab="Distance (cells) from 
edge of grid",ylab="Resistance distance", 
type="l",col="white",lwd=2,font=2,family='sans')


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NMDS plot and Adonis (PerMANOVA) of community composition with presence absence and relative intensity

2011-09-12 Thread Markus Lindh
How can I display a heatmap.2 with a column dendrogram without reordering
neither column or row?

library(vegan)
dissimilaritymatrix<-data.matrix(vegdist(step3,method="bray"))
library(gplots)
heatmap<-heatmap.2(dissimilaritymatrix,dendrogram="column",Colv=T,
Rowv=F,key=TRUE, symkey=FALSE, density.info="none", trace="none",col =
cm.colors(256))


Please advice!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Display dendrogram in heatmap.2 without reordering col or row

2011-09-12 Thread Markus Lindh
How can I display a heatmap.2 with a column dendrogram without reordering
neither column or row?

library(vegan)
dissimilaritymatrix<-data.matrix(vegdist(step3,method="bray"))
library(gplots)
heatmap<-heatmap.2(dissimilaritymatrix,dendrogram="column",Colv=T,
Rowv=F,key=TRUE, symkey=FALSE, density.info="none", trace="none",col =
cm.colors(256))


Please advice!





-- 
Vänliga hälsningar
Kind regards
//Markus Lindh

>Work
Kalmarsundslab
Barlastgatan 11
392 31 KALMAR
Sweden
Tel. +46480-447320 (lab)
Tel. +46706-851242 (cellphone)
Fax +46480447313
markus.li...@lnu.se
markusvli...@gmail.com

>Home
Lindgatan 6
36131, Emmaboda
Sweden
Tel. 0706-851242
Tel. 0471-12710

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Automated generation of combinations

2011-09-12 Thread Santiago Guallar
Hello,
 
I'd like to generate automatically all the possible combinations of a set of 8 
variables (there are 535, too many to do it by hand). For example:
 
input: varA, varB, varC
output: varA+varB+varC
varA+varB
varA+varC
varB+varC
varA
varB
varC
Is there any function that produces this option?
 
Thank you
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-help Digest, Vol 103, Issue 11

2011-09-12 Thread mihalicza . peter
Szeptember 12-től 26-ig irodán kívül vagyok, és az emailjeimet nem érem el.

Sürgős esetben kérem forduljon Kárpáti Edithez (karpati.e...@gyemszi.hu).

Üdvözlettel,
Mihalicza Péter


I will be out of the office from 12 till 26 September with no access to my 
emails.

In urgent cases please contact Ms. Edit Kárpáti (karpati.e...@gyemszi.hu).

With regards,
Peter Mihalicza

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] On-line machine learning packages?

2011-09-12 Thread Jay
In my mind this sequential classification task with feedback is
somewhat different from an completely offline, once-off,
classification. Am I wrong?
However, it looks like the mentality on this topic is to refer me to
cran/google in order to look for solutions myself. Oblivious I know
about these sources, and as I said, I used rseek.org among other
sources to look for solutions. I did not start this topic for fun, I'm
asking for help to find a suitable machine learning packages that
readily incorporates feedback loops and online learning. If somebody
has experience these kinds of problems in R, please respond.


Or will
"http://cran.r-project.org
Look for 'Task Views'"
be my next piece of advice?

On Sep 12, 11:31 am, Dennis Murphy  wrote:
> http://cran.r-project.org/web/views/
>
> Look for 'machine learning'.
>
> Dennis
>
>
>
> On Sun, Sep 11, 2011 at 11:33 PM, Jay  wrote:
> > If the answer is so obvious, could somebody please spell it out?
>
> > On Sep 11, 10:59 pm, Jason Edgecombe  wrote:
> >> Try this:
>
> >>http://cran.r-project.org/web/views/MachineLearning.html
>
> >> On 09/11/2011 12:43 PM, Jay wrote:
>
> >> > Hi,
>
> >> > I used the rseek search engine to look for suitable solutions, however
> >> > as I was unable to find anything useful, I'm asking for help.
> >> > Anybody have experience with these kinds of problems? I looked into
> >> > dynaTree, but as information is a bit scares and as I understand it,
> >> > it might not be what I'm looking for..(?)
>
> >> > BR,
> >> > Jay
>
> >> > On Sep 11, 7:15 pm, David Winsemius  wrote:
> >> >> On Sep 11, 2011, at 11:42 AM, Jay wrote:
>
> >> >>> What R packages are available for performing classification tasks?
> >> >>> That is, when the predictor has done its job on the dataset (based on
> >> >>> the training set and a range of variables), feedback about the true
> >> >>> label will be available and this information should be integrated for
> >> >>> the next classification round.
> >> >> You should look at CRAN Task Views. Extremely easy to find from the
> >> >> main R-project page.
>
> >> >> --
> >> >> David Winsemius, MD
> >> >> West Hartford, CT
>
> >> >> __
> >> >> r-h...@r-project.org mailing 
> >> >> listhttps://stat.ethz.ch/mailman/listinfo/r-help
> >> >> PLEASE do read the posting 
> >> >> guidehttp://www.R-project.org/posting-guide.html
> >> >> and provide commented, minimal, self-contained, reproducible code.
> >> > __
> >> > r-h...@r-project.org mailing list
> >> >https://stat.ethz.ch/mailman/listinfo/r-help
> >> > PLEASE do read the posting 
> >> > guidehttp://www.R-project.org/posting-guide.html
> >> > and provide commented, minimal, self-contained, reproducible code.
>
> >> __
> >> r-h...@r-project.org mailing 
> >> listhttps://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
>
> > __
> > r-h...@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] envfit vector labels with ordiplot3d

2011-09-12 Thread Briony
Thank you very much for the suggestion. And while I'm here, thank you for
vegan and the documentation that goes with it.

>Function ordiplot3d uses scatterplot3d, and it returns also all
>scatterplot3d items, like functions xyz.converter and points3d that
>can be used for tuning labels.

I tried ordilabel(pl$arrows) but the labels only seem to be in two
dimensions.

>With ordixyplot I can see no other choice than that you edit the
>function and preferably contribute your edited function to vegan
>(and will be credited with the function help).

I'm not a skilled enough user of R to edit the ordixyplot function - so I'll
pass on that invitation to anyone else who reads this thread?

Thanks again,
Briony 

Briony  gmail.com> writes:

>
> Hi R experts,
>
> I'm looking for some help with plotting vectors from envfit in vegan, onto
> a
> 3d plot using ordiplot3d. So far I have
>
> data.mds <- metaMDS(data, k=3,trace = FALSE)
> vect_data<-envfit(data.mds,vegdata[,3:21],choices=1:3,permu=)
> ordiplot3d(data.mds,envfit=vect_data)
> ordixyplot(data.mds,pch=pts,envfit=vect_data)
>
> (my data's not really called data, I thought it might be easier to
> communicate this way)
>
> These display the vectors as arrows, but what I would really like is for
> the
> arrows to be labelled, like what comes up automatically in ordirgl or with
> a
> 2D ordiplot.
>
> I've gone through the help and tried everything I can work out, but I must
> be missing something important, because nothing's worked so far. I would
> be
> happy to use ordixyplot and show a series of 2D plots, but I can't get
> labels on those arrows either.
>
> Any pointers in the right direction would be gratefully received.
> Briony

Briony,

There really is no way to do this automatically, but if someone fixes the
functions, we are happy to incorporate those changes in vegan.

You may be able to achieve something like that with ordiplot3d, but I am
not sure it looks completely satisfactory. Function ordiplot3d returns
invisibly the plotting object which contains, among other items. the
coordinates of arrow heads in the  flattened graph. So this could work:

pl <- ordiplot3d(data.mds,envfit=vect_data)
ordilabel(pl$arrows)

Function ordiplot3d uses scatterplot3d, and it returns also all
scatterplot3d items, like functions xyz.converter and points3d that
can be used for tuning labels.

With ordixyplot I can see no other choice than that you edit the
function and preferably contribute your edited function to vegan
(and will be credited with the function help).

Cheers, Jari Oksanen

__
[hidden email] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code. 

--
View this message in context: 
http://r.789695.n4.nabble.com/envfit-vector-labels-with-ordiplot3d-tp3800669p3807015.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] coxreg vs coxph: time-dependent treatment

2011-09-12 Thread Göran Broström
Dear Ehsan,

the cluster option is not implemented in 'eha', although you obviously get
no error if trying
I'll fix this. Thanks for the report. (So, use 'coxph' with cluster).

Göran

On Mon, Sep 12, 2011 at 4:43 AM, Ehsan Karim  wrote:

> Sorry: there was an error in the weight calculation, fixed version is
> the following, but still the final estimates differ as explained in
> the original email:
>
>
> #
>
> require(survival)
> require(eha)
>
> data(heart)
> head(heart)
>
> follow <- heart$stop - heart$start
> fit <- glm(transplant ~ age + surgery + year + follow,
>  data=heart, family = binomial)
> heart$wt <- ifelse(heart$transplant == 0,
>   (1 - predict(fit, type = "response")),
>   (predict(fit, type = "response")))
> heart$iptw <- unlist(tapply(1/heart$wt, heart$id, cumprod))
> summary(heart$iptw)
>
> # no weights
> fit0 <- coxph(Surv(start,stop,event)~transplant, data=heart)
> fit0 # fit with coxph without case-weights
> fit1 <- coxreg(Surv(start,stop,event)~transplant, data=heart)
> fit1 # fit with coxreg from eha without case-weights
>
> # coxph
> fit2 <- coxph(Surv(start,stop,event)~transplant + cluster(id),
>  data=heart, weights = iptw, robust = T)
> fit2 # fit with coxph having robust and cluster option
> fit3 <- coxph(Surv(start,stop,event)~transplant + cluster(id),
>  data=heart, weights = iptw)
> fit3 # fit with coxph having cluster option
> fit4 <- coxph(Surv(start,stop,event)~transplant,
>  data=heart, weights = iptw)
> fit4 # fit with coxph
>
> # coxreg
> fit5 <- coxreg(Surv(start,stop,event)~transplant + cluster(id),
>  data=heart, weights = iptw)
> fit5 # fit with coxreg from eha having cluster option
> fit6 <- coxreg(Surv(start,stop,event)~transplant,
>  data=heart, weights = iptw)
> fit6 # fit with coxreg from eha
>
> exp(coef(fit3))# HR from coxph having cluster option
> exp(coef(fit4))# HR from coxph
> exp(coef(fit5))[1] # HR from coxreg having cluster option
> exp(coef(fit6))[1] # HR from coxreg
>
> #
> > exp(coef(fit3))# HR from coxph having cluster option
> transplant1
>17.94681
> > exp(coef(fit4))# HR from coxph
> transplant1
>17.94681
> > exp(coef(fit5))[1] # HR from coxreg having cluster option
> transplant1
>20.06519
> > exp(coef(fit6))[1] # HR from coxreg
> transplant1
>17.94681
> #
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Göran Broström

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression on data subsets in datafile

2011-09-12 Thread Gabor Grothendieck
On Mon, Sep 12, 2011 at 3:42 AM, marcel  wrote:
> I have data of the form
>
> tC <- textConnection("
> Subject Date    parameter1
> bob     3/2/99  10
> bob     4/2/99  10
> bob     5/5/99  10
> bob     6/27/99 NA
> bob     8/35/01 10
> bob     3/2/02  10
> steve   1/2/99  4
> steve   2/2/00  7
> steve   3/2/01  10
> steve   4/2/02  NA
> steve   5/2/03  16
> kevin   6/5/04  24
> ")
> data <- read.table(header=TRUE, tC)
> close.connection(tC)
> rm(tC)
>
> I am trying to calculate rate of change of parameter1 in units/day for each
> person. I think I need something like:

Try this:

data$Date <- as.Date(data$Date, "%m/%d/%y")
fm <- lm(parameter1 ~ Subject / Date - 1, data)
coef(fm)


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple t.test

2011-09-12 Thread Uwe Ligges



On 12.09.2011 13:16, Raphael Saldanha wrote:

Hi!

Try something like this:

subset(example, disease==TRUE)
subset(example, disease==FALSE)



Hmmm, I think the actual answer to the question is something along this 
line:


sapply(example[names(example)!="disease"],
   function(x) t.test(x ~ example[["disease"]])[[3]])


Uwe Ligges





On Mon, Sep 12, 2011 at 4:54 AM, C.H.  wrote:


Dear R experts,

Suppose I have an data frame likes this:


example<- data.frame(age=c(1,2,3, 4,5,6),

height=c(100,110,120,130,140,150), disease=c(TRUE, TRUE, TRUE, FALSE, FALSE,
FALSE))


example

  age height disease
1   1100TRUE
2   2110TRUE
3   3120TRUE
4   4130   FALSE
5   5140   FALSE
6   6150   FALSE

Is there anyway to compare the age and height between those with
disease=TRUE and disease=FALSE using t.test and extract the p-values
quickly?

I can do this individually

t.test(example$age~example$disease)[3]

But when the number of variable grow to something like 200 it is not
easy any more.

Thanks!

Regards,

CH

--
CH Chan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple t.test

2011-09-12 Thread Raphael Saldanha
Hi!

Try something like this:

subset(example, disease==TRUE)
subset(example, disease==FALSE)


On Mon, Sep 12, 2011 at 4:54 AM, C.H.  wrote:

> Dear R experts,
>
> Suppose I have an data frame likes this:
>
> > example <- data.frame(age=c(1,2,3, 4,5,6),
> height=c(100,110,120,130,140,150), disease=c(TRUE, TRUE, TRUE, FALSE, FALSE,
> FALSE))
>
> > example
>  age height disease
> 1   1100TRUE
> 2   2110TRUE
> 3   3120TRUE
> 4   4130   FALSE
> 5   5140   FALSE
> 6   6150   FALSE
>
> Is there anyway to compare the age and height between those with
> disease=TRUE and disease=FALSE using t.test and extract the p-values
> quickly?
>
> I can do this individually
>
> t.test(example$age~example$disease)[3]
>
> But when the number of variable grow to something like 200 it is not
> easy any more.
>
> Thanks!
>
> Regards,
>
> CH
>
> --
> CH Chan
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Atenciosamente,

Raphael Saldanha
saldanha.plan...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hourly data with zoo

2011-09-12 Thread Gabor Grothendieck
On Mon, Sep 12, 2011 at 1:58 AM, steven mosher  wrote:
> I have date data as a numeric and hourly data in 0 to 2300 hours in a 
> dataframe.
>
> d  <-  rep(20110101,24)
> h  <-  seq(from =  0, to  =  2300, by  = 100)
>
> df  <-  data.frame(LST_DATE  =  d,  LST_TIME  =  h,  data  =  rnorm(24, 0, 1))
>
> S  <-  chron(dates. = as.character(df$LST_DATE), times. =
> paste(as.character(df$LST_TIME/100), ":0:0", sep  = ""),
>           format  = c(dates  =  "Ymd",  times =  "h:m:s"))
> X  <-  zoo(df$data, order.by = S)
>
> And I want to create a regular zoo series,  The above works but its
> pretty ugly. Is there a more elegant way to do this.

You probably want to create a zooreg object:

library(zoo)
library(chron)

zr <- zooreg(rnorm(24), as.chron("2011-01-01"), frequency = 24)

although if you really do want a zoo object that is not a zooreg
object then you can do it like this:

z <- as.zoo(zr)

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] envfit vector labels with ordiplot3d

2011-09-12 Thread Uwe Ligges
Please quote the prior thread, otherwise readers of this mailing list 
will not get the context.


Uwe Ligges




On 12.09.2011 01:41, Briony wrote:

Thank you very much for the suggestion. And while I'm here, thank you for
vegan and the documentation that goes with it.

I tried ordilabel(pl$arrows) but the labels only seem to be in two
dimensions.

I'm not a skilled enough user of R to edit the ordixyplot function - so I'll
pass on that invitation to anyone else who reads this thread?

Thanks again,
Briony

--
View this message in context: 
http://r.789695.n4.nabble.com/envfit-vector-labels-with-ordiplot3d-tp3800669p3806001.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solve your R problems

2011-09-12 Thread Patrick Burns

Far be it from me to misquote someone.

On 12/09/2011 09:17, peter dalgaard wrote:


On Sep 12, 2011, at 09:41 , Patrick Burns wrote:


R-help is all about solving R problems.
So here ya go:
http://www.portfolioprobe.com/2011/09/12/solve-your-r-problems/


Grin.

Incidentally, I don't think it is quite true that I called you "that infernal guy" at 
useR. I might have done so (tongue in cheek, of course), but more likely it was "ah, the R 
Inferno guy" upon glancing at your name tag.

Anyways, "Si non e vero, e ben trovato".

;-)
-p



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression on data subsets in datafile

2011-09-12 Thread Dennis Murphy
Hi:

Here's one approach:

# date typo fixed in record 5 - changed 35 to 5
tC <- textConnection("
Subject Dateparameter1
bob 3/2/99  10
bob 4/2/99  10
bob 5/5/99  10
bob 6/27/99 NA
bob 8/5/01 10
bob 3/2/02  10
steve   1/2/99  4
steve   2/2/00  7
steve   3/2/01  10
steve   4/2/02  NA
steve   5/2/03  16
kevin   6/5/04  24
")
dat <- read.table(tC, header=TRUE, stringsAsFactors = FALSE)
close.connection(tC)
rm(tC)
# Convert Date to an object of class Date
dat <- transform(dat, date = as.Date(Date, format = '%m/%d/%y'))

# You could do this with transform() and the by() function, but
# here is another way to use the min date per person as time 0
# using package plyr; mutate is a faster alternative to transform
# and can be used for groupwise operations inside of ddply():
library('plyr')
dat <- ddply(dat, .(Subject), mutate, days = as.numeric(date - min(date)))

# Since Kevin has one record, want to return NAs for his coefficients
# The function f returns NA if there are less than three observations
# per subgroup; you can change 3 to 2 if you like. Otherwise, it returns
# the coefficients of the least squares line as a data frame.

f <- function(d) {
   if(nrow(d) < 3) {return(data.frame(intercept = NA, slope = NA))
 } else {
   p <-  coef(lm(parameter1 ~ days, data = d))
   data.frame(intercept = p[1], slope = p[2])
 }
   }
# Apply the function to each person's sub-data frame
ddply(dat, .(Subject), f)
  Subject intercept   slope
1 bob 10.00 0.0
2   kevinNA  NA
3   steve  3.998485 0.007591638

Another option is to use the lmList() function in the nlme package.

HTH,
Dennis


On Mon, Sep 12, 2011 at 12:42 AM, marcel  wrote:
> I have data of the form
>
> tC <- textConnection("
> Subject Date    parameter1
> bob     3/2/99  10
> bob     4/2/99  10
> bob     5/5/99  10
> bob     6/27/99 NA
> bob     8/35/01 10
> bob     3/2/02  10
> steve   1/2/99  4
> steve   2/2/00  7
> steve   3/2/01  10
> steve   4/2/02  NA
> steve   5/2/03  16
> kevin   6/5/04  24
> ")
> data <- read.table(header=TRUE, tC)
> close.connection(tC)
> rm(tC)
>
> I am trying to calculate rate of change of parameter1 in units/day for each
> person. I think I need something like:
> "lapply(split(mydata, mydata$ppt), function(x) lm(parameter1 ~ day,
> data=x))"
>
> I am not sure how to handle the dates in order to have the first day for
> each person be time = 0, and the remaining dates to be handled as days since
> time 0. Also, is there a way to add the resulting slopes to the data set as
> a new column?
>
> Thanks,
> Marcel
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/regression-on-data-subsets-in-datafile-tp3806743p3806743.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] completing missing samples

2011-09-12 Thread Eran Eidinger
Hello,

I have a time-series that has some missing samples.
I was thinking on completing them using either zero-order hold or linear
interpolation.
I am looking for an efiicient way (other than a loop...) of identifiying the
missing time slots and filling them.

Can you think of any methods that might help here? (obviously
which(diff(time)>min(diff(time))) will give the locations, but what
then?)

Thanks,
Eran.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Latex + R + sweave

2011-09-12 Thread Liviu Andronic
On Mon, Sep 12, 2011 at 9:21 AM, Twaha Mlwilo  wrote:
>
> Hello all,
> Good day,
> I have problem on how to remove the source code from the pdf output.Here I 
> mean this.
>  code in sweave Rnw files
>
>  <<>>=
>  x<-c(1,2,3,4,5,6)
> x
>  mean(x)
> sd(x)
> @
> then would like it appear as
> mean = 3.5
> sd=1.3
> x=1,2,3,4,5,6
>
Try this:
<>=
  x<-c(1,2,3,4,5,6)
 x
  mean(x)
 sd(x)
 @

See ?RweaveLatex. Also, try
"mean = \Sexpr{mean(x)}"

in your document. You could also play with ?paste in the code chunk,
or with generating a matrix with appropriate row names, and printing
that.

Regards
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Power analysis in hierarchical models

2011-09-12 Thread ONKELINX, Thierry
Dear Tom,

I think you failed to generate simulated outcome from the correct model. Hence 
the zero variance of your random effects. Here is a better working example.

library(lme4)

fake2 <- expand.grid(Bleach = c("Control","Med","High"), Temp = 
c("Cold","Hot"), Rep = factor(seq_len(3)), ID = seq_len(8))
fake2$rep <- fake2$Bleach:fake2$Temp:fake2$Rep

SDnoise <- 0.77
SDrep <- 1
FFBleach <- c(3.27,3.21, 3.64)
RFrep <- rnorm(length(levels(fake2$rep)), sd = SDrep)
fake2$Growth <- with(fake2, FFBleach[Bleach] + RFrep[rep] + rnorm(nrow(fake2), 
sd = SDnoise))

model2 <- lmer(Growth~Bleach*Temp+(1|rep),data=fake2)
str(summary(model2))
summary(model2)@coefs #to extract the t-values

Best regards,

Thierry

PS R-sig-mixed models is a better mailing list for this kind of questions.

> -Oorspronkelijk bericht-
> Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> Namens Tom Wilding
> Verzonden: maandag 5 september 2011 16:17
> Aan: r-help@r-project.org
> Onderwerp: [R] Power analysis in hierarchical models
> 
> Dear All
> I am attempting some power analyses, based on simulated data.
> My experimental set up is thus:
> Bleach: main effect, three levels (control, med, high),  Fixed.
> Temp: main effect, two levels (cold, hot), Fixed.
> Main effect interactions, six levels (fixed)
> For each main-effect combination I have three replicates.
> Within each replicate I can take varying numbers of measurements
> (response variable = Growth (of marine worms)) but, for this example,
> assume eight).  (I’m interested in changing this to see if the
> experimental power changes much).
> Total size = 3 x 2 x 3 x 8 = 144
> The script thus far goes:
> === start of script =
> library(lme4)
> #Data frame structure
> Bleach=rep(c("Control","Med","High"),each=48)
> Temp=  rep(rep(c("Cold","Hot"),each=24),3)
> Rep=  (rep(rep(rep(c("1","2","3"),each=8),2),3))
> Ind= (rep(rep(rep(c(1:8),3),2),3))#not required for stats
> 
> #Fake data (based on pilot studies), only showing a single main effect
> (bleach)
> Growth=c( rnorm(48,3.27,0.77),rnorm(48,3.21,0.77),rnorm(48,3.64,1.17))
> fake2=data.frame(Bleach,Temp,Rep,Ind,Growth);head(fake2)
> #generate factor level for lmer as per Crawley, page 649
> fake2$rep=fake2$Bleach:fake2$Temp:fake2$Rep#rep is used in the lmer
> model
> with(fake2,table(rep))#check that each rep contains 8 measurements
> 
> # run alternative (?equivalent) models
> model1=aov(Growth~Bleach*Temp+Error(Bleach*Temp/Rep),data=fake2);sum
> mary(model1)
> model2=lmer(Growth~Bleach*Temp+(1|rep),data=fake2);summary(model2)#no
> te:
> see above, rep<>Rep!
>  end of script ==
> I'd like to get familiar with using lme4 because it is likely that the
> final results of the experiment will be unbalanced (which precludes the
> use of aov I think).  The df given by model1 seem to make sense.  Any
> guidance on any of the following would be much appreciated:
> 1. Are model1 and model2 equivalent?
> 2. For model1 - is the random component correctly specified and is
> there a (simple) mechanism to get the appropriate F ratios and P
> values?
> 3. For model2 - again, is the random component correct (probably not)
> and why is the random effect (rep) variance and standard deviations so
> low (zero in most iterations)?
> 4. For both models - how do I isolate (so I can tabulate and create
> histograms) the appropriate P and/or t values?  (for model2 - the
> ‘mer’ object doesn’t seem to contain the t values but maybe
> I’m missing something).
> Direction to any more generic sources of information regarding power
> analysis in hierarchical models would be gladly received.
> Thank you
> Tom.
> 
> 
> -
> Tom Wilding, MSc, PhD, Dip. Stat.
> Scottish Association for Marine Science,
> Scottish Marine Institute,
> OBAN
> Argyll.  PA37 1QA
> United Kingdom.
> Phone (+44) (0) 1631 559214
> Fax (+44) (0) 1631 559001
> 
> +++
> The Scottish Association for Marine Science (SAMS) is registered in
> Scotland as a Company Limited by Guarantee (SC009292) and is a
> registered charity (9206).  SAMS has an actively trading wholly owned
> subsidiary company: SAMS Research Services Ltd a Limited Company
> (SC224404). All Companies in the group are registered in Scotland and
> share a registered office at Scottish Marine Institute, Oban Argyll PA37
> 1QA.
> 
> The content of this message may contain personal views which are not
> the views of SAMS unless specifically stated.
> 
> Please note that all email traffic is monitored for purposes of
> security and spam filtering. As such individual emails may be examined
> in more detail.
> +++
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posti

Re: [R] On-line machine learning packages?

2011-09-12 Thread Dennis Murphy
http://cran.r-project.org/web/views/

Look for 'machine learning'.

Dennis

On Sun, Sep 11, 2011 at 11:33 PM, Jay  wrote:
> If the answer is so obvious, could somebody please spell it out?
>
>
> On Sep 11, 10:59 pm, Jason Edgecombe  wrote:
>> Try this:
>>
>> http://cran.r-project.org/web/views/MachineLearning.html
>>
>> On 09/11/2011 12:43 PM, Jay wrote:
>>
>>
>>
>> > Hi,
>>
>> > I used the rseek search engine to look for suitable solutions, however
>> > as I was unable to find anything useful, I'm asking for help.
>> > Anybody have experience with these kinds of problems? I looked into
>> > dynaTree, but as information is a bit scares and as I understand it,
>> > it might not be what I'm looking for..(?)
>>
>> > BR,
>> > Jay
>>
>> > On Sep 11, 7:15 pm, David Winsemius  wrote:
>> >> On Sep 11, 2011, at 11:42 AM, Jay wrote:
>>
>> >>> What R packages are available for performing classification tasks?
>> >>> That is, when the predictor has done its job on the dataset (based on
>> >>> the training set and a range of variables), feedback about the true
>> >>> label will be available and this information should be integrated for
>> >>> the next classification round.
>> >> You should look at CRAN Task Views. Extremely easy to find from the
>> >> main R-project page.
>>
>> >> --
>> >> David Winsemius, MD
>> >> West Hartford, CT
>>
>> >> __
>> >> r-h...@r-project.org mailing 
>> >> listhttps://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting 
>> >> guidehttp://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible code.
>> > __
>> > r-h...@r-project.org mailing list
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solve your R problems

2011-09-12 Thread Liviu Andronic
On Mon, Sep 12, 2011 at 9:41 AM, Patrick Burns  wrote:
> R-help is all about solving R problems.
> So here ya go:
> http://www.portfolioprobe.com/2011/09/12/solve-your-r-problems/
>
Sweet. :)

May I suggest a font change: anything but the default CM should do the
trick. For one I prefer Palatino & Optima (you'll most likely have
access to Palladio or Pagella, and Classico), but it tends to be heavy
on longer documents. You could use Aldum, if you have access to it,
instead of Palatino: it's the book-design version of Palatino. I hear
that Minion Pro fonts are very good (again, if you have access), and
there are some "very" free alternatives such as Libertine & Biolinum.

Although the default CM tend to be very hard to read, I guess they
could fit in handily with the theme of your document. Regards
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hourly data with zoo

2011-09-12 Thread Dennis Murphy
Hi Steven:

How about this?

d  <-  rep(20110101,24)
h <- sprintf('%04d', seq(0, 2300, by = 100))
df  <-  data.frame(LST_DATE  =  d,  LST_TIME  =  h,  data  =  rnorm(24, 0, 1))
df <- transform(df, datetime = as.POSIXct(paste(LST_DATE, LST_TIME),
format = '%Y%m%d %H%M'))

library(zoo)
X <- with(df, zoo(data, datetime))
class(X)
str(X)

HTH,
Dennis

On Sun, Sep 11, 2011 at 10:58 PM, steven mosher  wrote:
> I have date data as a numeric and hourly data in 0 to 2300 hours in a 
> dataframe.
>
> d  <-  rep(20110101,24)
> h  <-  seq(from =  0, to  =  2300, by  = 100)
>
> df  <-  data.frame(LST_DATE  =  d,  LST_TIME  =  h,  data  =  rnorm(24, 0, 1))
>
> S  <-  chron(dates. = as.character(df$LST_DATE), times. =
> paste(as.character(df$LST_TIME/100), ":0:0", sep  = ""),
>           format  = c(dates  =  "Ymd",  times =  "h:m:s"))
> X  <-  zoo(df$data, order.by = S)
>
> And I want to create a regular zoo series,  The above works but its
> pretty ugly. Is there a more elegant way to do this.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Solve your R problems

2011-09-12 Thread peter dalgaard

On Sep 12, 2011, at 09:41 , Patrick Burns wrote:

> R-help is all about solving R problems.
> So here ya go:
> http://www.portfolioprobe.com/2011/09/12/solve-your-r-problems/

Grin.

Incidentally, I don't think it is quite true that I called you "that infernal 
guy" at useR. I might have done so (tongue in cheek, of course), but more 
likely it was "ah, the R Inferno guy" upon glancing at your name tag. 

Anyways, "Si non e vero, e ben trovato".

;-)
-p

> 
> -- 
> Patrick Burns
> pbu...@pburns.seanet.com
> twitter: @portfolioprobe
> http://www.portfolioprobe.com/blog
> http://www.burns-stat.com
> (home of 'Some hints for the R beginner'
> and 'The R Inferno')
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] regression on data subsets in datafile

2011-09-12 Thread marcel
I have data of the form

tC <- textConnection("
Subject Dateparameter1
bob 3/2/99  10
bob 4/2/99  10
bob 5/5/99  10
bob 6/27/99 NA
bob 8/35/01 10
bob 3/2/02  10
steve   1/2/99  4
steve   2/2/00  7
steve   3/2/01  10
steve   4/2/02  NA
steve   5/2/03  16
kevin   6/5/04  24
")
data <- read.table(header=TRUE, tC)
close.connection(tC)
rm(tC)

I am trying to calculate rate of change of parameter1 in units/day for each
person. I think I need something like:
"lapply(split(mydata, mydata$ppt), function(x) lm(parameter1 ~ day,
data=x))"

I am not sure how to handle the dates in order to have the first day for
each person be time = 0, and the remaining dates to be handled as days since
time 0. Also, is there a way to add the resulting slopes to the data set as
a new column? 

Thanks,
Marcel 

--
View this message in context: 
http://r.789695.n4.nabble.com/regression-on-data-subsets-in-datafile-tp3806743p3806743.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Latex + R + sweave

2011-09-12 Thread Twaha Mlwilo

Hello all,
Good day,
I have problem on how to remove the source code from the pdf output.Here I mean 
this.
 code in sweave Rnw files

 <<>>=
 x<-c(1,2,3,4,5,6)
x
 mean(x)
sd(x)
@
then would like it appear as 
mean = 3.5
sd=1.3
x=1,2,3,4,5,6

thank you in advance

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple t.test

2011-09-12 Thread C.H.
Dear R experts,

Suppose I have an data frame likes this:

> example <- data.frame(age=c(1,2,3, 4,5,6), height=c(100,110,120,130,140,150), 
> disease=c(TRUE, TRUE, TRUE, FALSE, FALSE, FALSE))

> example
  age height disease
1   1100TRUE
2   2110TRUE
3   3120TRUE
4   4130   FALSE
5   5140   FALSE
6   6150   FALSE

Is there anyway to compare the age and height between those with
disease=TRUE and disease=FALSE using t.test and extract the p-values
quickly?

I can do this individually

t.test(example$age~example$disease)[3]

But when the number of variable grow to something like 200 it is not
easy any more.

Thanks!

Regards,

CH

-- 
CH Chan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Solve your R problems

2011-09-12 Thread Patrick Burns

R-help is all about solving R problems.
So here ya go:
http://www.portfolioprobe.com/2011/09/12/solve-your-r-problems/

--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.