Re: [R] Sweave for inclusion of p value in a sentence of a LaTeX document

2010-08-07 Thread Matthieu Dubois
Dear Julia, 

my way to do that is to attribute the t.test to an object, 
and then refer to its p.value with the function 
\Sexpr

e.g. 
\documentclass{article}
\usepackage{Sweave}

\begin{document}

<>=
x<-cbind(1,2,3)
y<-cbind(3,4,5)
t <- t.test(x,y)
@

The p value for my data was \Sexpr{ round(t$p.value, 3) } 
which is not significant.

\end{document}


Best, 

Matthieu
Matthieu Dubois
Post-doctoral researcher
Department of Psychology and Neural Science, NYU

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-07 Thread steven mosher
In the real data the months are all complete, but the years can be missing.
So years can be missing up front, in the middle, at the end. but if a year
is present than every month has a value or NA.

To create regular R ts I had to plow through the data frame, collect a year
caluculate an index to put it into the final time series.

I had tried zoo out and it handled the irregular spaced data, but a large
data structure of zoo objects had stumped me. espcially since I need to do
matching and selecting
of the zoo objects.

In the real data, there are about 7000 time series of 1500 months and those
7000
get averaged and combined in different ways


On Sat, Aug 7, 2010 at 8:45 PM, Gabor Grothendieck
wrote:

> On Sat, Aug 7, 2010 at 9:18 PM, steven mosher 
> wrote:
> > Very Slick.
> > Gabor this is a Huge speed up for me. Thanks. ha, Now I want to rewrite a
> > bunch of working code.
> >
> >
> >
> > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
> >  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
> > Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
> >  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
> >
>  
> Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values)
> >  Data
> >Index Year Jan  Feb Mar Apr Jun
> > 1  67543 1989  12  6.0  12  12  12
> > 2  67543 1990  14  7.0  NA  NA  14
> > 3  67543 1991  34 17.0  34  34  34
> > 4  67543 1992  21 10.5  21  21  21
> > 5  12345 1991  54 27.0  NA  NA  54
> > 6  12345 1993  65 32.5  65  65  65
> > 7  12345 1994  23 11.5  23  23  23
> > 8  89765 1991  12  6.0  NA  NA  12
> > 9  89765 1992  13  6.5  13  13  13
> > 10 89765 1993  13  6.5  NA  NA  13
> > 11 89765 1994  13  6.5  13  13  13
> > 12 89765 1995  14  7.0  14  14  14
> > #  Gabor's solution
> >  f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
> >  do.call(cbind, by(Data, Data$Index, f))
> >  12345 67543 89765
>
>
> The original data had consecutive months in each series (actually
> there was a missing 1992 in one case but I assumed that was an
> inadvertent omission and the actual data was complete); however, here
> we have missing 6 month chunks in addition.  That makes the series
> non-consecutive so to solve that we could either apply this to the
> data (after putting the missing 1992 year back in):
>
> Data <- cbind(Data, NA, NA, NA, NA, NA, NA)
>
> or we could use a time series class that can handle irregularly spaced
> data:
>
> library(zoo)
> f <- function(x) {
>dat <- x[-(1:2)]
>tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+"))
>zoo(c(as.matrix(dat)), tim)
> }
> do.call(cbind, by(Data, Data$Index, f))
>
> The last line is  unchanged from before.  This code will also handle
> the original situation correctly even if the missing 1992 is truly
> missing.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-07 Thread Gabor Grothendieck
On Sat, Aug 7, 2010 at 9:18 PM, steven mosher  wrote:
> Very Slick.
> Gabor this is a Huge speed up for me. Thanks. ha, Now I want to rewrite a
> bunch of working code.
>
>
>
> Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
>  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
> Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
>  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
>  Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values)
>  Data
>    Index Year Jan  Feb Mar Apr Jun
> 1  67543 1989  12  6.0  12  12  12
> 2  67543 1990  14  7.0  NA  NA  14
> 3  67543 1991  34 17.0  34  34  34
> 4  67543 1992  21 10.5  21  21  21
> 5  12345 1991  54 27.0  NA  NA  54
> 6  12345 1993  65 32.5  65  65  65
> 7  12345 1994  23 11.5  23  23  23
> 8  89765 1991  12  6.0  NA  NA  12
> 9  89765 1992  13  6.5  13  13  13
> 10 89765 1993  13  6.5  NA  NA  13
> 11 89765 1994  13  6.5  13  13  13
> 12 89765 1995  14  7.0  14  14  14
> #  Gabor's solution
>  f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
>  do.call(cbind, by(Data, Data$Index, f))
>              12345 67543 89765


The original data had consecutive months in each series (actually
there was a missing 1992 in one case but I assumed that was an
inadvertent omission and the actual data was complete); however, here
we have missing 6 month chunks in addition.  That makes the series
non-consecutive so to solve that we could either apply this to the
data (after putting the missing 1992 year back in):

Data <- cbind(Data, NA, NA, NA, NA, NA, NA)

or we could use a time series class that can handle irregularly spaced data:

library(zoo)
f <- function(x) {
dat <- x[-(1:2)]
tim <- as.yearmon(outer(x$Year, seq(0, length = ncol(dat))/12, "+"))
zoo(c(as.matrix(dat)), tim)
}
do.call(cbind, by(Data, Data$Index, f))

The last line is  unchanged from before.  This code will also handle
the original situation correctly even if the missing 1992 is truly
missing.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to start R

2010-08-07 Thread Xu Wang

Hi satimis, I also use Ubuntu 10.04 and R. I just wanted to say that it is
very hard to get started in R. I'm guessing you know this, but I just wanted
to add some encouragement and mention that if you find yourself getting
frustrated at times or taking a very long time to do things that are
extremely simple, do not give up! Once you struggle through it, you will be
amazed at how much control you have over your data.

Also, I would highly recommend getting a book. I'm not sure what the best
ones are: R in a nutshell is quite good (and can be found at borders and
barnes and noble). I think it's like 40 dollars but if you can spare it it's
worth it in my opinion. The intro pdf is very well written but is too
concise I think.

Best of luck and enjoy the ride!
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-start-R-tp2317188p2317539.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave for inclusion of p value in a sentence of a LaTeX document

2010-08-07 Thread julia . jacobson
Dear R Users,

I would like to include the p value in the results returned by the t.test 
function in a sentence of a LaTeX document. For this purpose, I use the 
following code (file.Rnw):

\documentclass{article}
\begin{document}
The p value for my data was 
<>=
x<-cbind(1,2,3)
y<-cbind(3,4,5)
t.test(x,y)
@
which is not significant.
\end{document}

I use "R CMD Sweave file.Rnw" and "pdflatex file.tex" to create a PDF document 
of it.
However, the all details of the t-test are included in my document and form a 
new paragraph in another format than the rest of the original sentence.
The sentence should look like this: "The p value for my data was 0.2879 which 
was not significant."
Thanks in advance.

Julia

Wassertemperaturen in Deutschland
Sommer, Sonne, Strand - wer braucht Abkühlung? Die aktuellen Wassertemperaturen 
und Windgeschwindigkeiten für Deutschlands Badeseen gibt´s auf arcor.de.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package for measurement error models

2010-08-07 Thread Christos Argyropoulos

I believe there was a fairly recent exchange (within the last 6 months) about 
linear measurement error models/error-in-variable models/Deming 
regression/total least squares/orthogonal regression.  Terry Therneau provided 
code for an R function that can estimate such models:

http://www.mail-archive.com/r-help@r-project.org/msg85070.html

If your knowledge of Fortran is up to the task, there is a Netlib package 
called ODRpack that can fit such models. Kind of surprising that no one has not 
written a R wrapper for this library (yet).

Christos 


> Date: Sat, 7 Aug 2010 11:21:01 -0400
> From: carrieands...@gmail.com
> To: R-help@r-project.org
> Subject: [R] package for measurement error models
>
> Hi,all,
>
> I posted this question couple of days again, but haven't gotten any answers
> back. I would like to post it again, and if you have any ideas, please let
> me know. Any helps and suggestions are very much appreciated.
>
> The problem is about linear regression with both y and x have measurement,
> and the variance of errors are heterogeneous. The estimated regression
> coefficient and its variance are of interest. Is any R package doing this
> task ?
>
> Thank you
>
> Carrie--
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-07 Thread steven mosher
Very Slick.

Gabor this is a Huge speed up for me. Thanks. ha, Now I want to rewrite a
bunch of working code.




Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
 Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
 Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
 
Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values)
 Data
   Index Year Jan  Feb Mar Apr Jun
1  67543 1989  12  6.0  12  12  12
2  67543 1990  14  7.0  NA  NA  14
3  67543 1991  34 17.0  34  34  34
4  67543 1992  21 10.5  21  21  21
5  12345 1991  54 27.0  NA  NA  54
6  12345 1993  65 32.5  65  65  65
7  12345 1994  23 11.5  23  23  23
8  89765 1991  12  6.0  NA  NA  12
9  89765 1992  13  6.5  13  13  13
10 89765 1993  13  6.5  NA  NA  13
11 89765 1994  13  6.5  13  13  13
12 89765 1995  14  7.0  14  14  14

#  Gabor's solution

 f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
 do.call(cbind, by(Data, Data$Index, f))
 12345 67543 89765
Jan 1989NA  12.0NA
Feb 1989NA   6.0NA
Mar 1989NA  12.0NA
Apr 1989NA  12.0NA
May 1989NA  12.0NA
Jun 1989NA  14.0NA
Jul 1989NA   7.0NA
Aug 1989NANANA
Sep 1989NANANA
Oct 1989NA  14.0NA
Nov 1989NA  34.0NA
Dec 1989NA  17.0NA
Jan 1990NA  34.0NA
Feb 1990NA  34.0NA
Mar 1990NA  34.0NA
Apr 1990NA  21.0NA
May 1990NA  10.5NA
Jun 1990NA  21.0NA
Jul 1990NA  21.0NA
Aug 1990NA  21.0NA
Sep 1990NANANA
Oct 1990NANANA
Nov 1990NANANA
Dec 1990NANANA
Jan 1991  54.0NA  12.0
Feb 1991  27.0NA   6.0
...

On Sat, Aug 7, 2010 at 5:09 PM, steven mosher wrote:

> Thanks Gabor, I probably should have done an example with fewer columns.
>
> i will rework the example and post it up so the next guys who has this
> issue can have a
> clear example with a solution.
>
>
>
> On Sat, Aug 7, 2010 at 5:04 PM, Gabor Grothendieck <
> ggrothendi...@gmail.com> wrote:
>
>> On Sat, Aug 7, 2010 at 4:49 PM, steven mosher 
>> wrote:
>> > Given a data frame, or it could be a matrix if I choose to.
>> > The data consists of an ID, a year, and data for all 12 months.
>> > Missing values are a factor AND missing years.
>> >
>> > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
>> >  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
>> >  Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
>> >  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
>> >
>>  
>> Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
>> > + Oct=Values,Nov=Values,Dec=Values2)
>> >  Data
>> >   Index Year Jan  Feb Mar Apr Jun  July Aug Sep Oct Nov Dec
>> > 1  67543 1989  12  6.0  12  12  12  4.00  12  12  12  12  12
>> > 2  67543 1990  14  7.0  NA  NA  14  4.67  NA  14  14  14  NA
>> > 3  67543 1991  34 17.0  34  34  34 11.33  34  34  34  34  34
>> > 4  67543 1992  21 10.5  21  21  21  7.00  21  21  21  21  21
>> > 5  12345 1991  54 27.0  NA  NA  54 18.00  NA  54  54  54  NA
>> > 6  12345 1993  65 32.5  65  65  65 21.67  65  65  65  65  65
>> > 7  12345 1994  23 11.5  23  23  23  7.67  23  23  23  23  23
>> > 8  89765 1991  12  6.0  NA  NA  12  4.00  NA  12  12  12  NA
>> > 9  89765 1992  13  6.5  13  13  13  4.33  13  13  13  13  13
>> > 10 89765 1993  13  6.5  NA  NA  13  4.33  NA  13  13  13  NA
>> > 11 89765 1994  13  6.5  13  13  13  4.33  13  13  13  13  13
>> > 12 89765 1995  14  7.0  14  14  14  4.67  14  14  14  14  14
>> >
>> >
>> > The Goal is to return a Time series object for each ID. Alternatively
>> one
>> > could return a matrix that I can turn into a Time series.
>> > The final structure would be something like this ( done in matrix form
>> for
>> > illustration)
>> >  1989.0  1989.083
>> >1991 ..19921993. 1994  1995
>> > 67543 12   6.0   12  12  12  4.00  12  12  12  12  12...
>> > .34...21.. NA.NANA
>> > 12345  NA, NA,
>> > NA,.54 27
>> >
>> > Basically the time series will have patches at the front, middle and end
>> > where you may have years of NA
>> > The must be column ordered by time and aligned so that averages for all
>> > series can be computed per month.
>> >
>> > Now I have looping code to do this, where I loop through all the IDs and
>> map
>> > the row of data into the correct
>> > column. and create column names based on the data and row names based on
>> the
>> > ID, but it's painfully
>> > slow. Any wizardry would help.
>>
>> Your email came out a bit garbled so its not clear what you want to
>> get out but this code will produce a multivariate ts series, i.e. an
>> mts series, with one column for each series:
>>
>> f <- function(x) ts(c(t(x[-(1:2

Re: [R] Data frame reordering to time series

2010-08-07 Thread steven mosher
Thanks Gabor, I probably should have done an example with fewer columns.

i will rework the example and post it up so the next guys who has this issue
can have a
clear example with a solution.



On Sat, Aug 7, 2010 at 5:04 PM, Gabor Grothendieck
wrote:

> On Sat, Aug 7, 2010 at 4:49 PM, steven mosher 
> wrote:
> > Given a data frame, or it could be a matrix if I choose to.
> > The data consists of an ID, a year, and data for all 12 months.
> > Missing values are a factor AND missing years.
> >
> > Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
> >  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
> >  Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
> >  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
> >
>  
> Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
> > + Oct=Values,Nov=Values,Dec=Values2)
> >  Data
> >   Index Year Jan  Feb Mar Apr Jun  July Aug Sep Oct Nov Dec
> > 1  67543 1989  12  6.0  12  12  12  4.00  12  12  12  12  12
> > 2  67543 1990  14  7.0  NA  NA  14  4.67  NA  14  14  14  NA
> > 3  67543 1991  34 17.0  34  34  34 11.33  34  34  34  34  34
> > 4  67543 1992  21 10.5  21  21  21  7.00  21  21  21  21  21
> > 5  12345 1991  54 27.0  NA  NA  54 18.00  NA  54  54  54  NA
> > 6  12345 1993  65 32.5  65  65  65 21.67  65  65  65  65  65
> > 7  12345 1994  23 11.5  23  23  23  7.67  23  23  23  23  23
> > 8  89765 1991  12  6.0  NA  NA  12  4.00  NA  12  12  12  NA
> > 9  89765 1992  13  6.5  13  13  13  4.33  13  13  13  13  13
> > 10 89765 1993  13  6.5  NA  NA  13  4.33  NA  13  13  13  NA
> > 11 89765 1994  13  6.5  13  13  13  4.33  13  13  13  13  13
> > 12 89765 1995  14  7.0  14  14  14  4.67  14  14  14  14  14
> >
> >
> > The Goal is to return a Time series object for each ID. Alternatively one
> > could return a matrix that I can turn into a Time series.
> > The final structure would be something like this ( done in matrix form
> for
> > illustration)
> >  1989.0  1989.083
> >1991 ..19921993. 1994  1995
> > 67543 12   6.0   12  12  12  4.00  12  12  12  12  12...
> > .34...21.. NA.NANA
> > 12345  NA, NA,
> > NA,.54 27
> >
> > Basically the time series will have patches at the front, middle and end
> > where you may have years of NA
> > The must be column ordered by time and aligned so that averages for all
> > series can be computed per month.
> >
> > Now I have looping code to do this, where I loop through all the IDs and
> map
> > the row of data into the correct
> > column. and create column names based on the data and row names based on
> the
> > ID, but it's painfully
> > slow. Any wizardry would help.
>
> Your email came out a bit garbled so its not clear what you want to
> get out but this code will produce a multivariate ts series, i.e. an
> mts series, with one column for each series:
>
> f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
> do.call(cbind, by(Data, Data$Index, f))
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data frame reordering to time series

2010-08-07 Thread Gabor Grothendieck
On Sat, Aug 7, 2010 at 4:49 PM, steven mosher  wrote:
> Given a data frame, or it could be a matrix if I choose to.
> The data consists of an ID, a year, and data for all 12 months.
> Missing values are a factor AND missing years.
>
> Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
>  Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
>  Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
>  Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
>  Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
> + Oct=Values,Nov=Values,Dec=Values2)
>  Data
>   Index Year Jan  Feb Mar Apr Jun      July Aug Sep Oct Nov Dec
> 1  67543 1989  12  6.0  12  12  12  4.00  12  12  12  12  12
> 2  67543 1990  14  7.0  NA  NA  14  4.67  NA  14  14  14  NA
> 3  67543 1991  34 17.0  34  34  34 11.33  34  34  34  34  34
> 4  67543 1992  21 10.5  21  21  21  7.00  21  21  21  21  21
> 5  12345 1991  54 27.0  NA  NA  54 18.00  NA  54  54  54  NA
> 6  12345 1993  65 32.5  65  65  65 21.67  65  65  65  65  65
> 7  12345 1994  23 11.5  23  23  23  7.67  23  23  23  23  23
> 8  89765 1991  12  6.0  NA  NA  12  4.00  NA  12  12  12  NA
> 9  89765 1992  13  6.5  13  13  13  4.33  13  13  13  13  13
> 10 89765 1993  13  6.5  NA  NA  13  4.33  NA  13  13  13  NA
> 11 89765 1994  13  6.5  13  13  13  4.33  13  13  13  13  13
> 12 89765 1995  14  7.0  14  14  14  4.67  14  14  14  14  14
>
>
> The Goal is to return a Time series object for each ID. Alternatively one
> could return a matrix that I can turn into a Time series.
> The final structure would be something like this ( done in matrix form for
> illustration)
>          1989.0  1989.083
>    1991 ..19921993. 1994  1995
> 67543 12       6.0   12  12  12  4.00  12  12  12  12  12...
> .34...21..     NA.NANA
> 12345  NA, NA,
> NA,.54 27
>
> Basically the time series will have patches at the front, middle and end
> where you may have years of NA
> The must be column ordered by time and aligned so that averages for all
> series can be computed per month.
>
> Now I have looping code to do this, where I loop through all the IDs and map
> the row of data into the correct
> column. and create column names based on the data and row names based on the
> ID, but it's painfully
> slow. Any wizardry would help.

Your email came out a bit garbled so its not clear what you want to
get out but this code will produce a multivariate ts series, i.e. an
mts series, with one column for each series:

f <- function(x) ts(c(t(x[-(1:2)])), freq = 12, start = x$Year[1])
do.call(cbind, by(Data, Data$Index, f))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] basic question about t-test with adjusted p value

2010-08-07 Thread John Sorkin
ANOVA will give you an adjusted t-test. For example:
fit<-lm(height~sex+age)
summary(fit)
Will give an age-adjusted t-test of the heights of men and women.
John
John Sorkin
Chief Biostatistics and Informatics
Univ. of Maryland School of Medicine
Division of Gerontology and Geriatric Medicine
jsor...@grecc.umaryland.edu 
-Original Message-
From: Erik Iverson 
To:  
Cc:  

Sent: 8/7/2010 4:24:35 PM
Subject: Re: [R] basic question about t-test with adjusted p value

On 08/07/2010 03:08 PM, josef.kar...@phila.gov wrote:
> I have read the R manual and help archives, sorry but I'm still stuck.
>
> How would I do a t-test with an adjusted p-value?

Please be more specific about what you mean by 'adjusted p-value'. See below...

>
> Suppose that I use t.test ( ) , with the function argument alternative =
> "two.sided",  and data such that degrees of freedom = 20.  The function
> calculates a t-statistic of 2.086, and p-value =0.05
>
> How do I then adjust the p-value?  My thought is to do
> p.adjust (pt(2.086, df=20),"BH")
> but that doesn't change anything (returns 0.975)

A couple things here.

1) You can get the p-value returned from t.test by inspecting the object with 
?str, and noting that it's stored in an element called p-value.

So,

p.val <- t.test(extra ~ group, data = sleep)$p.value

assigns the p-value to an object called p.val.

2) You're calling p.adjust with a single p-value, so what kind of adjustment 
did 
you have in mind? You would normally be adjusting p-values because of a 
multiple 
comparison issue (i.e., multiple tests performed).  You'd pass p.adjust the 
*vector* of p-values, and a method to adjust them by.  Your vector is of length 
1, so in p.adjust's opinion, there is nothing to adjust for.

So, what do *you* mean by 'adjusted p-value'?  `Why are you looking to adjust?` 
is another way of stating that.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot the dependent variable against one of the predictors with other predictors as constant

2010-08-07 Thread Frank Harrell


There are many ways to do this.  Here is one.

install.packages('rms')
require(rms)
dd <- datadist(x, y); options(datadist='dd')
f <- ols(z ~ x + y)
plot(Predict(f))# plot all partial effects
plot(Predict(f, x)) # plot only the effect of x
plot(Predict(f, y)) # plot only the effect of y
f <- ols(z ~ pol(x,2)*pol(y,2) # repeat, not assuming linearity

Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

On Sat, 7 Aug 2010, Yi wrote:


Hi, folks,

Happy work in weekends >_<

My question is how to plot the dependent variable against one of the
predictors with other predictors as constant. Not for the original data, but
after prediction. It means y is the predicted value of the dependent
variables. The constane value of the other predictors may be the average or
some fixed value.

###
y=1:10
x=10:1
z=2:11
lin_model=lm(z~x+y)
x_new=11:2
###


How to plot predicted value of z from the regression model with x takes
x_new and y as a constant (let's say y=1)


I am thinking about using 'predict' command to generate the prediction of z
with the new data.frame but there should be a better way.

Thanks all.

Yi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot the dependent variable against one of the predictors with other predictors as constant

2010-08-07 Thread Yi
Hi, folks,

Happy work in weekends >_<

My question is how to plot the dependent variable against one of the
predictors with other predictors as constant. Not for the original data, but
after prediction. It means y is the predicted value of the dependent
variables. The constane value of the other predictors may be the average or
some fixed value.

###
y=1:10
x=10:1
z=2:11
lin_model=lm(z~x+y)
x_new=11:2
###


How to plot predicted value of z from the regression model with x takes
x_new and y as a constant (let's say y=1)


I am thinking about using 'predict' command to generate the prediction of z
with the new data.frame but there should be a better way.

Thanks all.

Yi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data frame reordering to time series

2010-08-07 Thread steven mosher
Given a data frame, or it could be a matrix if I choose to.
The data consists of an ID, a year, and data for all 12 months.
Missing values are a factor AND missing years.

Id<-c(rep(67543,4),rep(12345,3),rep(89765,5))
 Years<-c(seq(1989,1992,by =1),1991,1993,1994,seq(1991,1995,by=1))
 Values2<-c(12,NA,34,21,NA,65,23,NA,13,NA,13,14)
 Values<-c(12,14,34,21,54,65,23,12,13,13,13,14)
 
Data<-data.frame(Index=Id,Year=Years,Jan=Values,Feb=Values/2,Mar=Values2,Apr=Values2,Jun=Values,July=Values/3,Aug=Values2,Sep=Values,
+ Oct=Values,Nov=Values,Dec=Values2)
 Data
   Index Year Jan  Feb Mar Apr Jun  July Aug Sep Oct Nov Dec
1  67543 1989  12  6.0  12  12  12  4.00  12  12  12  12  12
2  67543 1990  14  7.0  NA  NA  14  4.67  NA  14  14  14  NA
3  67543 1991  34 17.0  34  34  34 11.33  34  34  34  34  34
4  67543 1992  21 10.5  21  21  21  7.00  21  21  21  21  21
5  12345 1991  54 27.0  NA  NA  54 18.00  NA  54  54  54  NA
6  12345 1993  65 32.5  65  65  65 21.67  65  65  65  65  65
7  12345 1994  23 11.5  23  23  23  7.67  23  23  23  23  23
8  89765 1991  12  6.0  NA  NA  12  4.00  NA  12  12  12  NA
9  89765 1992  13  6.5  13  13  13  4.33  13  13  13  13  13
10 89765 1993  13  6.5  NA  NA  13  4.33  NA  13  13  13  NA
11 89765 1994  13  6.5  13  13  13  4.33  13  13  13  13  13
12 89765 1995  14  7.0  14  14  14  4.67  14  14  14  14  14


The Goal is to return a Time series object for each ID. Alternatively one
could return a matrix that I can turn into a Time series.
The final structure would be something like this ( done in matrix form for
illustration)
  1989.0  1989.083
1991 ..19921993. 1994  1995
67543 12   6.0   12  12  12  4.00  12  12  12  12  12...
.34...21.. NA.NANA
12345  NA, NA,
NA,.54 27

Basically the time series will have patches at the front, middle and end
where you may have years of NA
The must be column ordered by time and aligned so that averages for all
series can be computed per month.

Now I have looping code to do this, where I loop through all the IDs and map
the row of data into the correct
column. and create column names based on the data and row names based on the
ID, but it's painfully
slow. Any wizardry would help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] basic question about t-test with adjusted p value

2010-08-07 Thread Erik Iverson

On 08/07/2010 03:08 PM, josef.kar...@phila.gov wrote:

I have read the R manual and help archives, sorry but I'm still stuck.

How would I do a t-test with an adjusted p-value?


Please be more specific about what you mean by 'adjusted p-value'. See below...



Suppose that I use t.test ( ) , with the function argument alternative =
"two.sided",  and data such that degrees of freedom = 20.  The function
calculates a t-statistic of 2.086, and p-value =0.05

How do I then adjust the p-value?  My thought is to do
p.adjust (pt(2.086, df=20),"BH")
but that doesn't change anything (returns 0.975)


A couple things here.

1) You can get the p-value returned from t.test by inspecting the object with 
?str, and noting that it's stored in an element called p-value.


So,

p.val <- t.test(extra ~ group, data = sleep)$p.value

assigns the p-value to an object called p.val.

2) You're calling p.adjust with a single p-value, so what kind of adjustment did 
you have in mind? You would normally be adjusting p-values because of a multiple 
comparison issue (i.e., multiple tests performed).  You'd pass p.adjust the 
*vector* of p-values, and a method to adjust them by.  Your vector is of length 
1, so in p.adjust's opinion, there is nothing to adjust for.


So, what do *you* mean by 'adjusted p-value'?  `Why are you looking to adjust?` 
is another way of stating that.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] basic question about t-test with adjusted p value

2010-08-07 Thread Philipp Pagel
On Sat, Aug 07, 2010 at 04:08:40PM -0400, josef.kar...@phila.gov wrote:
> I have read the R manual and help archives, sorry but I'm still stuck. 
> 
> How would I do a t-test with an adjusted p-value?
> 
> Suppose that I use t.test ( ) , with the function argument alternative = 
> "two.sided",  and data such that degrees of freedom = 20.  The function 
> calculates a t-statistic of 2.086, and p-value =0.05
> 
> How do I then adjust the p-value?  My thought is to do
> p.adjust (pt(2.086, df=20),"BH") 
> but that doesn't change anything (returns 0.975)
> 
> what is the procedure?  I'm sorry if there is a basic concept that I am 
> missing here...

I'm confused - what result where you expecting? p.adjust will need to
know the number of test you are trying to adjust for - either by
giving explicitly giving the number or by handing a vector of p-values
to the function.

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] basic question about t-test with adjusted p value

2010-08-07 Thread Joshua Wiley
On Sat, Aug 7, 2010 at 1:08 PM,   wrote:
> I have read the R manual and help archives, sorry but I'm still stuck.
>
> How would I do a t-test with an adjusted p-value?
>
> Suppose that I use t.test ( ) , with the function argument alternative =
> "two.sided",  and data such that degrees of freedom = 20.  The function
> calculates a t-statistic of 2.086, and p-value =0.05
>
> How do I then adjust the p-value?  My thought is to do
> p.adjust (pt(2.086, df=20),"BH")
> but that doesn't change anything (returns 0.975)
>
> what is the procedure?  I'm sorry if there is a basic concept that I am
> missing here...

These adjustments are generally designed to control a family wise or
experiment wise error rate.  If you only have one p-value, then the
adjustment is equivalent to the unadjusted.  The argument n defaults
to length(p) (which is the length of the first argument).  Here are
some examples:

> p.adjust(p = 0.975, method = "BH")
[1] 0.975
> p.adjust(p = 0.975, method = "BH", n = 1)
[1] 0.975
> p.adjust(p = 0.975, method = "BH", n = 5)
[1] 1

HTH,

Josh

>
>
>
>
>
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] basic question about t-test with adjusted p value

2010-08-07 Thread Josef . Kardos
I have read the R manual and help archives, sorry but I'm still stuck. 

How would I do a t-test with an adjusted p-value?

Suppose that I use t.test ( ) , with the function argument alternative = 
"two.sided",  and data such that degrees of freedom = 20.  The function 
calculates a t-statistic of 2.086, and p-value =0.05

How do I then adjust the p-value?  My thought is to do
p.adjust (pt(2.086, df=20),"BH") 
but that doesn't change anything (returns 0.975)

what is the procedure?  I'm sorry if there is a basic concept that I am 
missing here...








[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Q] a dummy variable used with sapply

2010-08-07 Thread Erik Iverson

On 08/07/2010 01:00 PM, GMail (KU) wrote:

Hello,

While learning how to manipulate data with R, I found one example that I
could not understand.

In the following example, the function definition of "maxcor" has an argument
named "i" and I don't understand why. Could someone explain why the maxcor
function definition needs to have this argument?

maxcor = function(i, n = 10, m = 5) { mat = matrix(rnorm(n*m),n,m) corr =
cor(mat) diag(corr) = NA max(corr,na.rm=TRUE) }


maxcors = sapply(1:1000, maxcor, n = 100)


Well, the short answer is that you're instinct is correct, it doesn't need it. 
The `i` parameter is just being used as a counter.  `sapply` is passing the 
numbers from 1 to 1000 to this function each time it is invoked, but the 
function definition does not actually use this number.


I'm guessing it is there because whoever wrote maxcor did not know about, or 
understand, the ?replicate function, and just used the sapply function. (Note 
`replicate` is essentially automatically doing what the above code does.)


The following is untested, but will probably work...

maxcor <- function(n = 10, m = 5)
{
mat <- matrix(rnorm(n*m), n, m)
corr <- cor(mat)
diag(corr) <- NA
max(corr, na.rm = TRUE)
}

maxcors <- replicate(n = 1000, maxcor(n = 100))

where the first n represents the number of replicates to perform,
and the second n is the value passed to maxcor. It's just a coincidence that 
they are named the same thing.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [Q] a dummy variable used with sapply

2010-08-07 Thread KU
Hello,

While learning how to manipulate data with R, I found one example that I could 
not understand. 

In the following example, the function definition of "maxcor" has an argument 
named "i" and I don't understand why. 
Could someone explain why the maxcor function definition needs to have this 
argument?

maxcor = function(i, n = 10, m = 5)
{ 
mat = matrix(rnorm(n*m),n,m) 
corr = cor(mat) 
diag(corr) = NA 
max(corr,na.rm=TRUE)
}

> maxcors = sapply(1:1000, maxcor, n = 100)

Thanks in advance.

Young-Jin Lee
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: quantmod Example-google data download-problems

2010-08-07 Thread Velappan Periasamy
-- Forwarded message --
From: Velappan Periasamy 
Date: Sat, Aug 7, 2010 at 11:20 PM
Subject: quantmod Example-google data download-problems
To: r-sig-fina...@stat.math.ethz.ch


getSymbols("YHOO",src="google") is working

getSymbols("NSE:RCOM",src="google") is not working.

then how to download the stock data for the  above symbol?.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Colours on conditional levelplots in package "lattice"

2010-08-07 Thread Dieter Menne



gordon.morrison wrote:
> 
> I am having difficulty in getting the colours on a conditional levelplot
> in
> the package lattice to work as I want them to - I wonder if someone could
> help me.
> 

I did not understand exactly what you wanted because the example was a bit
complext, but the following comes closer to your requests:

require(lattice)

#col1 <-
c(1.37,2.35,0.20,1.85,0.274,1.56,-0.76,-0.03,-1.39,-0.95,-0.50,0.91,1.58,0.17,-0.34,-0.48,1.50,-0.68,1.05)
col2 <-
c("A","B","C","F","G","H","B","C","E","G","H","A","B","C","D","E","F","G","H")
col3 <-
c("X","Y","X","Y","X","Y","Y","X","X","X","Y","X","Y","X","Y","X","Y","X","Y")
col4 <-
c("1","1","1","1","1","1","2","2","2","2","2","3","3","3","3","3","3","3","3")
dat1 <- data.frame(X1=col1, var1=col2, var2=col3, var3=col4)
dat1$X1 = 1:nrow(dat1)-nrow(dat1)/2

maxAbsValue <- max(abs(as.vector(dat1[, "X1"])))
colorkeyAt <- pretty(c(-maxAbsValue, maxAbsValue), n = 10)
boxColours <- c("#8B", "#A83500", "#C46B00", "#E2A100", "#FFD700",
"#ADD8E6", "#81A2CF", "#566CB8", "#2B36A1", "#8B")


levelplot(X1 ~ var1 * var2 | var3, data = dat1,
  col.regions = boxColours,
  colorkey = list(at = colorkeyAt,
  labels = list(labels = colorkeyAt, at = colorkeyAt, 
cex =0.6)),
   at =  pretty(c(-maxAbsValue, maxAbsValue), n = 10),
  panel = function(x, y, z, ...) {
panel.fill(col = "darkgrey")
panel.levelplot(x, y, z,...)
  }) 
  

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Colours-on-conditional-levelplots-in-package-lattice-tp2316446p2317330.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply family functions

2010-08-07 Thread Jeff Newmiller
Sapply is not significantly faster than a for loop. Vectorization 
generally is, but you pay for it in RAM.


> dat.oc <- oc[dat$Class,]
> dat$Flag <- ifelse(with(dat.oc,(Open<=dat$Close_date & 
dat$Close_date<=Close) | (Open1<=dat$Close_date & 
dat$Close_date<=Close1)),"Valid","Invalid")


If you are really hurting for RAM, you might take a rather less 
computationally efficient approach:


> dat$Flag <- sapply(1:length(dat$Class), function(idx,ddat,toc){cl <- 
ddat[idx,"Class"]

cld <- ddat[idx,"Close_date"]
if ( (toc[cl,"Open"]<=cld && cld<=toc[cl,"Close"]) || 
(toc[cl,"Open1"]<=cld && cld<=toc[cl,"Close1"])) {result <- "Valid"} 
else {result <- "Invalid"}

c(Flag=result) }, ddat=dat, toc=oc )


Steven Kang wrote:

ini <- as.Date("2010/1/1", "%Y/%m/%d")
# Generate arbitrary data frame consisting of date values
oc <- data.frame(Open = seq(ini, ini + 6, 1), Close = seq(ini + 365, ini +
365 + 6, 1), Open1 = seq(ini + 365*2, ini + 365*2 + 6, 1), Close1 = seq(ini
+ 365*3, ini + 365*3 + 6, 1), Open2 = seq(ini + 365*4, ini + 365*4 + 6, 1),
Close2 = seq(ini + 365*5, ini + 365*5 + 6, 1))
rownames(oc) <- c("AAA", "C", "AA", "A", "CC", "BB", "B")

  

oc


  Open  Close  Open1Close1
Open2Close2
AAA  2010-01-01  2011-01-01  2012-01-01  2012-12-31  2013-12-31  2014-12-31
C  2010-01-02  2011-01-02  2012-01-02  2013-01-01  2014-01-01
2015-01-01
AA2010-01-03  2011-01-03  2012-01-03  2013-01-02  2014-01-02  2015-01-02
A  2010-01-04  2011-01-04  2012-01-04  2013-01-03  2014-01-03
2015-01-03
CC2010-01-05  2011-01-05  2012-01-05  2013-01-04  2014-01-04  2015-01-04
BB2010-01-06  2011-01-06  2012-01-06  2013-01-05  2014-01-05  2015-01-05
B 2010-01-07   2011-01-07  2012-01-07  2013-01-06  2014-01-06
2015-01-06

dat <- data.frame(Class = c("AAA", "C", "CC", "BB", "B", "A"), Close_date =
c(ini, ini, ini, ini+109, ini+39, ini+24), stringsAsFactors = FALSE)
ind <- sapply(dat$Class, function(x) match(x, rownames(oc)))

for (i in length(ind))  {
dat[["Flag"]] <- sapply(dat[["Close_date"]], function(x) ifelse((x >=
oc[ind[[i]], 1] & x < oc[ind[[i]], 2]) | (x >= oc[ind[[i]], 3] & x <
oc[ind[[i]], 4]) | (x >= oc[ind[[i]], 5] & x < oc[ind[[i]], 6]),
"Valid", "Invalid"))
}
  

dat


 Class   Close_dateFlag
*1   AAA2010-01-01   Invalid*
2 C  2010-01-01   Invalid
3CC2010-01-01Invalid
4BB2010-04-20Valid
5 B 2010-02-09Valid
6 A 2010-01-25Valid
The first record (highlighted in yellow) is flagged as "Invalid" where it
should really be "Valid".

Any suggestions on resolving this would be great.

Many thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package for measurement error models

2010-08-07 Thread Charles C. Berry

On Sat, 7 Aug 2010, Carrie Li wrote:


Hi,all,

I posted this question couple of days again, but haven't gotten any answers
back. I would like to post it again, and if you have any ideas, please let
me know. Any helps and suggestions are very much appreciated.


Start by reading the Posting Guide.

It will tell you to

   *) Do RSiteSearch("keyword") with different keywords (at the R prompt)
  to search R functions, contributed packages and R-Help postings.

and

RSiteSearch("measurement error")

gives a number of hits that might be of interest.

If these hits do not pan out, you should let us know why none of the 
available packages that have 'measurement error' in their descriptions 
satisfy your needs.


HTH,

Chuck



The problem is about linear regression with both y and x have measurement,
and the variance of errors are heterogeneous. The estimated regression
coefficient and its variance are of interest. Is any R package doing this
task ?

Thank you

Carrie--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] eval-parse and lme in a loop

2010-08-07 Thread Dieter Menne


Connolly, Colm wrote:
> 
> Hi everybody,
> 
> I'm having trouble getting an eval-parse combination to work with lme in a
> for loop.
> 
> 

I have not checked in detail, but it looks like another case for what we
call "Ripley's game":

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/16599.html

Dieter



-- 
View this message in context: 
http://r.789695.n4.nabble.com/eval-parse-and-lme-in-a-loop-tp2315622p2317292.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply family functions

2010-08-07 Thread Wu Gong

Hi Steven,

You code has two problems. One is loop structure which should be for(i in
1:length). The second is that you loop the process twice (for and sapply).

Hope followed code will work.

for (i in 1:length(ind))  {
x <- dat$Close_date[i]
dat[["Flag"]][i] <- ifelse((x >= oc[ind[[i]], 1] & x < oc[ind[[i]], 2])
| (x >= oc[ind[[i]], 3] & x < oc[ind[[i]], 4])
| (x >= oc[ind[[i]], 5] & x < oc[ind[[i]], 6]),
"Valid", "Invalid")
}

Yours,

Wu



-
A R learner.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/apply-family-functions-tp2315905p2317287.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to start R

2010-08-07 Thread Stephen Liu
Hi Erik,

Thanks for your advice and URL

q() is on the last line of the printout on starting R
Type 'q()' to quit R.

Sorry I overlook it

B.R.
Stephen




- Original Message 
From: Erik Iverson 
To: Stephen Liu 
Cc: Steve Lianoglou ; R-help@r-project.org
Sent: Sat, August 7, 2010 11:45:44 PM
Subject: Re: [R] How to start R

On 08/07/2010 10:29 AM, Stephen Liu wrote:
> Hi steve,
> 
> After starting R which command shall I use to exit it rather than close the
> console.  exit/quit did not work.  Thanks

What does "did not work" mean?

help.search("quit")

leads you to:

?quit


> quit()

or

> q()

should terminate the R session.

I suggest you read "An Introduction to R" at 
http://cran.r-project.org/doc/manuals/R-intro.html, where q() is first 
mentioned 
in section 1.5.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to start R

2010-08-07 Thread Erik Iverson

On 08/07/2010 10:29 AM, Stephen Liu wrote:

Hi steve,

After starting R which command shall I use to exit it rather than close the
console.  exit/quit did not work.  Thanks


What does "did not work" mean?

help.search("quit")

leads you to:

?quit


> quit()

or

> q()

should terminate the R session.

I suggest you read "An Introduction to R" at 
http://cran.r-project.org/doc/manuals/R-intro.html, where q() is first mentioned 
in section 1.5.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to start R

2010-08-07 Thread Stephen Liu
Hi steve,

After starting R which command shall I use to exit it rather than close the 
console.  exit/quit did not work.  Thanks

B.R.
Stephen



- Original Message 
From: Steve Lianoglou 
To: Stephen Liu 
Cc: Tim Gruene ; R-help@r-project.org
Sent: Sat, August 7, 2010 9:18:42 PM
Subject: Re: [R] How to start R

Hi,

On Sat, Aug 7, 2010 at 8:56 AM, Stephen Liu  wrote:
> Hi Tim,
>
> I got it.  Thanks for reminding me.  It should be R (capital letter) not r.

Indeed, you were typing in a "little r", as opposed to a capital R.
Now, note the package list you installed:

r-base
littler
r-cran-plotrix
r-cran-qtl
r-cran-rggobi

One of which is "littler" -- aka "little R", which is a scripting
front end for R :-)

http://dirk.eddelbuettel.com/code/littler.html

Enjoy,
-steve
-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package for measurement error models

2010-08-07 Thread Carrie Li
Hi,all,

I posted this question couple of days again, but haven't gotten any answers
back. I would like to post it again, and if you have any ideas, please let
me know. Any helps and suggestions are very much appreciated.

The problem is about linear regression with both y and x have measurement,
and the variance of errors are heterogeneous. The estimated regression
coefficient and its variance are of interest. Is any R package doing this
task ?

Thank you

Carrie--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence Intervals for logistic regression

2010-08-07 Thread Ben Bolker
  manchester.ac.uk> writes:

> 
> On 07-Aug-10 09:29:41, Michael Bedward wrote:
> > Thanks for that clarification Peter - much appreciated.
> > 
> > Is there an R function that you'd recommend for calculating
> > more valid CIs ?
> > Michael
> 
> It depends on what you want to mean by "more valid"! If you have
> a 95% CI for the linear predictor (say L(x) at X=x), then the
> probability that the CI will include the true value of L(x)
> is 95% (more or less accurately, depending on what approximation,
> if any, was used to obtain the CI). Thus, if A(Y) and B(Y) are the
> lower and upper limits of a 95% CI for L(x) as functions of the data Y,

  [snip]
 
> So you can't have everything at once, and it depends on what you
> want to mean by "valid"!
 
[snip]

   Yow.  Nothing like r-help on a Saturday morning!
  Practically speaking, I think the previous recommendations (confint() 
and boot.ci()) are probably best.  I prefer equal probabilities
in tails to a symmetric confidence interval. (You could also try
for Bayesian credible intervals, which are symmetric in the probability
cutoff for each side ...)
  The other thing to keep in mind is that once you get down to
this level of rigor, it's extremely likely that the major source
of error/uncertainty in your results will be some other
systematic error or violation of the assumptions. Are your data
*really* distributed the way you think?  Linear on the scale of the
linear predictor, independent, homogeneous probabilities/exchangeability
within groups, ... ?
  Or maybe that's just my excuse for not worrying too much.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] image plot but data not on grid.

2010-08-07 Thread Michael Bedward
> If it's regular, but incomplete, that method will work.  If it's
> irregular, then no image method will work without first interpolating
> a regular grid.

Thanks Hadley. As I suspected,  but ggplot2 is so very clever that I
thought it was worth asking :)

Michael

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] image plot but data not on grid.

2010-08-07 Thread Hadley Wickham
On Sat, Aug 7, 2010 at 2:54 AM, Michael Bedward
 wrote:
> On 7 August 2010 06:26, Hadley Wickham wrote:
>
>> library(ggplot2)
>> qplot(x, y, fill = z, data = df, geom = "tile")
>
> Hi Hadley,
>
> I read the original question as being about irregularly spaced data.
> The above method doesn't seem to for me in such a case but perhaps I'm
> constructing my example incorrectly.

If it's regular, but incomplete, that method will work.  If it's
irregular, then no image method will work without first interpolating
a regular grid.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to start R

2010-08-07 Thread Steve Lianoglou
Hi,

On Sat, Aug 7, 2010 at 8:56 AM, Stephen Liu  wrote:
> Hi Tim,
>
> I got it.  Thanks for reminding me.  It should be R (capital letter) not r.

Indeed, you were typing in a "little r", as opposed to a capital R.
Now, note the package list you installed:

r-base
littler
r-cran-plotrix
r-cran-qtl
r-cran-rggobi

One of which is "littler" -- aka "little R", which is a scripting
front end for R :-)

http://dirk.eddelbuettel.com/code/littler.html

Enjoy,
-steve
-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to start R

2010-08-07 Thread Stephen Liu
Hi Tim,

I got it.  Thanks for reminding me.  It should be R (capital letter) not r.

On console:-

~$ R

R version 2.10.1 (2009-12-14)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.


B.R.
Stephen




- Original Message 
From: Tim Gruene 
To: Stephen Liu 
Cc: R-help@r-project.org
Sent: Sat, August 7, 2010 8:45:04 PM
Subject: Re: [R] How to start R

What exactly do you mean with 'just hanging there'? Do you expect anything else?
When I start R on Debian squeeze (with the command 'R'), I see
R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 
and I can start working from there.

Tim

On Sat, Aug 07, 2010 at 05:30:29AM -0700, Stephen Liu wrote:
> Hi folks,
> 
> Just finished installing R on Ubuntu 10.04, running on a VM, download 
> following 
>
> packages on repo;
> 
> r-base
> littler
> r-cran-plotrix
> r-cran-qtl
> r-cran-rggobi
> 
> But I could not get R started. r is on /usr/bin/r
> 
> On console evoking it just hanging there.  Any additional packages I need to 
> install?  Thanks
> 
> B.R.
> 
> 
> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
--
Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to start R

2010-08-07 Thread Tim Gruene
What exactly do you mean with 'just hanging there'? Do you expect anything else?
When I start R on Debian squeeze (with the command 'R'), I see
R version 2.11.1 (2010-05-31)
Copyright (C) 2010 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 
and I can start working from there.

Tim

On Sat, Aug 07, 2010 at 05:30:29AM -0700, Stephen Liu wrote:
> Hi folks,
> 
> Just finished installing R on Ubuntu 10.04, running on a VM, download 
> following 
> packages on repo;
> 
> r-base
> littler
> r-cran-plotrix
> r-cran-qtl
> r-cran-rggobi
> 
> But I could not get R started. r is on /usr/bin/r
> 
> On console evoking it just hanging there.  Any additional packages I need to 
> install?  Thanks
> 
> B.R.
> 
> 
> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
--
Tim Gruene
Institut fuer anorganische Chemie
Tammannstr. 4
D-37077 Goettingen

GPG Key ID = A46BEE1A



signature.asc
Description: Digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to start R

2010-08-07 Thread Stephen Liu
Hi folks,

Just finished installing R on Ubuntu 10.04, running on a VM, download following 
packages on repo;

r-base
littler
r-cran-plotrix
r-cran-qtl
r-cran-rggobi

But I could not get R started. r is on /usr/bin/r

On console evoking it just hanging there.  Any additional packages I need to 
install?  Thanks

B.R.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence Intervals for logistic regression

2010-08-07 Thread Peter Dalgaard
Michael Bedward wrote:
> On 7 August 2010 19:56, Martin Maechler wrote:
> 
>> I'm coming late to the thread,
>> but it seems that nobody has yet given the advice which I would
>> very *strongly* suggest to anyone asking for confidence
>> intervals in GLMs:
>>
>> Use  confint()
> 
> confint was actually mentioned in the second post on the thread :)
> But it's not what the OP is after.

Yep. The problem is that we cannot (yet? easily?) do confint on a
transformation of the parameters, hence not on the predicted values (on
whatever scale).

Another thing is that confint() isn't quite a cure-all because it relies
on the distribution of the (signed square root of) the LRT, and if that
is substantially different from chi-square(1) (or N(0,1)), then you
might still not obtain improved coverage, etc.

That being said, one should probably say that it is quite common to base
CI's on the linear predictor scale, and there's nothing REALLY wrong
with that procedure. As will have transpired by now, once you leave this
rather well-trodden road, you can easily find yourself in a rather
impenetrable thicket...

As far as I recall the asymptotics, all of the "+/- 2*se" procedures
have approximation errors of the same order as that caused by the
discrete distribution of the response, so it is not like one method is
going to "win" by orders of magnitude.

> Michael


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] several figures from one Sweave chunk? [solved]

2010-08-07 Thread Liviu Andronic
(on-list)
Hello Cameron

On Fri, 6 Aug 2010 19:02:22 +0100
Liviu Andronic  wrote:
> On Fri, 6 Aug 2010 11:30:59 -0600
> Cameron Bracken  wrote:
> > There is no real good way to deal with this in Sweave. Sweave does
> > not actually know anything about the graphics it is creating, all
> > it knows it that a graphic is or is not being created. The problem
> > lies with the graphic devices, each has its own way of creating a
> > new "page." That is when you create a new plot without closing the
> > graphics device. For example, pdf() createS a new page in the pdf
> > document, png() just overwrites the old image and tikz() creates a
> > whole new tikzpicture. These are all done without Sweave knowing a
> > bit about it.
> > 
> > So the answer is no, pgfSweave nor cacheSweave have any way of
> > dealing with this.  In my experience, you are much better off
> > either creating separate chunks for each image, or using something
> > like layout:
> > 
> > <>=
> > layout(matrix(c(1,3,2,4),nrow=2))
> > for (i in 1:4) plot(rnorm(100)+i)
> > @
> > 
Unfortunately neither creating separate chunks for each image nor using
layout are options in my case: I need to automate the repetitive task
of describing the variables in my data set (15 for one df, about 45
for the second), using graphs and summary stats. This would require to
loop over mixed R/LaTeX code, something not allowed  by Sweave. 

Fortunately, I found a way to work around, based on the Sweave FAQ [1].
If you need not only graphs, but some LaTeX markup as well, Dieter
Menne proposed a very elegant solution that involved defining a
\newcommand containing most LaTeX code and calling it in the R loop
via a cat() statement [2]. Another alternative is to put all LaTeX code
in cat() statements, but this tends to get messy. 

The final hurdle was to output some 'verbatim' text from within a
<>= 
@

code chunk, which contained the entire loop. This can
be done by printing the object within 
cat("\\begin{verbatim}\n")
[..]
cat("\\end{verbatim}\n")

I would have wanted to include this in the \newcommand call, but since
my object was a list, it couldn't be passed via cat(). I would love to
hear a workaround to this. 

Here are the links to the LyX [3] and Sweave [4]  versions and the PDF
[5] result of my file, extending Dieter's example. 

Another potential solution, that I haven't yet fully investigated, is
brew. Contrary to my first impressions, its syntax is simpler than that
of Sweave. Moreover, it "natively" handles looping over mixed R/LaTeX
code and is thus perfect for generating repetitive reports. Combined
with a new 'weave' method [6], it seems that brew could easily be used
instead of Sweave. 

To echo Dieter's comments, I had some headaches when thinking
about the hen and the egg. Hope this helps
Liviu

[1] http://www.stat.uni-muenchen.de/~leisch/Sweave/FAQ.html#x1-11000A.9
[2] https://stat.ethz.ch/pipermail/r-help/2008-June/164783.html
[3] http://s000.tinyupload.com/index.php?file_id=22646520644600938098
[4] http://s000.tinyupload.com/index.php?file_id=73594116524789598605
[5] http://s000.tinyupload.com/index.php?file_id=00010585455841885948
[6] http://biostatmatt.com/archives/573/comment-page-1#comment-770


> > Hope that helps. If you have a specific application where you need
> > this type of functionality, I may be able so suggest a workaround.
> > 
> > -Cameron
> >

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence Intervals for logistic regression

2010-08-07 Thread Ted Harding
On 07-Aug-10 09:29:41, Michael Bedward wrote:
> Thanks for that clarification Peter - much appreciated.
> 
> Is there an R function that you'd recommend for calculating
> more valid CIs ?
> Michael

It depends on what you want to mean by "more valid"! If you have
a 95% CI for the linear predictor (say L(x) at X=x), then the
probability that the CI will include the true value of L(x)
is 95% (more or less accurately, depending on what approximation,
if any, was used to obtain the CI). Thus, if A(Y) and B(Y) are the
lower and upper limits of a 95% CI for L(x) as functions of the data Y,

  P(A(Y) < L(x) < B(Y)) = 0.95 (to within approximation)

and this may be asymmetrical in that we may have
  P((A(Y) > L(x)) = 1 - P(B(Y) < L(x)) != 0.025
(e.g. it may come out as a 1%:4% split of the 5%).

The response probability P(Y=1 | X=x) will be a monotonic function
F(L(x)) of x -- e.g. for the logistic exp(L)/(1+exp(L)), increasing
from 0 to 1. Then {F(A(Y)), F(B(Y)} is a 95% CI for P = F(x),
since P[A(Y) < L(x) < B(Y)] = P[F(A(Y)) < F(L(x))=P(x) < F(B(Y))]. Also,
  P[A(Y) > L(x)] = P(F(A(Y)) > F(L(x) = P(x)] and
  P[B(Y) < L(x)] = P(F(B(Y)) < F(L(x) = P(x)]
for exactly the same reason (monotonicity of F). Hence the split
of the 5% between left tail and right tail ion the response scale
P(x) = F(L(x)) is exactly the same as the split on the linear
predictor scale L(x).

Therefore, on that front (comparison of probabilities of coverage),
the CI transformed to the response scale {F(A(Y)), F(B(Y)} is exactly
as valid as the CI {A(Y),B(Y)} on the original linear predictor scale.
In particular, if the latter is "equi-tailed" (2.5% on either side)
then the former will be too. If that is what you mean by "valid",
then you're finished.

However, possibly you may want "valid" to mean "extending to equal
distances on either side of the point estimate" -- e.g. as you
do with Estimate +/- 1.96*SE. It may be that, on the linear predictor
scale, you achieve this and also equi-tailed (2.5% either way).

But then, when you transform to the response scale, you will lose
that symmetry: F(Est - 1.96*SE) and F(Est + 1.96*SE) will not
be equidistant from F(Est) (though the equi-tailed 2.5%:2.5%
of the tail probability will be preserved).

If you have a reason for wanting to, you can start with a 95% CI
for L(x) which is not equi-tailed, but does have the property of
symmetry in the response scale: F(Est - 1.96*SE) and F(Est + 1.96*SE)
will be equidistant from F(Est). So you could set up the CI for
L(x) as {A(x) = Est - c0(x)*SE, B(x) = Est + c1(x)*SE} where c0(x)
and c1(x) (which in general depend on x) are chosen so that
you get symmetry on the response scale. But then you will lose
the equi-tailed property on the linear predictor scale, hence also
on the response scale.

So you can't have everything at once, and it depends on what you
want to mean by "valid"!

However, in the case of response being the probability of Y=1,
you might want to be careful about symmetry on the response
scale, since that could result in a CI which goes above 1 or
below 0, which would not be "valid" ...

For large samples, asymptotically all these issues tend to dwindle
into near-irrelevance, since locally the reponse is close to linear
and whatever you achieve on one scale will be (close to) achieved
on the other scale.

Hoping this helps,
Ted.







> On 7 August 2010 18:37, Peter Dalgaard  wrote:
>>
>> Probably, neither is optimal, although any transformed scale is
>> asymptotically equivalent. E.g., neither the probability scale
>> nor the logit scale stabilizes the variance of a simple proportion
>> (the arcsine transform does), so test-based CIs should really be
>> asymmetric in both cases rather than just +/- 1.96se.
>>
>> However, working on the linear predictor scale has the advantage
>> that CIs by definition will not cross the boundaries of the
>> parameter space. (For the "usual" link functions: logit, probit,
>> cloglog, that is; it's not true for the identity link, obviously.)
>> --
>> Peter Dalgaard
>> Center for Statistics, Copenhagen Business School
>> Phone: (+45)38153501
>> Email: pd@cbs.dk _Priv: pda...@gmail.com


E-Mail: (Ted Harding) 
Fax-to-email: +44 (0)870 094 0861
Date: 07-Aug-10   Time: 11:42:09
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence Intervals for logistic regression

2010-08-07 Thread Michael Bedward
On 7 August 2010 19:56, Martin Maechler wrote:

> I'm coming late to the thread,
> but it seems that nobody has yet given the advice which I would
> very *strongly* suggest to anyone asking for confidence
> intervals in GLMs:
>
> Use  confint()

confint was actually mentioned in the second post on the thread :)
But it's not what the OP is after.

Michael

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence Intervals for logistic regression

2010-08-07 Thread Martin Maechler
> "PD" == Peter Dalgaard 
> on Sat, 07 Aug 2010 10:37:49 +0200 writes:

PD> Michael Bedward wrote:
>>> I was aware of this option.  I was assuming it was not ok to do fit +/- 
1.96
>>> se when you requested probabilities.  If this is legitimate then all the
>>> better.
>> 
>> I don't think it is.  I understood that you should do the calculation
>> in the scale of the linear predictor and then transform to
>> probabilities.
>> 
>> Happy to be corrected if that's wrong.

PD> Probably, neither is optimal, although any transformed scale is
PD> asymptotically equivalent. E.g., neither the probability scale nor the
PD> logit scale stabilizes the variance of a simple proportion (the arcsine
PD> transform does), so test-based CIs should really be asymmetric in both
PD> cases rather than just +/- 1.96se.

PD> However, working on the linear predictor scale has the advantage that
PD> CIs by definition will not cross the boundaries of the parameter space.
PD> (For the "usual" link functions: logit, probit, cloglog, that is; it's
PD> not true for the identity link, obviously.)

I'm coming late to the thread, 
but it seems that nobody has yet given the advice which I would
very *strongly* suggest to anyone asking for confidence
intervals in GLMs:

Use  confint()
which (for "glm"s) will load the MASS package and use likelihood
- profiling, giving (much) more reliable confidence intervals,
notably in the case of the infamous Hauck-Donner phenomenon,
which has been mentioned many times "here", and notably in 
MASS (the book!), I think even in its first edition.

Even more reliable (probably) would be to use the (recommended)
'boot' package and use bootstrap confidence intervals, i.e.,
boot.ci() there.

Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rJava question

2010-08-07 Thread Wincent
In a HTML file, a java applet can be launch by


What is the equivalent rJava command(s) to do the same? Thank you

Best

-- 
Wincent Rong-gui HUANG
Doctoral Candidate
Dept of Public and Social Administration
City University of Hong Kong
http://asrr.r-forge.r-project.org/rghuang.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence Intervals for logistic regression

2010-08-07 Thread Michael Bedward
Thanks for that clarification Peter - much appreciated.

Is there an R function that you'd recommend for calculating more valid CIs ?

Michael

On 7 August 2010 18:37, Peter Dalgaard  wrote:
>
> Probably, neither is optimal, although any transformed scale is
> asymptotically equivalent. E.g., neither the probability scale nor the
> logit scale stabilizes the variance of a simple proportion (the arcsine
> transform does), so test-based CIs should really be asymmetric in both
> cases rather than just +/- 1.96se.
>
> However, working on the linear predictor scale has the advantage that
> CIs by definition will not cross the boundaries of the parameter space.
> (For the "usual" link functions: logit, probit, cloglog, that is; it's
> not true for the identity link, obviously.)
>
> --
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence Intervals for logistic regression

2010-08-07 Thread Peter Dalgaard
Michael Bedward wrote:
>> I was aware of this option.  I was assuming it was not ok to do fit +/- 1.96
>> se when you requested probabilities.  If this is legitimate then all the
>> better.
> 
> I don't think it is.  I understood that you should do the calculation
> in the scale of the linear predictor and then transform to
> probabilities.
> 
> Happy to be corrected if that's wrong.

Probably, neither is optimal, although any transformed scale is
asymptotically equivalent. E.g., neither the probability scale nor the
logit scale stabilizes the variance of a simple proportion (the arcsine
transform does), so test-based CIs should really be asymmetric in both
cases rather than just +/- 1.96se.

However, working on the linear predictor scale has the advantage that
CIs by definition will not cross the boundaries of the parameter space.
(For the "usual" link functions: logit, probit, cloglog, that is; it's
not true for the identity link, obviously.)

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] image plot but data not on grid.

2010-08-07 Thread Michael Bedward
On 7 August 2010 06:26, Hadley Wickham wrote:

> library(ggplot2)
> qplot(x, y, fill = z, data = df, geom = "tile")

Hi Hadley,

I read the original question as being about irregularly spaced data.
The above method doesn't seem to for me in such a case but perhaps I'm
constructing my example incorrectly.

Michael

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence Intervals for logistic regression

2010-08-07 Thread Troy S
Stefano,

I was aware of this option.  I was assuming it was not ok to do fit +/- 1.96
se when you requested probabilities.  If this is legitimate then all the
better.

Thanks!
Troy

On 6 August 2010 22:25, Guazzetti Stefano wrote:

> a closer look to the help on predict.glm will reveal that the function
> accepts a 'type' argument.
> In you case 'type = response' will give you the results in probabilities
> (that it seems to be what you are looking for).
> There also is an example on use of the 'type' argument at the end of the
> page.
>
> Stefano
>
> -Messaggio originale-
> Da: r-help-boun...@r-project.org
> [mailto:r-help-boun...@r-project.org]per conto di Troy S
> Inviato: Friday, August 06, 2010 6:31 PM
> A: Michael Bedward
> Cc: r-help@r-project.org
> Oggetto: Re: [R] Confidence Intervals for logistic regression
>
>
> Michael,
>
> Thanks for the reply.  I believe Aline was sgiving me CI's on coefficients
> as well.
>
> So c(pred$fit + 1.96 * pred$se.fit, pred$fit - 1.96 *
> pred$se.fit) gives me the CI on the logits if I understand correctly?
>  Maybe
> the help on predict.glm can be updated.
>
> Thanks!
>
> On 6 August 2010 01:46, Michael Bedward  wrote:
>
> > Sorry about earlier reply - didn't read your email properly (obviously :)
> >
> > You're suggestion was right, so as well as method for Aline below,
> > another way of doing the same thing is:
> >
> > pred <- predict(y.glm, newdata= something, se.fit=TRUE)
> > ci <- matrix( c(pred$fit + 1.96 * pred$se.fit, pred$fit - 1.96 *
> > pred$se.fit), ncol=2 )
> >
> > lines( something, plogis( ci[,1] ) )
> > lines( something, plogis( ci[,2] ) )
> >
> >
> >
> > On 6 August 2010 18:39, aline uwimana  wrote:
> > > Dear Troy,
> > > use this commend, your will get IC95% and OR.
> > >
> > >  logistic.model <- glm(formula =y~ x1+x2, family = binomial)
> > > summary(logistic.model)
> > >
> > > sum.coef<-summary(logistic.model)$coef
> > >
> > > est<-exp(sum.coef[,1])
> > > upper.ci<-exp(sum.coef[,1]+1.96*sum.coef[,2])
> > > lower.ci<-exp(sum.coef[,1]-1.96*sum.coef[,2])
> > >
> > > cbind(est,upper.ci,lower.ci)
> > >
> > > regards.
> > >
> > > 2010/8/6 Troy S 
> > >
> > >> Dear UseRs,
> > >>
> > >> I have fitted a logistic regression using glm and want a 95%
> confidence
> > >> interval on a response probability.  Can I use
> > >>
> > >> predict(model, newdata, se.fit=T)
> > >>
> > >> Will fit +/- 1.96se give me a 95% of the logit?  And then
> > >> exp(fit +/- 1.96se) / (exp(fit +/- 1.96se) +1) to get the
> probabilities?
> > >>
> > >> Troy
> > >>
> > >>[[alternative HTML version deleted]]
> > >>
> > >> __
> > >> R-help@r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> > >>
> > >
> > >[[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> Rispetta l'ambiente: Se non ti è necessario, non stampare questa mail.
>
>
> "Le informazioni contenute nel presente messaggio di posta elettronica e in
> ogni suo allegato sono da considerarsi riservate e il destinatario della
> email è l'unico autorizzato
> ad usarle, copiarle e, sotto la propria responsabilità, divulgarle.
> Chiunque riceva questo messaggio per errore senza esserne il destinatario
> deve immediatamente rinviarlo
> al mittente cancellando l'originale. Eventuali dati personali e sensibili
> contenuti nel presente messaggio e/o suoi allegati vanno trattati nel
> rispetto della normativa
> in materia di privacy ( DLGS n.196/'03)".
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Confidence Intervals for logistic regression

2010-08-07 Thread Michael Bedward
> I was aware of this option.  I was assuming it was not ok to do fit +/- 1.96
> se when you requested probabilities.  If this is legitimate then all the
> better.

I don't think it is.  I understood that you should do the calculation
in the scale of the linear predictor and then transform to
probabilities.

Happy to be corrected if that's wrong.

Michael

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.