Re: [R] rgl.snapshot() : no longer works?

2010-12-29 Thread Yihui Xie
Hi,

Is there any progress so far? It seems R 2.12.1 under Windows still
does not have the rgl.snapshot() support.

Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA



On Tue, Nov 2, 2010 at 7:27 PM, Duncan Murdoch  wrote:
> On 02/11/2010 8:24 PM, Remko Duursma wrote:
>>
>> Hi all,
>>
>>> library(rgl)
>>> plot3d(1,1,1)
>>> snapshot3d("somefile.png")
>>
>> Error in rgl.snapshot(...) :
>>   pixmap save format not supported in this build
>>
>>
>> Why does this no longer work?
>
> The build for 2.12.0 on CRAN doesn't have png support built in.  I'm
> currently working with Uwe to fix this.
>
> Duncan Murdoch
>
>>
>> thanks,
>> Remko
>>
>>> sessionInfo()
>>
>> R version 2.12.0 (2010-10-15)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252
>> [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
>> [5] LC_TIME=English_Australia.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>> other attached packages:
>> [1] YPLANTER2_0.1   LeafAngle_1.0.3 gpclib_1.5-1    geometry_0.1-7
>> rgl_0.92.794
>>
>> loaded via a namespace (and not attached):
>> [1] tools_2.12.0
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Writing a single output file

2010-12-29 Thread Amy Milano
Dear sir,

At the outset I sincerely apologize for reverting back bit late as I was out of 
office. I thank you for your guidance extended by you in response to my earlier 
mail regarding "Writing a single output file" where I was trying to read 
multiple output files and create a single output date.frame. However, I think 
things are not working as I am mentioning below -


# Your code

setwd('/temp')
fileNames <- list.files(pattern = "file.*.csv")

input <- do.call(rbind, lapply(fileNames, function(.name)
{
.data <- read.table(.name, header = TRUE, as.is = TRUE)
.data$file <- .name
.data
}))


# This produces following output containing only two columns and moreover date 
and yield_rates are clubbed together.


 
 date.yield_rate  file
1   12/23/10,5.25 file1.csv
2   12/22/10,5.19 file1.csv
3   12/23/10,4.16 file2.csv
4   12/22/10,4.59 file2.csv
5   12/23/10,6.15 file3.csv
6   12/22/10,6.41 file3.csv
7   12/23/10,8.15 file4.csv
8   12/22/10,8.68 file4.csv


# and NOT the kind of output given below where date and yield_rates are 
different.

> input
    date  yield_rate  file
1 12/23/2010   5.25 file1.csv
2 12/22/2010   5.19 file1.csv
3 12/23/2010   5.25 file2.csv
4 12/22/2010   5.19 file2.csv
5 12/23/2010   5.25 file3.csv
6
 12/22/2010   5.19 file3.csv
7 12/23/2010   5.25 file4.csv
8 12/22/2010   5.19 file4.csv

So when I tried following code to produce the required result, it throws me an 
error.

require(reshape)

in.melt <- melt(input, measure = 'yield_rate')
> in.melt <- melt(input, measure = 'yield_rate')
Error: measure variables not found in data: yield_rate

# So I tried 

in.melt <- melt(input, measure = 'date.yield_rate')


cast(in.melt, date.yield_rate ~ file)

> cast(in.melt, date ~ file)
Error: Casting formula contains variables not found in molten data: date

# If I try to change it as 

cast(in.melt, date.yield_rate ~ file)    # Gives following error.
Error: Casting formula contains variables not found in molten data: 
date.yield_rate

Sir, it will be a
 great help if you can guide me and once again sinserely apologize for 
reverting so late.

Regards

Amy


--- On Thu, 12/23/10, jim holtman  wrote:

From: jim holtman 
Subject: Re: [R] Writing a single output file
To: "Amy Milano" 
Cc: r-help@r-project.org
Date: Thursday, December 23, 2010, 1:39 PM

This should get you close:

> # get file names
> setwd('/temp')
> fileNames <- list.files(pattern = "file.*.csv")
> fileNames
[1] "file1.csv" "file2.csv" "file3.csv" "file4.csv"
> input <- do.call(rbind, lapply(fileNames, function(.name){
+     .data <- read.table(.name, header = TRUE, as.is = TRUE)
+     # add
 file name to the data
+     .data$file <- .name
+     .data
+ }))
> input
        date yield_rate      file
1 12/23/2010       5.25 file1.csv
2 12/22/2010       5.19 file1.csv
3 12/23/2010       5.25 file2.csv
4 12/22/2010       5.19 file2.csv
5 12/23/2010       5.25 file3.csv
6 12/22/2010       5.19 file3.csv
7 12/23/2010       5.25 file4.csv
8 12/22/2010       5.19 file4.csv
> require(reshape)
> in.melt <- melt(input, measure = 'yield_rate')
> cast(in.melt, date ~ file)
        date file1.csv file2.csv file3.csv file4.csv
1 12/22/2010      5.19      5.19 
     5.19      5.19
2 12/23/2010      5.25      5.25      5.25      5.25
>


On Thu, Dec 23, 2010 at 8:07 AM, Amy Milano  wrote:
> Dear R helpers!
>
> Let me first wish all of you "Merry Christmas and Very Happy New year 2011"
>
> "Christmas day is a day of Joy and Charity,
> May God make you rich in both" - Phillips Brooks
>
> ## 
> 
>
> I have a process which generates number of outputs. The R code for the same 
> is as given below.
>
> for(i in 1:n)
> {
> write.csv(output[i], file = paste("output", i, ".csv", sep = ""), row.names =
 FALSE)
> }
>
> Depending on value of 'n', I get different output files.
>
> Suppose n = 3, that means I am having three output csv files viz. 
> 'output1.csv', 'output2.csv' and 'output3.csv'
>
> output1.csv
> date   yield_rate
> 12/23/2010    5.25
> 12/22/2010    5.19
> .
> .
>
>
> output2.csv
>
> date   yield_rate
>
> 12/23/2010    4.16
>
> 12/22/2010    4.59
>
> .
>
>
 .
>
> output3.csv
>
>
> date   yield_rate
>
>
> 12/23/2010    6.15
>
>
> 12/22/2010    6.41
>
>
> .
>
>
> .
>
>
>
> Thus all the output files have same column names viz. Date and yield_rate. 
> Also, I do need these files individually too.
>
> My further requirement is to have a single dataframe as given below.
>
> Date yield_rate1  
 yield_rate2    yield_rate3
> 12/23/2010   5.25  4.16 

Re: [R] access a column of a dataframe without qualifying the name of the column

2010-12-29 Thread Bill.Venables
Here is an alternaive approach that is closer to that used by lm and friends.

> df <- data.frame(x=1:10,y=11:20)
> test <- function(col, dat) eval(substitute(col), envir = dat)
> test(x, df)
 [1]  1  2  3  4  5  6  7  8  9 10
> test(y, df)
 [1] 11 12 13 14 15 16 17 18 19 20
> 

There is a slight added bonus this way

> test(x+y+1, df)
 [1] 13 15 17 19 21 23 25 27 29 31
> 

(Well, I did say 'slight'.)

Bill Venables.



From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
David Winsemius [dwinsem...@comcast.net]
Sent: 30 December 2010 10:44
To: John Sorkin
Cc: r-help@r-project.org
Subject: Re: [R] access a column of a dataframe without qualifying the name 
of the column

On Dec 29, 2010, at 7:11 PM, John Sorkin wrote:

> I am trying to write a function that will access a column of a data
> frame without having to qualify the name of the data frame column as
> long as the name of the dataframe is passed to the function. As can
> be seen from the code below, my function is not working:

Not sure what the verb "qualify" means in programming. Quoting?

>
> df <- data.frame(x=1:10,y=11:20)
> df
>
> test <- function(column,data) {
>  print(data$column)
> }
>
> test(x,df)
>
> I am trying to model my function after the way that lm works where
> one needs not qualify column names, i.e.


 > df <- data.frame(x=1:10,y=11:20)
 > test <- function(column,dat) { print(colname <-
deparse(substitute(column)))
+  dat[[colname]]
+ }
 >
 > test(x,df)
[1] "x"
  [1]  1  2  3  4  5  6  7  8  9 10
 >

--
David.


>
>
> fit1<- lm(y~x,data=df)
>
>
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
> Confidentiality Statement:
> This email message, including any attachments, is for th...{{dropped:
> 6}}
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] icon for an R package

2010-12-29 Thread Michael Friendly

Wow! thanks John, David and Marc
and Happy New Year to all R-helpRs

-Michael

On 12/29/2010 7:00 PM, John Fox wrote:

Hi Michael,

I've attached my attempt at an R-package logo.

Best,
  John


John Fox
Senator William McMaster
   Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]

On

Behalf Of Michael Friendly
Sent: December-29-10 12:32 PM
To: David Winsemius
Cc: r-help@r-project.org
Subject: Re: [R] icon for an R package

On 12/29/2010 11:02 AM, David Winsemius wrote:

On Dec 29, 2010, at 10:03 AM, Michael Friendly wrote:


I'm looking for an icon to represent an R package.  Perhaps something
like



http://cdn2.iconfinder.com/data/icons/DarkGlass_Reworked/128x128/apps/packag
e

.png


but with the R logo rather than KDE.

Can't you just get the location of an "R" at CRAN?

http://cran.r-project.org/Rlogo.jpg


Sorry for not being clearer.  What I want is an icon for a *package*,
such as I gave in the iconfinder link
above, but with the R logo *superposed*. Such 3D icons are common in the
Mac world, but I couldn't
find anything similar for R, so thought I'd ask before trying my (poor)
hand with PhotoShop or something
similar.


--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bivariate weighted fit methods of Williamson-York in R?

2010-12-29 Thread Jooil Kim
Hello everyone,

I've been looking for an R function to calculate bivariate weighted fits of
my data set, preferably using methods of Williamson-York.

Improvements offered by using bivariate weighted fitting compared to
conventional linear least-square fitting was recently described in a paper
by Cantrell (http://www.atmos-chem-phys.org/8/5477/2008/acp-8-5477-2008.pdf),
in which the methods of Williamson-York were highlighted among some of the
other bivariate methods available in literature.

Searching through the CRAN archives does reveal some packages offering
weighted fitting methods, but, with my limited knowledge of statistics
(mathematics), I can't tell if the methods are similar to the
Williamson-York method.

Of course, other weighted methods may be just as applicable for my intended
use (determining a fit between two concurrent measurements of atmospheric
greenhouse gases), and I'm open to suggestions for other methods.

Any help in this matter is greatly appreciated.

Thanks,

Jooil

-- 
#
Jooil Kim
Graduate Student
School of Earth and Environmental Sciences,
Seoul National University
Gwanak Gu Shillim 9 Dong San 56-1
Seoul National Univ. Bld#501, Rm 503
Seoul, Rep. of Korea 151-742
kji2...@gmail.com
tel) 82-2-877-6741
fax) 82-2-885-7164
#

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filling up holes

2010-12-29 Thread analys...@hotmail.com


On Dec 28, 10:27 pm,  wrote:
> Dear 'analyst41' (it would be a courtesy to know who you are)
>
> Here is a low-level way to do it.  
>
> First create some dummy data
>
> > allDates <- seq(as.Date("2010-01-01"), by = 1, length.out = 50)
> > client_ID <- sample(LETTERS[1:5], 50, rep = TRUE)
> > value <- 1:50
> > date <- sample(allDates)
> > clientData <- data.frame(client_ID, date, value)
>
> At this point clientData has 50 rows, with 5 clients, each with a sample of 
> datas.  Everything is in random order execept "value".
>
> Now write a little function to fill out a subset of the data consisting of 
> one client's data only:
>
> > fixClient <- function(cData) {
>
> +   dateRange <- range(cData$date)
> +   dates <- seq(dateRange[1], dateRange[2], by = 1)
> +   fullSet <- data.frame(client_ID = as.character(cData$client_ID[1]),
> +                         date = dates, value = NA)
> +
> +   fullSet$value[match(cData$date, dates)] <- cData$value
> +   fullSet  
> + }
>
> Now split up the data, apply the fixClient function to each section and 
> re-combine them again:
>
> > allData <- do.call(rbind,
>
> +                    lapply(split(clientData, clientData$client_ID), 
> fixClient))
>
> Check:
>
> > head(allData)
>
>     client_ID       date value
> A.1         A 2010-01-04    36
> A.2         A 2010-01-05    18
> A.3         A 2010-01-06    NA
> A.4         A 2010-01-07    NA
> A.5         A 2010-01-08    NA
> A.6         A 2010-01-09    49
>
>
>
> Seems OK.  At this point the data are in sorted order by client and date, but 
> that should not matter.
>
> Bill Venables.
>
>

It is of course a great honor to receive a reply from you (but please
allow me to continue to be an anonymous source of bits and bytes over
the net).

This is a neat solution, but please watch this space to see my dumber
version (the code might need to be changed to a procedural languaage
eventually).

Thank you.
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
> Behalf Of analys...@hotmail.com
> Sent: Wednesday, 29 December 2010 10:45 AM
> To: r-h...@r-project.org
> Subject: [R] filling up holes
>
> I have a data frame with three columns
>
> client ID | date | value
>
> For each cilent ID I want to determine Min date and Max date and for
> any dates in between that are missing I want to insert a row
>
> Client ID | date| NA
>
> Any help would be appreciated.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.- Hide 
> quoted text -
>
> - Show quoted text -

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Curso de R en Santiago, Chile

2010-12-29 Thread Jose Bustos Melo
Estimados,

A todos quienes estan insterados y estan en Stgo de 
Chile les tengo una muy buena noticia. Desde hace ya algunas semanas, 
nos hemos sentado a la mesa con Alex (Epidemiologo) a discutir la 
necesidad de desarrollar un curso de R para principiantes. No solo 
porque es necesario unirse en torno a un projecto como es el R, sino que
 es una oportunidad para muchos cientificos (como nosotros) que estan 
necesitando manejar sus bases de datos de mejor forma. 

Asi, que 
usando este mismo medio me he tomado la libertad de invitarlos a todos 
ustedes (quienes estan en Stgo) a participar de este curso de R. 
Inicialmente serán 5 dias, desde un Lunes a un Viernes. Entregaremos 
Libros en formato PDF y manuales. Estamos gestionando un lugar en la 
Universidad Católica para poder trabajar y estamos a la espera de esa 
confirmación. El curso no tendrá costo alguno, porque queremos 
desarrollar un nodo de investigadores usuarios de R aqui en Chile y 
eventualmente trabajar en un projecto de este tipo.
 
El programa de curso lo adjunto aqui, para que le hechen una miradita.

Dependiendo
 de la participacion y afluencia, realizaremos otros cursos de R basados
 en diferentes topicos de Estadistica y/o Epidemiologia. Usando las 
redes sociales como facebook podemos.

Muy atento a sus comentarios, me despido.

José Bustos
Bioestadistico
Escuela de Enfermeria
Pontificia Universidad Católica de Chile
Celular 95939144

http://www.facebook.com/?sk=messages&tid=1248634633579#!/note.php?note_id=475603118010&id=49677888734


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] access a column of a dataframe without qualifying the name of the column

2010-12-29 Thread John Sorkin
Thank you,
John




John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Bert 
Gunter  12/29/2010 8:17 PM >>>
?substitute

test <- function(col,frm) {
  eval(substitute(col),frm)
}

test2 <- function(col,frm){
  cname<- deparse(substitute(col))
  frm[[cname]]
}

 z <- data.frame(x=1:3,y=letters[1:3])

test(x, z)

test2(x, z)


-- Bert

On Wed, Dec 29, 2010 at 4:44 PM, David Winsemius  wrote:
>
> On Dec 29, 2010, at 7:11 PM, John Sorkin wrote:
>
>> I am trying to write a function that will access a column of a data frame
>> without having to qualify the name of the data frame column as long as the
>> name of the dataframe is passed to the function. As can be seen from the
>> code below, my function is not working:
>
> Not sure what the verb "qualify" means in programming. Quoting?
>
>>
>> df <- data.frame(x=1:10,y=11:20)
>> df
>>
>> test <- function(column,data) {
>>  print(data$column)
>> }
>>
>> test(x,df)
>>
>> I am trying to model my function after the way that lm works where one
>> needs not qualify column names, i.e.
>
>
>> df <- data.frame(x=1:10,y=11:20)
>> test <- function(column,dat) { print(colname <-
>> deparse(substitute(column)))
> +  dat[[colname]]
> + }
>>
>> test(x,df)
> [1] "x"
>  [1]  1  2  3  4  5  6  7  8  9 10
>>
>
> --
> David.
>
>
>>
>>
>> fit1<- lm(y~x,data=df)
>>
>>
>> John David Sorkin M.D., Ph.D.
>> Chief, Biostatistics and Informatics
>> University of Maryland School of Medicine Division of Gerontology
>> Baltimore VA Medical Center
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>> (Phone) 410-605-7119
>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>>
>> Confidentiality Statement:
>> This email message, including any attachments, is for th...{{dropped:6}}
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Bert Gunter
Genentech Nonclinical Biostatistics

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] access a column of a dataframe without qualifying the name of the column

2010-12-29 Thread John Sorkin
Thank you
John




John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Bert 
Gunter  12/29/2010 8:17 PM >>>
?substitute

test <- function(col,frm) {
  eval(substitute(col),frm)
}

test2 <- function(col,frm){
  cname<- deparse(substitute(col))
  frm[[cname]]
}

 z <- data.frame(x=1:3,y=letters[1:3])

test(x, z)

test2(x, z)


-- Bert

On Wed, Dec 29, 2010 at 4:44 PM, David Winsemius  wrote:
>
> On Dec 29, 2010, at 7:11 PM, John Sorkin wrote:
>
>> I am trying to write a function that will access a column of a data frame
>> without having to qualify the name of the data frame column as long as the
>> name of the dataframe is passed to the function. As can be seen from the
>> code below, my function is not working:
>
> Not sure what the verb "qualify" means in programming. Quoting?
>
>>
>> df <- data.frame(x=1:10,y=11:20)
>> df
>>
>> test <- function(column,data) {
>>  print(data$column)
>> }
>>
>> test(x,df)
>>
>> I am trying to model my function after the way that lm works where one
>> needs not qualify column names, i.e.
>
>
>> df <- data.frame(x=1:10,y=11:20)
>> test <- function(column,dat) { print(colname <-
>> deparse(substitute(column)))
> +  dat[[colname]]
> + }
>>
>> test(x,df)
> [1] "x"
>  [1]  1  2  3  4  5  6  7  8  9 10
>>
>
> --
> David.
>
>
>>
>>
>> fit1<- lm(y~x,data=df)
>>
>>
>> John David Sorkin M.D., Ph.D.
>> Chief, Biostatistics and Informatics
>> University of Maryland School of Medicine Division of Gerontology
>> Baltimore VA Medical Center
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>> (Phone) 410-605-7119
>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>>
>> Confidentiality Statement:
>> This email message, including any attachments, is for th...{{dropped:6}}
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html 
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help 
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Bert Gunter
Genentech Nonclinical Biostatistics

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] access a column of a dataframe without qualifying the name of the column

2010-12-29 Thread Bert Gunter
?substitute

test <- function(col,frm) {
  eval(substitute(col),frm)
}

test2 <- function(col,frm){
  cname<- deparse(substitute(col))
  frm[[cname]]
}

 z <- data.frame(x=1:3,y=letters[1:3])

test(x, z)

test2(x, z)


-- Bert

On Wed, Dec 29, 2010 at 4:44 PM, David Winsemius  wrote:
>
> On Dec 29, 2010, at 7:11 PM, John Sorkin wrote:
>
>> I am trying to write a function that will access a column of a data frame
>> without having to qualify the name of the data frame column as long as the
>> name of the dataframe is passed to the function. As can be seen from the
>> code below, my function is not working:
>
> Not sure what the verb "qualify" means in programming. Quoting?
>
>>
>> df <- data.frame(x=1:10,y=11:20)
>> df
>>
>> test <- function(column,data) {
>>  print(data$column)
>> }
>>
>> test(x,df)
>>
>> I am trying to model my function after the way that lm works where one
>> needs not qualify column names, i.e.
>
>
>> df <- data.frame(x=1:10,y=11:20)
>> test <- function(column,dat) { print(colname <-
>> deparse(substitute(column)))
> +  dat[[colname]]
> + }
>>
>> test(x,df)
> [1] "x"
>  [1]  1  2  3  4  5  6  7  8  9 10
>>
>
> --
> David.
>
>
>>
>>
>> fit1<- lm(y~x,data=df)
>>
>>
>> John David Sorkin M.D., Ph.D.
>> Chief, Biostatistics and Informatics
>> University of Maryland School of Medicine Division of Gerontology
>> Baltimore VA Medical Center
>> 10 North Greene Street
>> GRECC (BT/18/GR)
>> Baltimore, MD 21201-1524
>> (Phone) 410-605-7119
>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>>
>> Confidentiality Statement:
>> This email message, including any attachments, is for th...{{dropped:6}}
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logistic regression with response 0,1

2010-12-29 Thread Dennis Murphy
Hi:

I think you created a problem for yourself in the way you generated your
data.

y<-rbinom(2000,1,.7)
euro <- rnorm(2000, m = 300 * y + 50 * (1 - y), s = 20 * y + 12 * (1 - y))
# Create a 2000 x 2 matrix of probabilities
prmat <- cbind(0.8 * y + 0.2 * (1 - y), 0.2 * y + 0.8 * (1 - y))
# sample sex from each row of prmat with the rows comprising the
distribution
sex <- apply(prmat, 1, function(x) sample(c('m', 'f'), 1, prob = x))

df <- data.frame(euro, sex, y)

# Histogram of euro: notice the separation in distributions
hist(euro, nclass = 50)
# Generate an indicator between the two clusters of euro
spl <- euro > 150
# Now show a table of that split vs. response
table(spl, y)

This is what I get for my simulation:

table(spl, y)
   y
spl01
  FALSE  5720
  TRUE 0 1428

which in turn leads to

m <- glm(y ~ euro + sex, data = df, family = binomial)
Warning messages:
1: glm.fit: algorithm did not converge
2: glm.fit: fitted probabilities numerically 0 or 1 occurred

This is what is known as 'complete data separation' in the logistic
regression literature. Basically, you've generated data so that all the
successes are associated with a N(300, 20) distribution and all the failures
with a N(50, 12) distribution. If this is a classification problem,
congratulations - the margin on the support vector will be huge :)  OTOH, if
you're trying to fit a logistic model for purposes of explanation, you've
created a problem, especially with respect to prediction.

...and it does matter whether this is a regression problem or a
classification problem. In the latter, separation is a good thing; in the
former, it creates convergence problems.

Since you have a continuous predictor in the model, there is an additional
complication: the logistic regression null deviance does not have an
asymptotic chi-square distribution, so tests involving reductions of
deviance from the null model are not guaranteed to have asymptotic
chi-square distributions *when the predictor x is truly continuous*.

More below.

On Wed, Dec 29, 2010 at 9:48 AM, Federico Bonofiglio wrote:

> Dear Masters,
> first I'd like to wish u all a great 2011 and happy holydays by now,
>
> second (here it come the boring stuff) I have a question to which I hope u
> would answer:
>
> I run a logistic regression by glm(), on the following data type
> (y1=1,x1=x1); (y2=0,x2=x2);..(yn=0,xn=xn), where the response (y) is
> abinary outcome on 0,1 amd x is any explanatory variable (continuous or
> not)
> observed with the i-th outcome.
>
> This is indeed one of the most frequent case when challenged with binary
> responses, though I know R manages such responses slightly differently (a
> vector for the successes counts and one for the failures) and I'm not sure
> wheather my summary.glm gives me any senseful answer at all
>
> for the purpose I have tried to to emphasize the numbers so to obtain
> significant results
>
> y<-rbinom(2000,1,.7)#response
>
> for(i in 1:2000){
> euro[i]<-if(y[i]==1){rnorm(1,300,20)#explanatory 1
> }else{rnorm(1,50,12)}
> }
>
> for(i in 1:2000){
> sex[i]<-if(y[i]==1){sample(c("m","f"),1,prob=c(.8,.2))#explanatory 2
> }else{sample(c("m","f"),1,prob=c(.2,.8))}
> }
>
>
>
> m<-glm(y~euro+factor(sex),family=binomial)
>
> summary(m)
>
>
>
>
> My worries:
>
>   - are the estimates correct?
>

The people who wrote the glm() routine were smart enough to anticipate the
data separation case and are warning you of potential instability in the
model estimates/predictions as a result of producing predictions of exactly
0 or 1. This is a warning to take seriously -  your generated data produced
these based on x alone.

  -  degrees of freedom exponentiate dramatically (one per cell) , so may I
>   risk to never obtain a significant result?
>

When using grouped or ungrouped data, comparisons between nested models will
be the same whether the data are grouped or ungrouped in non-pathological
situations.

>
> I also take the chance to ask wheater u know any implemented method to plot
> logistic curves directly out of a glm() model
>


The following is an example to illustrate some of the questions you raised.


# Example to illustrate the difference between grouped and ungrouped
# logistic regression analyses

library(reshape2)
library(lattice)

# Sample 50 distinct x values 300 times
x <- sample(1:50, 300, replace = TRUE)
# P(Y = 1 | X)  increases with x
y <- rbinom(300, 1, (10 + x)/80)
ind <- x > 25
# males sampled more heavily when x > 25
p <- cbind(0.7 * ind + 0.3 * (1 - ind), 0.3 * ind + 0.7 * (1 - ind))
sex <- apply(p, 1, function(x) sample(c('m', 'f'), 1, prob = x))
df <- data.frame(sex, x, y)

# Ungrouped logistic regression
# treat x as a continuous covariate
m1 <- glm(y ~ sex + x, data = df, family = binomial)

# Group the data by x * sex combinations
u <- as.data.frame(xtabs(~ y + sex + x, data = df))
# cast() reshapes the data so that the 0/1 frequencies become separate
columns
# cast() comes from package reshape(2)

Re: [R] access a column of a dataframe without qualifying the name of the column

2010-12-29 Thread David Winsemius


On Dec 29, 2010, at 7:11 PM, John Sorkin wrote:

I am trying to write a function that will access a column of a data  
frame without having to qualify the name of the data frame column as  
long as the name of the dataframe is passed to the function. As can  
be seen from the code below, my function is not working:


Not sure what the verb "qualify" means in programming. Quoting?



df <- data.frame(x=1:10,y=11:20)
df

test <- function(column,data) {
 print(data$column)
}

test(x,df)

I am trying to model my function after the way that lm works where  
one needs not qualify column names, i.e.



> df <- data.frame(x=1:10,y=11:20)
> test <- function(column,dat) { print(colname <-  
deparse(substitute(column)))

+  dat[[colname]]
+ }
>
> test(x,df)
[1] "x"
 [1]  1  2  3  4  5  6  7  8  9 10
>

--
David.





fit1<- lm(y~x,data=df)


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped: 
6}}


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] prediction intervals for (mcgv) gam objects

2010-12-29 Thread Julian
As I understand it,  predict.lm(l ,newdata=nd ,interval="confidence") yields 
confidence bands for the predicted mean of new observations and lm.predict(l 
,newdata=nd ,interval="prediction") yields confidence bands for new 
observations themselves, given an lm object l.
 
However with regard to {mgcv} although  predict.gam (g ,se.fit=TRUE ,interval= 
"prediction") computes without protest it only seems to yield the predicted 
mean and the standard error of the predicted mean given an {mgcv} gam object g.
 
Is there an easy way to get confidence bands for new observations for {mgcv} 
gam objects, analogous to those available for lm objects? 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] access a column of a dataframe without qualifying the name of the column

2010-12-29 Thread John Sorkin
I am trying to write a function that will access a column of a data frame 
without having to qualify the name of the data frame column as long as the name 
of the dataframe is passed to the function. As can be seen from the code below, 
my function is not working:

df <- data.frame(x=1:10,y=11:20)
df

test <- function(column,data) {
  print(data$column)
}

test(x,df)

 I am trying to model my function after the way that lm works where one needs 
not qualify column names, i.e.


fit1<- lm(y~x,data=df)


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] icon for an R package

2010-12-29 Thread John Fox
Hi Michael,

I've attached my attempt at an R-package logo.

Best,
 John


John Fox
Senator William McMaster 
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
> Behalf Of Michael Friendly
> Sent: December-29-10 12:32 PM
> To: David Winsemius
> Cc: r-help@r-project.org
> Subject: Re: [R] icon for an R package
> 
> On 12/29/2010 11:02 AM, David Winsemius wrote:
> >
> > On Dec 29, 2010, at 10:03 AM, Michael Friendly wrote:
> >
> >> I'm looking for an icon to represent an R package.  Perhaps something
> >> like
> >>
> >>
>
http://cdn2.iconfinder.com/data/icons/DarkGlass_Reworked/128x128/apps/packag
e
> .png
> >>
> >>
> >> but with the R logo rather than KDE.
> >
> > Can't you just get the location of an "R" at CRAN?
> >
> > http://cran.r-project.org/Rlogo.jpg
> >>
> >
> >
> Sorry for not being clearer.  What I want is an icon for a *package*,
> such as I gave in the iconfinder link
> above, but with the R logo *superposed*. Such 3D icons are common in the
> Mac world, but I couldn't
> find anything similar for R, so thought I'd ask before trying my (poor)
> hand with PhotoShop or something
> similar.
> 
> 
> --
> Michael Friendly Email: friendly AT yorku DOT ca
> Professor, Psychology Dept.
> York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
> 4700 Keele StreetWeb:   http://www.datavis.ca
> Toronto, ONT  M3J 1P3 CANADA
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
<>__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset question

2010-12-29 Thread Sarah Goslee
Details of *what* didn't work would be helpful, like for example
error messages.

Regardless, I'd do it like this:

subd <- d[, d$gene %in% c("i1","i2","i3"), ]


> d
  gene 1  2  3
1   i1 1  6 11
2   i5 2  7 12
3   i2 3  8 13
4   i3 4  9 14
5   i1 5 10 15

> d[d$gene  %in% c("i1","i2","i3"), ]
  gene 1  2  3
1   i1 1  6 11
3   i2 3  8 13
4   i3 4  9 14
5   i1 5 10 15

Sarah

On Wed, Dec 29, 2010 at 5:29 PM, ANJAN PURKAYASTHA
 wrote:
> nope, that did not work.
> thanks though.1"
> Anjan
>
> On Wed, Dec 29, 2010 at 5:02 PM, Jonathan Flowers <
> jonathanmflow...@gmail.com> wrote:
>
>> Try subd <- d[, "gene" == c("i1","i2","i3")]
>>
>> On Wed, Dec 29, 2010 at 4:55 PM, ANJAN PURKAYASTHA <
>> anjan.purkayas...@gmail.com> wrote:
>>
>>> Hi,
>>> I'm having a problem with a step that should be pretty simple.
>>> I have a dataframe, d,  with column names : gene s1 s2 s3. The column
>>> "gene"
>>> stores an Id; the rest of the columns store intensity data.
>>> I would like to extract the rows for gene Ids i1, i2, i3 ( I know a priori
>>> that those rows exist).
>>> So I do this:
>>> subset(d, gene %in% c(i1, i2, i3)).
>>> This does not give me the required data.
>>> Any ideas where I am going wrong?
>>> TIA,
>>> Anjan
>>>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset question

2010-12-29 Thread Jorge Ivan Velez
Hi Anjan,

Try

subset(d, gene %in% c("i1", "i2", "i3"))

HTH,
Jorge


On Wed, Dec 29, 2010 at 4:55 PM, ANJAN PURKAYASTHA <> wrote:

> Hi,
> I'm having a problem with a step that should be pretty simple.
> I have a dataframe, d,  with column names : gene s1 s2 s3. The column
> "gene"
> stores an Id; the rest of the columns store intensity data.
> I would like to extract the rows for gene Ids i1, i2, i3 ( I know a priori
> that those rows exist).
> So I do this:
> subset(d, gene %in% c(i1, i2, i3)).
> This does not give me the required data.
> Any ideas where I am going wrong?
> TIA,
> Anjan
>
> --
> ===
> anjan purkayastha, phd.
> research associate
> fas center for systems biology,
> harvard university
> 52 oxford street
> cambridge ma 02138
> phone-703.740.6939
> ===
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset question

2010-12-29 Thread ANJAN PURKAYASTHA
nope, that did not work.
thanks though.
Anjan

On Wed, Dec 29, 2010 at 5:02 PM, Jonathan Flowers <
jonathanmflow...@gmail.com> wrote:

> Try subd <- d[, "gene" == c("i1","i2","i3")]
>
> On Wed, Dec 29, 2010 at 4:55 PM, ANJAN PURKAYASTHA <
> anjan.purkayas...@gmail.com> wrote:
>
>> Hi,
>> I'm having a problem with a step that should be pretty simple.
>> I have a dataframe, d,  with column names : gene s1 s2 s3. The column
>> "gene"
>> stores an Id; the rest of the columns store intensity data.
>> I would like to extract the rows for gene Ids i1, i2, i3 ( I know a priori
>> that those rows exist).
>> So I do this:
>> subset(d, gene %in% c(i1, i2, i3)).
>> This does not give me the required data.
>> Any ideas where I am going wrong?
>> TIA,
>> Anjan
>>
>> --
>> ===
>> anjan purkayastha, phd.
>> research associate
>> fas center for systems biology,
>> harvard university
>> 52 oxford street
>> cambridge ma 02138
>> phone-703.740.6939
>> ===
>>
>>[[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


-- 
===
anjan purkayastha, phd.
research associate
fas center for systems biology,
harvard university
52 oxford street
cambridge ma 02138
phone-703.740.6939
===

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create an array of lists of multiple components?

2010-12-29 Thread Gabor Grothendieck
On Wed, Dec 29, 2010 at 4:58 PM, Marius Hofert  wrote:
> Dear Jim,
>
> thanks for your quick response. Here is what I try to achieve:
>
> ## list containing some data
> l <- list(
>          list(
>               list(
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2)
>                    ),
>               list(
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2)
>                    ),
>               list(
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2)
>                    )
>               ),
>          list(
>               list(
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2)
>                    ),
>               list(
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2)
>                    ),
>               list(
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2),
>                    list(a = 1, b = "b", c = 2)
>                    )
>               )
>          )
>
> ## now (try to) build an array of lists of the form list(a = 1, b = "b", c = 
> 2)
> n1 <- 2
> n2 <- 3
> n3 <- 4
> res <- array(rep(list(NULL,NULL,NULL), n1*n2*n3), dim = c(n1,n2,n3))
> for(i in 1:n1){
>    for(j in 1:n2){
>        for(k in 1:n3){
>            res[i,j,k] <- l[[i]][[j]][[k]]
>        }
>    }
> }
>


Try this:

array(sapply(sapply(l, c), c), c(4, 3, 2))



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting number of datasets and appending them

2010-12-29 Thread Jorge Ivan Velez
Hi Grace,

Try something along the lines of

do.call(rbind, lapply(1:maxi, function(x) get(paste('data', x, sep = ""

HTH,
Jorge


On Wed, Dec 29, 2010 at 5:06 PM, Li, Grace <> wrote:

> Hi there,
>
> I have a question on how to  read a bunch of dataset, assign each of the
> dataset to a matrix in the  memory, and append them.
>
> Suppose I have 20 dataset saved to different .rda files named
> gradeFileData1, gradeFileData2,, gradeFileData20. And I would like to
> read them each into a dataset in the memory, then combine them. I wrote
> something like:
>
> e1<-new.env(parent=.GlobalEnv)
> maxi <- 20
> i <- 1
> while (i<=maxi) {
> e1$d <-1
> datanam <- paste("data",i,sep="")
> data <- e1$d
> names(data)[length(data)] <- datanam
> i <- i+1
> }
>
> The function "names(data)[length(data)]" doesn't seem to work. I need it to
> be named like data1,data2,data20.
>
> Also to append them into a big dataset, I think there should be something
> simpler than
> all <-
> rbind(data1,data2,data3,data4,data5,data6,data7,data8.data20)
>
> can you help me on this? I hope this is not some simplest R question. I
> am a beginner.
>
> Thanks a ton!
>
> Grace
>
>
> Confidentiality Notice: This e-mail message including at...{{dropped:12}}
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Counting number of datasets and appending them

2010-12-29 Thread Li, Grace
Hi there,

I have a question on how to  read a bunch of dataset, assign each of the 
dataset to a matrix in the  memory, and append them.

Suppose I have 20 dataset saved to different .rda files named gradeFileData1, 
gradeFileData2,, gradeFileData20. And I would like to read them each into a 
dataset in the memory, then combine them. I wrote something like:

e1<-new.env(parent=.GlobalEnv)
maxi <- 20
i <- 1
while (i<=maxi) {
e1$d <-1
datanam <- paste("data",i,sep="")
data <- e1$d
names(data)[length(data)] <- datanam
i <- i+1
}

The function "names(data)[length(data)]" doesn't seem to work. I need it to be 
named like data1,data2,data20.

Also to append them into a big dataset, I think there should be something 
simpler than
all <- rbind(data1,data2,data3,data4,data5,data6,data7,data8.data20)

can you help me on this? I hope this is not some simplest R question. I am 
a beginner.

Thanks a ton!

Grace


Confidentiality Notice: This e-mail message including at...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset question

2010-12-29 Thread Jonathan Flowers
Try subd <- d[, "gene" == c("i1","i2","i3")]

On Wed, Dec 29, 2010 at 4:55 PM, ANJAN PURKAYASTHA <
anjan.purkayas...@gmail.com> wrote:

> Hi,
> I'm having a problem with a step that should be pretty simple.
> I have a dataframe, d,  with column names : gene s1 s2 s3. The column
> "gene"
> stores an Id; the rest of the columns store intensity data.
> I would like to extract the rows for gene Ids i1, i2, i3 ( I know a priori
> that those rows exist).
> So I do this:
> subset(d, gene %in% c(i1, i2, i3)).
> This does not give me the required data.
> Any ideas where I am going wrong?
> TIA,
> Anjan
>
> --
> ===
> anjan purkayastha, phd.
> research associate
> fas center for systems biology,
> harvard university
> 52 oxford street
> cambridge ma 02138
> phone-703.740.6939
> ===
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create an array of lists of multiple components?

2010-12-29 Thread Marius Hofert
Dear Jim,

thanks for your quick response. Here is what I try to achieve:

## list containing some data
l <- list(
  list(
   list(
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2)
),
   list(
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2)
),
   list(
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2)
)
   ),
  list(
   list(
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2)
),
   list(
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2)
),
   list(
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2),
list(a = 1, b = "b", c = 2)
)
   )
  )

## now (try to) build an array of lists of the form list(a = 1, b = "b", c = 2)
n1 <- 2
n2 <- 3
n3 <- 4
res <- array(rep(list(NULL,NULL,NULL), n1*n2*n3), dim = c(n1,n2,n3))
for(i in 1:n1){
for(j in 1:n2){
for(k in 1:n3){
res[i,j,k] <- l[[i]][[j]][[k]]
}
}
}

So the list "l" should be converted to an array of lists. I tried your approach 
and the modified (given above), but both do not work. I always obtain something 
like:
Error in res[i, j, k] <- l[[i]][[j]][[k]] : 
  number of items to replace is not a multiple of replacement length

Cheers,

Marius



On 2010-12-29, at 22:19 , jim holtman wrote:

> Is this what you want:
> 
>> n1 <- 2
>> n2 <- 4
>> n3 <- 5
>> res <- array(rep(list(list(NULL,NULL,NULL)), n1*n2*n3), dim = c(n1,n2,n3))
>> res[1,1,1] # is not a list with three components...
> [[1]]
> [[1]][[1]]
> NULL
> 
> [[1]][[2]]
> NULL
> 
> [[1]][[3]]
> NULL
> 
> 
>> str(res)
> List of 40
> $ :List of 3
>  ..$ : NULL
>  ..$ : NULL
>  ..$ : NULL
> $ :List of 3
>  ..$ : NULL
>  ..$ : NULL
>  ..$ : NULL
> $ :List of 3
>  ..$ : NULL
>  ..$ : NULL
>  ..$ : NULL
> $ :List of 3
>  ..$ : NULL
>  ..$ : NULL
>  ..$ : NULL
> $ :List of 3
>  ..$ : NULL
>  ..$ : NULL
>  ..$ : NULL
> 
> 
> On Wed, Dec 29, 2010 at 3:25 PM, Marius Hofert  wrote:
>> Hi,
>> 
>> how can I create an array of lists of three components?
>> This approach does not work:
>> 
>> n1 <- 2
>> n2 <- 4
>> n3 <- 5
>> res <- array(rep(vector("list",3), n1*n2*n3), dim = c(n1,n2,n3))
>> res[1,1,1] # is not a list with three components...
>> 
>> The goal is that res[1,1,1] is a list with three components. Also, appending 
>> the
>> components didn't work. For example, I tried:
>> component <- list(a = 4, b = "some text", c = 1)
>> for(i in 1:3) res[1,1,1] <- c(res[1,1,1], component[[i]])
>> 
>> Cheers,
>> 
>> Marius
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> 
> 
> -- 
> Jim Holtman
> Data Munger Guru
> 
> What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subset question

2010-12-29 Thread ANJAN PURKAYASTHA
Hi,
I'm having a problem with a step that should be pretty simple.
I have a dataframe, d,  with column names : gene s1 s2 s3. The column "gene"
stores an Id; the rest of the columns store intensity data.
I would like to extract the rows for gene Ids i1, i2, i3 ( I know a priori
that those rows exist).
So I do this:
subset(d, gene %in% c(i1, i2, i3)).
This does not give me the required data.
Any ideas where I am going wrong?
TIA,
Anjan

-- 
===
anjan purkayastha, phd.
research associate
fas center for systems biology,
harvard university
52 oxford street
cambridge ma 02138
phone-703.740.6939
===

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing rows with earlier dates

2010-12-29 Thread William Dunlap

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Ali Salekfard
> Sent: Wednesday, December 29, 2010 6:25 AM
> To: r-help@r-project.org
> Subject: Re: [R] Removing rows with earlier dates
> 
> Thanks to everyone. Joshua's response seemed the most concise 
> one, but it
> used up so much memory that my R just gave error. I checked the other
> replies and all in all I came up with this, and thought to 
> share it with
> others and get comments.
> 
> My structure was as follows:
> 
> ACCOUNT   RULE  DATE
> A1  2010-01-01
> A2  2007-05-01
> A2  2007-05-01
>  A2  2005-05-01
> A2  2005-05-01
>  A1  2009-01-01

This printout is not really sufficient to tell us what
is in your dataset.  E.g., I tried to convert it to
a data.frame with the following code

my.mapping.Date <- read.table(header=TRUE,
  colClasses=c("character","character","Date"),
  textConnection("
ACCOUNT RULE  DATE
A1  Rule1 2010-01-01
A2  Rule2 2007-05-01
A2  Rule3 2007-05-01
A2  Rule4 2005-05-01
A2  Rule5 2005-05-01
A1  Rule6 2009-01-01")
)

and your processing code failed in the as.Date(a,"%Y-%m-%d")
step because tapply() corrupts things of class Date (it
turns them into integers).  tapply() often has problems
dealing with nontrivial data classes.

If I read in the DATE column as character data then your
code doesn't crash.  (I did not try it with the default
factors for all columns.)

my.mapping.character <- read.table(header=TRUE,
  colClasses=c("character","character","character"),
  textConnection("
ACCOUNT RULE  DATE
A1  Rule1 2010-01-01
A2  Rule2 2007-05-01
A2  Rule3 2007-05-01
A2  Rule4 2005-05-01
A2  Rule5 2005-05-01
A1  Rule6 2009-01-01")
)

f0 <- function (my.mapping) 
{
# your code converted to a function so it doesn't
# overwrite its input and so it can be easily compared
# with other functions.
a <- tapply(my.mapping$DATE, my.mapping$ACCOUNT, max)
a <- data.frame(ACCOUNT = names(a), DT = as.Date(a, "%Y-%m-%d"))
my.mapping <- merge(x = my.mapping, y = a, by.x = "ACCOUNT", 
by.y = "ACCOUNT")
my.mapping <- cbind(my.mapping, TAKE = my.mapping$DATE == 
my.mapping$DT)
my.mapping <- my.mapping[my.mapping$TAKE == TRUE, ]
my.mapping
}

> f0(my.mapping.character)
  ACCOUNT  RULE   DATE DT TAKE
1  A1 Rule1 2010-01-01 2010-01-01 TRUE
3  A2 Rule2 2007-05-01 2007-05-01 TRUE
4  A2 Rule3 2007-05-01 2007-05-01 TRUE

In your original post you wrote
  > What I would like to do is to create a data frame
  > with only the most recent  rule for each account.
but your code gives 2 rules for account A2, because
there is a tie in the dates.  Is that what you want?

It makes thinks much simpler for R-helpers if a request
for help includes details how how to make a typical
input object and exactly what is wanted to be done.

In the runs-based approach I suggested, ties are broken
by the original order of the file.  Returning all rules
for the maximum date would be more complicated using this
approach.

isLastInRun <- function (x, ...)
{
retval <- c(x[-1] != x[-length(x)], TRUE)
for (y in list(...)) {
stopifnot(length(x) == length(y))
retval <- retval | c(x[-1] != x[-length(x)], TRUE)
}
retval
}
f2 <- function(data) {
o <- order(data[, "ACCOUNT"], data[, "DATE"])
tmp <- logical(length(o))
tmp[o] <- isLastInRun(data[o, "ACCOUNT"])
data[tmp,]
}

f2() works on either class of DATE column.  It returns
the same class of DATE as the input class, because it
just returns a subset of the rows of the original
data.frame.  The row names/numbers in the output show
which rows of the input were selected.
> f2(my.mapping.Date)
  ACCOUNT  RULE   DATE
1  A1 Rule1 2010-01-01
3  A2 Rule3 2007-05-01
> f2(my.mapping.character)
  ACCOUNT  RULE   DATE
1  A1 Rule1 2010-01-01
3  A2 Rule3 2007-05-01

I generated a random dataset with 1 million rows (and
c. 2.3 rules/account) with

gen <- function (n, Date = FALSE) {
set.seed(1)
d <- data.frame(stringsAsFactors = FALSE,
ACCOUNT = paste(sep = "", "A",
   sample(floor(n/2), size = n, replace = TRUE)),
RULE = paste(sep = "", "Rule", 1:n),
DATE = sprintf("%04d-%02d-%02d",
   sample(1995:2010, size = n, replace = TRUE),
   sample(1:12, size = n, replace = TRUE), 
   sample(1:28, size = n, replace = TRUE))
)
if (Date) {
d$DATE <- as.Date(d$DATE)
}
d
}
d6 <- gen(n=10^6, Date=FALSE)

and got the following processing times

> system.time(r0 <- f0(d6))
   user  system elapsed 
  79.960.36   73.94 
> system.time(r2 <- f2(d6))
   user  system elapsed 
  19.810.

Re: [R] How to create an array of lists of multiple components?

2010-12-29 Thread jim holtman
Is this what you want:

> n1 <- 2
> n2 <- 4
> n3 <- 5
> res <- array(rep(list(list(NULL,NULL,NULL)), n1*n2*n3), dim = c(n1,n2,n3))
> res[1,1,1] # is not a list with three components...
[[1]]
[[1]][[1]]
NULL

[[1]][[2]]
NULL

[[1]][[3]]
NULL


> str(res)
List of 40
 $ :List of 3
  ..$ : NULL
  ..$ : NULL
  ..$ : NULL
 $ :List of 3
  ..$ : NULL
  ..$ : NULL
  ..$ : NULL
 $ :List of 3
  ..$ : NULL
  ..$ : NULL
  ..$ : NULL
 $ :List of 3
  ..$ : NULL
  ..$ : NULL
  ..$ : NULL
 $ :List of 3
  ..$ : NULL
  ..$ : NULL
  ..$ : NULL


On Wed, Dec 29, 2010 at 3:25 PM, Marius Hofert  wrote:
> Hi,
>
> how can I create an array of lists of three components?
> This approach does not work:
>
> n1 <- 2
> n2 <- 4
> n3 <- 5
> res <- array(rep(vector("list",3), n1*n2*n3), dim = c(n1,n2,n3))
> res[1,1,1] # is not a list with three components...
>
> The goal is that res[1,1,1] is a list with three components. Also, appending 
> the
> components didn't work. For example, I tried:
> component <- list(a = 4, b = "some text", c = 1)
> for(i in 1:3) res[1,1,1] <- c(res[1,1,1], component[[i]])
>
> Cheers,
>
> Marius
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Trying to extract an algorithm from a function

2010-12-29 Thread David Winsemius


On Dec 29, 2010, at 3:31 PM, CALEF ALEJANDRO RODRIGUEZ CUEVAS wrote:

Hi, I'm using package "vars" and I'm trying to extract the algorithm  
that
function "predict" contained in that package in order to understand  
how does

it work.

When I type function "VAR" then all its algorithm appears in R,  
however if I
try to do the same with "predict" nothing happens...Is there any  
possible

way to extract the algorithm?


With the package loaded do this to an object to which you are  
submitting to predict

class(object)

Then do"

methods(predict)

If your .predict function has a "*" by it then do this:

getAnywhere(.predict)

And if you are using a package that uses S4 methods, good luck.
--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] icon for an R package

2010-12-29 Thread Marc Schwartz

On Dec 29, 2010, at 2:08 PM, Michael Friendly wrote:

> On 12/29/2010 1:01 PM, Marc Schwartz wrote:
>> Michael,
>> Are you referring to an icon that would be displayed for an R package when 
>> browsing in a file manager, such as Nautilus, Konqueror or Finder?
> Well, my initial query was just for an icon that I could use in a 
> presentation to represent several R packages
> and the relations among them.

I see, so a network tree of sorts

David's image file did not come thru the list, so I am not sure if that or 
something similar meets your needs.

Simon has some image icons that he uses in R.app for the OSX GUI. These are 
available from:

  https://svn.r-project.org/R-packages/trunk/Mac-GUI/images/


>> If so, those icons are typically associated with a file using MIME types and 
>> are based upon the file type in question having a unique extension and 
>> perhaps being associated with a particular application that is installed.
>> 
> You raise the more interesting question of OS-specific icons for R-related 
> file types (.R, .Rd, .RData, .Rnw),
> but the only ones I know of are the .RData icon under Windows, and icons used 
> in eclipse/StatET.

Presumably if a motivated person were to create and offer some under an 
appropriate and compatible license, they could be made generally available. Of 
course the actual utilization of them by the various file managers across the 
OS's in the manner I describe would require additional steps. Either via an 
install program that would automate the registration of the file types and 
icons, or via manual configuration.

Regards,

Marc

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trying to extract an algorithm from a function

2010-12-29 Thread CALEF ALEJANDRO RODRIGUEZ CUEVAS
Hi, I'm using package "vars" and I'm trying to extract the algorithm that
function "predict" contained in that package in order to understand how does
it work.

When I type function "VAR" then all its algorithm appears in R, however if I
try to do the same with "predict" nothing happens...Is there any possible
way to extract the algorithm?

Thanks a lot.

Regards

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to create an array of lists of multiple components?

2010-12-29 Thread Marius Hofert
Hi,

how can I create an array of lists of three components?
This approach does not work:

n1 <- 2
n2 <- 4
n3 <- 5
res <- array(rep(vector("list",3), n1*n2*n3), dim = c(n1,n2,n3))
res[1,1,1] # is not a list with three components...

The goal is that res[1,1,1] is a list with three components. Also, appending the
components didn't work. For example, I tried:
component <- list(a = 4, b = "some text", c = 1)
for(i in 1:3) res[1,1,1] <- c(res[1,1,1], component[[i]])

Cheers,

Marius
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] icon for an R package

2010-12-29 Thread Michael Friendly

On 12/29/2010 1:01 PM, Marc Schwartz wrote:

Michael,
Are you referring to an icon that would be displayed for an R package when 
browsing in a file manager, such as Nautilus, Konqueror or Finder?
Well, my initial query was just for an icon that I could use in a 
presentation to represent several R packages

and the relations among them.

If so, those icons are typically associated with a file using MIME types and 
are based upon the file type in question having a unique extension and perhaps 
being associated with a particular application that is installed.

You raise the more interesting question of OS-specific icons for 
R-related file types (.R, .Rd, .RData, .Rnw),
but the only ones I know of are the .RData icon under Windows, and icons 
used in eclipse/StatET.

Since R packages are either .zip files or tar archive files (.tar.gz or .tgz), 
they will use the default theme icons for the OS/File Manager in use for those 
extensions, possibly for the archiving application associated with those file 
types.

I don't know of any way off-hand, to differentiate R packages from other files 
that have the same extensions and therefore have a unique icon displayed in the 
respective file manager application.

Regards,

Marc Schwartz




--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Duplicated date values aren't duplicates

2010-12-29 Thread odocoileus55

Hi all,

I am experiencing a similar issue with "non unique dates for a given burst". 
This is more of a practice session for me, I have data from 1 collared
animal and for practice, assigned this individual 2 bursts (wk1 and wk2).

This is what I have:
   ID DATETIME BURST  X   Y
1   12021  11/28/2006 0:01   wk1 6xx 30x
2   12021  11/28/2006 1:01   wk1 6xx 30x
3   12021  11/28/2006 2:01   wk1 6xx 30x

Codes are:

data=read.csv("test.csv",header=TRUE)
attach(data)
xy<-data[,c("X","Y")]
da<-as.character(data$DATETIME)
da<-as.POSIXct(strptime(as.character(data$DATETIME),"%m/%d/%Y %H:%M"))
litr<-as.ltraj(xy,da,id,burst,typeII=TRUE)
##error is displayed
##checking for duplicates
dupz=which(duplicated(da))
## results show "55 210", can not be right since duplicated times are
impossible
##checking for duplicates in xy
dupz=which(duplicated(xy))
##results show "0"
##one thing I noticed is that xy is in 1 column and da is displayed in 7
columns.  Could that be the issue?  Perhaps the da needs to be in one column
to match up with the xy?

Thanks in advance.


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Duplicated-date-values-aren-t-duplicates-tp897178p3167487.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] icon for an R package

2010-12-29 Thread David Winsemius



Attached is a png file that is the superposition of an R logo and a  
Mac-ish package icon. Best I can do with my limited tools.



On Dec 29, 2010, at 12:32 PM, Michael Friendly wrote:


On 12/29/2010 11:02 AM, David Winsemius wrote:



On Dec 29, 2010, at 10:03 AM, Michael Friendly wrote:

I'm looking for an icon to represent an R package.  Perhaps  
something like


http://cdn2.iconfinder.com/data/icons/DarkGlass_Reworked/128x128/apps/package.png

but with the R logo rather than KDE.


Can't you just get the location of an "R" at CRAN?

http://cran.r-project.org/Rlogo.jpg





Sorry for not being clearer.  What I want is an icon for a package,  
such as I gave in the iconfinder link
above, but with the R logo superposed. Such 3D icons are common in  
the Mac world, but I couldn't
find anything similar for R, so thought I'd ask before trying my  
(poor) hand with PhotoShop or something

similar.


--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] icon for an R package

2010-12-29 Thread Marc Schwartz

On Dec 29, 2010, at 11:32 AM, Michael Friendly wrote:

> On 12/29/2010 11:02 AM, David Winsemius wrote:
>> 
>> On Dec 29, 2010, at 10:03 AM, Michael Friendly wrote:
>> 
>>> I'm looking for an icon to represent an R package.  Perhaps something 
>>> like
>>> 
>>> http://cdn2.iconfinder.com/data/icons/DarkGlass_Reworked/128x128/apps/package.png
>>>  
>>> 
>>> 
>>> but with the R logo rather than KDE.
>> 
>> Can't you just get the location of an "R" at CRAN?
>> 
>> http://cran.r-project.org/Rlogo.jpg
>>> 
>> 
>> 
> Sorry for not being clearer.  What I want is an icon for a *package*, 
> such as I gave in the iconfinder link
> above, but with the R logo *superposed*. Such 3D icons are common in the 
> Mac world, but I couldn't
> find anything similar for R, so thought I'd ask before trying my (poor) 
> hand with PhotoShop or something
> similar.


Michael,

Are you referring to an icon that would be displayed for an R package when 
browsing in a file manager, such as Nautilus, Konqueror or Finder?

If so, those icons are typically associated with a file using MIME types and 
are based upon the file type in question having a unique extension and perhaps 
being associated with a particular application that is installed.

Since R packages are either .zip files or tar archive files (.tar.gz or .tgz), 
they will use the default theme icons for the OS/File Manager in use for those 
extensions, possibly for the archiving application associated with those file 
types.

I don't know of any way off-hand, to differentiate R packages from other files 
that have the same extensions and therefore have a unique icon displayed in the 
respective file manager application.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] logistic regression with response 0,1

2010-12-29 Thread Federico Bonofiglio
Dear Masters,
first I'd like to wish u all a great 2011 and happy holydays by now,

second (here it come the boring stuff) I have a question to which I hope u
would answer:

I run a logistic regression by glm(), on the following data type
(y1=1,x1=x1); (y2=0,x2=x2);..(yn=0,xn=xn), where the response (y) is
abinary outcome on 0,1 amd x is any explanatory variable (continuous or not)
observed with the i-th outcome.

This is indeed one of the most frequent case when challenged with binary
responses, though I know R manages such responses slightly differently (a
vector for the successes counts and one for the failures) and I'm not sure
wheather my summary.glm gives me any senseful answer at all

for the purpose I have tried to to emphasize the numbers so to obtain
significant results

y<-rbinom(2000,1,.7)#response

for(i in 1:2000){
euro[i]<-if(y[i]==1){rnorm(1,300,20)#explanatory 1
}else{rnorm(1,50,12)}
}

for(i in 1:2000){
sex[i]<-if(y[i]==1){sample(c("m","f"),1,prob=c(.8,.2))#explanatory 2
}else{sample(c("m","f"),1,prob=c(.2,.8))}
}



m<-glm(y~euro+factor(sex),family=binomial)

summary(m)




My worries:

   - are the estimates correct?
   -  degrees of freedom exponentiate dramatically (one per cell) , so may I
   risk to never obtain a significant result?

I also take the chance to ask wheater u know any implemented method to plot
logistic curves directly out of a glm() model


I would like to thank u all by the way

Federico Bonofiglio

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.object: function doesn't exist but I wish it did

2010-12-29 Thread Uwe Ligges

Not sure what you really want, my best guess is you are looking for get():


best <- get(best)

Uwe Ligges



On 29.12.2010 18:31, Patrick McKann wrote:

I seem to come to this problem alot, and I can find my way out of it with a
loop, but I wish, and wonder if there is a better way.  Here's an example
(lmer1-5 are a series of lmer objects):

  bs=data.frame(bic=BIC(lmer1,lmer2,lmer3,lmer4,lmer5)$BIC)
  rownames(bs)=c('lmer1','lmer2','lmer3','lmer4','lmer5')
  best=rownames(bs)[bs==min(bs)]


best

[1] "lmer5"

This tells me that lmer5 is the model with the lowest BIC.  I want to start
working with lmer5 as the best model, such as fixef(best) to get the fixed
effect estimates from lmer5.  I tried best=as.object('lmer5') but of course
this doesn't work because that is not a real function.

Does anybody see what I'm getting at?  If so, do you know a way to do this
without a loop or series of if statements?

Thank you!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] as.object: function doesn't exist but I wish it did

2010-12-29 Thread Gabor Grothendieck
On Wed, Dec 29, 2010 at 12:31 PM, Patrick McKann  wrote:
> I seem to come to this problem alot, and I can find my way out of it with a
> loop, but I wish, and wonder if there is a better way.  Here's an example
> (lmer1-5 are a series of lmer objects):
>
>  bs=data.frame(bic=BIC(lmer1,lmer2,lmer3,lmer4,lmer5)$BIC)
>  rownames(bs)=c('lmer1','lmer2','lmer3','lmer4','lmer5')
>  best=rownames(bs)[bs==min(bs)]
>
>> best
> [1] "lmer5"
>
> This tells me that lmer5 is the model with the lowest BIC.  I want to start
> working with lmer5 as the best model, such as fixef(best) to get the fixed
> effect estimates from lmer5.  I tried best=as.object('lmer5') but of course
> this doesn't work because that is not a real function.
>
> Does anybody see what I'm getting at?  If so, do you know a way to do this
> without a loop or series of if statements?
>

If you are asking how to turn a string into a variable its a FAQ!

http://cran.r-project.org/doc/FAQ/R-FAQ.html#How-can-I-turn-a-string-into-a-variable_003f


-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] icon for an R package

2010-12-29 Thread Michael Friendly
On 12/29/2010 11:02 AM, David Winsemius wrote:
>
> On Dec 29, 2010, at 10:03 AM, Michael Friendly wrote:
>
>> I'm looking for an icon to represent an R package.  Perhaps something 
>> like
>>
>> http://cdn2.iconfinder.com/data/icons/DarkGlass_Reworked/128x128/apps/package.png
>>  
>>
>>
>> but with the R logo rather than KDE.
>
> Can't you just get the location of an "R" at CRAN?
>
> http://cran.r-project.org/Rlogo.jpg
>>
>
>
Sorry for not being clearer.  What I want is an icon for a *package*, 
such as I gave in the iconfinder link
above, but with the R logo *superposed*. Such 3D icons are common in the 
Mac world, but I couldn't
find anything similar for R, so thought I'd ask before trying my (poor) 
hand with PhotoShop or something
similar.


-- 
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] as.object: function doesn't exist but I wish it did

2010-12-29 Thread Patrick McKann
I seem to come to this problem alot, and I can find my way out of it with a
loop, but I wish, and wonder if there is a better way.  Here's an example
(lmer1-5 are a series of lmer objects):

 bs=data.frame(bic=BIC(lmer1,lmer2,lmer3,lmer4,lmer5)$BIC)
 rownames(bs)=c('lmer1','lmer2','lmer3','lmer4','lmer5')
 best=rownames(bs)[bs==min(bs)]

> best
[1] "lmer5"

This tells me that lmer5 is the model with the lowest BIC.  I want to start
working with lmer5 as the best model, such as fixef(best) to get the fixed
effect estimates from lmer5.  I tried best=as.object('lmer5') but of course
this doesn't work because that is not a real function.

Does anybody see what I'm getting at?  If so, do you know a way to do this
without a loop or series of if statements?

Thank you!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Windows editor suggestions - autosave

2010-12-29 Thread Jonathan P Daily
I can also suggest the following:

Notepad++ has an AutoSave plugin, a slew of other useful features, and a 
Scintilla editor. If you use other languages, it is also extensible and 
has almost no learning curve. NppToR is a background application that 
allows you to pass lines or files straight from your editor to an R 
console.

Notepad++: notepad-plus-plus.org
NppToR: sourceforge.net/projects/npptor/ 

I can also suggest Geany, another Scintilla-powered editor. In the plugins 
list, Save Actions has a autosave feature. While Geany doesn't have a 
native R interface in Windows (I wrote one for myself in AutoHotKey in 
about 2 mins), it has a ton of other options and is cross-platform.

Geany: www.geany.org
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
"Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it."
 - Jubal Early, Firefly

r-help-boun...@r-project.org wrote on 12/29/2010 11:32:40 AM:

> [image removed] 
> 
> [R] Windows editor suggestions - autosave
> 
> Michael Conklin 
> 
> to:
> 
> R-help
> 
> 12/29/2010 11:35 AM
> 
> Sent by:
> 
> r-help-boun...@r-project.org
> 
> I am looking for advice on an editor to use with R (windows) that 
> has an autosave feature.  I typically write scripts using the RGui 
> (and tried TinnR yesterday) but I am having continuing problems with
> BSODs (non R related) and have in the past have had issues with R 
> crashes and would really like a system that does not require me to 
> remember to hit the save button on my script every 10 minutes so 
> that I can avoid redoing everything.
> 
> W. Michael Conklin
> Chief Methodologist
> Google Voice: (612) 56STATS
> 
> MarketTools, Inc. | www.markettools.com
> 6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 
> 952.417.4719 | CELL: 612.201.8978   
> This email and attachment(s) may contain confidential and/or 
> proprietary information and is intended only for the intended 
> addressee(s) or its authorized agent(s). Any disclosure, printing, 
> copying or use of such information is strictly prohibited. If this 
> email and/or attachment(s) were received in error, please 
> immediately notify the sender and delete all copies
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Windows editor suggestions - autosave

2010-12-29 Thread Gabor Grothendieck
On Wed, Dec 29, 2010 at 11:32 AM, Michael Conklin
 wrote:
> I am looking for advice on an editor to use with R (windows) that has an 
> autosave feature.  I typically write scripts using the RGui (and tried TinnR 
> yesterday) but I am having continuing problems with BSODs (non R related) and 
> have in the past have had issues with R crashes and would really like a 
> system that does not require me to remember to hit the save button on my 
> script every 10 minutes so that I can avoid redoing everything.
>

I would imagine that most of the popular editors support something
along those lines.

I use vim and the way it does this is slightly different than what you
describe but IMHO better.   The problem with auto save is that if you
make some mistakes and then discover them you have already saved the
mistakes.  The way vim works is that rather than auto save the file it
auto saves **changes** to a swap file.  You can easily get back to the
last manual save by just quitting without saving and starting vim
again or you can recover to the last auto-saved point after a crash by
using the recovery flag: vim -r myfile.txt

vim also has the :DiffOrig command which will show you what has
changed since the last manual save.

Its also possible to implement a vim script that does true auto save
and you can find such scripts on the net so you don't have to write it
yourself but I suspect the people who wrote them did not really
understand that vim already has a superior approach to this.  The
functionality of vim is so vast, including not only vim itself but
also 3000+ use contributed scripts, that its quite easy to overlook a
feature.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing rows with earlier dates

2010-12-29 Thread Ali Salekfard
David,

Thanks alot. Your code is worked fine on the whole dataset (no memory error
as I had with the other ideas). I do like the style - especialy the fact
that it is all in one line - , but for large datasets it takes longer than
what I wrote. I ran it on the same machine with the same set of rules of
144,643 your code takes 81.50 seconds.

> a<-my.mapping[ with(my.mapping, DATE == ave( DATE, ACCOUNT,FUN=max )), ]

 Description Duration
1 Max.Date for Mappings   81.498

I guess the running time of your algorithm is exponential to the number of
rows.

Ali

On Wed, Dec 29, 2010 at 3:24 PM, David Winsemius wrote:

>
> On Dec 29, 2010, at 9:24 AM, Ali Salekfard wrote:
>
> Thanks to everyone. Joshua's response seemed the most concise one, but it
>> used up so much memory that my R just gave error. I checked the other
>> replies and all in all I came up with this, and thought to share it with
>> others and get comments.
>>
>> My structure was as follows:
>>
>> ACCOUNT   RULE  DATE
>> A1  2010-01-01
>> A2  2007-05-01
>> A2  2007-05-01
>> A2  2005-05-01
>> A2  2005-05-01
>> A1  2009-01-01
>>
>> The most efficient solution I came across involves the following steps:
>>
>> 1. Find the latest date for each account, and convert it to a data frame:
>>
>> a<-tapply(my.mapping$DATE,my.mapping$ACCOUNT,max)
>> a<-data.frame(ACCOUNT=names(a),DT=as.Date(a,"%Y-%m-%d"))
>> 2. merge the set with the original data
>>
>> my.mapping<-merge(x=my.mapping,y=a,by.x="ACCOUNT",by.y="ACCOUNT")
>>
>> 3. Create a take column, which is to confirm if the date of the row is the
>> maximum date for the account.
>> my.mapping<-cbind(my.mapping,TAKE=my.mapping$DATE==my.mapping$DT)
>> 4. Filter out all lines except those with TAKE==TRUE.
>>
>> my.mapping<-my.mapping[my.mapping$TAKE==TRUE,]
>> The running time for my whole list was 4.5 sec which is far better than
>> any
>> other ways I tried. Let me have your thoughts on that.
>>
>
> My first thought is that you should use more spaces in your code. It looks
> quite a bit more complex than the method I suggested (and my benchmark says
> mine was maybe 50% faster, but with Maechler's improvements is now about 4
> times faster. I guess I shouldn't throw too many stones about coding style.)
>
> my.mapping[ with(my.mapping, DATE == ave( DATE,
>  ACCOUNT,
>  FUN=max} ), ]
> #--
> require(rbenchmark)
> ave.method = function(df, acc, dt)
>   {df[with( df, dt == ave(dt, acc, FUN=max)), ]}
> merge.method = function(df, acc, dt) {
>   a<- tapply(df[[dt]], df[[acc]],max)
>   a  <- data.frame(ACCOUNT=names(a), DT=a)
>   df <- merge(x=df, y=a, by.x=acc, by.y="ACCOUNT")
>   df <- cbind(df, TAKE=df[dt]==df$DT)
> df <- df[df$TAKE==TRUE,]}
> benchmark(
>   rep=ave.method(airquality, "Month", "Day"),
>   pat=merge.method(airquality, "Month", "Day"),
>   replications=1000,
>   order=c('replications', 'elapsed'))
> #-
>  test replications elapsed relative user.self sys.self user.child sys.child
> 1  rep 1000   2.523 1.00 2.5120.018  0
> 0
> 2  pat 1000   7.847 3.110186 7.7730.092  0
> 0
>
>
> It does give the same answers when tested on airquality, though. That says
> something for it I suppose. (Had you offered a sensible test dataset in your
> first posting , I would have offered a solution using your column names, but
> as it was I figured you should have been able to make the mappings.)
>
>
> --
> David.
>
>
>
>> Ali
>>
>
>
> David Winsemius, MD
> West Hartford, CT
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] JGR installation problem

2010-12-29 Thread Uwe Ligges

Do you have Java installed? If so, please ask on the JGR mailing list.

Uwe Ligges



On 29.12.2010 11:30, SNV Krishna wrote:

Hi All,

I am trying to install JGR GUI for R (windows xp) but facing the problem.
The following error message is displayed when I click on JGR.exe

"Cannot find Java/R Interface (JRI) library (jri.dll)
Please make sure you start JGR by double clicking the JGR.exe program"

I know this is R help forum, but trying to get help from experts who are
using JGR.

Any help or idea will be highly appreciated.

thanks and regards,

SNVK


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Windows editor suggestions - autosave

2010-12-29 Thread Joshua Wiley
Hi,

Take a look at Emacs + ESS (http://ess.r-project.org/).  In Emacs you
can setup autosaves and control the time interval between them, and it
has a number of other wondeful features as a text editor.  For
example:

http://www.emacswiki.org/emacs/AutoSave

Cheers,

Josh

On Wed, Dec 29, 2010 at 8:32 AM, Michael Conklin
 wrote:
> I am looking for advice on an editor to use with R (windows) that has an 
> autosave feature.  I typically write scripts using the RGui (and tried TinnR 
> yesterday) but I am having continuing problems with BSODs (non R related) and 
> have in the past have had issues with R crashes and would really like a 
> system that does not require me to remember to hit the save button on my 
> script every 10 minutes so that I can avoid redoing everything.
>
> W. Michael Conklin
> Chief Methodologist
> Google Voice: (612) 56STATS
>
> MarketTools, Inc. | www.markettools.com
> 6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 
> 952.417.4719 | CELL: 612.201.8978
> This email and attachment(s) may contain confidential and/or proprietary 
> information and is intended only for the intended addressee(s) or its 
> authorized agent(s). Any disclosure, printing, copying or use of such 
> information is strictly prohibited. If this email and/or attachment(s) were 
> received in error, please immediately notify the sender and delete all copies
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Windows editor suggestions - autosave

2010-12-29 Thread Michael Conklin
I am looking for advice on an editor to use with R (windows) that has an 
autosave feature.  I typically write scripts using the RGui (and tried TinnR 
yesterday) but I am having continuing problems with BSODs (non R related) and 
have in the past have had issues with R crashes and would really like a system 
that does not require me to remember to hit the save button on my script every 
10 minutes so that I can avoid redoing everything.

W. Michael Conklin
Chief Methodologist
Google Voice: (612) 56STATS

MarketTools, Inc. | www.markettools.com
6465 Wayzata Blvd | Suite 170 |  St. Louis Park, MN 55426.  PHONE: 952.417.4719 
| CELL: 612.201.8978   
This email and attachment(s) may contain confidential and/or proprietary 
information and is intended only for the intended addressee(s) or its 
authorized agent(s). Any disclosure, printing, copying or use of such 
information is strictly prohibited. If this email and/or attachment(s) were 
received in error, please immediately notify the sender and delete all copies

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing rows with earlier dates

2010-12-29 Thread David Winsemius

On Dec 29, 2010, at 11:03 AM, Ali Salekfard wrote:

> David,
>
> Thanks alot. Your code is worked fine on the whole dataset (no  
> memory error as I had with the other ideas). I do like the style -  
> especialy the fact that it is all in one line - , but for large  
> datasets it takes longer than what I wrote. I ran it on the same  
> machine with the same set of rules of 144,643 your code takes 81.50  
> seconds.
>
> > a<-my.mapping[ with(my.mapping, DATE == ave( DATE,  
> ACCOUNT,FUN=max )), ]
>
>  Description Duration
> 1 Max.Date for Mappings   81.498
>
> I guess the running time of your algorithm is exponential to the  
> number of rows.

If the large database has a large number of columns there might be  
improvement from just using the necessary columns.

  a<-my.mapping[ with(my.mapping[ , c("DATE", "ACCOUNT")] , DATE ==  
ave( DATE, ACCOUNT,FUN=max )), ]

Or using subset.

It occurs to me that this my be applicable to a problem I have on my  
to-do list, so if I run into problems on my dataset which is about 30  
time longer than yours, I will have a backup plan.

Best;
David.

>
> Ali
>
> On Wed, Dec 29, 2010 at 3:24 PM, David Winsemius  > wrote:
>
> On Dec 29, 2010, at 9:24 AM, Ali Salekfard wrote:
>
> Thanks to everyone. Joshua's response seemed the most concise one,  
> but it
> used up so much memory that my R just gave error. I checked the other
> replies and all in all I came up with this, and thought to share it  
> with
> others and get comments.
>
> My structure was as follows:
>
> ACCOUNT   RULE  DATE
> A1  2010-01-01
> A2  2007-05-01
> A2  2007-05-01
> A2  2005-05-01
> A2  2005-05-01
> A1  2009-01-01
>
> The most efficient solution I came across involves the following  
> steps:
>
> 1. Find the latest date for each account, and convert it to a data  
> frame:
>
> a<-tapply(my.mapping$DATE,my.mapping$ACCOUNT,max)
> a<-data.frame(ACCOUNT=names(a),DT=as.Date(a,"%Y-%m-%d"))
> 2. merge the set with the original data
>
> my.mapping<-merge(x=my.mapping,y=a,by.x="ACCOUNT",by.y="ACCOUNT")
>
> 3. Create a take column, which is to confirm if the date of the row  
> is the
> maximum date for the account.
> my.mapping<-cbind(my.mapping,TAKE=my.mapping$DATE==my.mapping$DT)
> 4. Filter out all lines except those with TAKE==TRUE.
>
> my.mapping<-my.mapping[my.mapping$TAKE==TRUE,]
> The running time for my whole list was 4.5 sec which is far better  
> than any
> other ways I tried. Let me have your thoughts on that.
>
> My first thought is that you should use more spaces in your code. It  
> looks quite a bit more complex than the method I suggested (and my  
> benchmark says mine was maybe 50% faster, but with Maechler's  
> improvements is now about 4 times faster. I guess I shouldn't throw  
> too many stones about coding style.)
>
> my.mapping[ with(my.mapping, DATE == ave( DATE,
>  ACCOUNT,
>  FUN=max} ), ]
> #--
> require(rbenchmark)
> ave.method = function(df, acc, dt)
>   {df[with( df, dt == ave(dt, acc, FUN=max)), ]}
> merge.method = function(df, acc, dt) {
>   a<- tapply(df[[dt]], df[[acc]],max)
>   a  <- data.frame(ACCOUNT=names(a), DT=a)
>   df <- merge(x=df, y=a, by.x=acc, by.y="ACCOUNT")
>   df <- cbind(df, TAKE=df[dt]==df$DT)
> df <- df[df$TAKE==TRUE,]}
> benchmark(
>   rep=ave.method(airquality, "Month", "Day"),
>   pat=merge.method(airquality, "Month", "Day"),
>   replications=1000,
>   order=c('replications', 'elapsed'))
> #-
>  test replications elapsed relative user.self sys.self user.child  
> sys.child
> 1  rep 1000   2.523 1.00 2.5120.018   
> 0 0
> 2  pat 1000   7.847 3.110186 7.7730.092   
> 0 0
>
>
> It does give the same answers when tested on airquality, though.  
> That says something for it I suppose. (Had you offered a sensible  
> test dataset in your first posting , I would have offered a solution  
> using your column names, but as it was I figured you should have  
> been able to make the mappings.)
>
>
> -- 
> David.
>
>
>
> Ali
>
>
> David Winsemius, MD
> West Hartford, CT
>
>

David Winsemius, MD
West Hartford, CT


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] HELP for repeated measure ANCOVA with varying covariate

2010-12-29 Thread Dong Xie
Dear All,
I am a researcher doing research in plant growth and I have a 
statistical problem that seems to not be able to handle. Recently, I 
conducted an experiment about plant growing in three different 
nutrient-level sediments. I harvested these every three week (three 
harvests in all). Some growth traits of these plants were recorded (e.g. 
total biomass, leaf biomass and stem biomass). In addition, I found that 
the total plant biomass may also influence these traits. So, I am going 
to use total biomass as covariate to do the statistical analysis. 
Meanwhile, this experimental design is repeated design, I guess I have 
to do the repeated-measurement  (repeated measure ANCOVA with varying 
covariate) to analyze the growth traits between different 
nutrient-levels. I've tried gls and lme function with a correlation 
structure applied to time series in nlme library with great help by Dr. 
Yuanye Zhang. However, I still can't figure out the "between-subject" 
and "with-subject" parts.

Do you know if R can do repeated measures ANCOVA with varying covariate? 
if yes how?

Thanks for any help. Much appreciated.

Dong Xie


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Referring to an object name from within a function

2010-12-29 Thread zerfetzen

Excellent, thanks, and sorry about the test(x) goof.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Referring-to-an-object-name-from-within-a-function-tp3167147p3167174.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] icon for an R package

2010-12-29 Thread David Winsemius


On Dec 29, 2010, at 10:03 AM, Michael Friendly wrote:

I'm looking for an icon to represent an R package.  Perhaps  
something like


http://cdn2.iconfinder.com/data/icons/DarkGlass_Reworked/128x128/apps/package.png

but with the R logo rather than KDE.


Can't you just get the location of an "R" at CRAN?

http://cran.r-project.org/Rlogo.jpg





--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing rows with earlier dates

2010-12-29 Thread David Winsemius


On Dec 29, 2010, at 9:24 AM, Ali Salekfard wrote:

Thanks to everyone. Joshua's response seemed the most concise one,  
but it

used up so much memory that my R just gave error. I checked the other
replies and all in all I came up with this, and thought to share it  
with

others and get comments.

My structure was as follows:

ACCOUNT   RULE  DATE
A1  2010-01-01
A2  2007-05-01
A2  2007-05-01
A2  2005-05-01
A2  2005-05-01
A1  2009-01-01

The most efficient solution I came across involves the following  
steps:


1. Find the latest date for each account, and convert it to a data  
frame:


a<-tapply(my.mapping$DATE,my.mapping$ACCOUNT,max)
a<-data.frame(ACCOUNT=names(a),DT=as.Date(a,"%Y-%m-%d"))
2. merge the set with the original data

my.mapping<-merge(x=my.mapping,y=a,by.x="ACCOUNT",by.y="ACCOUNT")

3. Create a take column, which is to confirm if the date of the row  
is the

maximum date for the account.
my.mapping<-cbind(my.mapping,TAKE=my.mapping$DATE==my.mapping$DT)
4. Filter out all lines except those with TAKE==TRUE.

my.mapping<-my.mapping[my.mapping$TAKE==TRUE,]
The running time for my whole list was 4.5 sec which is far better  
than any

other ways I tried. Let me have your thoughts on that.


My first thought is that you should use more spaces in your code. It  
looks quite a bit more complex than the method I suggested (and my  
benchmark says mine was maybe 50% faster, but with Maechler's  
improvements is now about 4 times faster. I guess I shouldn't throw  
too many stones about coding style.)


my.mapping[ with(my.mapping, DATE == ave( DATE,
  ACCOUNT,
  FUN=max} ), ]
#--
require(rbenchmark)
ave.method = function(df, acc, dt)
   {df[with( df, dt == ave(dt, acc, FUN=max)), ]}
merge.method = function(df, acc, dt) {
   a<- tapply(df[[dt]], df[[acc]],max)
   a  <- data.frame(ACCOUNT=names(a), DT=a)
   df <- merge(x=df, y=a, by.x=acc, by.y="ACCOUNT")
   df <- cbind(df, TAKE=df[dt]==df$DT)
df <- df[df$TAKE==TRUE,]}
benchmark(
   rep=ave.method(airquality, "Month", "Day"),
   pat=merge.method(airquality, "Month", "Day"),
   replications=1000,
   order=c('replications', 'elapsed'))
#-
  test replications elapsed relative user.self sys.self user.child  
sys.child
1  rep 1000   2.523 1.00 2.5120.018   
0 0
2  pat 1000   7.847 3.110186 7.7730.092   
0 0



It does give the same answers when tested on airquality, though. That  
says something for it I suppose. (Had you offered a sensible test  
dataset in your first posting , I would have offered a solution using  
your column names, but as it was I figured you should have been able  
to make the mappings.)



--
David.




Ali



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem applying Chi-square in R and Cochran's Recommendations

2010-12-29 Thread Johannes Huesing
Manoj Aravind  [Wed, Dec 29, 2010 at 03:59:16PM CET]:
> Sir,
> 
> I have a problem here while applying chisquare test to the following Data (
> below the subject of this mail) ...when I wanted to test the significance
> using three different free statistical packages, here R, EpiInfo and
> OpenEpi.
> 
> *Only OpenEpi accepts the test based on Cochran's Recommendations. *
> R says " chi squared approximation may be incorrect."
> Does it mean the same as what EpInfo saying " Chi square is not valid"

Yes. Take confidence from the fact that arithmetically all three
programs arrive at the same result (anything but surprising).
The recommendations when to trust Chi-Square are similar. R lets
you look at the source though, so if you type

> chisq.test

you get a result containing the following lines:

sr <- rowSums(x)
sc <- colSums(x)
E <- outer(sr, sc, "*")/n

(so E contains the expected values for the cell entries)

and

names(PARAMETER) <- "df"
if (any(E < 5) && is.finite(PARAMETER)) 
warning("Chi-squared approximation may be incorrect")

so it seems that R is fussier about the quality of the approximation
than EpiInfo: 

[...]
> --
>   Chi Square=43.81Degrees of Freedom=2p-value= <0.001 Cochran recommends
> accepting the chi square if: 1. No more than 20% of cells have expected < 5.2.
> No cell has an expected value < 1. In this table: 17% of 6 cells have
> expected values < 5.No cells have expected values < 1.  *Using these
> criteria, this chi square can be accepted.*  Expected value = row
> total*column total/grand total Rosner, B. Fundamentals of Biostatistics. 5th
> ed. Duxbury Thompson Learning. 2000; p. 395
> 

Note that these are recommendations which you are free to heed or ignore.

-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:johan...@huesing.name  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi")

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] helps on upgrading R in Mac OS

2010-12-29 Thread Berend Hasselman


Mao Jianfeng wrote:
> 
> Dear R-helpers,
> 
> I intend to upgrade R in Mac OS with updated R version and updated Mac
> OS version.
> 
> I think my Mac notebook is produced with Mac x86_64, darwin9.8.0. I
> have updated my Mac OS to  Mac OS X version 10.6.5. But, when I
> installed R 2.12.1, the "version" function still gave me information
> that R is based on old Mac OS. I need to know how can I update R to
> let it to fit for the updated Mac OS.
> 
> Could you please give me any direction on that? Thanks in advance.
> 
>> version
>_
> platform   x86_64-apple-darwin9.8.0
> arch   x86_64
> os darwin9.8.0
> system x86_64, darwin9.8.0
> status
> major  2
> minor  12.1
> year   2010
> month  12
> day16
> svn rev53855
> language   R
> version.string R version 2.12.1 (2010-12-16)
>> sessionInfo()
> R version 2.12.1 (2010-12-16)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
> 
> Jian-Feng, Mao
> 


This query belongs to the R-SIG-Mac list.
I don't think you should worry too much.
R is reporting the system it was built under.
The version you are using is for Leopard and higher.
Same as what I'm using on 10.6.5.

As far as I can see you are using the correct version for 10.6.5

See ?version for the "Do not use R.version$os to test the platform the code
is running on"

Berend

-- 
View this message in context: 
http://r.789695.n4.nabble.com/helps-on-upgrading-R-in-Mac-OS-tp3167005p3167230.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] icon for an R package

2010-12-29 Thread Michael Friendly

I'm looking for an icon to represent an R package.  Perhaps something like

http://cdn2.iconfinder.com/data/icons/DarkGlass_Reworked/128x128/apps/package.png

but with the R logo rather than KDE.


--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem applying Chi-square in R and Cochran's Recommendations

2010-12-29 Thread Manoj Aravind
Sir,

I have a problem here while applying chisquare test to the following Data (
below the subject of this mail) ...when I wanted to test the significance
using three different free statistical packages, here R, EpiInfo and
OpenEpi.

*Only OpenEpi accepts the test based on Cochran's Recommendations. *
R says " chi squared approximation may be incorrect."
Does it mean the same as what EpInfo saying " Chi square is not valid"

Regards,
Dr. B. Manoj Aravind.

*Table for analysis*

  Number of STDs Identified
Type of Health worker <22 or >

ASHA 395
AWW  221
ANM   1 12

..
*In R the ouput was like this*

> std<- cbind(c(39,22,1),c(5,1,12))
> std
 [,1] [,2]
[1,]   395
[2,]   221
[3,]1   12
> chisq.test(std)

Pearson's Chi-squared test

data:  std
X-squared = 43.8055, df = 2, p-value = 3.074e-10

Warning message:
In chisq.test(std) : *Chi-squared approximation may be incorrect*

In EpInfo the output was this

Analysis of Single Table
An expected cell is <5.  *Chi square is not valid.*
  Chi square = 43.81
  2 degrees of freedom
  P<0. <--
.
In OpenEpi the output was like this

Single Table AnalysisVar 2 39 544Var 1221 2311213 621880   Chi Square for R
by C Table
--
  Chi Square=43.81Degrees of Freedom=2p-value= <0.001 Cochran recommends
accepting the chi square if: 1. No more than 20% of cells have expected < 5.2.
No cell has an expected value < 1. In this table: 17% of 6 cells have
expected values < 5.No cells have expected values < 1.  *Using these
criteria, this chi square can be accepted.*  Expected value = row
total*column total/grand total Rosner, B. Fundamentals of Biostatistics. 5th
ed. Duxbury Thompson Learning. 2000; p. 395

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem applying McNemar's - Different values in SPSS and R

2010-12-29 Thread Johannes Huesing
Marc Schwartz  [Wed, Dec 29, 2010 at 03:28:56PM CET]:
> 
> On Dec 29, 2010, at 6:48 AM, Manoj Aravind wrote:
> 
> > Thank you Marc :)
> > It Certainly helped me to get the exact value of P. 
> > How to understand when to apply mcnemar.exact or just mcnemar.test?
[...]
> 

> Generally speaking, exact tests are used for "small-ish" sample
  sizes. Frequently when n <100 and in many cases, much lower (eg. <50
  or <30). The methods tend to become computationally impractical on
  "larger" data sets.

Sorry for chiming in again here, but binomial tests are computationally
cheap:

> system.time(binom.test(48000, 10))
   User  System verstrichen 
  0.072   0.000   0.077 

You are certainly correct on Fisher's Exact Test with larger tables
or Wilcoxon's Signed Rank test.

[...]
> One exception to the above comment, is the use of Fisher's Exact Test (FET), 
> which is typically advocated by folks as an alternative to a chi-square test 
> when **expected** cell counts are <5. However, much has been written in 
> recent times relative to just how conservative the FET is. One resource is:
> 
>   http://www.iancampbell.co.uk/twobytwo/twobytwo.htm
> 

That's only because people shy away from the randomized version :-)

-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:johan...@huesing.name  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi")

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RGtk2 compilation problem

2010-12-29 Thread Shige Song
Dear All,

I am trying to compile&install the package "RGtk2" on my Ubuntu 10.04
box. I did not have problem with earlier versions, but with the new
version, I got the following error message :
-
* installing *source* package ‘RGtk2’ ...
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
checking for INTROSPECTION... no
checking for GTK... yes
checking for GTHREAD... yes
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for uintptr_t... yes
configure: creating ./config.status
config.status: creating src/Makevars
** libs
gcc -std=gnu99 -I/usr/local/lib/R/include -g -D_R_=1 -pthread
-D_REENTRANT -I/usr/include/gtk-2.0 -I/usr/lib/gtk-2.0/include
-I/usr/include/atk-1.0 -I/usr/include/cairo -I/usr/include/pango-1.0
-I/usr/include/gio-unix-2.0/ -I/usr/include/glib-2.0
-I/usr/lib/glib-2.0/include -I/usr/include/pixman-1
-I/usr/include/freetype2 -I/usr/include/directfb
-I/usr/include/libpng12   -I.  -DHAVE_UINTPTR_T  -I/usr/local/include
  -fpic  -g -O2 -c RGtkDataFrame.c -o RGtkDataFrame.o
In file included from RGtk2/gtk.h:19,
 from RGtkDataFrame.h:1,
 from RGtkDataFrame.c:1:
./RGtk2/gdkClasses.h:4:23: error: RGtk2/gdk.h: No such file or directory
make: *** [RGtkDataFrame.o] Error 1
ERROR: compilation failed for package ‘RGtk2’
* removing ‘/usr/local/lib/R/library/RGtk2’
* restoring previous ‘/usr/local/lib/R/library/RGtk2’

The downloaded packages are in
‘/tmp/RtmprSWbka/downloaded_packages’
Updating HTML index of packages in '.Library'
Warning message:
In install.packages("RGtk2", dep = T) :
  installation of package 'RGtk2' had non-zero exit status


I noticed the requirement for the package
(http://cran.r-project.org/web/packages/RGtk2/index.html) saying
"...GTK+ (>= 2.8.0)..." The latest GTK+ is 2.20, could this be the
problem?

Many thanks.

Best,
Shige

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing rows with earlier dates

2010-12-29 Thread Martin Maechler
> David Winsemius 
> on Fri, 24 Dec 2010 11:47:05 -0500 writes:

> On Dec 24, 2010, at 11:04 AM, David Winsemius wrote:

>> 
>> On Dec 24, 2010, at 8:45 AM, Ali Salekfard wrote:
>> 
>>> Hi all,
>>> 
>>> I'm new to the list but have benfited from it quite extensively.  
>>> Straight to
>>> my rather strange question:
>>> 
>>> I have a data frame that contains mapping rules in this way:
>>> 
>>> ACCOUNT, RULE COLUMNS, Effective Date
>>> 
>>> 
>>> The dataframe comes from a database that stores all dates. What I  
>>> would like
>>> to do is to create a data frame with only the most recent rule for  
>>> each
>>> account. In traditional programming languages I would loop through  
>>> each
>>> account find the most recent rule(s) and fill up my updated data  
>>> frame.
>>> 
>>> Does anyone have any better idea to use R's magic (Its syntax is  
>>> still
>>> magical to me) for this problem?
>> 
>> It's going to remain magic until you start thinking about what is  
>> needed. In this case the need is for a good understanding of the  
>> structure of the data object and the str function is the usual way  
>> to examine such AND to then communicate with the list. Read the  
>> Posting Guide again and the references it cites, please.
>> 
>>> 
>> 
>> Here would have been my first attempt, assuming a dataframe named  
>> dfrm:
>> #make sure the most recent is on top
>> dfrm <- dfrm[ order(dfrm["Effective Date"], decreasing=TRUE), ]
>> # then pull the first record within ACCOUNT
>> tapply(dfrm, dfrm$ACCOUNT , FUN= "[", 1 , )
>> 
>> 
>>> By the way  the list of rules is quite extensive (144643 lines to be
>>> precise), and there are usually 1-3 most recent rules (rows) for each
>>> account.
>> 
>> That is a bit different than the initial problem statement in which  
>> you asked for the "only the most recent" within each account. How  
>> are we supposed to get 3 _most_ recent rules? I think you are  
>> expecting us to read your mind regarding how you are thinking about  
>> this problem and pull all the records with the maximum date within  
>> an account.
>> 
>> Perhaps this effort to create a logical vector would be in the right  
>> direction:
>> 
>> dfrm[ ave(dfrm["Effective Date"], dfrm[ , "ACCOUNT"], function(x) x  
>> == max(x), ]
>> 
>> It should pull all records for which the Effective Date is equal to  
>> the maximum within ACCOUNT. It is going to depend on whether  
>> "Effective Date" of of a class that can be properly compared with  
>> max(). Both Date and character representations of dates in standard  
>> y-m-d form would qualify. Other date formats might not:
>> > max("01-02-2011", "02-01-2010")
>> [1] "02-01-2010"
>> 

> When I used the strategy on the airquality dataset I do not get the  
> results I expected, but a modification did succeed:

>> airquality[ airquality$Day == ave(airquality$Day, airquality$Month,  
> FUN=function(x){ max(x)} ), ]
> Ozone Solar.R Wind Temp Month Day
> 31 37 279  7.4   76 5  31
> 61 NA 138  8.0   83 6  30
> 92 59 254  9.2   81 7  31
> 12385 188  6.3   94 8  31
> 15320 223 11.5   68 9  30

Hmm, yes, but   " FUN = function(x) { max(x) } "
is so ugly that it hurts my R-eyes.
Just use  'FUN = max'  .. please ..

and as we are in making things more readable,
I'd like to propose using with() in these cases -->

 > airquality[with(airquality, Day == ave(Day, Month, FUN=max)),]

 Ozone Solar.R Wind Temp Month Day
 31 37 279  7.4   76 5  31
 61 NA 138  8.0   83 6  30
 92 59 254  9.2   81 7  31
 12385 188  6.3   94 8  31
 15320 223 11.5   68 9  30


Regards,
Martin Maechler, ETH Zurich



> I do suspect it requires that the dataframe be sorted to get the  
> joint  conditions lined up correctly. The earlier method should have  
> used an as.logical() wrapper and would then not have needed pre- 
> sorting the dataframe, so try instead:

> frm[ as.logical(ave(dfrm["Effective Date"], dfrm[ , "ACCOUNT"],  
> function(x) x == max(x)), ]


>> 
>> 
>> -- 
>> David Winsemius, MD
>> West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Referring to an object name from within a function

2010-12-29 Thread David Winsemius


On Dec 29, 2010, at 9:18 AM, zerfetzen wrote:



Can anyone show me how to refer to an object name that is passed to a
function, from within the function?


deparse(substitute(x))



For example:

MyModel <- 1

test <- function(x) {
if(x == 1) {cat("x is a valid object.\n")}
}

test(x)


Well you don't want to test(x) since x has not been defined. You wnat  
to test(MyModel)


MyModel <- 1

test <- function(x) {xname <- deparse(substitute(x))
if(x == 1) {cat(xname, " is a valid object.\n")}
}

test(MyModel)
#MyModel  is a valid object.


What I would like this to do is pass MyModel to function test, and  
if it

passes a test, be able to print "MyModel is a valid object."

Thanks.



David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem applying McNemar's - Different values in SPSS and R

2010-12-29 Thread Marc Schwartz

On Dec 29, 2010, at 6:48 AM, Manoj Aravind wrote:

> Thank you Marc :)
> It Certainly helped me to get the exact value of P. 
> How to understand when to apply mcnemar.exact or just mcnemar.test?
> I'm a beginner to biostatistics.
> 
> Manoj Aravind


Generally speaking, exact tests are used for "small-ish" sample sizes. 
Frequently when n <100 and in many cases, much lower (eg. <50 or <30). The 
methods tend to become computationally impractical on "larger" data sets.

Since you are coming from SPSS, you might find this document helpful in 
providing a general framework:

  
http://support.spss.com/productsext/spss/documentation/spssforwindows/otherdocs/SPSS%20Exact%20Tests%207.0.pdf

The document is written by Mehta and Patel of Cytel/StatXact, who are 
historical advocates of the techniques.

That being said and as I noted in my reply to Johannes, I am not typically 
involved in situations where exact tests make sense, thus am probably not the 
best resource. I would steer you towards using various reference texts on 
analyzing categorical data (eg. Agresti) for more information. 

One exception to the above comment, is the use of Fisher's Exact Test (FET), 
which is typically advocated by folks as an alternative to a chi-square test 
when **expected** cell counts are <5. However, much has been written in recent 
times relative to just how conservative the FET is. One resource is:

  http://www.iancampbell.co.uk/twobytwo/twobytwo.htm

Another reference is:

How conservative is Fisher's exact test?
A quantitative evaluation of the two-sample comparative binomial trial
Gerald G. Crans, Jonathan J. Shuster 
Stat Med. 2008 Aug 15;27(18):3598-611.
http://onlinelibrary.wiley.com/doi/10.1002/sim.3221/abstract


So you might want to consider those resources as arguments against using the 
FET under situations that are likely more commonly observed in day to day 
practice.

HTH,

Marc

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing rows with earlier dates

2010-12-29 Thread Ali Salekfard
Thanks to everyone. Joshua's response seemed the most concise one, but it
used up so much memory that my R just gave error. I checked the other
replies and all in all I came up with this, and thought to share it with
others and get comments.

My structure was as follows:

ACCOUNT   RULE  DATE
A1  2010-01-01
A2  2007-05-01
A2  2007-05-01
 A2  2005-05-01
A2  2005-05-01
 A1  2009-01-01

The most efficient solution I came across involves the following steps:

1. Find the latest date for each account, and convert it to a data frame:

a<-tapply(my.mapping$DATE,my.mapping$ACCOUNT,max)
a<-data.frame(ACCOUNT=names(a),DT=as.Date(a,"%Y-%m-%d"))
2. merge the set with the original data

my.mapping<-merge(x=my.mapping,y=a,by.x="ACCOUNT",by.y="ACCOUNT")

3. Create a take column, which is to confirm if the date of the row is the
maximum date for the account.
my.mapping<-cbind(my.mapping,TAKE=my.mapping$DATE==my.mapping$DT)
4. Filter out all lines except those with TAKE==TRUE.

my.mapping<-my.mapping[my.mapping$TAKE==TRUE,]
The running time for my whole list was 4.5 sec which is far better than any
other ways I tried. Let me have your thoughts on that.

Ali

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Panel Data Analysis in R

2010-12-29 Thread taby gathoni
Dear All,

Can anyone provide me with reference notes(or steps) towards analysis of  
(un)balanced panel data in R.


Thank you!







Kind regards,
  Tabitha Mundia ,  Project Management  Office,   Equity Bank 
Limited,P.O. Box 75104-00200
   Head Office, Upper  hill,   NHIF BLDG, 14th  Floor,   Nairobi, Kenya 
 Direct Extension : +254732112721  Mobile: +254722309538  
 Email: tabitha.mun...@equitybank.co.keskype ID: twamaeYahoo ID: 
tabygath...@yahoo.com


An idea not coupled with action will never get any bigger than the brain cell 
it occupied.
Arnold Glasgow
..
"Attempt something large enough that failure is guaranteed…unless God steps 
in!"


--- On Wed, 12/29/10, Manoj Aravind  wrote:

From: Manoj Aravind 
Subject: Re: [R] Problem applying McNemar's - Different values in SPSS and R
To: "Marc Schwartz" 
Cc: r-help@r-project.org
Date: Wednesday, December 29, 2010, 2:48 PM

Thank you Marc :)
It Certainly helped me to get the exact value of P.
How to understand when to apply mcnemar.exact or just mcnemar.test?
I'm a beginner to biostatistics.

Manoj Aravind

On Tue, Dec 28, 2010 at 11:00 PM, Marc Schwartz wrote:

>
> On Dec 28, 2010, at 11:05 AM, Manoj Aravind wrote:
>
> > Hi friends,
> > I get different values for McNemar's test in R and SPSS. Which one should
> i
> > rely on when the p values differ.
> > I came across this problem when i started learning R and seriously give
> up
> > on SPSS or any other proprietary software.
> > Thank u in advance
> >
> > Output in SPSS follows
> >
> > *Crosstab*
> >
> >
> >               hsc
> >
> > Total
> >
> >     ABN
> >
> > NE
> >
> > ABN
> >
> > tvs
> >
> > ABN
> >
> > Count
> >
> > 40
> >
> > 3
> >
> > 43
> >
> >     Row %
> >
> > 93.0%
> >
> > 7.0%
> >
> > 100.0%
> >
> >     COL%
> >
> > 78.4%
> >
> > 30.0%
> >
> > 70.5%
> >
> >  NE
> >
> > Count
> >
> > 11
> >
> > 7
> >
> > 18
> >
> >     Row %
> >
> > 61.1%
> >
> > 38.9%
> >
> > 100.0%
> >
> >     COL%
> >
> > 21.6%
> >
> > 70.0%
> >
> > 29.5%
> >
> > Total
> >
> > Count
> >
> > 51
> >
> > 10
> >
> > 61
> >
> >  Row %
> >
> > 83.6%
> >
> > 16.4%
> >
> > 100.0%
> >
> >  COL%
> >
> > 100.0%
> >
> > 100.0%
> >
> > 100.0%
> >
> >
> >
> > * Chi-Square Tests*
> >
> >
> >      Value
> >
> > Exact Sig. (2-sided)
> >
> > McNemar Test
> >
> >  .057(a)
> >
> > N of Valid Cases
> >
> > 61
> >
> >   a Binomial distribution used.
> >
> > Output from R is as follows
> >
> >> tvshsc<-
> >
> > + matrix(c(40,11,3,7),
> >
> > + nrow=2,
> >
> > + dimnames=list("TVS"=c("ABN","NE"),
> >
> > + "HSC"=c("ABN","NE")))
> >
> >> tvshsc
> >
> >     HSC
> >
> > TVS   ABN NE
> >
> >  ABN  40  3
> >
> >  NE   11  7
> >
> >> mcnemar.test(tvshsc)
> >
> >
> > McNemar's Chi-squared test with continuity correction
> >
> >
> > data:  tvshsc
> >
> > McNemar's chi-squared = 3.5, df = 1, p-value = 0.06137
> >
> > Regards
> >
> > Dr. B Manoj Aravind
>
>
> The SPSS test appears to be an exact test, whereas the default R function
> does not perform an exact test, so you are not comparing Apples to Apples...
>
> Try this using the 'exact2x2' CRAN package:
>
> > require(exact2x2)
> Loading required package: exact2x2
> Loading required package: exactci
>
> > mcnemar.exact(matrix(c(40, 11, 3, 7), 2, 2))
>
>        Exact McNemar test (with central confidence intervals)
>
> data:  matrix(c(40, 11, 3, 7), 2, 2)
> b = 3, c = 11, p-value = 0.05737
> alternative hypothesis: true odds ratio is not equal to 1
> 95 percent confidence interval:
>  0.04885492 1.03241985
> sample estimates:
> odds ratio
>  0.2727273
>
>
> HTH,
>
> Marc Schwartz
>
>

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Referring to an object name from within a function

2010-12-29 Thread zerfetzen

Can anyone show me how to refer to an object name that is passed to a
function, from within the function?

For example:

MyModel <- 1

test <- function(x) {
 if(x == 1) {cat("x is a valid object.\n")}
}

test(x)

What I would like this to do is pass MyModel to function test, and if it
passes a test, be able to print "MyModel is a valid object."

Thanks.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Referring-to-an-object-name-from-within-a-function-tp3167147p3167147.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Any functions to manipulate (merge, cut, remove) hclust objects? (maybe through phylo?)

2010-12-29 Thread Martin Maechler
> Tal Galili 
> on Wed, 29 Dec 2010 14:08:26 +0200 writes:

> Hello Martin,
> Thank you for the reference to the "cut" option in the dendrogram help 
page!
> I guess I was too focused on looking for a solution to the hclust object
> then to think that such a method existed for dendrograms.

> The cut.dendrogram  doesn't solve my problem yet, since what I'm looking 
for
> is the output of something like:
> cutree(hc.object, k = 3)

> which is a vector indicating to which cluster belongs each item.

indeed; and that's only indirectly the result of  a cut(*, h= .) 
call.

BTW: cutree() internally translates a 
 'h = *' specification into a  'k = *' one.
  ...
  ...
  which is actually a bit peculiar, as a cut at a given height is well-defined, 
  but a cut into a given number of clusters may *NOT* be well
  defined in the case where two sub branches have the exact same
  height 'h'; such that going from  h  to  'h - eps'  leads to
  addition of *two* new clusters, i.e., a step  k --> k+2  
  such that cutree(*, k+1) is not really well defined.
  The cutree() internal algorithm will use the (somewhat)
  arbitrary order of the merges to define the grouping.

Given all the above, I now tend to think that yes, indeed,
it may be most fruitful to provide
a  as.hclust.dendrogram() method, rather than just implementing
a cut() - based cutree method for dendrograms.

> And for some reason I can't seem to understand the structure of the
> dendrogram object using "str".

Yes;  there's a str.dendrogram() method which very nicely
shows the structure of a dendrogram, 
however, if you really want to see the internal structure, you need
  str(unclass( . ))

> But I'll read some more and write back if I can't solve it.

> p.s: If I'll succeed in writing something useful, it will be
> my pleasure and honor to contribute it back to the r-project :)

Cool.
Actually, now I think the merge() is the much easier part than
the cutree() / as.hclust.dendrogram() one.
But also that should not be very hard.

As I'm officially in vacation at the moment, I may have some fun
helping with these...

Martin





> On Wed, Dec 29, 2010 at 1:49 PM, Martin Maechler 
> wrote:

>> > Tal Galili 
>> > on Wed, 29 Dec 2010 13:32:26 +0200 writes:
>> 
>> > Hello Martin,
>> > Thank you for replying.
>> 
>> > I have two needs:
>> 
>> > 1) To merge two dendrograms into one.
>> 
>> > 2) To then run cutree on it (which works on hclust, but
>> >not on dendrogram).
>> 
>> Well, but cut() does and is prominently mentioned on the
>> dendrogram help page (and its examples)
>> 
>> > I guess that if I knew how to perform both steps I would be able to do
>> what
>> > I'm trying to do on my data.
>> > If nothing like this currently exists, I guess I'll simply implement a
>> > method of cutree for a dendrogram, and see how to merge two
>> dendrograms
>> > together.
>> 
>> so you only need to program the merge / join part.
>> 
>> I did not take the time to understand what exactly you mean with
>> that, but as there is no function to do that with "hclust" either,
>> I'm convinced you should rather write one for "dendrogram"
>> indeed; as merge() is already "S3 generic", I'd call it
>> merge.dendrogram()
>> 
>> If you end up finding it useful and are willing to write a help
>> page (including examples!) for it, you may consider donating it
>> back to the R-project ... ;-)
>> 
>> Regards, Martin
>>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem applying McNemar's - Different values in SPSS and R

2010-12-29 Thread Marc Schwartz

On Dec 28, 2010, at 4:13 PM, Johannes Huesing wrote:

> Marc Schwartz  [Tue, Dec 28, 2010 at 07:14:49PM CET]:
> [...]
>>> An old question of mine: Is there any reason not to use binom.test()
>>> other than historical reasons?
>> 
> 
> (I meant "in lieu of the McNemar approximation", sorry if some
> misunderstanding ensued).


After I posted, I had a thought that this might be the case. Apologies for the 
digression then.


>> I may be missing the context of your question, but I frequently see
>> exact binomial tests being used when one is comparing the
>> presumptively known probability of some dichotomous characteristic
>> versus that which is observed in an independent sample. For example,
>> in single arm studies where one is comparing an observed event rate
>> against a point estimate for a presumptive historical control.
> 
> In the McNemar context (as used by SPSS) the null hypothesis is p=0.5.


Yes, from what I can tell from a brief Google search, it appears that there are 
some software packages offering an exact variant of McNemar's, that will 
automatically shift to performing an exact binomial test if the sample size is 
say, <25.

I rarely use exact tests in general practice (I am not typically involved with 
"smallish" data sets), so do not come across this situation frequently. That 
being said, back to your original query, if one is using these techniques, one 
might find that the exact binomial test is actually being used as noted and 
therefore should be aware of the documentation for the package, especially if 
the results that are output are not clear on the effective shift in methodology.

So historical issues nothwithstanding, the functional equivalent of 
binom.test() is used elsewhere in current practice under certain conditions.



>> I also see the use of exact binomial (Clopper-Pearson) confidence
>> intervals being used when one wants to have conservative CI's, given
>> that the nominal coverage of these are at least as large as
>> requested. That is, 95% exact CI's will be at least that large, but
>> in reality can tend to be well above that, depending upon various
>> factors. This is well documented in various papers.
> 
> Confidence intervals are not that regularly used in the McNemar context, as 
> the
> conditional probability "a > b given they are unequal" is not that much an
> interpretable quantity as is the event probability in a single arm study.
> 
>> I generally tend to use Wilson CI's for binomial proportions when
>  reporting analyses. I have my own code but these are implemented in
>  various R functions, including Frank's binconf() in Hmisc.
> 
> Thanks for the hint.


Happy to help.

Regards,

Marc

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Simulating data and imputation

2010-12-29 Thread Sarah

Hi,

I wrote a script in order to simulate data, which I will use for evaluating
missing data and imputation. However, I'm having trouble with the last part
of my script, in which a dataframe is constructed without missing values. 

This is my script:
y1 <- rnorm(10,0,3)
y2 <- rnorm(10,3,3)
y3 <- rnorm(10,3,3)
y4 <- rnorm(10,6,3)
y <- c(y1,y2,y3,y4)
a1 <-rep(1,20)
a2 <-rep(2,20)
a <- c(a1,a2)
b1 <- gl(2,10,20)
b2 <- gl(2,10,20)
b <- c(b1,b2)
x1 <- 1+2*y1+ rnorm(10,0,8)
x2 <- 1+2*y2+ rnorm(10,0,8)
x3 <- 1+2*y3+ rnorm(10,0,8)
x4 <- 1+2*y4+ rnorm(10,0,8)
x <- c(x1,x2,x3,x4)
#Create missing data dependent on factor A:
mar.y <- rep(NA,40)
df <- data.frame(y=y, mar.y=mar.y, a=a, b=b, x=x)
for (j in 1:40)
{
# Create missingness at random dependent on A:
df$mar.y[which(df$a==1)] <- replicate(length(which(df$a==1)),
rbinom(1,1,0.20))
df$mar.y[which(df$a==2)] <- replicate(length(which(df$a==2)),
rbinom(1,1,0.10))
}
if (length(which(df$mar.y==0))>34) { 
df <- df[sample(which(df$mar.y==0),34), ]
 } else {
 df <- df[c(which(df$mar.y==0),
sample(which(df$mar.y==1),34-length(which(df$mar.y==0, ]
}
 
(I would like the total number of randomly removed values to be 15% of the
total sample size, which in this case are 6 values. In other scripts I'm
using different values.)
 
At this point, I would like to impute missing values. However, my dataframe
only contains the 34 'observed' values (which seemed okay in the beginning
of my study). Now, I would like my dataframe to contain 34 observed values
(y=0) AND the 6 'missing' or deleted values (y=1). Unfortunately, the
missing values are deleted from the data set with 'sample', so imputation is
not possible at the moment (i.e., there are no NA's to impute)
Does anyone knows how to rewrite the last bit of the script
(if...else...-part), in order to keep the 6 'deleted/missing' values in the
data set, and give them a value mar.y=1 (or NA, or any other value),
together with the 34 'observed ones' (mar.y=0)? In this way, I can impute
the missing values in my data set.

Thanks in advance,
Sarah.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Simulating-data-and-imputation-tp3167119p3167119.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] another superscript problem

2010-12-29 Thread Peter Ehlers

On 2010-12-28 11:17, Tyler Dean Rudolph wrote:

Part of the reason I was having difficulty is that I'm trying to add a
legend with more than one element:

plot(1,1)
obv = 5
txt = "Pop mean"

# this works
legend("topleft", legend=bquote(.(txt) == .(obv)*degree))

# but this doesn't
legend("topleft", legend=c(bquote(.(txt) == .(obv)*degree), "Von Mises
distribution"))

How can I go about using multiple legend elements with
mathematical/latin annotation in both?

Tyler


[...snip...]

If you want the secondary text on the same line,
here are 3 ways to do that:

 txt2 <- "(Von Mises distribution)"
 txt3 <- "Von Mises distribution"

 plot(1:10, type='n')
 legend(4,2, legend =
   bquote(.(txt) == .(obv)*degree~~.(txt2)))
 legend(4,4, legend =
   bquote(.(txt) == .(obv)*degree~~group( "(", list(.(txt3)), ")" )))
 legend(4,6, legend =
   bquote(.(txt) == .(obv)*degree~~bgroup( "(", list(.(txt3)), ")" )))

The second and third just produce slightly nicer (I think) parentheses.
For the two-line display use Baptiste's suggestion.
(Note that you can't use \n in plotmath expressions, as per help page.)

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem applying McNemar's - Different values in SPSS and R

2010-12-29 Thread Manoj Aravind
Thank you Marc :)
It Certainly helped me to get the exact value of P.
How to understand when to apply mcnemar.exact or just mcnemar.test?
I'm a beginner to biostatistics.

Manoj Aravind

On Tue, Dec 28, 2010 at 11:00 PM, Marc Schwartz wrote:

>
> On Dec 28, 2010, at 11:05 AM, Manoj Aravind wrote:
>
> > Hi friends,
> > I get different values for McNemar's test in R and SPSS. Which one should
> i
> > rely on when the p values differ.
> > I came across this problem when i started learning R and seriously give
> up
> > on SPSS or any other proprietary software.
> > Thank u in advance
> >
> > Output in SPSS follows
> >
> > *Crosstab*
> >
> >
> >   hsc
> >
> > Total
> >
> > ABN
> >
> > NE
> >
> > ABN
> >
> > tvs
> >
> > ABN
> >
> > Count
> >
> > 40
> >
> > 3
> >
> > 43
> >
> > Row %
> >
> > 93.0%
> >
> > 7.0%
> >
> > 100.0%
> >
> > COL%
> >
> > 78.4%
> >
> > 30.0%
> >
> > 70.5%
> >
> >  NE
> >
> > Count
> >
> > 11
> >
> > 7
> >
> > 18
> >
> > Row %
> >
> > 61.1%
> >
> > 38.9%
> >
> > 100.0%
> >
> > COL%
> >
> > 21.6%
> >
> > 70.0%
> >
> > 29.5%
> >
> > Total
> >
> > Count
> >
> > 51
> >
> > 10
> >
> > 61
> >
> >  Row %
> >
> > 83.6%
> >
> > 16.4%
> >
> > 100.0%
> >
> >  COL%
> >
> > 100.0%
> >
> > 100.0%
> >
> > 100.0%
> >
> >
> >
> > * Chi-Square Tests*
> >
> >
> >  Value
> >
> > Exact Sig. (2-sided)
> >
> > McNemar Test
> >
> >  .057(a)
> >
> > N of Valid Cases
> >
> > 61
> >
> >   a Binomial distribution used.
> >
> > Output from R is as follows
> >
> >> tvshsc<-
> >
> > + matrix(c(40,11,3,7),
> >
> > + nrow=2,
> >
> > + dimnames=list("TVS"=c("ABN","NE"),
> >
> > + "HSC"=c("ABN","NE")))
> >
> >> tvshsc
> >
> > HSC
> >
> > TVS   ABN NE
> >
> >  ABN  40  3
> >
> >  NE   11  7
> >
> >> mcnemar.test(tvshsc)
> >
> >
> > McNemar's Chi-squared test with continuity correction
> >
> >
> > data:  tvshsc
> >
> > McNemar's chi-squared = 3.5, df = 1, p-value = 0.06137
> >
> > Regards
> >
> > Dr. B Manoj Aravind
>
>
> The SPSS test appears to be an exact test, whereas the default R function
> does not perform an exact test, so you are not comparing Apples to Apples...
>
> Try this using the 'exact2x2' CRAN package:
>
> > require(exact2x2)
> Loading required package: exact2x2
> Loading required package: exactci
>
> > mcnemar.exact(matrix(c(40, 11, 3, 7), 2, 2))
>
>Exact McNemar test (with central confidence intervals)
>
> data:  matrix(c(40, 11, 3, 7), 2, 2)
> b = 3, c = 11, p-value = 0.05737
> alternative hypothesis: true odds ratio is not equal to 1
> 95 percent confidence interval:
>  0.04885492 1.03241985
> sample estimates:
> odds ratio
>  0.2727273
>
>
> HTH,
>
> Marc Schwartz
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Any functions to manipulate (merge, cut, remove) hclust objects? (maybe through phylo?)

2010-12-29 Thread Tal Galili
Hello Martin,

Thank you for the reference to the "cut" option in the dendrogram help page!
 I guess I was too focused on looking for a solution to the hclust object
then to think that such a method existed for dendrograms.

The cut.dendrogram  doesn't solve my problem yet, since what I'm looking for
is the output of something like:
cutree(hc.object, k = 3)
Which is a vector indicating to which cluster belongs each item.

And for some reason I can't seem to understand the structure of the
dendrogram object using "str".
But I'll read some more and write back if I can't solve it.

p.s: If I'll succeed in writing something useful, it will be
my pleasure and honor to contribute it back to the r-project :)

With regards,
Tal


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Wed, Dec 29, 2010 at 1:49 PM, Martin Maechler  wrote:

> > Tal Galili 
> > on Wed, 29 Dec 2010 13:32:26 +0200 writes:
>
>> Hello Martin,
>> Thank you for replying.
>
>> I have two needs:
>
>> 1) To merge two dendrograms into one.
>
>> 2) To then run cutree on it (which works on hclust, but
>>not on dendrogram).
>
> Well, but cut() does and is prominently mentioned on the
> dendrogram help page (and its examples)
>
>> I guess that if I knew how to perform both steps I would be able to do
> what
>> I'm trying to do on my data.
>> If nothing like this currently exists, I guess I'll simply implement a
>> method of cutree for a dendrogram, and see how to merge two
> dendrograms
>> together.
>
> so you only need to program the merge / join part.
>
> I did not take the time to understand what exactly you mean with
> that, but as there is no function to do that with "hclust" either,
> I'm convinced you should rather write one for "dendrogram"
> indeed; as merge() is already "S3 generic", I'd call it
>merge.dendrogram()
>
> If you end up finding it useful and are willing to write a help
> page (including examples!) for it, you may consider donating it
> back to the R-project ... ;-)
>
> Regards, Martin
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Any functions to manipulate (merge, cut, remove) hclust objects? (maybe through phylo?)

2010-12-29 Thread Martin Maechler
> Tal Galili 
> on Wed, 29 Dec 2010 13:32:26 +0200 writes:

> Hello Martin,
> Thank you for replying.

> I have two needs:

> 1) To merge two dendrograms into one.

> 2) To then run cutree on it (which works on hclust, but
>not on dendrogram).

Well, but cut() does and is prominently mentioned on the
dendrogram help page (and its examples)

> I guess that if I knew how to perform both steps I would be able to do 
what
> I'm trying to do on my data.
> If nothing like this currently exists, I guess I'll simply implement a
> method of cutree for a dendrogram, and see how to merge two dendrograms
> together.

so you only need to program the merge / join part.

I did not take the time to understand what exactly you mean with
that, but as there is no function to do that with "hclust" either,
I'm convinced you should rather write one for "dendrogram"
indeed; as merge() is already "S3 generic", I'd call it
merge.dendrogram()

If you end up finding it useful and are willing to write a help
page (including examples!) for it, you may consider donating it
back to the R-project ... ;-)

Regards, Martin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-code to generate random rotation matrix for rotation testing

2010-12-29 Thread Martin Maechler
> Martin Krautschke 
> on Mon, 27 Dec 2010 22:47:26 +0100 writes:

> I am looking for an implementation of random rotation
> matrix generation in R to do a rotation test: I want to
> use the matrices to create random multivariate normal
> matrices with common covariance structure and mean based
> on an observed data matrix.

> The rRotationMatrix-function in the mixAK-package is an
> option, but as far as I can tell I need to draw rotation
> matrices with determinant -1 as well. Roast and Romer in
> the limma-bioconductor package appear to have implemented
> something similar, which seems not to be general enough
> for my purposes, however. Inspired by the code in the
> ffmanova-rotationtest function I thought of the following,
> but it appears to me that there only the covariance, not
> the mean, is preserved:

> #
> # a given Y has independent, multivariate normal rows
> library(mvtnorm)
> Y <- rmvnorm(4,mean=1:10,sigma=diag(1:10)+3)

> # Generation of a set of random matrices Z
> for (i in 1:10) {
> # R is random matrix of independent standard-normal entries
> R <- matrix(rnorm(16),ncol=4)
> R <- qr.Q(qr(R, LAPACK = TRUE))
> # Z shall be a random matrix with the same mean and covariance structure 
as Y
> Z <- crossprod(R,Y)
> }
> #

> A suggestion for the procedure exists (in Dorum et al. 
http://www.bepress.com/sagmb/vol8/iss1/art34/ , end of chapter 2.1), but a hint 
to a (fast) implementation would be greatly appreciated.


> Best regards and a happy new year,
> Martin Krautschke


> ---
> Martin Krautschke
> Student at University of Vienna

and this is not a home work problem?
Just in case, I don't give you the complete solution in R,
but in words :

Think geometrically: 
Rotation in the above sense only preserves the mean when that is
the zero vector. 
Consequently: Your procedure must rather be

  1)  Y0 <- Y  - mY
  2)  Z0 <- Q' %*% Y0
  3)   Z <- Z0 + mY

and to make this work with data matrices Y, Z,
the mean vector  mY  must either be a matrix with constant
columns or the result of as.vector()ing such a matrix.


Regards,
Martin Maechler, ETH Zurich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] linear regression for grouped data

2010-12-29 Thread entropy
Thanks alot for the quick responses.
I have some additional questions related to this topic. In fact, my
intention was to be able to answer questions like what percent of the
regressions have p_values less than a certain threshold, how do
residuals look like, how do the plots of y vs. x look like, etc.
I tried the following commands and found that the second line (and
similar ones) does not work for extracting certain statistics.

regress=lapply(split(egfr, as.factor(egfr$P_ID)), function(df)
{anova(lm(VALUE ~ LAB_DT, data=df)) })
regress[1]$residuals; regress[1]$fstatistic[1]

So, is it possible to record statistics of each regression such as
p_value, F-value, residuals, etc. as a vector?

Thanks,


On Dec 28, 6:23 pm, Entropi ntrp  wrote:
> Hi,
> I have been examining large data and need to do simple linear regression
> with the data which is grouped based on the values of a particular
> attribute. For instance, consider three columns : ID, x, y,  and  I need to
> regress x on y for each distinct value of ID. Specifically, for the set of
> data corresponding to each of the 4 values of ID (76,111,121,168) in the
> below data, I should invoke linear regression 4 times. The challenge is
> that, the length of the ID vector is around 2 and therefore linear
> regression must be done automatically for each distinct value of ID.
>
>                ID            x                     y
>  76 36476 15.8  76 36493 66.9  76 36579 65.6  111 35465 10.3  111 35756 4.8
> 121 38183 16  121 38184 15  121 38254 9.6  121 38255 7  168 37727 21.9  168
> 37739 29.7  168 37746 97.4
> I was wondering whether there is an easy way to group data based on the
> values of ID in R  so that linear regression can be done easily for each
> group determined by each value of ID. Or, is the only way to construct
> loops  with 'for' or 'while'  in which a matrix is generated for each
> distinct value of ID  that stores corresponding values of x and y by
> screening the entire ID vector?
>
> Thanks in advance,
>
> Yasin
>
>         [[alternative HTML version deleted]]
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Any functions to manipulate (merge, cut, remove) hclust objects? (maybe through phylo?)

2010-12-29 Thread Tal Galili
Hello Martin,
Thank you for replying.

I have two needs:
1) To merge two dendrograms into one.
2) To then run cutree on it (which works on hclust, but not on dendrogram).

I guess that if I knew how to perform both steps I would be able to do what
I'm trying to do on my data.
If nothing like this currently exists, I guess I'll simply implement a
method of cutree for a dendrogram, and see how to merge two dendrograms
together.

Best,
Tal




Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Wed, Dec 29, 2010 at 1:23 PM, Martin Maechler  wrote:

> > "TG" == Tal Galili 
> > on Mon, 27 Dec 2010 23:34:33 +0200 writes:
>
>TG> Hello all, I'm now working with hclust objects and was
>TG> hoping to perform some basic editing on them like:
>
>TG>- Joining = the merging of two hclust objects (so
>TG> they will share one root) - Splicing = So to cut/extract
>TG> a branch out of an hclust object - that by itself will
>TG> be an hclust object.
>
>TG> I noticed I could extract one element of an hclust
>TG> object by turning it into a dendrogram, but that doesn't
>TG> enable me to turn it back into an hclust object.
>
> Why should you "turn it back" ?
> What do you want to use them for
>
> The intent of the "dendrogram" has been that it is more flexible
> (and more general) than "hclust" and can be printed, plotted,
> manipulated, ... better than hclust ones.
>
> Regards,
> Martin Maechler, ETH Zurich
>
>
>TG> Are there any functions that can aid with this?  Maybe
>TG> through the ape package and the phylo objects?
>
>TG> Thanks, Tal
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Any functions to manipulate (merge, cut, remove) hclust objects? (maybe through phylo?)

2010-12-29 Thread Martin Maechler
> "TG" == Tal Galili 
> on Mon, 27 Dec 2010 23:34:33 +0200 writes:

TG> Hello all, I'm now working with hclust objects and was
TG> hoping to perform some basic editing on them like:

TG>- Joining = the merging of two hclust objects (so
TG> they will share one root) - Splicing = So to cut/extract
TG> a branch out of an hclust object - that by itself will
TG> be an hclust object.

TG> I noticed I could extract one element of an hclust
TG> object by turning it into a dendrogram, but that doesn't
TG> enable me to turn it back into an hclust object.

Why should you "turn it back" ?
What do you want to use them for

The intent of the "dendrogram" has been that it is more flexible
(and more general) than "hclust" and can be printed, plotted,
manipulated, ... better than hclust ones.

Regards,
Martin Maechler, ETH Zurich
 

TG> Are there any functions that can aid with this?  Maybe
TG> through the ape package and the phylo objects?

TG> Thanks, Tal

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] helps on upgrading R in Mac OS

2010-12-29 Thread Mao Jianfeng
Dear R-helpers,

I intend to upgrade R in Mac OS with updated R version and updated Mac
OS version.

I think my Mac notebook is produced with Mac x86_64, darwin9.8.0. I
have updated my Mac OS to  Mac OS X version 10.6.5. But, when I
installed R 2.12.1, the "version" function still gave me information
that R is based on old Mac OS. I need to know how can I update R to
let it to fit for the updated Mac OS.

Could you please give me any direction on that? Thanks in advance.

> version
   _
platform   x86_64-apple-darwin9.8.0
arch   x86_64
os darwin9.8.0
system x86_64, darwin9.8.0
status
major  2
minor  12.1
year   2010
month  12
day16
svn rev53855
language   R
version.string R version 2.12.1 (2010-12-16)
> sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

Jian-Feng, Mao

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] JGR installation problem

2010-12-29 Thread SNV Krishna
Hi All,
 
I am trying to install JGR GUI for R (windows xp) but facing the problem.
The following error message is displayed when I click on JGR.exe
 
"Cannot find Java/R Interface (JRI) library (jri.dll)
Please make sure you start JGR by double clicking the JGR.exe program"
 
I know this is R help forum, but trying to get help from experts who are
using JGR.
 
Any help or idea will be highly appreciated. 
 
thanks and regards,
 
SNVK
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.