Re: [R] How to double integrate a function in R

2013-07-26 Thread Hans W Borchers
Tiago V. Pereira  mbe.bio.br> writes:

> I am trying to double integrate the following expression:
> 
> #  expression
> (1/(2*pi))*exp(-y2/2)*sqrt((y1/(y2-y1)))
> 
> for y2>y1>0.
> 
> I am trying the following approach
> 
> # first attempt
> 
>  library(cubature)
> fun <- function(x)   { (1/(2*pi))*exp(-x[2]/2)*sqrt((x[1]/(x[2]-x[1])))}
> adaptIntegrate(fun, lower = c(0,0), upper =c(5, 6), tol=1e-8)
> 
> However, I don't know how to constrain the integration so that y2>y1>0.
> 
> Any ideas?
> Tiago

You could use integral2() in package 'pracma'. It implements the
"TwoD" algorithm and has the following properties:

(1) The boundaries of the second variable y can be functions of the first
  variable x;
(2) it can handle singularities on the boundaries (to a certain extent).

> library(pracma)
> fun <- function(y1, y2) (1/(2*pi))*exp(-y2/2)*sqrt((y1/(y2-y1)))

> integral2(fun, 0, 5, function(x) x, 6, singular=TRUE)
$Q
[1] 0.7706771

$error
[1] 7.890093e-11

The relative error is a bit optimistic, the absolute error here is < 0.5e-6.
The computation time is 0.025 seconds.

Hans Werner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readTiff - Sorry can't handle images with 32-bit samples

2013-07-26 Thread Prof Brian Ripley

It is a little more complex than that.

readTiff is not part of R.  It is in an unstated package, and there are 
instances in packages biOps and rtiff.


There are also readTIFF in packages tiff and beadArray.  It is 
tiff::readTIFF that I would recommend.  Its help says it can read 32-bit 
tiffs.


rtiff::readTiff is old, limited and orphaned.

Secondly, all of these are interfaces to libtiff, and that is an OS 
library.  So what images they can read mainly depends on what version of 
libtiff and what capabilities it was configured to provide.  Athough we 
were not told (see the posting guide) it looks like the OP was using 
Windows.  There I compiled libtiff from source and it is a fairly 
minimal build (not least as most people with complex image requirements 
do not use Windows).


There are other ways to read TIFF files, e.g. rgdal and EBImage.  See 
the R manuals, specifically 
http://cran.r-project.org/doc/manuals/r-release/R-data.html#Image-files 
.  E.g. some versions of rgdal have support for 12-bit TIFFs that 
libtiff cannot handle.


In short: use a better tool.


On 26/07/2013 23:08, Peter Langfelder wrote:

Disclaimer: I haven't seen your tif file and I know nothing about
readTiff... but here go some general comments.

TIF files can use different bit depths (number of bits to store each
pixel (or each color for each pixel). Most common software outputs 8-
or 16-bits, but your file probably has a higher bit depth of 32 bits
per sample. Apparently readTiff cannot handle such bit depth.

You may need to convert the 32-bit delth file(s) into 16-bit depth (or
whatever readTiff can handle). My suggestion would be to look at
ImageMagick, but you may also be able to use some image editing
applications to do that,

Peter

On Fri, Jul 26, 2013 at 1:51 PM, wwreith  wrote:

I tried using readTiff() and got the error message "Sorry can't handle images
with 32-bit samples"

line of code

x <- readTiff("C:/Users/550062/Desktop/Data/example1.tif")

So far I have not had any luck finding this error message on google. Any
guess at what it means and how to get the code to work?

Thanks!



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combine multiple random forests contained in a list

2013-07-26 Thread Bert Gunter
I would say that the use of Reduce in this context is bad practice.

from ?Reduce :

"Reduce uses a binary function to successively combine the elements of
a given vector and a possibly given initial value."

combine() is obviously not a binary function. do.call() seems to be
THE appropriate idiom.

-- Bert

On Fri, Jul 26, 2013 at 9:26 PM, arun  wrote:
> HI,
> Using the example in ?combine
> library(randomForest)
> rf1 <- randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE)
>   rf2 <- randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE)
>   rf3 <- randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE)
>   rf.all <- combine(rf1, rf2, rf3)
> lst1<- list(rf1,rf2,rf3)
>
> rf.allL<- do.call(`combine`,lst1)
> #or
> rf.allL<- Reduce(`combine,lst1)
>  identical(rf.all,rf.allL)
> #[1] TRUE
> A.K.
>
>
> Is there a quick and easy way to pass randomForest objects contained in a 
> list into the combine() function?
>
> As a result of calling randomForest through lapply(), I now have 10 
> randomForests in a list (rfors)
>
> I want to combine all 10 of them. Understandably combine(rfors)
> doesn't work as it doesn't recognise the individual forests within the
> list. I have spend quite sometime messing around with unlist(), lapply()
>  and apply() to try and extract the information in a suitable format but
>  to no avail. The only thing that works is combine(rfors[[1]],
> rfors[[2]] ...etc).
>
> This is a bit cumbersome though, not least because the number of
>  random forests I'll need to combine is likely to change. Any sleek and
> elegant solution to this someone can suggest?
>
> Thanks in advance for any help.
>
> Anna
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with ldpaths and new R

2013-07-26 Thread David Winsemius

On Jul 26, 2013, at 9:45 PM, Erin Hodgess wrote:

> Hello!
> 
> I have just installed R on an Ubutnu machine (13.04)

Did you follow directions given here:

http://cran.us.r-project.org/bin/linux/ubuntu/

"Users who need to compile R packages from source [e.g. package maintainers, or 
anyone installing packages with install.packages()] should also install the 
r-base-dev package:

   sudo apt-get install r-base-dev
"

I ask because one ot the other directions that you did not follow was:

"The best place to report problems with these packages or ask R questions 
specific to Ubuntu is the R-SIG-Debian mailing list. See

   https://stat.ethz.ch/mailman/listinfo/r-sig-debian
"

-- 
David.

> and keep getting the
> following:
> 
> erin@erin-Lenovo-IdeaPad-Y480:~$ R
> /usr/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or directory
> 
> R version 3.0.1 (2013-05-16) -- "Good Sport"
> Copyright (C) 2013 The R Foundation for Statistical Computing
> Platform: x86_64-pc-linux-gnu (64-bit)
> 
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
> 
>  Natural language support but running in an English locale
> 
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
> 
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
> 
>> install.packages("pbdMPI",depen=TRUE)
> Installing package into Œ/home/erin/lib/R/library‚
> (as Œlib‚ is unspecified)
> --- Please select a CRAN mirror for use in this session ---
> also installing the dependency Œrlecuyer‚
> 
> trying URL '
> http://cran.revolutionanalytics.com/src/contrib/rlecuyer_0.3-3.tar.gz'
> Content type 'application/x-gzip' length 11756 bytes (11 Kb)
> opened URL
> ==
> downloaded 11 Kb
> 
> trying URL '
> http://cran.revolutionanalytics.com/src/contrib/pbdMPI_0.1-8.tar.gz'
> Content type 'application/x-gzip' length 417547 bytes (407 Kb)
> opened URL
> ==
> downloaded 407 Kb
> 
> /usr/lib/R/bin/R: line 140: /usr/lib/R/etc/ldpaths: No such file or
> directory
> /usr/lib/R/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or
> directory
> Error in file(con, "r") : cannot open the connection
> Calls:  -> sub -> grep -> readLines -> file
> In addition: Warning message:
> In file(con, "r") :
>  cannot open file '/usr/lib/R/etc/Makeconf': No such file or directory
> /usr/lib/R/bin/R: line 140: /usr/lib/R/etc/ldpaths: No such file or
> directory
> /usr/lib/R/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or
> directory
> Error in file(con, "r") : cannot open the connection
> Calls:  -> sub -> grep -> readLines -> file
> In addition: Warning message:
> In file(con, "r") :
>  cannot open file '/usr/lib/R/etc/Makeconf': No such file or directory
> 
> The downloaded source packages are in
>Œ/tmp/RtmpAtAxNL/downloaded_packages‚
> Warning messages:
> 1: In install.packages("pbdMPI", depen = TRUE) :
>  installation of package Œrlecuyer‚ had non-zero exit status
> 2: In install.packages("pbdMPI", depen = TRUE) :
>  installation of package ŒpbdMPI‚ had non-zero exit status
>> 
> 
> 
> I uninstalled it, deleted the /etc/R and /usr/lib/R directories, and
> re-installed.  (Actually, I followed that process twice).
> 
> I'm still stuck.  Does anyone have any suggestions, please?
> 
> Thanks,
> Erin
> 
> 
> -- 
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodg...@gmail.com
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with ldpaths and new R

2013-07-26 Thread Erin Hodgess
Hello!

I have just installed R on an Ubutnu machine (13.04) and keep getting the
following:

erin@erin-Lenovo-IdeaPad-Y480:~$ R
/usr/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or directory

R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> install.packages("pbdMPI",depen=TRUE)
Installing package into ‘/home/erin/lib/R/library’
(as ‘lib’ is unspecified)
--- Please select a CRAN mirror for use in this session ---
also installing the dependency ‘rlecuyer’

trying URL '
http://cran.revolutionanalytics.com/src/contrib/rlecuyer_0.3-3.tar.gz'
Content type 'application/x-gzip' length 11756 bytes (11 Kb)
opened URL
==
downloaded 11 Kb

trying URL '
http://cran.revolutionanalytics.com/src/contrib/pbdMPI_0.1-8.tar.gz'
Content type 'application/x-gzip' length 417547 bytes (407 Kb)
opened URL
==
downloaded 407 Kb

/usr/lib/R/bin/R: line 140: /usr/lib/R/etc/ldpaths: No such file or
directory
/usr/lib/R/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or
directory
Error in file(con, "r") : cannot open the connection
Calls:  -> sub -> grep -> readLines -> file
In addition: Warning message:
In file(con, "r") :
  cannot open file '/usr/lib/R/etc/Makeconf': No such file or directory
/usr/lib/R/bin/R: line 140: /usr/lib/R/etc/ldpaths: No such file or
directory
/usr/lib/R/bin/R: line 236: /usr/lib/R/etc/ldpaths: No such file or
directory
Error in file(con, "r") : cannot open the connection
Calls:  -> sub -> grep -> readLines -> file
In addition: Warning message:
In file(con, "r") :
  cannot open file '/usr/lib/R/etc/Makeconf': No such file or directory

The downloaded source packages are in
‘/tmp/RtmpAtAxNL/downloaded_packages’
Warning messages:
1: In install.packages("pbdMPI", depen = TRUE) :
  installation of package ‘rlecuyer’ had non-zero exit status
2: In install.packages("pbdMPI", depen = TRUE) :
  installation of package ‘pbdMPI’ had non-zero exit status
>


I uninstalled it, deleted the /etc/R and /usr/lib/R directories, and
re-installed.  (Actually, I followed that process twice).

I'm still stuck.  Does anyone have any suggestions, please?

Thanks,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combine multiple random forests contained in a list

2013-07-26 Thread arun
HI,
Using the example in ?combine
library(randomForest)
rf1 <- randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE)
  rf2 <- randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE)
  rf3 <- randomForest(Species ~ ., iris, ntree=50, norm.votes=FALSE)
  rf.all <- combine(rf1, rf2, rf3)
lst1<- list(rf1,rf2,rf3)

rf.allL<- do.call(`combine`,lst1)
#or
rf.allL<- Reduce(`combine,lst1)
 identical(rf.all,rf.allL)
#[1] TRUE
A.K.


Is there a quick and easy way to pass randomForest objects contained in a list 
into the combine() function? 

As a result of calling randomForest through lapply(), I now have 10 
randomForests in a list (rfors) 

I want to combine all 10 of them. Understandably combine(rfors) 
doesn't work as it doesn't recognise the individual forests within the 
list. I have spend quite sometime messing around with unlist(), lapply()
 and apply() to try and extract the information in a suitable format but
 to no avail. The only thing that works is combine(rfors[[1]], 
rfors[[2]] ...etc). 

This is a bit cumbersome though, not least because the number of
 random forests I'll need to combine is likely to change. Any sleek and 
elegant solution to this someone can suggest? 

Thanks in advance for any help. 

Anna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Duplicated function with conditional statement

2013-07-26 Thread arun


On some slightly different datasets:
tt1<-structure(list(subj = c(1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 
6, 6, 6, 7, 7, 7, 8, 8, 8), response = structure(c(2L, 2L, 1L, 
2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 
1L, 2L, 1L), .Label = c("buy", "sample"), class = "factor"), 
    product = c(1, 2, 3, 2, 2, 3, 2, 1, 1, 4, 4, 2, 2, 4, 5, 
    5, 4, 3, 4, 5, 4, 2)), .Names = c("subj", "response", "product"
), class = "data.frame", row.names = c(NA, 22L))

tt2<- structure(list(subj = c(1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 
6, 6, 6, 7, 7, 7, 8, 8, 8), response = structure(c(2L, 2L, 1L, 
2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 
1L, 2L, 2L), .Label = c("buy", "sample"), class = "factor"), 
    product = c(1, 2, 3, 2, 2, 3, 2, 1, 1, 4, 1, 4, 5, 1, 4, 
    2, 3, 3, 2, 5, 3, 4)), .Names = c("subj", "response", "product"
), class = "data.frame", row.names = c(NA, 22L))

tt3<- structure(list(subj = c(1, 1, 1, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 
6, 6, 6, 7, 7, 7, 8, 8, 8), response = structure(c(2L, 2L, 1L, 
2L, 1L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 
1L, 1L, 2L), .Label = c("buy", "sample"), class = "factor"), 
    product = c(1, 2, 3, 2, 2, 3, 2, 1, 1, 4, 1, 1, 3, 5, 2, 
    2, 2, 2, 4, 3, 2, 5)), .Names = c("subj", "response", "product"
), class = "data.frame", row.names = c(NA, 22L))


#Tried David's solution:
tt1$rown <- rownames(tt1)
as.numeric ( apply(tt1, 1, function(x) {
    x['product'] %in% tt1[ rownames(tt1) < x['rown'] & tt1$response == "buy", 
"product"]  } ) )
  #gave inconsistent results especially since the first 10 rows were from `tt`
# [1] 0 1 1 1 1 1 1 0 1 0 1 0 0 1 0 0 1 0 1 0 1 1

#similarly for `tt2` and `tt3`.


##Created this function.  It seems to work in the tested cases, though it is 
not tested extensively.
fun1<- function(dat,colName,newColumn){
  indx<- which(dat[,colName]=="buy")
  dat[,newColumn]<-0
  dat[unlist(lapply(seq_along(indx),function(i){
            x1<- if(i==length(indx)){
                seq(indx[i],nrow(dat))
             }
            else if((indx[i+1]-indx[i])==1){
            indx[i]
            }
            else {
            seq(indx[i]+1,indx[i+1]-1)
             }
            x2<- dat[unique(c(indx[i:1],x1)),]
            x3<- subset(x2,response=="sample")
            x4<- subset(x2,response=="buy")
            if(nrow(x3)!=0) {
                row.names(x3)[x3$product%in% x4$product]
                       }
                                    
            })),newColumn]<-1
    dat

    }
fun1(tt,"response","newCol")
#   subj response product rown newCol
#1 1   sample   1    1  0
#2 1   sample   2    2  0
#3 1  buy   3    3  0
#4 2   sample   2    4  0
#5 2  buy   2    5  0
#6 3   sample   3    6  1
#7 3   sample   2    7  1
#8 3  buy   1    8  0
#9 4   sample   1    9  1
#10    4  buy   4   10  0

fun1(tt1,"response","newCol")
#   subj response product newCol
#1 1   sample   1  0
#2 1   sample   2  0
#3 1  buy   3  0
#4 2   sample   2  0
#5 2  buy   2  0
#6 3   sample   3  1
#7 3   sample   2  1
#8 3  buy   1  0
#9 4   sample   1  1
#10    4  buy   4  0
#11    5  buy   4  0
#12    5   sample   2  1
#13    5  buy   2  0
#14    6  buy   4  0
#15    6   sample   5  0
#16    6   sample   5  0
#17    7   sample   4  1
#18    7  buy   3  0
#19    7  buy   4  0
#20    8  buy   5  0
#21    8   sample   4  1
#22    8  buy   2  0
#Also
 fun1(tt2,"response","newCol")
 fun1(tt3,"response","newCol")
A.K.

P.S.  Below is OP's clarification regarding the conditional statement in a 
private message:

I am sorry i didnt question it very clearly, let me change the 
conditional statement, I hope you can understand. i will explain by 
example

as you can see, almost every number is duplicated, but only in row 6th,7th,and 
9th the value on column is 1.

on row4th, the value is duplicated( 2 already occurred on 2nd row),but 
since the value is considered as duplicated only if the value is 
duplicated where the response is 'buy' than the value on column, on 
row4th still zero. 

On row 6th, where the value product column is 3. 3 is already occurred 
in 3rd row where the value on response is 'buy', so the value on column 
should be 1

I hope it can understand the conditional statement. 








- Original Message -
From: David Winsemius 
To: David Winsemius 
Cc: R-help@r-project.org; Uwe Ligges 
Sent: Friday, July 26, 2013 5:16 PM
Subject: Re: [R] Duplicated function with conditional statement


On Jul 26, 2013, at 2:06 PM, David Winsemius wrote:

> 
> On Jul 26, 2013, at 11:51 AM, Uwe Ligges wrote:
> 
>> 
>> 

[R] matching columns of model matrix to those in original data.frame

2013-07-26 Thread Ross Boylan
What is a reliable way to go from a column of a model matrix back to the column 
(or columns) of the original data source used to make the model 
matrix?  I can come up with a method that seems to work, but I don't see 
guarantees in the documentation that it will.

In particular, does the order of the term.labels match the order of columns for 
factors in a terms object?  The documentation says the model.matrix 
assign attribute uses the ordering of terms.labels.

If anyone can tell me if this approach is reliable, or of one that is, I would 
appreciate it.

Ross Boylan

Proposed function and a little example follow.

# return a vector v such that data[,v[i]] contributed to mm[,i]
# mm = model matrix produced by
# form = formula
# data = data
reverse.map <- function(mm, form, data){
tt <- terms(form, data=data)
ttf <- attr(tt, "factors")
mmi <- attr(mm, "assign")
# this depends on assign using same order as columns of factors
# entries in mmi that are 0 (the intercept) are silently dropped
ttf2 <- ttf[,mmi]
# take the first row that contributes
r <- apply(ttf2, 2, function(is) rownames(ttf)[is > 0][1])
match(r, colnames(data))
}

> ### experiment with mapping model matrix to original columns
> df <- sp2b[sample(nrow(sp2b), 8), c("pEthnic", "ethnic_sg", "rac_gay")]
> form <- ~pEthnic+ethnic_sg*rac_gay
> mm <- model.matrix(form, df)
> tt <- terms(form, data=df)
> ttf <- attr(tt, "factors")
> mmi <- attr(mm, "assign")
> df
  pEthnic ethnic_sg rac_gay
1366 Afr Amer  Afr Amer3.25
3052 Afr Amer  Afr Amer1.75
3012   Latino  Afr Amer2.00
369  Afr Amer  Asian/PI2.00
529 White  Asian/PI2.00
194  Asian/PI  Asian/PI3.25
126 White  Asian/PI2.25
2147   LatinoLatino2.75
> colnames(mm)
 [1] "(Intercept)"   "pEthnicAsian/PI"  
 [3] "pEthnicLatino" "pEthnicOther" 
 [5] "pEthnicWhite"  "ethnic_sgAsian/PI"
 [7] "ethnic_sgLatino"   "rac_gay"  
 [9] "ethnic_sgAsian/PI:rac_gay" "ethnic_sgLatino:rac_gay"  
> ttf  # term "factors"
  pEthnic ethnic_sg rac_gay ethnic_sg:rac_gay
pEthnic 1 0   0 0
ethnic_sg   0 1   0 1
rac_gay 0 0   1 1
> mmi  #model matrix "assign"
 [1] 0 1 1 1 1 2 2 3 4 4
> reverse.map(mm, form, df)
[1] 1 1 1 1 2 2 3 2 2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GGplot 2 – cannot get histogram and box plot axis to match.

2013-07-26 Thread John W. Clow
Dear Thierry,

Thank you for your help. My code now works the way I wanted it to.

To other readers out there, I think the reason why the two functions work 
differently is because extend_limits only ensures that the axis includes the 
coordinates supplied, and does not guarantee that limits will be the 
coordinates you supplied to the function (they might extend further in some 
graphs like the heatmap)

Thierry, if any of the above is wrong or incomplete, feel free to correct me or 
add more details.

John Clow
UCSB Student

From: ONKELINX, Thierry [thierry.onkel...@inbo.be]
Sent: Friday, July 26, 2013 1:03 AM
To: John W. Clow; r-help@r-project.org
Subject: RE: GGplot 2 – cannot get histogram and box plot axis to match.

Dear John,

Use xlim() and ylim() instead of expand_limits()

library(ggplot2)

#sample data from ggplot2
data(Cars93, package = "MASS")
dataSet <- Cars93

#variables to calculate the range to extend the axis dataVector <- 
unlist(dataSet[,"MPG.city"])

dataRange <- diff(range(dataSet$MPG.city))

graphRange <- c(min(dataSet$MPG.city) - dataRange/5,
max(dataSet$MPG.city) + dataRange/5)

#making the box plot
theBoxPlot <- ggplot(dataSet,aes_string(x = "MPG.city", y = "MPG.city"))

theBoxPlot <-
  theBoxPlot  + geom_boxplot() + coord_flip() + ylim(limits = graphRange)
print(theBoxPlot)


#making the histogram
thePlot <- ggplot(dataSet,aes_string(x = "MPG.city"))
thePlot <- thePlot + geom_histogram()  + xlim(graphRange)
print(thePlot)



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
John W. Clow
Verzonden: donderdag 25 juli 2013 20:03
Aan: r-help@r-project.org
Onderwerp: [R] GGplot 2 – cannot get histogram and box plot axis to match.

Problem:
I am trying to get the histogram and box plot x axis to match. I’ve tried using 
the expand_limits function to make the axis match but that didn’t make the axis 
match. The histogram’s axis are still consistently larger than the ones for the 
box plot (though the function did help). Does anyone have a suggestion as to 
what I should do instead?


Background:
I am building a Shiny app that displays a histogram below a bar chart for a set 
of data that a user uploads to the app. If you want to see the app, go here 
http://spark.rstudio.com/jclow/Archive20130725HistogramApp/
To run the app, select “Use Sample Data” , then select  “MPG.city” under choose 
a column, then finally select box plot.


Sample code:
Below is a snippet of my code to demonstrate the problems I have.

library(ggplot2)

#sample data from ggplot2
data(Cars93, package = "MASS")
dataSet <- Cars93

#variables to calculate the range to extend the axis dataVector <- 
unlist(dataSet[,"MPG.city"])

dataRange <- max(dataVector) - min(dataVector)

graphRange <- c(min(dataVector) - dataRange/5,
max(dataVector) + dataRange/5)

#making the box plot
theBoxPlot <- ggplot(dataSet,aes_string(x = "MPG.city",y = "MPG.city"))

theBoxPlot = theBoxPlot  + geom_boxplot() + expand_limits(y= graphRange) + 
coord_flip()
print(theBoxPlot)


#making the histogram
thePlot <- ggplot(dataSet,aes_string(x = "MPG.city")) thePlot <-thePlot + 
geom_histogram()  + expand_limits(x= graphRange)

print(thePlot)


Thank you for taking the time to read this.

John Clow
UCSB Student

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the 
writer and may not be regarded as stating an official position of INBO, as long 
as the message is not confirmed by a duly signed document.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://ww

[R] Boxcox transformation error

2013-07-26 Thread Miller Ruiz
Hello

I'm trying to run the script below for making a boxcox transformation of
some variables contained on  an excel file, but i can't get it. I ever have
the same message :
error : $ operator is invalid for atomic vectors

One of the names of the variables is "Ec30" and it's the variable I put as
example.

require(RODBC)
require(fBasics)
require(e1071)
require(MASS)
setwd("C:\\estadisticafijo")
dir()
temp=odbcConnectExcel("prueba35r")
rs<-sqlFetch(temp,"Hoja1")
close(temp)
fix(rs)
boxcox(rs$"Ec30")

Thaks a lot for your help.

Miller Ruiz
Assistant Research
Cenipalma
Colombia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is it possible (/reasonable) to write an as.RefClass function for R "list" class?

2013-07-26 Thread Jeff Newmiller
Reference classes are implemented with environment objects, which are mutable. 
Once you convert to list, the converted object is not mutable, since there is 
no such thing as a mutable list. So I would say your request is not possible.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Tal Galili  wrote:
>Hello all.
>
>
>I am interested in creating a mutable "list" object in R.
>
>Ideally I would do it through something like this:
>
>x <- list(a = list(1:4), b = "miao", c = 4:1)
>x_RC <- as.RefClass(x)
>attr(x_RC[[2]], "I am") <- "string"
>x_RC[[3]] <- x_RC[[3]]+1
>x_new <- as.list(x_RC)
>
>Is that reasonably possible to create? Or does that make
>little/no-sense?
>
>Thanks.
>
>
>
>
>Contact
>Details:---
>Contact me: tal.gal...@gmail.com |
>Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew)
>|
>www.r-statistics.com (English)
>--
>
>   [[alternative HTML version deleted]]
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readTiff - Sorry can't handle images with 32-bit samples

2013-07-26 Thread Peter Langfelder
Disclaimer: I haven't seen your tif file and I know nothing about
readTiff... but here go some general comments.

TIF files can use different bit depths (number of bits to store each
pixel (or each color for each pixel). Most common software outputs 8-
or 16-bits, but your file probably has a higher bit depth of 32 bits
per sample. Apparently readTiff cannot handle such bit depth.

You may need to convert the 32-bit delth file(s) into 16-bit depth (or
whatever readTiff can handle). My suggestion would be to look at
ImageMagick, but you may also be able to use some image editing
applications to do that,

Peter

On Fri, Jul 26, 2013 at 1:51 PM, wwreith  wrote:
> I tried using readTiff() and got the error message "Sorry can't handle images
> with 32-bit samples"
>
> line of code
>
> x <- readTiff("C:/Users/550062/Desktop/Data/example1.tif")
>
> So far I have not had any luck finding this error message on google. Any
> guess at what it means and how to get the code to work?
>
> Thanks!
>
>
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/readTiff-Sorry-can-t-handle-images-with-32-bit-samples-tp4672465.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] readTiff - Sorry can't handle images with 32-bit samples

2013-07-26 Thread Jeff Newmiller
Try reading a different file. The error message says the existing code cannot 
read that file.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

wwreith  wrote:
>I tried using readTiff() and got the error message "Sorry can't handle
>images
>with 32-bit samples"
>
>line of code
>
>x <- readTiff("C:/Users/550062/Desktop/Data/example1.tif")
>
>So far I have not had any luck finding this error message on google.
>Any
>guess at what it means and how to get the code to work?
>
>Thanks!
>
>
>
>--
>View this message in context:
>http://r.789695.n4.nabble.com/readTiff-Sorry-can-t-handle-images-with-32-bit-samples-tp4672465.html
>Sent from the R help mailing list archive at Nabble.com.
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is it possible (/reasonable) to write an as.RefClass function for R "list" class?

2013-07-26 Thread Tal Galili
Hello all.


I am interested in creating a mutable "list" object in R.

Ideally I would do it through something like this:

x <- list(a = list(1:4), b = "miao", c = 4:1)
x_RC <- as.RefClass(x)
attr(x_RC[[2]], "I am") <- "string"
x_RC[[3]] <- x_RC[[3]]+1
x_new <- as.list(x_RC)

Is that reasonably possible to create? Or does that make little/no-sense?

Thanks.




Contact
Details:---
Contact me: tal.gal...@gmail.com |
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Duplicated function with conditional statement

2013-07-26 Thread David Winsemius

On Jul 26, 2013, at 2:06 PM, David Winsemius wrote:

> 
> On Jul 26, 2013, at 11:51 AM, Uwe Ligges wrote:
> 
>> 
>> 
>> On 25.07.2013 21:05, vanessa van der vaart wrote:
>>> Hi everybody,,
>>> I have a question about R function duplicated(). I have spent days try to
>>> figure this out,but I cant find any solution yet. I hope somebody can help
>>> me..
>>> this is my data:
>>> 
>>> subj=c(1,1,1,2,2,3,3,3,4,4)
>>> response=c('sample','sample','buy','sample','buy','sample','
>>> sample','buy','sample','buy')
>>> product=c(1,2,3,2,2,3,2,1,1,4)
>>> tt=data.frame(subj, response, product)
>>> 
>>> the data look like this:
>>> 
>>> subj response product
>>> 1 1   sample   1
>>> 2 1   sample   2
>>> 3 1  buy  3
>>> 4 2   sample   2
>>> 5 2 buy   2
>>> 6 3   sample   3
>>> 7 3   sample   2
>>> 8 3 buy   1
>>> 9 4  sample   1
>>> 10   4   buy4
>>> 
>>> I want to create new  column based on the value on response and product
>>> column. if the value on product is duplicated, then  the value on new column
>>> is 1, otherwise is 0.
>> 
>> 
>> According to your description:
>> 
> 
> Agree that the description did not match the output. I tried to match the 
> output using a rule that could be expressed as: 
> 
> if( a "buy"- associated "product" value precedes the current "product" 
> value){1}else{0}
> 

So this delivers the specified output:

tt$rown <- rownames(tt)
as.numeric ( apply(tt, 1, function(x) { 
 x['product'] %in% tt[ rownames(tt) < x['rown'] & tt$response == "buy", 
"product"]  } ) )

# [1] 0 0 0 0 0 1 1 0 1 0

> -- 
> David.
> 
>> tt$newcolumn <- as.integer(duplicated(tt$product) & tt$response=="buy")
>> 
>> which is different from what you show us below, where I cannot derive any 
>> systematic rule from.
>> 
>> Uwe Ligges
>> 
>>> but I want to add conditional statement that the value on product column
>>> will only be considered as duplicated if the value on response column is
>>> 'buy'.
>>> for illustration, the table should look like this:
>>> 
>>> subj response product newcolumn
>>> 1 1   sample   1  0
>>> 2 1   sample   2  0
>>> 3 1  buy  3  0
>>> 4 2   sample   2  0
>>> 5 2 buy   2  0
>>> 6 3   sample   3  1
>>> 7 3   sample   2   1
>>> 8 3 buy   1   0
>>> 9 4  sample   11
>>> 10   4   buy   4 0
>>> 
>>> 
>>> can somebody help me?
>>> any help will be appreciated.
>>> I am new in this mailing list, so forgive me in advance, If I did not  ask
>>> the question appropriately.
>>> 
>>> [[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius
> Alameda, CA, USA
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Duplicated function with conditional statement

2013-07-26 Thread David Winsemius

On Jul 26, 2013, at 11:51 AM, Uwe Ligges wrote:

> 
> 
> On 25.07.2013 21:05, vanessa van der vaart wrote:
>> Hi everybody,,
>> I have a question about R function duplicated(). I have spent days try to
>> figure this out,but I cant find any solution yet. I hope somebody can help
>> me..
>> this is my data:
>> 
>> subj=c(1,1,1,2,2,3,3,3,4,4)
>> response=c('sample','sample','buy','sample','buy','sample','
>> sample','buy','sample','buy')
>> product=c(1,2,3,2,2,3,2,1,1,4)
>> tt=data.frame(subj, response, product)
>> 
>> the data look like this:
>> 
>>  subj response product
>> 1 1   sample   1
>> 2 1   sample   2
>> 3 1  buy  3
>> 4 2   sample   2
>> 5 2 buy   2
>> 6 3   sample   3
>> 7 3   sample   2
>> 8 3 buy   1
>> 9 4  sample   1
>> 10   4   buy4
>> 
>> I want to create new  column based on the value on response and product
>> column. if the value on product is duplicated, then  the value on new column
>> is 1, otherwise is 0.
> 
> 
> According to your description:
> 

Agree that the description did not match the output. I tried to match the 
output using a rule that could be expressed as: 

 if( a "buy"- associated "product" value precedes the current "product" 
value){1}else{0}

-- 
David.

> tt$newcolumn <- as.integer(duplicated(tt$product) & tt$response=="buy")
> 
> which is different from what you show us below, where I cannot derive any 
> systematic rule from.
> 
> Uwe Ligges
> 
>> but I want to add conditional statement that the value on product column
>> will only be considered as duplicated if the value on response column is
>> 'buy'.
>> for illustration, the table should look like this:
>> 
>> subj response product newcolumn
>> 1 1   sample   1  0
>> 2 1   sample   2  0
>> 3 1  buy  3  0
>> 4 2   sample   2  0
>> 5 2 buy   2  0
>> 6 3   sample   3  1
>> 7 3   sample   2   1
>> 8 3 buy   1   0
>> 9 4  sample   11
>> 10   4   buy   4 0
>> 
>> 
>> can somebody help me?
>> any help will be appreciated.
>> I am new in this mailing list, so forgive me in advance, If I did not  ask
>> the question appropriately.
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readTiff - Sorry can't handle images with 32-bit samples

2013-07-26 Thread wwreith
I tried using readTiff() and got the error message "Sorry can't handle images
with 32-bit samples"

line of code

x <- readTiff("C:/Users/550062/Desktop/Data/example1.tif")

So far I have not had any luck finding this error message on google. Any
guess at what it means and how to get the code to work?

Thanks!



--
View this message in context: 
http://r.789695.n4.nabble.com/readTiff-Sorry-can-t-handle-images-with-32-bit-samples-tp4672465.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] add different regression lines for groups on ggplot

2013-07-26 Thread Ye Lin
Hey All,

I need to apply different regression lines to different group on my ggplot,
and here is the code I use:

qplot(x=Var1,y=Var2,data=df,color=SiteID,group=SiteID)+geom_point()+geom_smooth(method='lm',formula=log(y)~I(1/x),se=FALSE,size=2)

However the regression for different groups is as below:

AL1/AL2: log(y)~I(1/x)
AL3: log(y)~log(x)

How can I apply each regression equation on the same ggplot?

Also I have an issue that if I use the code above, the regression lines are
not overlapped on top of my data points.

Thanks for your help!
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SpatialPolygonsDataFrame and unique()

2013-07-26 Thread Nicola Rossi
Thank you very much for the help, that's what I was looking for!

Also my apologies for sending an html, I thought I turned it off
(hopefully this time works).

Best regards,

Nicola

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Duplicated function with conditional statement

2013-07-26 Thread Uwe Ligges



On 25.07.2013 21:05, vanessa van der vaart wrote:

Hi everybody,,
I have a question about R function duplicated(). I have spent days try to
figure this out,but I cant find any solution yet. I hope somebody can help
me..
this is my data:

subj=c(1,1,1,2,2,3,3,3,4,4)
response=c('sample','sample','buy','sample','buy','sample','
sample','buy','sample','buy')
product=c(1,2,3,2,2,3,2,1,1,4)
tt=data.frame(subj, response, product)

the data look like this:

  subj response product
1 1   sample   1
2 1   sample   2
3 1  buy  3
4 2   sample   2
5 2 buy   2
6 3   sample   3
7 3   sample   2
8 3 buy   1
9 4  sample   1
10   4   buy4

I want to create new  column based on the value on response and product
column. if the value on product is duplicated, then  the value on new column
is 1, otherwise is 0.



According to your description:

tt$newcolumn <- as.integer(duplicated(tt$product) & tt$response=="buy")

which is different from what you show us below, where I cannot derive 
any systematic rule from.


Uwe Ligges


but I want to add conditional statement that the value on product column
will only be considered as duplicated if the value on response column is
'buy'.
for illustration, the table should look like this:

subj response product newcolumn
1 1   sample   1  0
2 1   sample   2  0
3 1  buy  3  0
4 2   sample   2  0
5 2 buy   2  0
6 3   sample   3  1
7 3   sample   2   1
8 3 buy   1   0
9 4  sample   11
10   4   buy   4 0


can somebody help me?
any help will be appreciated.
I am new in this mailing list, so forgive me in advance, If I did not  ask
the question appropriately.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pairwise comparison between columns, logic

2013-07-26 Thread biobee
Hi Arun,

Many thanks for following this through. Based on your earlier  suggestion
with CEBPA, I also edited the code and now it seems work (I removed the non
exiting names).  I looked into the current code that you have sent below
and I  learnt a few more things. So thank you again.
Your skills in R is very good. Could you suggest some resources for advance
R programming. I am fairly comfortable with R, but need to keep abreast
with the latest R libraries that can make life a lot simpler.

Thanks again,
-M

On Fri, Jul 26, 2013 at 2:20 PM, arun kirshna [via R] <
ml-node+s789695n4672460...@n4.nabble.com> wrote:

> Hi,
> Using the example code without removing CEBPA:
> gset<- read.table("Names.txt",header=TRUE,stringsAsFactors=FALSE)
>  temp1<- read.table("Data.txt",header=TRUE,stringsAsFactors=FALSE)
> lst1<-split(temp1,temp1$Names)
> mat1<-combn(gset[,1],2)
> library(plyr)
> lst2<-lapply(split(mat1,col(mat1)),function(x){lst1[x][all(lapply(lst1[x],length)==2)]})
>
> lst3<-lapply(lst2[lapply(lst2,length)==2],function(x) {x1<-
> join_all(x,by="patient_id",type="inner");x2<-x1["patient_id"];row.names(x2)<-if(nrow(x1)!=0)
> paste(x1[,1],x1[,3],1:nrow(x1),sep="_") else NULL;x2 })
>  Reduce(rbind,lst3)
> #   patient_id
> #DNMT3A_FLT3_1 LAML-AB-2811-TB
> #DNMT3A_FLT3_2 LAML-AB-2816-TB
> #DNMT3A_FLT3_3 LAML-AB-2818-TB
> #DNMT3A_IDH1_1 LAML-AB-2802-TB
> #DNMT3A_IDH1_2 LAML-AB-2822-TB
> #DNMT3A_NPM1_1 LAML-AB-2802-TB
> #DNMT3A_NPM1_2 LAML-AB-2809-TB
> #DNMT3A_NPM1_3 LAML-AB-2811-TB
> #DNMT3A_NPM1_4 LAML-AB-2816-TB
> #DNMT3A_NRAS_1 LAML-AB-2816-TB
> #FLT3_NPM1_1   LAML-AB-2811-TB
> #FLT3_NPM1_2   LAML-AB-2812-TB
> #FLT3_NPM1_3   LAML-AB-2816-TB
> #FLT3_NRAS_1   LAML-AB-2816-TB
> #IDH1_NPM1_1   LAML-AB-2802-TB
> #NPM1_NRAS_1   LAML-AB-2816-TB
>
>
>
>
> From your original dataset:
> gset<- read.table("SampleGenes.txt",header=TRUE,stringsAsFactors=FALSE)
> temp0<-
> read.table("LAML-TB.final_analysis_set.txt",header=TRUE,stringsAsFactors=FALSE,sep="\t")
>
>  temp1<- temp0[,c("Hugo_Symbol","firehose_patient_id")]
>  str(temp1)
> #'data.frame':2221 obs. of  2 variables:
> # $ Hugo_Symbol: chr  "TBX15" "TCHHL1" "DNMT3A" "IDH1" ...
> # $ firehose_patient_id: chr  "LAML-AB-2802-TB" "LAML-AB-2802-TB"
> "LAML-AB-2802-TB" "LAML-AB-2802-TB" ...
> lst1<-split(temp1,temp1$Hugo_Symbol)
>  length(lst1)
> #[1] 1607
> mat1<-combn(gset[,1],2) # Generate all
> lst2<-lapply(split(mat1,col(mat1)),function(x){lst1[x][all(lapply(lst1[x],length)==2)]})
>
>  length(lst2)
> #[1] 105
>
>
>  lst3<-lapply(lst2[lapply(lst2,length)==2],function(x) {x1<-
> join_all(x,by="firehose_patient_id",type="inner");x2<-x1["firehose_patient_id"];row.names(x2)<-if(nrow(x1)!=0)
> paste(x1[,1],x1[,3],1:nrow(x1),sep="_") else NULL;x2 })
> res<-Reduce(rbind,lst3)
>  nrow(res)
> #[1] 234
> head(res)
> #firehose_patient_id
> #NPM1_FLT3_1 LAML-AB-2811-TB
> #NPM1_FLT3_2 LAML-AB-2812-TB
> #NPM1_FLT3_3 LAML-AB-2816-TB
> #NPM1_FLT3_4 LAML-AB-2818-TB
> #NPM1_FLT3_5 LAML-AB-2825-TB
> #NPM1_FLT3_6 LAML-AB-2836-TB
>
>
>
>
> Regarding your second question:
> setdiff(gset[,1],unique(temp1[,1])) # CEBPA was not found in the temp1[,1]
> #[1] "CEBPA"
> mat2<- combn(gset[-5,1],2)
> vec1<- apply(mat2,2,paste,collapse="_")
> vec2<-unique(gsub("(.*\\_.*)\\_.*","\\1",row.names(res)))
> setdiff(vec1,vec2)
>  #[1] "NPM1_TP53"   "NPM1_EZH2"   "NPM1_RUNX1"  "NPM1_ASXL1"  "NPM1_KDM6A"
>  #[6] "FLT3_TP53"   "FLT3_EZH2"   "FLT3_KRAS"   "FLT3_ASXL1"  "FLT3_KDM6A"
> #[11] "IDH1_TP53"   "IDH1_KRAS"   "NRAS_IDH2"   "NRAS_KRAS"   "NRAS_ASXL1"
> #[16] "NRAS_KDM6A"  "TP53_EZH2"   "TP53_IDH2"   "TP53_RUNX1"  "TP53_KRAS"
> #[21] "TP53_WT1""TP53_ASXL1"  "TP53_KDM6A"  "EZH2_IDH2"   "EZH2_WT1"
> #[26] "EZH2_ASXL1"  "EZH2_KDM6A"  "IDH2_TET2"   "IDH2_KDM6A"
> "RUNX1_KDM6A"
> #[31] "KRAS_WT1""KRAS_KDM6A"  "WT1_ASXL1"   "WT1_TET2""WT1_KDM6A"
> #[36] "ASXL1_TET2"  "ASXL1_KDM6A" "TET2_KDM6A"
> A.K.
>
>
> - Original Message -
> From: arun <[hidden 
> email]>
>
> To: Manisha <[hidden 
> email]>
>
> Cc: R help <[hidden 
> email]>
>
> Sent: Friday, July 26, 2013 11:18 AM
> Subject: Re: Pairwise comparison between columns, logic
>
> Hi Manisha,
> I didn't run your dataset as I am on the way to college.  But, from the
> error reported, I think it will be due to some missing combinations in one
> of the dataset.  For ex. if you run my previous code without removing
> CEBPA:
> ie.
> mat1<- combn(gset[,1],2)
>
>
> lst2<-lapply(split(mat1,col(mat1)),function(x)
> {x1<-join_all(lst1[x],by="patient_id",type="inner");x1["patient_id"] } )
> #Error: All inputs to rbind.fill must be data.frames
>
>
> So, check whether all the combinations are available in the `lst1`.
>
> 2. I will get back to you once I run it.
> A.K.
>
>
>
>
>
> 
> From: Mani

Re: [R] Pairwise comparison between columns, logic

2013-07-26 Thread arun
Hi,
Using the example code without removing CEBPA:
gset<- read.table("Names.txt",header=TRUE,stringsAsFactors=FALSE)
 temp1<- read.table("Data.txt",header=TRUE,stringsAsFactors=FALSE)
lst1<-split(temp1,temp1$Names)
mat1<-combn(gset[,1],2) 
library(plyr)
lst2<-lapply(split(mat1,col(mat1)),function(x){lst1[x][all(lapply(lst1[x],length)==2)]})
lst3<-lapply(lst2[lapply(lst2,length)==2],function(x) {x1<- 
join_all(x,by="patient_id",type="inner");x2<-x1["patient_id"];row.names(x2)<-if(nrow(x1)!=0)
 paste(x1[,1],x1[,3],1:nrow(x1),sep="_") else NULL;x2 })
 Reduce(rbind,lst3)
#   patient_id
#DNMT3A_FLT3_1 LAML-AB-2811-TB
#DNMT3A_FLT3_2 LAML-AB-2816-TB
#DNMT3A_FLT3_3 LAML-AB-2818-TB
#DNMT3A_IDH1_1 LAML-AB-2802-TB
#DNMT3A_IDH1_2 LAML-AB-2822-TB
#DNMT3A_NPM1_1 LAML-AB-2802-TB
#DNMT3A_NPM1_2 LAML-AB-2809-TB
#DNMT3A_NPM1_3 LAML-AB-2811-TB
#DNMT3A_NPM1_4 LAML-AB-2816-TB
#DNMT3A_NRAS_1 LAML-AB-2816-TB
#FLT3_NPM1_1   LAML-AB-2811-TB
#FLT3_NPM1_2   LAML-AB-2812-TB
#FLT3_NPM1_3   LAML-AB-2816-TB
#FLT3_NRAS_1   LAML-AB-2816-TB
#IDH1_NPM1_1   LAML-AB-2802-TB
#NPM1_NRAS_1   LAML-AB-2816-TB




From your original dataset:
gset<- read.table("SampleGenes.txt",header=TRUE,stringsAsFactors=FALSE) 
temp0<- 
read.table("LAML-TB.final_analysis_set.txt",header=TRUE,stringsAsFactors=FALSE,sep="\t")
 
 temp1<- temp0[,c("Hugo_Symbol","firehose_patient_id")]
 str(temp1)
#'data.frame':    2221 obs. of  2 variables:
# $ Hugo_Symbol    : chr  "TBX15" "TCHHL1" "DNMT3A" "IDH1" ...
# $ firehose_patient_id: chr  "LAML-AB-2802-TB" "LAML-AB-2802-TB" 
"LAML-AB-2802-TB" "LAML-AB-2802-TB" ...
lst1<-split(temp1,temp1$Hugo_Symbol) 
 length(lst1)
#[1] 1607
mat1<-combn(gset[,1],2) # Generate all
lst2<-lapply(split(mat1,col(mat1)),function(x){lst1[x][all(lapply(lst1[x],length)==2)]})
 
 length(lst2)
#[1] 105


 lst3<-lapply(lst2[lapply(lst2,length)==2],function(x) {x1<- 
join_all(x,by="firehose_patient_id",type="inner");x2<-x1["firehose_patient_id"];row.names(x2)<-if(nrow(x1)!=0)
 paste(x1[,1],x1[,3],1:nrow(x1),sep="_") else NULL;x2 })
res<-Reduce(rbind,lst3)
 nrow(res)
#[1] 234
head(res)
#    firehose_patient_id
#NPM1_FLT3_1 LAML-AB-2811-TB
#NPM1_FLT3_2 LAML-AB-2812-TB
#NPM1_FLT3_3 LAML-AB-2816-TB
#NPM1_FLT3_4 LAML-AB-2818-TB
#NPM1_FLT3_5 LAML-AB-2825-TB
#NPM1_FLT3_6 LAML-AB-2836-TB




Regarding your second question:
setdiff(gset[,1],unique(temp1[,1])) # CEBPA was not found in the temp1[,1]
#[1] "CEBPA"
mat2<- combn(gset[-5,1],2)
vec1<- apply(mat2,2,paste,collapse="_")
vec2<-unique(gsub("(.*\\_.*)\\_.*","\\1",row.names(res)))
setdiff(vec1,vec2)
 #[1] "NPM1_TP53"   "NPM1_EZH2"   "NPM1_RUNX1"  "NPM1_ASXL1"  "NPM1_KDM6A" 
 #[6] "FLT3_TP53"   "FLT3_EZH2"   "FLT3_KRAS"   "FLT3_ASXL1"  "FLT3_KDM6A" 
#[11] "IDH1_TP53"   "IDH1_KRAS"   "NRAS_IDH2"   "NRAS_KRAS"   "NRAS_ASXL1" 
#[16] "NRAS_KDM6A"  "TP53_EZH2"   "TP53_IDH2"   "TP53_RUNX1"  "TP53_KRAS"  
#[21] "TP53_WT1"    "TP53_ASXL1"  "TP53_KDM6A"  "EZH2_IDH2"   "EZH2_WT1"   
#[26] "EZH2_ASXL1"  "EZH2_KDM6A"  "IDH2_TET2"   "IDH2_KDM6A"  "RUNX1_KDM6A"
#[31] "KRAS_WT1"    "KRAS_KDM6A"  "WT1_ASXL1"   "WT1_TET2"    "WT1_KDM6A"  
#[36] "ASXL1_TET2"  "ASXL1_KDM6A" "TET2_KDM6A" 
A.K.


- Original Message -
From: arun 
To: Manisha 
Cc: R help 
Sent: Friday, July 26, 2013 11:18 AM
Subject: Re: Pairwise comparison between columns, logic

Hi Manisha,
I didn't run your dataset as I am on the way to college.  But, from the error 
reported, I think it will be due to some missing combinations in one of the 
dataset.  For ex. if you run my previous code without removing CEBPA:
ie.
mat1<- combn(gset[,1],2)


lst2<-lapply(split(mat1,col(mat1)),function(x) 
{x1<-join_all(lst1[x],by="patient_id",type="inner");x1["patient_id"] } )
#Error: All inputs to rbind.fill must be data.frames


So, check whether all the combinations are available in the `lst1`.

2. I will get back to you once I run it.
A.K.






From: Manisha 
To: arun  
Sent: Friday, July 26, 2013 11:09 AM
Subject: Re: Pairwise comparison between columns, logic



Hi Arun,
I ran the script on a larger dataset and I seem to be running into this 
following error:
Error: All inputs to rbind.fill must be data.frames
after the step;
lst2<-lapply(split(mat1,col(mat1)),function(x) 
{x1<-join_all(lst1[x],by="firehose_patient_id",type="inner");x1["firehose_patient_id"]})
 
I tried a few things to solve the issue but I am not able to. The format of 
input files and data are same as in the code you posted.
Could you suggest me something?
I have attached my input files on which I am trying to run the code. See 
attached code as well. Minor changes have been made by me.

2. I have another question. From your code how do also capture those pairs of 
names that donot have any common patient id?

Thanks again,
-M


On Fri, Jul 26, 2013 at 9:29 AM, arun  wrote:

Hi M,
>No problem.
>Regards,
>Arun
>
>
>
>
>- Original Message -
>From: "manishab...@gmail.com" 
>To: smartpink...@yahoo.com
>

[R] Error: Line starting ' ...' is malformed!

2013-07-26 Thread Ross Boylan
A DESCRIPTION file begins with 0xFFFE and
$ file DESCRIPTION 
DESCRIPTION: Little-endian UTF-16 Unicode text, with CRLF, CR line terminators

I think it was created on Windows.

In R (2,15,1 on Debian GNU/Linux), using roxygen2, I get
> roxygenize("../GitHub/mice")
Error: Line starting '��P ...' is malformed!

Enter a frame number, or 0 to exit   

1: roxygenize("../GitHub/mice")
2: read.description(DESCRIPTION)
3: read.dcf(file)

Selection: 3
Called from: read.description(DESCRIPTION)
Browse[1]> Q

I'm not sure if the first 2 characters after line starting ', which are octal 
377, 376, will survive email; I stripped them out of the subject line.

The files (DESCRIPTION isn't the only one) have also caused trouble for git 
(even on  Windows 7), since it thinks they are binary.

Any advice about what to do?

I'm reluctant to change the format of the files because it's not my package.

Ross Boylan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Externalptr class to character class from (web) scrape

2013-07-26 Thread Duncan Murdoch

By the way, here's a quicker (but slightly more dangerous) way to find the

print.XMLInternalDocument


function:  just call debug(print) before printing the object.  Two steps 
take you to the method, and let you see what it's doing. The "danger" 
comes because now print() will always trigger the debugger, which can be 
a little confusing!  Remember undebug(print) at the end.


Duncan Murdoch


On 26/07/2013 1:26 PM, Duncan Murdoch wrote:

On 26/07/2013 12:43 PM, Nick McClure wrote:
> I'm hitting a wall. When I use the 'scrape' function from the package
> 'scrapeR' to get the pagesource from a web page, I do the following:
> (as an example)
>
> website.doc = parse("http://www.google.com";)
>
> When I look at it, it seems fine:
>
> website.doc[[1]]
>
> This seems to have the information I need.  Then when I try to get it
> into a character vector,
>
> character.website = as.character(website.doc[[1]])
>
> I get the error:
>
> Error in as.vector(x, "character") :
> cannot coerce type 'externalptr' to vector of type 'character'
>
> I'm trying very very hard to wrap my head around how to get this
> external pointer to a character, but after reading many help files, I
> cannot understand how to do this. Any ideas?

You should use str() in cases like this. When I look at
str(website.doc[[1]]) (after producing website.doc with scrape(), not
parse()), I see

  > str(website.doc[[1]])
Classes 'HTMLInternalDocument', 'HTMLInternalDocument',
'XMLInternalDocument', 'XMLAbstractDocument' 
- attr(*, "headers")= Named chr [1:2] "\n302
Moved\n"| __truncated__ ""
..- attr(*, "names")= chr [1:2] "\n302
Moved\n"| __truncated__ ""

So it is an external pointer with a number of classes. One or more of
those will have a print method. methods(print) will list all the print
methods, and I see there's a (hidden) print.XMLInternalDocument method
somewhere. Then

  > getAnywhere("print.XMLInternalDocument")
A single object matching ‘print.XMLInternalDocument’ was found
It was found in the following places
registered S3 method for print from namespace XML
namespace:XML
with value

function (x, ...)
{
cat(as(x, "character"), "\n")
}


shows that the as() generic should work, even though as.character()
doesn't, and indeed as(website.doc[[1]], "character") does display
something.

Duncan Murdoch





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Externalptr class to character class from (web) scrape

2013-07-26 Thread Duncan Murdoch

On 26/07/2013 12:43 PM, Nick McClure wrote:

I'm hitting a wall. When I use the 'scrape' function from the package
'scrapeR' to get the pagesource from a web page, I do the following:
(as an example)

website.doc = parse("http://www.google.com";)

When I look at it, it seems fine:

website.doc[[1]]

This seems to have the information I need.  Then when I try to get it
into a character vector,

character.website = as.character(website.doc[[1]])

I get the error:

Error in as.vector(x, "character") :
cannot coerce type 'externalptr' to vector of type 'character'

I'm trying very very hard to wrap my head around how to get this
external pointer to a character, but after reading many help files, I
cannot understand how to do this. Any ideas?


You should use str() in cases like this. When I look at 
str(website.doc[[1]]) (after producing website.doc with scrape(), not 
parse()), I see


> str(website.doc[[1]])
Classes 'HTMLInternalDocument', 'HTMLInternalDocument', 
'XMLInternalDocument', 'XMLAbstractDocument' 
- attr(*, "headers")= Named chr [1:2] "http-equiv=\"content-type\" 
content=\"text/html;charset=utf-8\">\n302 
Moved\n"| __truncated__ ""
..- attr(*, "names")= chr [1:2] "http-equiv=\"content-type\" 
content=\"text/html;charset=utf-8\">\n302 
Moved\n"| __truncated__ ""


So it is an external pointer with a number of classes. One or more of 
those will have a print method. methods(print) will list all the print 
methods, and I see there's a (hidden) print.XMLInternalDocument method 
somewhere. Then


> getAnywhere("print.XMLInternalDocument")
A single object matching ‘print.XMLInternalDocument’ was found
It was found in the following places
registered S3 method for print from namespace XML
namespace:XML
with value

function (x, ...)
{
cat(as(x, "character"), "\n")
}


shows that the as() generic should work, even though as.character() 
doesn't, and indeed as(website.doc[[1]], "character") does display 
something.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to double integrate a function in R

2013-07-26 Thread David Winsemius

On Jul 26, 2013, at 9:33 AM, David Winsemius wrote:

> fun2 <- function(x,y)   { (x0)* (1/(2*pi))*exp(-y/2)* sqrt((x/(y-x)))}
> persp(outer(0:5, 0:6, fun2) )

There does seem to be some potential pathology at the edges of the range, 
Restricting it to x >= 0.03 removes most of that concern. 

fun2 <- function(x,y)   { (x0)* (1/(2*pi))*exp(-y/2)* sqrt((x/(y-x)))}
persp(outer(seq(0.01,5,by=.01), seq(.02,6,by=.01), fun2) ,ticktype="detailed")


> fun <- function(x)   { (x[1]0)* 
> (1/(2*pi))*exp(-x[2]/2)*if(x[1]>x[2]){0}else{ sqrt((x[1]/(x[2]-x[1])) )}}
>   adaptIntegrate(fun, lower = c(0.03,0.03), upper =c(5, 6), tol=1e-2 )
$integral
[1] 0.7605703

$error
[1] 0.00760384

$functionEvaluations
[1] 190859

$returnCode
[1] 0

I tried decreasing the tolerance to 1e-3 but the wait exceeds the patience I 
have allocated to the problem.

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't figure out why short figure won't work

2013-07-26 Thread Richard M. Heiberger
lattice graphics allow you to specify units in cm or inches, and then the
absolute vertical positioning can be controlled.

On Fri, Jul 26, 2013 at 12:58 PM, Frank Harrell wrote:

> [Sorry I can't quote past messages as I get mail on Nabble and have been
> told I can't reply through Nabble].
>
> Thanks Kennel for recommended a narrowing range for usr y-limits.  That
> does help quite a bit.
>
> But I found a disappointing aspect of the graphics system: When you change
> height= on the device call you have to change the y-coordinates to keep the
> same absolute vertical positioning.
>
> Frank
>
>
> --
> Frank E Harrell Jr Professor and Chairman  School of Medicine
>Department of Biostatistics Vanderbilt University
>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html 
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multiple interaction terms in GAMM-model

2013-07-26 Thread Gavin Simpson
On Thu, 2013-07-25 at 04:56 -0700, Jeroen wrote:
> Dear all,
> 
> I am trying to correlate a variable tau1 to a set of other variables (x1,
> x2, x3, x4), taking into account an interaction with time ('doy') and place
> ('region'), and taking into account dependency of data in time per object
> ID. My dataset looks like:



> Since the data is dependent in time per objectID, I use a GAMM model with an
> autocorrelation function. Since each variable x1, x2, etc. is dependent on
> time and place, I should incorporate this as well. 
> 
> Therefore I am wondering if the following gamm-model is correct for my
> situation:
> 
> model <- gamm( tau1 ~ te( x1, by= doy ) + te( x1, by= factor( region ) ) +
> ... + te( x4, by= doy ) + te( x4, by= factor( region ) ) + factor( region ),
> correlation= corAR1(form= ~ doy|objectID ), na.action= na.omit ).
> 
> Does anyone know if this is ok? 

The model looks a little odd - the way you've written it you don't need
`te()` smooths. For the interaction with doy, I would try

te(x1, doy, bs = c("cr", "cc"))

assuming x1 and doy are continuous. This specifies cubic splines and
cyclic cubic splines respectively for x1 and doy. I use a cyclic spline
for doy as we might not expect Jan 1st and Dec 31st to be much
different. By separating the interaction between x1 and doy and the one
for x1 and region you are saying that the way the effect of x1 varies
through the year does not change between regions. Is that reasonable?

I think you need something for the objectID - a random effect or a fixed
effect will account for differences in mean values of taul between
objects. A random effect seems most useful otherwise you'll eat into
your degrees of freedom, but you have plenty of those.

The correlation part assumes that the residuals follow an AR(1) applied
within the objects, with each AR(1) having the same coefficient.
Autocorrelation decreases exponentially with lag/separation. It is a
reasonable start. Fitting will be much quicker without this. Could you
try fitting without and then extract the normalised residuals to check
for residual correlation?

I hope you have a lot of memory and a lot of spare time available; my
experience of fitting similarly sized (though not as complex - I had one
*very* long series) was that gamm() used large amounts of RAM (I had
16GB and most was used for a single model) and took many days to fit.
You may find the random effect for object fights with the AR(1) so you
could end up with an odd fit if it fits at all.

You are going to want to turn on verbosity when fitting so you can see
how the optimisation is proceeding. Something like

require(nlme)
## need a control object
ctrl <- lmeControl(msVerbose = TRUE,
   maxIter = 400,
   msMaxIter = 400,
   niterEM = 0, ## this is VERY important for gamm()!
   tolerance = 1e-8,
   msTol = 1e-8,
   msMaxEval = 400)

then add `control = ctrl` to the `gamm()` call.

I would approach the expecting the model to fail to fit so be prepared
to simplify it etc.

Good luck,

G

> Or should I use a model which also includes terms like " te( x1 ) + ... +
> te( x4 )".
> And is the correlation function correct?
> 
> Thanks so much!!
> 
> Jeroen
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Multiple-interaction-terms-in-GAMM-model-tp4672297.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Gavin Simpson, PhD  [t] +1 306 337 8863
Adjunct Professor, Department of Biology[f] +1 306 337 2410
Institute of Environmental Change & Society [e] gavin.simp...@uregina.ca
523 Research and Innovation Centre  [tw] @ucfagls
University of Regina
Regina, SK S4S 0A2, Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't figure out why short figure won't work

2013-07-26 Thread Frank Harrell
[Sorry I can't quote past messages as I get mail on Nabble and have been 
told I can't reply through Nabble].


Thanks Kennel for recommended a narrowing range for usr y-limits.  That 
does help quite a bit.


But I found a disappointing aspect of the graphics system: When you 
change height= on the device call you have to change the y-coordinates 
to keep the same absolute vertical positioning.


Frank

--
Frank E Harrell Jr Professor and Chairman  School of Medicine
   Department of Biostatistics Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Externalptr class to character class from (web) scrape

2013-07-26 Thread Nick McClure
I'm hitting a wall. When I use the 'scrape' function from the package
'scrapeR' to get the pagesource from a web page, I do the following:
(as an example)

website.doc = parse("http://www.google.com";)

When I look at it, it seems fine:

website.doc[[1]]

This seems to have the information I need.  Then when I try to get it
into a character vector,

character.website = as.character(website.doc[[1]])

I get the error:

Error in as.vector(x, "character") :
cannot coerce type 'externalptr' to vector of type 'character'

I'm trying very very hard to wrap my head around how to get this
external pointer to a character, but after reading many help files, I
cannot understand how to do this. Any ideas?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to double integrate a function in R

2013-07-26 Thread David Winsemius

On Jul 26, 2013, at 8:44 AM, Tiago V. Pereira wrote:

> I am trying to double integrate the following expression:
> 
> #  expression
> (1/(2*pi))*exp(-y2/2)*sqrt((y1/(y2-y1)))
> 
> for y2>y1>0.
> 
> I am trying the following approach
> 
> # first attempt
> 
> library(cubature)
>fun <- function(x)   { (1/(2*pi))*exp(-x[2]/2)*sqrt((x[1]/(x[2]-x[1])))}
>adaptIntegrate(fun, lower = c(0,0), upper =c(5, 6), tol=1e-8)
> 
> However, I don't know how to constrain the integration so that y2>y1>0.

Generally incorporating boundaries is accomplished by multiplying the integrand 
with logical vectors that encapsulate what are effectively two conditions: 
Perhaps:

 fun <- function(x)   { (x[1]0)* (1/(2*pi))*exp(-x[2]/2)* 
sqrt((x[1]/(x[2]-x[1])))}

That was taking quite a long time and I interrupted it. There were quite a few 
warnings of the sort
1: In sqrt((x[1]/(x[2] - x[1]))) : NaNs produced
2: In sqrt((x[1]/(x[2] - x[1]))) : NaNs produced

Thinking the NaNs might sabotage the integration process, I added a conditional 
to the section of that expression that was generating the NaNs. I don't really 
know whether NaN's are excluded from the summation process in adaptIntegrate:

 fun <- function(x)   { (x[1]0)* (1/(2*pi))*exp(-x[2]/2)*
  if(x[1]>x[2]){ 0 }else{ sqrt((x[1]/(x[2]-x[1])) 
)} }
 adaptIntegrate(fun, lower = c(0,0), upper =c(5, 6) )

I still didn't have the patience to wait for an answer, but I did plot the 
function:

fun2 <- function(x,y)   { (x0)* (1/(2*pi))*exp(-y/2)* sqrt((x/(y-x)))}
persp(outer(0:5, 0:6, fun2) )

So at least the function is finite over most of its domain.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Saving multiple rda-files as one rda-file

2013-07-26 Thread jim holtman
What you need to ensure is that you have sufficient physical memory for the
operations that you want to do.  I would suggest at least 3X the size of
the object you want to create.  If you have RData files that will result
(after the rbind) into a 40GB object, then you will need over 100GB of
physical memory since the rbind operation will be creating a new object of
40GB from the separate files that total 40GB, so that is 80GB right there.
 If you then want to do some operations on an object that large, you might
be making copies, so you need the memory.

Maybe you should consider keeping the data in a relational database and
then use the SELECT operator to get just the data you need.  Also the
aggregation operators might be useful to reduce the size of physical memory
that you need.


On Fri, Jul 26, 2013 at 11:04 AM, David Winsemius wrote:

>
> On Jul 25, 2013, at 7:17 AM, Dark wrote:
>
> > Hi,
> >
> > Yes maybe I should have been more clear on my problem.
> > I want to append the different data-frames back into one variable (
> rbind )
> > and save it as one R Data file.
> >
>
> Indeed. That was the operation I had in mind when I made my suggestions.
> Perhaps you need to create a set of toy dataframes with similar structure
> and then the audience can propose solutions. That's the usual process
> around these parts.
>
> --
> David.
>
>
> > Regards Derk
> >
> >
> >
> > --
> > View this message in context:
> http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041p4672313.html
> > Sent from the R help mailing list archive at Nabble.com.
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maintaining data order in factanal with missing data

2013-07-26 Thread David Carlson
When you use complete.cases(), it creates a logical vector
that selects cases with no missing values, but it does not
change the rownames in the data.frame and those are carried
through to the factor scores so you could link them up that
way. Alternatively, you could use na.exclude as the na.action
in the call to factanal() instead of complete.cases()? That
should pad the output with NAs for the cases that have missing
data.

You have to use the formula version of factanal():

> set.seed(42)
> example <- data.frame(x=rnorm(15), y=rnorm(15), z=rnorm(15))
> to.na <- cbind(sample.int(15, 3), sample.int(3, 3))
> example[to.na] <- NA
> out <- factanal(~x+y+z, 1, data=example, na.action=na.omit,
scores="regression")
> out$scores   # With na.omit the cases with missing values
are gone as indicated
   Factor1 # by the missing row numbers
2  -0.92604879
4   0.10731539
5  -0.24370504
6   0.07357697
7   0.69905895
8  -0.17646575
9   1.58430095
10 -0.35934769
12  1.07671299
13 -1.47487960
14 -0.30235156
15 -0.05816682
> out <- factanal(~x+y+z, 1, data=example,
na.action=na.exclude, scores="regression")
> out$scores   # With na.exclude, the cases are kept out of
the analysis but the rows are
   Factor1 # preserved in the factor scores output
1   NA
2  -0.92604879
3   NA
4   0.10731539
5  -0.24370504
6   0.07357697
7   0.69905895
8  -0.17646575
9   1.58430095
10 -0.35934769
11  NA
12  1.07671299
13 -1.47487960
14 -0.30235156
15 -0.05816682

-
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352

Original Message-
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of PIKAL Petr
Sent: Friday, July 26, 2013 9:06 AM
To: s00123...@myacu.edu.au; 'Justin Delahunty';
r-help@r-project.org
Subject: Re: [R] Maintaining data order in factanal with
missing data

Hi

There are probably better options but

merge(data.frame(x=1:154),data.frame(x=names(ab.1.fa[[1]]),
y=ab.1.fa[[1]]), all.x=T)

gives you data frame with NA when there was missing value in
the first data.frame.

You probably can automate the process a bit with nrow
function.

Regards
Petr



> -Original Message-
> From: Justin Delahunty [mailto:a...@genius.net.au]
> Sent: Friday, July 26, 2013 3:34 PM
> To: PIKAL Petr; 'Justin Delahunty'; 'Justin Delahunty';
r-help@r-
> project.org
> Subject: RE: [R] Maintaining data order in factanal with
missing data
> 
> Hi Petr,
> 
> So sorry, I accidentally attached the complete data set
rather than the
> one with missing values. I've attached the correct file to
this email.
> RE:
> init.dfs() being local, I hadn't even thought of that. I've
been away
> from OOP for close to 15 years now, so it might be time to
revise!
> 
> The problem I have is that with missing values the list of
factor
> scores returned (ab.w1.fa$factor.scores) does not map onto
the
> originating data frame (ab.w1.df) as it no longer includes
the cases
> which had missing values. So while the original data set for
ab.w1.df
> contains 154 ordered cases, the factor analysis contains
only 150.
> 
> I am seeking a way to map the values derived from the factor
analysis
> (ab.w1.fa$factor.scores) back to their original ordered
position, so
> that these factor score variables may be merged back into
the master
> data frame (ab.df). A unique ID for each case is available
($dmid)
> which I had thought to use when merging the new variables,
however I
> don't know how to implement this.
> 
> 
> Thanks for your help,
> 
> Justin
> 
> 
> -Original Message-
> From: PIKAL Petr [mailto:petr.pi...@precheza.cz]
> Sent: Friday, 26 July 2013 10:59 PM
> To: Justin Delahunty; Justin Delahunty; r-help@r-project.org
> Subject: RE: [R] Maintaining data order in factanal with
missing data
> 
> Hi
> 
> Well, the function init.dfs does nothing as all data frames
created
> inside it does not propagate to global environment and there
is nothing
> what the function returns.
> 
> Tha last line (when used outside a function) gives warnings
but there
> is no sign of error.
> 
> When
> 
> > head(ab.1.df)
>   dmid   g5oab2  g53  g54  g55   g5ovb1
> 11 1.418932 1.805227 2.791152 3.624116 3.425586
> 22 2.293907 1.187830 1.611237 1.748526 3.816533
> 33 2.836536 2.679523 1.279639 2.674986 2.452395
> 44 1.872259 3.278359 1.785872 2.458315 1.146480
> 55 1.467195 1.180747 3.564127 3.007682 2.109506
> 66 3.098512 3.151974 3.969379 3.750571 1.497358
> > head(ab.2.df)
>   dmid   w2oab3  w22  w23  w24   w2ovb1
> 11 4.831362 5.522764 7.809366 6.969172 7.398385
> 22 6.706346 4.101742 1.434697 5.266775 5.357641
> 33 3.653806 2.666885 1.209326 5.125556 4.963374
> 44 7.221255 7.649152 6.540398 6.648506 2.576081
> 55 1.848023 5.044314 2.761881 3.307220 1.454234
> 66 7.606429 4.911766 2.034813 2.638573 2.818834
> > head(ab.3.df)
>   dmid   w3oab3   w3oab4   w3oab7   w3oab8   

Re: [R] X matrix deemed to be singular and cbind

2013-07-26 Thread Soumitro Dey
Thanks for your response. You are right - I am new to R and it's
terminologies. I will follow up on your suggestions.




On Fri, Jul 26, 2013 at 11:22 AM, Bert Gunter wrote:

> Soumitro:
>
> Have you read "An Introduction to R." If not, do so, as some of your
> confusion appears related to basic concepts (e.g. of factors)
> explained there.
>
> 1. Presumably your categorical variables are factors, not character.
> If so, when you cbind() them, you cbind their integer codes, yielding
> numerical variables. This produces an in incorrect design matrix in
> fitting -- 1 df per categorical variable instead of 1 less than the
> number of levels. Also see ?cbind.
>
> 2. Produces the correct design matrix, but you are overfitting,
> presumably because of many different levels for your categorical
> variables. I suggest you consult with a local statistician to decide
> how best to handle this, as you seem to be out of your depth with
> regard to model fitting.
>
> ... unless I have misunderstood, of course.
>
> Cheers,
> Bert
>
> On Fri, Jul 26, 2013 at 7:55 AM, Soumitro Dey 
> wrote:
> > Hi list,
> >
> > While the "X matrix deemed to be singular" question has been answered in
> > the list for quite a few times, I have a twist to it.
> >
> > I am using the coxph model for survival analysis on a dataset containing
> > over 160,000 instances and 46 independent variables and I have 2
> scenarios:
> >
> > 1. If I use cbind on the 46 independent variables (many of which are
> > categorical), coxph runs without any frills. The problem however is that
> it
> > won't report which of the categorical variables (e.g. VERY HIGH, HIGH,
> > NEUTRAL, LOW or VERY LOW) are actually meaningful/significant(e.g. XHIGH
> > ***, XLOW ., etc). Is there any way to check this?
> >
> > 2. If I don't use cbind, assuming it'll give me the details I am looking
> > for in the previous step, it throws me the "X matrix deemed to be
> > singular", more precisely: "X matrix deemed to be singular; variable 130
> > 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148
> 149
> > 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167
> 168
> > 169 170 171 172 173 174 175 176 177 178 179 180 181"
> >
> > Could anyone please elaborate on how to get around problem #1 or #2?
> >
> > Thanks!
> > SD
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
>
> Bert Gunter
> Genentech Nonclinical Biostatistics
>
> Internal Contact Info:
> Phone: 467-7374
> Website:
>
> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot install XML package - getting "cannot open URL 'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip' error

2013-07-26 Thread Jeff Newmiller
I have no trouble accessing either of those files. Keep in mind that servers do 
not have 100% uptime (and the people responsible for uptime do not monitor this 
forum), and any local network problems you may be having cannot be addressed 
here. So, in general you need to try more servers, confirm the file you want 
does exist, verify that your local network is functioning, and practice a 
little patience.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

"Grach, Alexander"  wrote:
>Hello All, 
>
> 
>
>I am not able to install/update XML package, getting the following:
>
> 
>
>> install.packages("XML", lib = .libPaths()[3])
>
>trying URL
>'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip'
>
>Error in download.file(url, destfile, method, mode = "wb", ...) : 
>
>  cannot open URL
>'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip'
>
>In addition: Warning message:
>
>In download.file(url, destfile, method, mode = "wb", ...) :
>
>  cannot open: HTTP status was '503 Service Unavailable'
>
>Warning in download.packages(pkgs, destdir = tmpd, available =
>available,  :
>
>  download of package 'XML' failed
>
> 
>
>Also have tried to download  the XML_3.98-1.1.zip from
>http://cran.r-project.org/web/packages/XML/index.html - getting an
>error:
>
>"Unable to download XML_3.98-1.1.zip from cran.r-project.org
>
>Unable to open this internet site. The requested site is either
>unavailable or cannot be found. Please try again later."
>
> 
>
>Using R 3.0.1 on windows and able to install/update other packages
>fine.
>
> 
>
>Please help,
>
> 
>
>Many thanks,
>
>Alex
>
> 
>
> 
>
>
>=
>
>This material has been prepared by individual sales and/or trading
>personnel and does not 
>constitute investment research.  Please follow the attached hyperlink
>to an important disclaimer: 
>http://www.credit-suisse.com/americas/legal/salestrading 
>=
>
>
>
>===
>
>Please access the attached hyperlink for an important
>el...{{dropped:8}}
>
>__
>R-help@r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] transform dataframe with look-up table

2013-07-26 Thread Juan Antonio Balbuena

   Hello
   First thank you very much to Jean, Bill, Brian and David for the answers and
   code. I very extremely grateful.
   I am eventually going to adapt Brian's code with a very minor alteration. If
   one follows the original syntax
End <- merge(merge(Start, transformer, by.x="Left", by.y="input", 
all.x=TRUE),
  transformer, by.x="Right", by.y="input", all.x=TRUE)

   the "Left" variables are listed right and vice versa, which seems odd. Just
   swapping "Left" and "Right" will put it right (pun intentional):
End <- merge(merge(Start, transformer, by.x="Right", by.y="input", 
all.x=TRUE),
  transformer, by.x="Left", by.y="input", all.x=TRUE)

   Best wishes
   Juan Antonio

   El 25/07/2013 18:02, Brian Diggs escribió:

On 7/25/2013 8:13 AM, Juan Antonio Balbuena wrote:

Hello
I hope that there is a simple solution to this apparently complex problem.
Any help will be much appreciated:
I have a dataframe with Left and Right readings (that is, elements in each
row are paired). For instance,
Left Right
 [1]  98
 [2]  43
 [3]  21
 [4]  65
 [5]  31
 [6]  41
 [7]  32
 [8]  42
 [9]  10   8
[10]  9   10
I  need  to  produce a new data frame where the values are transformed
according to a look-up table such as
inputoutput
 [1] 5  1
 [2]10 1
 [3] 4  2
 [4] 8  3
 [5] 6  5
 [6] 5  6
 [7] 7  6
 [8] 2  7
 [9] 9  7
[10]107
[11] 2 8
So  [1, ] in the new dataframe would be 7 3. Quite simple so far, but what
makes things complicated is the multiple outputs for a single input. In thi
s
example, 10 corresponds to 1 and 7 so [9, ] in the input dataframe must
yield two rows in its output counterpart: 1 3 and 7 3. Likewise the output
for  [10, ] should be 7 1 and 7 7. In addition, given that 3 and 1 are
missing as inputs the output for [5, ] should be NA NA.
Thank you very much for your time.
Juan Antonio Balbuena

merge can handle both of these requirements.

First, making the two datasets reproducible:

Start <- data.frame(Left=c(9,4,2,6,3,4,3,4,10,9),
 Right=c(8,3,1,5,1,1,2,2,8,10))

transformer <- data.frame(input=c(5,10,4,8,6,5,7,2,9,10,2),
   output=c(1,1,2,3,5,6,6,7,7,7,8))

Then add a marker of the original row numbers so that the work can be 
checked more easily later (not really needed for the calculations):

Start$rownum <- seq_len(nrow(Start))

Two merge statements with the columns specified and all.x set to TRUE 
(to keep cases even without a match):

End <- merge(merge(Start, transformer, by.x="Left", by.y="input", 
all.x=TRUE),
  transformer, by.x="Right", by.y="input", all.x=TRUE)

Then we can look at the output, resorted by the original row numbers:

End[order(End$rownum),]

which gives

Right Left rownum output.x output.y
12 89  173
9  34  22   NA
1  12  37   NA
2  12  38   NA
10 56  456
11 56  451
3  13  5   NA   NA
4  14  62   NA
5  23  7   NA7
6  23  7   NA8
7  24  827
8  24  828
13 8   10  913
14 8   10  973
15109 1071
16109 1077


--

Dr. Juan A. Balbuena
Marine Zoology Unit
Cavanilles Institute of Biodiversity and Evolutionary Biology
University of
Valencia
[1][1]http://www.uv.es/~balbuena
P.O. Box 22085
[2][2]http://www.uv.es/cavanilles/zoomarin/index.htm
46071 Valencia, Spain
[3][3]http://cetus.uv.es/mullpardb/index.html
e-mail: [[4]4]j.a.balbu...@uv.estel. +34 963 543 658fax +34 963 543
 733

NOTE! For shipments by EXPRESS COURIER use the following street address:
C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain.


References

1. [5]http://www.uv.es/%7Ebalbuena
2. [6]http://www.uv.es/cavanilles/zoomarin/index.htm
3. [7]http://cetus.uv.es/mullpardb/index.html
4. [8]mailto:j.a.balbu...@uv.es


   --

   Dr. Juan A. Balbuena
   Marine Zoology Unit
   Cavanilles Institute of Biodiversity and Evolutionary Biology
   University of
   Valencia
   [9]http://www.uv.es/~balbuena
   P.O. Box 22085
   [10]http://www.uv.es/cavanilles/zoomarin/index.htm
   46071 Valencia, Spain
   [11]http://cetus.uv.es/mullpardb/index.html
   e-mail: [12]j.a.balbu...@uv.estel. +34 963 543 658fax +34 963 543
   733
   +

[R] How to double integrate a function in R

2013-07-26 Thread Tiago V. Pereira
Hello, R users!

I am trying to double integrate the following expression:

#  expression
(1/(2*pi))*exp(-y2/2)*sqrt((y1/(y2-y1)))

for y2>y1>0.

I am trying the following approach

# first attempt

 library(cubature)
fun <- function(x)   { (1/(2*pi))*exp(-x[2]/2)*sqrt((x[1]/(x[2]-x[1])))}
adaptIntegrate(fun, lower = c(0,0), upper =c(5, 6), tol=1e-8)

However, I don't know how to constrain the integration so that y2>y1>0.

Any ideas?

Tiago

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Holt-Winters problem

2013-07-26 Thread Berend Hasselman

On 26-07-2013, at 16:18, Przemek Gawin  wrote:

> This post has NOT been accepted by the mailing list yet.
> This post was updated on Jul 26, 2013; 4:40pm.
> Hello,
> 
> Two days ago I started my journey with R, and right now I'm struck on
> Holtwinters, and have no idea, what I'm doing wrong. So now, I'll show
> step by step what I've been doing. Actually it is a part of "Introducy
> Time Series with R".
> 
> 1) I download data from
> http://www.massey.ac.nz/~pscowper/ts/motororg.dat due to port
> blockage. Save it as motororg.txt
> 
> 2) Inserting this code:
> Motor.dat <- read.table("motororg.txt", header = T)
> attach(Motor.dat)
> Comp.hw1 <- HoltWinters(complaints, beta = 0, gamma = 0)
> 
> 3) Baang! Error here.
> "Error in decompose(ts(x[1L:wind], start = start(x), frequency = f),
> seasonal) :
> time series has no or less than 2 periods"
> 

Your complaints vector is not a timeseries. 

>From the section Arguments of HoltWinters:

gamma   parameter used for the seasonal component. If set to FALSE, an 
non-seasonal model is fitted.

So use gamma=FALSE.

Don't use attach().
Use Motor.dat$complaints in the HoltWinter call 

or

Comp.hw1 <- with(Motor.dat, HoltWinters(complaints,  beta = 0, gamma = FALSE))


Berend

> 4) Actually it has, and I turned it on with attach command?
> 
> Motor.dat
>   complaints
> 1  27
> 2  34
> 3  31
> 4  24
> 5  18
> 6  19
> 7  17
> 8  12
> 9  26
> 10 14
> 11 18
> 12 33
> 13 31
> 14 31
> 15 19
> 16 17
> 17 10
> 18 25
> 19 26
> 20 18
> 21 18
> 22 10
> 23  4
> 24 20
> 25 28
> 26 21
> 27 18
> 28 23
> 29 19
> 30 16
> 31 10
> 32 24
> 33 11
> 34 11
> 35 13
> 36 27
> 37 18
> 38 20
> 39 21
> 40  6
> 41  9
> 42 29
> 43 12
> 44 19
> 45 14
> 46 19
> 47 16
> 48 23
> 
> Can you tell me what I am doing wrong?
> 
> PS
> 
> I tried to make complaints.ts as well but I didn't work either.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hmisc ctable rotate option obsolete?

2013-07-26 Thread Simon Zehnder
Dear R-Users and R-Devels,

I may have found a deprecated option for the 'latex' function in the Hmisc 
package. I am working with Hmisc and knitr and tried the following code:

 \documentclass{article}
\usepackage[utf8]{inputenc}
\usepackage{amsmath, amssymb}
\usepackage{ctable}
%\usepackage{booktabs}
\begin{document}
<>=
library(Hmisc)
iris.t <- head(iris)
iris.t[seq(2, NROW(iris.t), by = 2),] <- format(iris.t[seq(2, NROW(iris.t), by 
= 2),], scientific = TRUE)
texMat <- matrix("", ncol = 5, nrow = 6)
texMat[seq(2,nrow(texMat), by = 2), ] <- "scriptsize"
latex(iris.t, 
file = '', 
landscape = TRUE,
dcolumn = TRUE,
col.just = c('r','c', 'r','c', 'l'),
cdec = c(0, 0, 1, 1, 0),
na.blank = TRUE,
rowname = '',
rowlabel = '', 
cellTexCmd = texMat,
ctable = TRUE, 
cgroup = c('Observations', ''),
n.cgroup = c(4, 1),
rgroup = c('',''),
n.rgroup = c(3, 3),
caption = 'iris'
)
@
\end{document}

Everything runs fine but the 'landscape' option. It says in the help for 
'latex' that if option 'ctable' is set to TRUE the 'rotate' option for ctable 
is used if 'nadscape' is set TRUE. Looking at the ctable documentary 
(http://texdoc.net/texmf-dist/doc/latex/ctable/ctable.pdf) in section Change 
History, I get for version v1.07: General: Added option sideways, option rotate 
now obsolete. Hasn't this been updated in the Hmisc package?

Best

Simon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Repeated measures Cox regression ??coxph??

2013-07-26 Thread John Sorkin
Marc,
Thank you for your comments. The data has been previously collected, so the 
study is a non-concurrent prospective analysis, i.e. retrospective analysis.
John


John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 
>>> Marc Schwartz  07/26/13 11:13 AM >>>
Thanks Terry. 

Good points. I recalled last night some exchanges on r-sig-mixed-models 
regarding a reasonable number of 'replications' for the estimation of random 
effects and it occurred to me that with this study, you will have 0, 1 or 2 
events per subject, depending upon the subject risk profiles for hip 
replacement and length of follow up. 

It was not clear to me if John's cohort study is retrospective or prospective. 
If the former, then he will have some insights into the event distribution. If 
the latter and he needs to pre-specify the analytic method, a GEE style 
approach using coxph() may make more sense here given the unknowns.

Regards,

Marc
 
On Jul 26, 2013, at 7:02 AM, Terry Therneau  wrote:

> Two choices.
> If this were a linear model, do you like the GEE approach or a mixed effects 
> approach?  Assume that "subject" is a variable containing a per-subject 
> identifier.
> 
> GEE approach: add "+ cluster(subject)" to the model statement in coxph
> Mixed models approach: Add " + (1|subject)" to the model statment in coxme.
> 
> When only a very few subjects have multiple events, the mixed model (random 
> effect) approach may not be reliable, however.  Multiple events per group are 
> the fuel for estimation of the variance of the random effect, and with few of 
> these the profile likelihood of the random effect will be very flat.  You can 
> get esssentially a random estimate of the variance of the "subject effect".  
> I'm still getting my arms around this issue, and it has taken me a long time.
> 
> "Frailty" is an alternate label for "random effects when all we have is a 
> random intercept".  Multiple labels for the same idea adds confusion, but 
> nothing else.
> 
> Terry Therneau
> 
> On 07/25/2013 08:14 PM, Marc Schwartz wrote:
>> On Jul 25, 2013, at 4:45 PM, David Winsemius  wrote:
>> 
>>> On Jul 25, 2013, at 12:27 PM, Marc Schwartz wrote:
>>> 
 On Jul 25, 2013, at 2:11 PM, John Sorkin  
 wrote:
 
> Colleagues,
> Is there any R package that will allow one to perform a repeated measures 
> Cox Proportional Hazards regression? I don't think coxph is set up to 
> handle this type of problem, but I would be happy to know that I am not 
> correct.
> I am doing a study of time to hip joint replacement. As each person has 
> two hips, a given person can appear in the dataset twice, once for the 
> left hip and once for the right hip, and I need to account for the 
> correlation of data from a single individual.
> Thank you,
> John
 
 
 John,
 
 See Terry's 'coxme' package:
 
 http://cran.r-project.org/web/packages/coxme/index.html
 
>>> When I looked over the description of coxme, I was concerned it was not 
>>> really designed with this in mind. Looking at Therneau and Grambsch, I 
>>> thought section 8.4.2 in the 'Multiple Events per Subject' Chapter fit the 
>>> analysis question well. There they compared the use of coxph( 
>>> ...+cluster(ID),,...)  withcoxph( ...+strata(ID),,...). Unfortunately I 
>>> could not tell for sure which one was being described as superio but I 
>>> think it was the cluster() alternative. I seem to remember there are 
>>> discussions in the archives.
>> 
>> David,
>> 
>> I think that you raise a good point. The example in the book (I had to wait 
>> to get home to read it) is potentially different however, in that the 
>> subject's eye's were randomized to treatment or control, which would seem to 
>> suggest comparable baseline characteristics for each pair of eyes, as well 
>> as an active intervention on one side where a difference in treatment effect 
>> between each eye is being analyzed.
>> 
>> It is not clear from John's description above if there is one hip that will 
>> be treated versus one as a control and whether the extent of disease at 
>> baseline is similar in each pair of hips. Presumably the timing of hip 
>> replacements will be staggered at some level, even if there is comparable 
>> disease, simply due to post-op recovery time and surgical risk. In cases 
>> where the disease between each hip is materially different, that would be 
>> another factor to consider, however I would defer to orthopaedic 
>> physicians/surgeons from a subject matter expertise consideration. It is 
>> possible that the bilateral hip replacement data might be more of a parallel 
>> to bilateral breast cancer data, if each breast w

Re: [R] X matrix deemed to be singular and cbind

2013-07-26 Thread Bert Gunter
Soumitro:

Have you read "An Introduction to R." If not, do so, as some of your
confusion appears related to basic concepts (e.g. of factors)
explained there.

1. Presumably your categorical variables are factors, not character.
If so, when you cbind() them, you cbind their integer codes, yielding
numerical variables. This produces an in incorrect design matrix in
fitting -- 1 df per categorical variable instead of 1 less than the
number of levels. Also see ?cbind.

2. Produces the correct design matrix, but you are overfitting,
presumably because of many different levels for your categorical
variables. I suggest you consult with a local statistician to decide
how best to handle this, as you seem to be out of your depth with
regard to model fitting.

... unless I have misunderstood, of course.

Cheers,
Bert

On Fri, Jul 26, 2013 at 7:55 AM, Soumitro Dey  wrote:
> Hi list,
>
> While the "X matrix deemed to be singular" question has been answered in
> the list for quite a few times, I have a twist to it.
>
> I am using the coxph model for survival analysis on a dataset containing
> over 160,000 instances and 46 independent variables and I have 2 scenarios:
>
> 1. If I use cbind on the 46 independent variables (many of which are
> categorical), coxph runs without any frills. The problem however is that it
> won't report which of the categorical variables (e.g. VERY HIGH, HIGH,
> NEUTRAL, LOW or VERY LOW) are actually meaningful/significant(e.g. XHIGH
> ***, XLOW ., etc). Is there any way to check this?
>
> 2. If I don't use cbind, assuming it'll give me the details I am looking
> for in the previous step, it throws me the "X matrix deemed to be
> singular", more precisely: "X matrix deemed to be singular; variable 130
> 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149
> 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168
> 169 170 171 172 173 174 175 176 177 178 179 180 181"
>
> Could anyone please elaborate on how to get around problem #1 or #2?
>
> Thanks!
> SD
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pairwise comparison between columns, logic

2013-07-26 Thread arun
Hi Manisha,
I didn't run your dataset as I am on the way to college.  But, from the error 
reported, I think it will be due to some missing combinations in one of the 
dataset.  For ex. if you run my previous code without removing CEBPA:
ie.
mat1<- combn(gset[,1],2)


lst2<-lapply(split(mat1,col(mat1)),function(x) 
{x1<-join_all(lst1[x],by="patient_id",type="inner");x1["patient_id"] } )
#Error: All inputs to rbind.fill must be data.frames


So, check whether all the combinations are available in the `lst1`.

2. I will get back to you once I run it.
A.K.






From: Manisha 
To: arun  
Sent: Friday, July 26, 2013 11:09 AM
Subject: Re: Pairwise comparison between columns, logic



Hi Arun,
I ran the script on a larger dataset and I seem to be running into this 
following error:
Error: All inputs to rbind.fill must be data.frames
after the step;
lst2<-lapply(split(mat1,col(mat1)),function(x) 
{x1<-join_all(lst1[x],by="firehose_patient_id",type="inner");x1["firehose_patient_id"]})
 
I tried a few things to solve the issue but I am not able to. The format of 
input files and data are same as in the code you posted.
Could you suggest me something?
I have attached my input files on which I am trying to run the code. See 
attached code as well. Minor changes have been made by me.

2. I have another question. From your code how do also capture those pairs of 
names that donot have any common patient id?

Thanks again,
-M


On Fri, Jul 26, 2013 at 9:29 AM, arun  wrote:

Hi M,
>No problem.
>Regards,
>Arun
>
>
>
>
>- Original Message -
>From: "manishab...@gmail.com" 
>To: smartpink...@yahoo.com
>Cc:
>Sent: Friday, July 26, 2013 9:27 AM
>Subject: Re: Pairwise comparison between columns, logic
>
>Thanks for the code. It is elegant and does what I need. Learnt some new 
>things.
>-M
>
>
>_
>Sent from http://r.789695.n4.nabble.com
>  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Repeated measures Cox regression ??coxph??

2013-07-26 Thread Marc Schwartz
Thanks Terry. 

Good points. I recalled last night some exchanges on r-sig-mixed-models 
regarding a reasonable number of 'replications' for the estimation of random 
effects and it occurred to me that with this study, you will have 0, 1 or 2 
events per subject, depending upon the subject risk profiles for hip 
replacement and length of follow up. 

It was not clear to me if John's cohort study is retrospective or prospective. 
If the former, then he will have some insights into the event distribution. If 
the latter and he needs to pre-specify the analytic method, a GEE style 
approach using coxph() may make more sense here given the unknowns.

Regards,

Marc
 
On Jul 26, 2013, at 7:02 AM, Terry Therneau  wrote:

> Two choices.
> If this were a linear model, do you like the GEE approach or a mixed effects 
> approach?  Assume that "subject" is a variable containing a per-subject 
> identifier.
> 
> GEE approach: add "+ cluster(subject)" to the model statement in coxph
> Mixed models approach: Add " + (1|subject)" to the model statment in coxme.
> 
> When only a very few subjects have multiple events, the mixed model (random 
> effect) approach may not be reliable, however.  Multiple events per group are 
> the fuel for estimation of the variance of the random effect, and with few of 
> these the profile likelihood of the random effect will be very flat.  You can 
> get esssentially a random estimate of the variance of the "subject effect".  
> I'm still getting my arms around this issue, and it has taken me a long time.
> 
> "Frailty" is an alternate label for "random effects when all we have is a 
> random intercept".  Multiple labels for the same idea adds confusion, but 
> nothing else.
> 
> Terry Therneau
> 
> On 07/25/2013 08:14 PM, Marc Schwartz wrote:
>> On Jul 25, 2013, at 4:45 PM, David Winsemius  wrote:
>> 
>>> On Jul 25, 2013, at 12:27 PM, Marc Schwartz wrote:
>>> 
 On Jul 25, 2013, at 2:11 PM, John Sorkin  
 wrote:
 
> Colleagues,
> Is there any R package that will allow one to perform a repeated measures 
> Cox Proportional Hazards regression? I don't think coxph is set up to 
> handle this type of problem, but I would be happy to know that I am not 
> correct.
> I am doing a study of time to hip joint replacement. As each person has 
> two hips, a given person can appear in the dataset twice, once for the 
> left hip and once for the right hip, and I need to account for the 
> correlation of data from a single individual.
> Thank you,
> John
 
 
 John,
 
 See Terry's 'coxme' package:
 
 http://cran.r-project.org/web/packages/coxme/index.html
 
>>> When I looked over the description of coxme, I was concerned it was not 
>>> really designed with this in mind. Looking at Therneau and Grambsch, I 
>>> thought section 8.4.2 in the 'Multiple Events per Subject' Chapter fit the 
>>> analysis question well. There they compared the use of coxph( 
>>> ...+cluster(ID),,...)  withcoxph( ...+strata(ID),,...). Unfortunately I 
>>> could not tell for sure which one was being described as superio but I 
>>> think it was the cluster() alternative. I seem to remember there are 
>>> discussions in the archives.
>> 
>> David,
>> 
>> I think that you raise a good point. The example in the book (I had to wait 
>> to get home to read it) is potentially different however, in that the 
>> subject's eye's were randomized to treatment or control, which would seem to 
>> suggest comparable baseline characteristics for each pair of eyes, as well 
>> as an active intervention on one side where a difference in treatment effect 
>> between each eye is being analyzed.
>> 
>> It is not clear from John's description above if there is one hip that will 
>> be treated versus one as a control and whether the extent of disease at 
>> baseline is similar in each pair of hips. Presumably the timing of hip 
>> replacements will be staggered at some level, even if there is comparable 
>> disease, simply due to post-op recovery time and surgical risk. In cases 
>> where the disease between each hip is materially different, that would be 
>> another factor to consider, however I would defer to orthopaedic 
>> physicians/surgeons from a subject matter expertise consideration. It is 
>> possible that the bilateral hip replacement data might be more of a parallel 
>> to bilateral breast cancer data, if each breast were to be tracked 
>> separately.
>> 
>> I have cc'd Terry here, hoping that he might jump in and offer some insights 
>> into the pros/cons of using coxme versus coxph with either a cluster or 
>> strata based approach, or perhaps even a frailty based approach as in 9.4.1 
>> in the book.
>> 
>> Regards,
>> 
>> Marc
>> 
>> 
>>> -- 
>>> David.
 You also might find the following of interest:
 
 http://bjo.bmj.com/content/71/9/645.full.pdf
 
 http://www.ncbi.nlm.nih.gov/pubmed/6885
 
 http://www.ncbi.nlm.nih.go

[R] cannot install XML package - getting "cannot open URL 'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip' error

2013-07-26 Thread Grach, Alexander
Hello All, 

 

I am not able to install/update XML package, getting the following:

 

> install.packages("XML", lib = .libPaths()[3])

trying URL
'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip'

Error in download.file(url, destfile, method, mode = "wb", ...) : 

  cannot open URL
'http://cran.cnr.Berkeley.edu/bin/windows/contrib/3.0/XML_3.98-1.1.zip'

In addition: Warning message:

In download.file(url, destfile, method, mode = "wb", ...) :

  cannot open: HTTP status was '503 Service Unavailable'

Warning in download.packages(pkgs, destdir = tmpd, available =
available,  :

  download of package 'XML' failed

 

Also have tried to download  the XML_3.98-1.1.zip from
http://cran.r-project.org/web/packages/XML/index.html - getting an
error:

"Unable to download XML_3.98-1.1.zip from cran.r-project.org

Unable to open this internet site. The requested site is either
unavailable or cannot be found. Please try again later."

 

Using R 3.0.1 on windows and able to install/update other packages fine.

 

Please help,

 

Many thanks,

Alex

 

 


=
 
This material has been prepared by individual sales and/or trading personnel 
and does not 
constitute investment research.  Please follow the attached hyperlink to an 
important disclaimer: 
http://www.credit-suisse.com/americas/legal/salestrading 
=
 


=== 
Please access the attached hyperlink for an important el...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Holt-Winters problem

2013-07-26 Thread Przemek Gawin
This post has NOT been accepted by the mailing list yet.
This post was updated on Jul 26, 2013; 4:40pm.
Hello,

Two days ago I started my journey with R, and right now I'm struck on
Holtwinters, and have no idea, what I'm doing wrong. So now, I'll show
step by step what I've been doing. Actually it is a part of "Introducy
Time Series with R".

1) I download data from
http://www.massey.ac.nz/~pscowper/ts/motororg.dat due to port
blockage. Save it as motororg.txt

2) Inserting this code:
Motor.dat <- read.table("motororg.txt", header = T)
attach(Motor.dat)
Comp.hw1 <- HoltWinters(complaints, beta = 0, gamma = 0)

3) Baang! Error here.
"Error in decompose(ts(x[1L:wind], start = start(x), frequency = f),
seasonal) :
 time series has no or less than 2 periods"

4) Actually it has, and I turned it on with attach command?

Motor.dat
   complaints
1  27
2  34
3  31
4  24
5  18
6  19
7  17
8  12
9  26
10 14
11 18
12 33
13 31
14 31
15 19
16 17
17 10
18 25
19 26
20 18
21 18
22 10
23  4
24 20
25 28
26 21
27 18
28 23
29 19
30 16
31 10
32 24
33 11
34 11
35 13
36 27
37 18
38 20
39 21
40  6
41  9
42 29
43 12
44 19
45 14
46 19
47 16
48 23

Can you tell me what I am doing wrong?

PS

I tried to make complaints.ts as well but I didn't work either.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Differential problem

2013-07-26 Thread Raphaëlle Carraud
Hello,

I am trying to solve the following algebraic differential equation :

dy1 = h - dy3
dy2 = g - dy4
y3 = K1*y1/(y3+y4)
y4 = K2*y2/(y3+y4)

I tried using the function daspk, but it gives me the following error :

> out <- daspk(y = yini, dy = dyini, times = z, res = liquide, parms = 0)
daspk--  warning.. At T(=R1) and stepsize H (=R2) the
  nonlinear solver failed to converge
  repeatedly of with abs (H) = HMIN &g, 0
Warning messages:
1: In daspk(y = yini, dy = dyini, times = z, res = liquide, parms = 0) :
  repeated convergence test failures on a step - inaccurate Jacobian or 
preconditioner?
2: In daspk(y = yini, dy = dyini, times = z, res = liquide, parms = 0) :
  Returning early. Results are accurate, as far as they go
> head(out)
 time   12 3 4  5  6 7 8 9
[1,]0 0.9 1.33 0 0 10 20 1e-04 0.001 0.001
[2,]0 0.9 1.33 0 0 10 20 1e-04 0.001 0.001

I read the documentation but it only says that daspk can only solve DAE of 
index one maximum, which I do not understand how to determine. I was wondering 
if I had to use the function radau but I do not get how to do the M matrix? I 
did not managed to get it with the example.
Could it be an error in my program? Here it is :

liquide <- function(z, C, dC, parameters) {
  with(as.list(c(C,dC,parameters)),{

Ct = C[1] + C[2] + C[3] + C[4] + C[5] + C[6] + C[7] + C[8] + C[9] + Ceau
V <- 1

K32 <- 6.54*10^2  # m^3/mol
K33 <- 1.37*10^-2 # m^3/mol
K34 <- 0.330  # sans unité
K35 <- 5.81*10^1  # sans unité

kf2 <- 1.37   # m^3/mol
kf3 <- 1.37   # m^3/mol
kf4 <- 8.68*10^-1 # m^3
kf5 <- 157.2*10^-6# m^3

K2 <- 10^1.44*10^3# mol/m^3
K3 <- 10^(-3.224)*10^3# mol/m^3
Ke <- 10^-11  # mol/m^3

r1 <- kf4*C[4]/V - kf4/K34*C[5]^2/(Ct*V)
r2 <- kf3*C[1]*C[2] - kf3/K33*C[4]
r3 <- kf2*C[1]^2 - kf2/K32*C[3]
r4 <- kf5*C[3]/V - kf5/K35*C[5]*C[6]*C[8]/(Ct^2*V)

res1 <- dC[1] + r2 + r3 # dNO2/dt
res2 <- dC[2] + r2  # dNO/dt
res3 <- dC[3] - r3 + r4 # dN2O4/dt
res4 <- dC[4] - r2 + r1 # dN2O3/dt
res5 <- dC[5] -2*r1 - r4 + dC[7]# dHNO2/dt
res6 <- dC[6] - r4 + dC[8]  # dHNO3/dt
res7 <- C[7] - K3*C[5]/C[9] # dNO2-/dt
res8 <- C[8] - K2*C[6]/C[9] # dNO3-/dt
res9 <- dC[9] - dC[8] - dC[7]   # dCH/dt

list(c(res1, res2, res3, res4, res5, res6, res7, res8, res9))
  })
}

yini <- c(0.9, 1.33, 0, 0, 10,  20, 0.001, 22.9, 23)
dyini <- c(1,1,1,1,1,1,1,1,1)

Qm <- 4   # kg/h
Ceau <- (Qm - yini[1]*0.046 - yini[2]*0.03 + yini[3]*0.092 + yini[4]*0.076 +
   yini[5]*0.062 + yini[6]*0.063)/0.018# mol/m^3

parameters <- c(Qm = Qm,
Ceau = Ceau)

liquide(20,yini, dyini,parameters)

z <- seq(0, 120, by = 1)  # en seconde

library(deSolve)
out <- daspk(y = yini, dy = dyini, times = z, res = liquide, parms = 0)
head(out)
plot(out)

Thanks

Raphaëlle Carraud

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Saving multiple rda-files as one rda-file

2013-07-26 Thread David Winsemius

On Jul 25, 2013, at 7:17 AM, Dark wrote:

> Hi, 
> 
> Yes maybe I should have been more clear on my problem.
> I want to append the different data-frames back into one variable ( rbind )
> and save it as one R Data file.
> 

Indeed. That was the operation I had in mind when I made my suggestions. 
Perhaps you need to create a set of toy dataframes with similar structure and 
then the audience can propose solutions. That's the usual process around these 
parts.

-- 
David.


> Regards Derk
> 
> 
> 
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Saving-multiple-rda-files-as-one-rda-file-tp4672041p4672313.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] X matrix deemed to be singular and cbind

2013-07-26 Thread Soumitro Dey
Hi list,

While the "X matrix deemed to be singular" question has been answered in
the list for quite a few times, I have a twist to it.

I am using the coxph model for survival analysis on a dataset containing
over 160,000 instances and 46 independent variables and I have 2 scenarios:

1. If I use cbind on the 46 independent variables (many of which are
categorical), coxph runs without any frills. The problem however is that it
won't report which of the categorical variables (e.g. VERY HIGH, HIGH,
NEUTRAL, LOW or VERY LOW) are actually meaningful/significant(e.g. XHIGH
***, XLOW ., etc). Is there any way to check this?

2. If I don't use cbind, assuming it'll give me the details I am looking
for in the previous step, it throws me the "X matrix deemed to be
singular", more precisely: "X matrix deemed to be singular; variable 130
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149
150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168
169 170 171 172 173 174 175 176 177 178 179 180 181"

Could anyone please elaborate on how to get around problem #1 or #2?

Thanks!
SD

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prediction survival curves for coxph-models; how to extract the right strata per individual

2013-07-26 Thread Terry Therneau
It would help me to give advice if I knew what you wanted to do with the new curves.  
Plot, print, extract?


A more direct solution to your question will appear in the next release of the 
code, btw.

Terry T.


On 07/25/2013 05:00 AM, r-help-requ...@r-project.org wrote:

My problem is:



I have a coxph.model with several strata and other covariables.

Now I want to fit the estimated survival-curves for new data, using
survfit.coxph.

But this returns a prediction for each stratum per individual. So if I
have 15 new individuals and 10 strata, I have 150 fitted surivival curves
(or if I don't use the subscripts I have 15 predictions with the curves
for all strata pasted together)



Is there any possibility to get only the survival curves for the stratum
the new individual belongs to?




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maintaining data order in factanal with missing data

2013-07-26 Thread PIKAL Petr
Hi

There are probably better options but

merge(data.frame(x=1:154),data.frame(x=names(ab.1.fa[[1]]), y=ab.1.fa[[1]]), 
all.x=T)

gives you data frame with NA when there was missing value in the first 
data.frame.

You probably can automate the process a bit with nrow function.

Regards
Petr



> -Original Message-
> From: Justin Delahunty [mailto:a...@genius.net.au]
> Sent: Friday, July 26, 2013 3:34 PM
> To: PIKAL Petr; 'Justin Delahunty'; 'Justin Delahunty'; r-help@r-
> project.org
> Subject: RE: [R] Maintaining data order in factanal with missing data
> 
> Hi Petr,
> 
> So sorry, I accidentally attached the complete data set rather than the
> one with missing values. I've attached the correct file to this email.
> RE:
> init.dfs() being local, I hadn't even thought of that. I've been away
> from OOP for close to 15 years now, so it might be time to revise!
> 
> The problem I have is that with missing values the list of factor
> scores returned (ab.w1.fa$factor.scores) does not map onto the
> originating data frame (ab.w1.df) as it no longer includes the cases
> which had missing values. So while the original data set for ab.w1.df
> contains 154 ordered cases, the factor analysis contains only 150.
> 
> I am seeking a way to map the values derived from the factor analysis
> (ab.w1.fa$factor.scores) back to their original ordered position, so
> that these factor score variables may be merged back into the master
> data frame (ab.df). A unique ID for each case is available ($dmid)
> which I had thought to use when merging the new variables, however I
> don't know how to implement this.
> 
> 
> Thanks for your help,
> 
> Justin
> 
> 
> -Original Message-
> From: PIKAL Petr [mailto:petr.pi...@precheza.cz]
> Sent: Friday, 26 July 2013 10:59 PM
> To: Justin Delahunty; Justin Delahunty; r-help@r-project.org
> Subject: RE: [R] Maintaining data order in factanal with missing data
> 
> Hi
> 
> Well, the function init.dfs does nothing as all data frames created
> inside it does not propagate to global environment and there is nothing
> what the function returns.
> 
> Tha last line (when used outside a function) gives warnings but there
> is no sign of error.
> 
> When
> 
> > head(ab.1.df)
>   dmid   g5oab2  g53  g54  g55   g5ovb1
> 11 1.418932 1.805227 2.791152 3.624116 3.425586
> 22 2.293907 1.187830 1.611237 1.748526 3.816533
> 33 2.836536 2.679523 1.279639 2.674986 2.452395
> 44 1.872259 3.278359 1.785872 2.458315 1.146480
> 55 1.467195 1.180747 3.564127 3.007682 2.109506
> 66 3.098512 3.151974 3.969379 3.750571 1.497358
> > head(ab.2.df)
>   dmid   w2oab3  w22  w23  w24   w2ovb1
> 11 4.831362 5.522764 7.809366 6.969172 7.398385
> 22 6.706346 4.101742 1.434697 5.266775 5.357641
> 33 3.653806 2.666885 1.209326 5.125556 4.963374
> 44 7.221255 7.649152 6.540398 6.648506 2.576081
> 55 1.848023 5.044314 2.761881 3.307220 1.454234
> 66 7.606429 4.911766 2.034813 2.638573 2.818834
> > head(ab.3.df)
>   dmid   w3oab3   w3oab4   w3oab7   w3oab8   w3ovb1
> 11 5.835609 6.108220 6.587721 2.451461 2.785467
> 22 4.973198 1.196815 6.388056 1.110877 4.226463
> 33 3.800367 6.697287 5.235345 6.666829 6.319073
> 44 1.093141 1.43 2.269252 3.194978 4.916342
> 55 1.975060 7.204516 4.825435 1.775874 3.484027
> 66 3.273361 2.243805 5.326547 5.720892 6.118723
> >
> 
> > str(ab.1.fa)
> List of 2
>  $ rescaled.scores: Named num [1:154] 3.43 3.83 2.43 1.1 2.08 ...
>   ..- attr(*, "names")= chr [1:154] "1" "2" "3" "4" ...
>  $ factor.loadings: Named num [1:5] -0.0106 -0.0227 -0.1093 -0.0912
> 0.9975
>   ..- attr(*, "names")= chr [1:5] "g5oab2" "g53" "g54" "g55" ...
> > str(ab.2.fa)
> List of 2
>  $ rescaled.scores: Named num [1:154] 6.34 5.24 5.3 1.91 2.16 ...
>   ..- attr(*, "names")= chr [1:154] "1" "2" "3" "4" ...
>  $ factor.loadings: Named num [1:5] -0.2042 0.0063 -0.2287 -0.0119
> 0.7138
>   ..- attr(*, "names")= chr [1:5] "w2oab3" "w22" "w23" "w24" ...
> > str(ab.3.fa)
> List of 2
>  $ rescaled.scores: Named num [1:154] NaN NaN NaN NaN NaN NaN NaN NaN
> NaN NaN ...
>   ..- attr(*, "names")= chr [1:154] "1" "2" "3" "4" ...
>  $ factor.loadings: Named num [1:5] -0.1172 0.0128 -0.0968 0.106 0.9975
>   ..- attr(*, "names")= chr [1:5] "w3oab3" "w3oab4" "w3oab7" "w3oab8"
> ...
> 
> Anyway I have no idea what you consider wrong?
> 
> Regards
> Petr
> 
> 
> 
> > -Original Message-
> > From: Justin Delahunty [mailto:a...@genius.net.au]
> > Sent: Friday, July 26, 2013 2:22 PM
> > To: PIKAL Petr; 'Justin Delahunty'; r-help@r-project.org
> > Subject: RE: [R] Maintaining data order in factanal with missing data
> >
> > Hi Petr,
> >
> > Thanks for the quick response. Unfortunately I cannot share the data
> I
> > am working with, however please find attached a suitable R workspace
> > with generated data. It has the appropriate variable names, only the
> > data has been changed.
> >
> > The last function in the list (init.df

Re: [R] Maintaining data order in factanal with missing data

2013-07-26 Thread Justin Delahunty
Hi Petr,

So sorry, I accidentally attached the complete data set rather than the one
with missing values. I've attached the correct file to this email. RE:
init.dfs() being local, I hadn't even thought of that. I've been away from
OOP for close to 15 years now, so it might be time to revise!

The problem I have is that with missing values the list of factor scores
returned (ab.w1.fa$factor.scores) does not map onto the originating data
frame (ab.w1.df) as it no longer includes the cases which had missing
values. So while the original data set for ab.w1.df contains 154 ordered
cases, the factor analysis contains only 150.

I am seeking a way to map the values derived from the factor analysis
(ab.w1.fa$factor.scores) back to their original ordered position, so that
these factor score variables may be merged back into the master data frame
(ab.df). A unique ID for each case is available ($dmid) which I had thought
to use when merging the new variables, however I don't know how to implement
this.


Thanks for your help,

Justin


-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz] 
Sent: Friday, 26 July 2013 10:59 PM
To: Justin Delahunty; Justin Delahunty; r-help@r-project.org
Subject: RE: [R] Maintaining data order in factanal with missing data

Hi

Well, the function init.dfs does nothing as all data frames created inside
it does not propagate to global environment and there is nothing what the
function returns.

Tha last line (when used outside a function) gives warnings but there is no
sign of error.

When 

> head(ab.1.df)
  dmid   g5oab2  g53  g54  g55   g5ovb1
11 1.418932 1.805227 2.791152 3.624116 3.425586
22 2.293907 1.187830 1.611237 1.748526 3.816533
33 2.836536 2.679523 1.279639 2.674986 2.452395
44 1.872259 3.278359 1.785872 2.458315 1.146480
55 1.467195 1.180747 3.564127 3.007682 2.109506
66 3.098512 3.151974 3.969379 3.750571 1.497358
> head(ab.2.df)
  dmid   w2oab3  w22  w23  w24   w2ovb1
11 4.831362 5.522764 7.809366 6.969172 7.398385
22 6.706346 4.101742 1.434697 5.266775 5.357641
33 3.653806 2.666885 1.209326 5.125556 4.963374
44 7.221255 7.649152 6.540398 6.648506 2.576081
55 1.848023 5.044314 2.761881 3.307220 1.454234
66 7.606429 4.911766 2.034813 2.638573 2.818834
> head(ab.3.df)
  dmid   w3oab3   w3oab4   w3oab7   w3oab8   w3ovb1
11 5.835609 6.108220 6.587721 2.451461 2.785467
22 4.973198 1.196815 6.388056 1.110877 4.226463
33 3.800367 6.697287 5.235345 6.666829 6.319073
44 1.093141 1.43 2.269252 3.194978 4.916342
55 1.975060 7.204516 4.825435 1.775874 3.484027
66 3.273361 2.243805 5.326547 5.720892 6.118723
>

> str(ab.1.fa)
List of 2
 $ rescaled.scores: Named num [1:154] 3.43 3.83 2.43 1.1 2.08 ...
  ..- attr(*, "names")= chr [1:154] "1" "2" "3" "4" ...
 $ factor.loadings: Named num [1:5] -0.0106 -0.0227 -0.1093 -0.0912 0.9975
  ..- attr(*, "names")= chr [1:5] "g5oab2" "g53" "g54" "g55" ...
> str(ab.2.fa)
List of 2
 $ rescaled.scores: Named num [1:154] 6.34 5.24 5.3 1.91 2.16 ...
  ..- attr(*, "names")= chr [1:154] "1" "2" "3" "4" ...
 $ factor.loadings: Named num [1:5] -0.2042 0.0063 -0.2287 -0.0119 0.7138
  ..- attr(*, "names")= chr [1:5] "w2oab3" "w22" "w23" "w24" ...
> str(ab.3.fa)
List of 2
 $ rescaled.scores: Named num [1:154] NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN ...
  ..- attr(*, "names")= chr [1:154] "1" "2" "3" "4" ...
 $ factor.loadings: Named num [1:5] -0.1172 0.0128 -0.0968 0.106 0.9975
  ..- attr(*, "names")= chr [1:5] "w3oab3" "w3oab4" "w3oab7" "w3oab8" ...

Anyway I have no idea what you consider wrong?

Regards
Petr



> -Original Message-
> From: Justin Delahunty [mailto:a...@genius.net.au]
> Sent: Friday, July 26, 2013 2:22 PM
> To: PIKAL Petr; 'Justin Delahunty'; r-help@r-project.org
> Subject: RE: [R] Maintaining data order in factanal with missing data
> 
> Hi Petr,
> 
> Thanks for the quick response. Unfortunately I cannot share the data I 
> am working with, however please find attached a suitable R workspace 
> with generated data. It has the appropriate variable names, only the 
> data has been changed.
> 
> The last function in the list (init.dfs()) I call to subset the 
> overall data set into the three waves, then conduct the factor 
> analysis on each
> (1 factor CFA); it's just in a function to ease re-typing in a new 
> workspace.
> 
> 
> Thanks,
> 
> Justin
> 
> -Original Message-
> From: PIKAL Petr [mailto:petr.pi...@precheza.cz]
> Sent: Friday, 26 July 2013 7:35 PM
> To: Justin Delahunty; r-help@r-project.org
> Subject: RE: [R] Maintaining data order in factanal with missing data
> 
> Hi
> 
> You provided functions, so far so good. But without data it would be 
> quite difficult to understand what the functions do and where could be 
> the issue.
> 
> I suspect combination of complete cases selection together with subset 
> and factor behaviour. But I can be completely out of target too.
> 
> Petr
> 
> > -

Re: [R] Maintaining data order in factanal with missing data

2013-07-26 Thread Justin Delahunty
Hi Petr,

Thanks for the quick response. Unfortunately I cannot share the data I am
working with, however please find attached a suitable R workspace with
generated data. It has the appropriate variable names, only the data has
been changed.

The last function in the list (init.dfs()) I call to subset the overall data
set into the three waves, then conduct the factor analysis on each (1 factor
CFA); it's just in a function to ease re-typing in a new workspace.


Thanks,

Justin

-Original Message-
From: PIKAL Petr [mailto:petr.pi...@precheza.cz] 
Sent: Friday, 26 July 2013 7:35 PM
To: Justin Delahunty; r-help@r-project.org
Subject: RE: [R] Maintaining data order in factanal with missing data

Hi

You provided functions, so far so good. But without data it would be quite
difficult to understand what the functions do and where could be the issue.

I suspect combination of complete cases selection together with subset and
factor behaviour. But I can be completely out of target too.

Petr

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- 
> project.org] On Behalf Of s00123...@myacu.edu.au
> Sent: Friday, July 26, 2013 9:35 AM
> To: r-help@r-project.org
> Subject: [R] Maintaining data order in factanal with missing data
> 
> Hi,
> 
> 
> 
> I'm new to R, so sorry if this is a simple answer. I'm currently 
> trying to collapse some ordinal variables into a composite; the 
> program ideally should take a data frame as input, perform a factor 
> analysis, compute factor scores, sds, etc., and return the rescaled 
> scores and loadings. The difficulty I'm having is that my data set 
> contains a number of NA, which I am excluding from the analysis using 
> complete.cases(), and thus the incomplete cases are "skipped". These 
> functions are for a longitudinal data set with repeated waves of data, 
> so the final rescaled scores from each wave need to be saved as 
> variables grouped by a unique ID (DMID). The functions I'm trying to 
> implement are as follows:
> 
> 
> 
> weighted.sd<-function(x,w){
> 
> sum.w<-sum(w)
> 
> sum.w2<-sum(w^2)
> 
> mean.w<-sum(x*w)/sum(w)
> 
> 
> x.sd.w<-sqrt((sum.w/(sum.w^2-sum.w2))*sum(w*(x-mean.w)^2))
> 
> return(x.sd.w)
> 
> }
> 
> 
> 
> re.scale<-function(f.scores, raw.data, loadings){
> 
> 
> fz.scores<-(f.scores+mean(f.scores))/(sd(f.scores))
> 
> 
> means<-apply(raw.data,1,weighted.mean,w=loadings)
> 
> 
> sds<-apply(raw.data,1,weighted.sd,w=loadings)
> 
> grand.mean<-mean(means)
> 
> grand.sd<-mean(sds)
> 
> 
> final.scores<-((fz.scores*grand.sd)+grand.mean)
> 
> return(final.scores)
> 
> }
> 
> 
> 
> get.scores<-function(data){
> 
> 
> fact<-
> factanal(data[complete.cases(data),],factors=1,scores="regression")
> 
> f.scores<-fact$scores[,1]
> 
> f.loads<-fact$loadings[,1]
> 
> rescaled.scores<-re.scale(f.scores,
> data[complete.cases(data),], f.loads)
> 
> output.list<-list(rescaled.scores,
> f.loads)
> 
> names(output.list)<- 
> c("rescaled.scores",
> "factor.loadings")
> 
> return(output.list)
> 
> }
> 
> 
> 
> init.dfs<-function(){
> 
> 
> ab.1.df<-subset(ab.df,,select=c(dmid,g5oab2:g5ovb1))
> 
> 
> ab.2.df<-subset(ab.df,,select=c(dmid,w2oab3:w2ovb1))
> 
> ab.3.df<-subset(ab.df,,select=c(dmid,
> w3oab3, w3oab4, w3oab7, w3oab8, w3ovb1))
> 
> 
> 
> ab.1.fa<-get.scores(ab.1.df[-1])
> 
> ab.2.fa<-get.scores(ab.2.df[-1])
> 
> ab.3.fa<-get.scores(ab.3.df[-1])
> 
> 
> }
> 
> 
> 
> Thanks for your help,
> 
> 
> 
> Justin
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html and provide commented, minimal, self-contained, 
> reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help on carrying forward several vectors of rownames

2013-07-26 Thread John Kane
Can you supply some code and data? At the moment the question in not clear, or 
not expressed in R terminology.  Maybe I'm just not awake yet but 
"I tried to insert these vectors as a list, e.g. row.names=c(1,4), from the 
excel file"
does not seem to make any sense at all probably because 'list' has a special 
meaning in R and I don't think that is what you mean. if it is, my appologies.


Perhaps have a look at
https://github.com/hadley/devtools/wiki/Reproducibility
 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

John Kane
Kingston ON Canada


> -Original Message-
> From: theobrom...@gmail.com
> Sent: Thu, 25 Jul 2013 22:00:32 -0700
> To: r-help@r-project.org
> Subject: [R] help on carrying forward several vectors of rownames
> 
> Dear reader,
> I'm trying to use multiple vectors as rownames for the read.table
> function.
> I tried to insert these vectors as a list, e.g. row.names=c(1,4), from
> the
> excel file. I tried other ways, as if the argument only took continuous
> tab
> separated
> columns, e.g. row.names=c(1:4). But, this also did not work.
> Is there a solution to this problem?
> Regards,
> --
> Franklin
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to split two levels several times?

2013-07-26 Thread arun
It would be better to wrap it in a function.

fun1<- function(x,colName,N,value){
rl<- rle(as.character(x[,colName]))
dat<-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1<-if(rl$values[i]==value
 & (rl$lengths[i]%/%N>1)) rep(N,rl$lengths[i]%/%N) else 
rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])}))
lst1<-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1)
vec1<-sapply(lst1,max)
vec2<-c(1,vec1[-length(vec1)]+1)
res<-  lapply(seq_along(lst1),function(i) {x1<-lst1[[i]]; 
x[seq(vec2[i],max(x1)),]})
res
}

fun1(XXX,"electrode",6,"electrode4")
#Using previous dataset XXX, XXX1, XXX2
fun1(XXX,"electrode",3,"electrode1")

fun1(XXX1,"electrode",3,"electrode1")

fun1(XXX2,"electrode",3,"electrode1")
A.K.

- Original Message -
From: arun 
To: "dennis1...@gmx.net" 
Cc: R help ; Rui Barradas 
Sent: Friday, July 26, 2013 9:26 AM
Subject: Re: [R] How to split two levels several times?



Hi Dennis,
I guess in this case, instead of "Eletrode1" occuring 3 times, it is 
"Electrode4" exists only 6 times.  If that is the situation:
just change:
XXX: data
rl<-rle(as.character(XXX$electrode))
 
dat<-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1<-if(rl$values[i]=="electrode4"
 & (rl$lengths[i]%/%6>1)) rep(6,rl$lengths[i]%/%6) else 
rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])}))
 lst1<-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1)
vec1<-sapply(lst1,max)
vec2<-c(1,vec1[-length(vec1)]+1)
res<-  lapply(seq_along(lst1),function(i) {x1<-lst1[[i]]; 
XXX[seq(vec2[i],max(x1)),]})
res
[[1]]
   electrode length
1 electrode1    206
2 electrode1    194
3 electrode1    182
4 electrode1    172
5 electrode1    169
6 electrode2 82
7 electrode2 78
8 electrode2 70
9 electrode2 58

[[2]]
    electrode length
10 electrode1    206
11 electrode1    194
12 electrode1    182
13 electrode1    172
14 electrode1    169
15 electrode3    260
16 electrode3    176
17 electrode3    137

[[3]]
    electrode length
18 electrode1    206
19 electrode1    194
20 electrode1    182
21 electrode1    172
22 electrode1    169
23 electrode4 86
24 electrode4 66
25 electrode4 64
26 electrode4 52
27 electrode4 27
28 electrode4 26

[[4]]
    electrode length
29 electrode2 82
30 electrode2 78
31 electrode2 70
32 electrode2 58
33 electrode1    206
34 electrode1    194
35 electrode1    182
36 electrode1    172
37 electrode1    169

[[5]]
    electrode length
38 electrode2 82
39 electrode2 78
40 electrode2 70
41 electrode2 58
42 electrode3    260
43 electrode3    176
44 electrode3    137

[[6]]
    electrode length
45 electrode2 82
46 electrode2 78
47 electrode2 70
48 electrode2 58
49 electrode4 86
50 electrode4 66
51 electrode4 64
52 electrode4 52
53 electrode4 27
54 electrode4 26

[[7]]
    electrode length
55 electrode3    260
56 electrode3    176
57 electrode3    137
58 electrode1    206
59 electrode1    194
60 electrode1    182
61 electrode1    172
62 electrode1    169

[[8]]
    electrode length
63 electrode3    260
64 electrode3    176
65 electrode3    137
66 electrode2 82
67 electrode2 78
68 electrode2 70
69 electrode2 58

[[9]]
    electrode length
70 electrode3    260
71 electrode3    176
72 electrode3    137
73 electrode4 86
74 electrode4 66
75 electrode4 64
76 electrode4 52
77 electrode4 27
78 electrode4 26

[[10]]
    electrode length
79 electrode4 86
80 electrode4 66
81 electrode4 64
82 electrode4 52
83 electrode4 27
84 electrode4 26
85 electrode1    206
86 electrode1    194
87 electrode1    182
88 electrode1    172
89 electrode1    169

[[11]]
    electrode length
90 electrode4 86
91 electrode4 66
92 electrode4 64
93 electrode4 52
94 electrode4 27
95 electrode4 26
96 electrode2 82
97 electrode2 78
98 electrode2 70
99 electrode2 58

[[12]]
 electrode length
100 electrode4 86
101 electrode4 66
102 electrode4 64
103 electrode4 52
104 electrode4 27
105 electrode4 26
106 electrode3    260
107 electrode3    176
108 electrode3    137


A.K.




- Original Message -
From: "dennis1...@gmx.net" 
To: Rui Barradas ; r-help@r-project.org
Cc: 
Sent: Friday, July 26, 2013 6:07 AM
Subject: Re: [R] How to split two levels several times?

Hi Rui & Arun,
really thanks for investing so much time to deal with this problem! The code 
works now for this specific example. However it is not generally robust for 
slightly different situations. For instance it cannot correctly handle a slight 
variation of the table where I have again 4 types of electrodes of certain 
lengths. Electrode4 exists only 6 times. At the transition of the combinations 
3-4 and 4-1 there are 12 times electrode4 which stick together in the output 
$`9`. This leads to wrong splittings thereafter. Sorry for asking such tricky 
questions.

New table XXX

electrode    length
electrode1    206
electrode1    194
electrode1  

Re: [R] How to split two levels several times?

2013-07-26 Thread arun


Hi Dennis,
I guess in this case, instead of "Eletrode1" occuring 3 times, it is 
"Electrode4" exists only 6 times.  If that is the situation:
just change:
XXX: data
rl<-rle(as.character(XXX$electrode))
 
dat<-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1<-if(rl$values[i]=="electrode4"
 & (rl$lengths[i]%/%6>1)) rep(6,rl$lengths[i]%/%6) else 
rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])}))
 lst1<-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1)
vec1<-sapply(lst1,max)
vec2<-c(1,vec1[-length(vec1)]+1)
res<-  lapply(seq_along(lst1),function(i) {x1<-lst1[[i]]; 
XXX[seq(vec2[i],max(x1)),]})
res
[[1]]
   electrode length
1 electrode1    206
2 electrode1    194
3 electrode1    182
4 electrode1    172
5 electrode1    169
6 electrode2 82
7 electrode2 78
8 electrode2 70
9 electrode2 58

[[2]]
    electrode length
10 electrode1    206
11 electrode1    194
12 electrode1    182
13 electrode1    172
14 electrode1    169
15 electrode3    260
16 electrode3    176
17 electrode3    137

[[3]]
    electrode length
18 electrode1    206
19 electrode1    194
20 electrode1    182
21 electrode1    172
22 electrode1    169
23 electrode4 86
24 electrode4 66
25 electrode4 64
26 electrode4 52
27 electrode4 27
28 electrode4 26

[[4]]
    electrode length
29 electrode2 82
30 electrode2 78
31 electrode2 70
32 electrode2 58
33 electrode1    206
34 electrode1    194
35 electrode1    182
36 electrode1    172
37 electrode1    169

[[5]]
    electrode length
38 electrode2 82
39 electrode2 78
40 electrode2 70
41 electrode2 58
42 electrode3    260
43 electrode3    176
44 electrode3    137

[[6]]
    electrode length
45 electrode2 82
46 electrode2 78
47 electrode2 70
48 electrode2 58
49 electrode4 86
50 electrode4 66
51 electrode4 64
52 electrode4 52
53 electrode4 27
54 electrode4 26

[[7]]
    electrode length
55 electrode3    260
56 electrode3    176
57 electrode3    137
58 electrode1    206
59 electrode1    194
60 electrode1    182
61 electrode1    172
62 electrode1    169

[[8]]
    electrode length
63 electrode3    260
64 electrode3    176
65 electrode3    137
66 electrode2 82
67 electrode2 78
68 electrode2 70
69 electrode2 58

[[9]]
    electrode length
70 electrode3    260
71 electrode3    176
72 electrode3    137
73 electrode4 86
74 electrode4 66
75 electrode4 64
76 electrode4 52
77 electrode4 27
78 electrode4 26

[[10]]
    electrode length
79 electrode4 86
80 electrode4 66
81 electrode4 64
82 electrode4 52
83 electrode4 27
84 electrode4 26
85 electrode1    206
86 electrode1    194
87 electrode1    182
88 electrode1    172
89 electrode1    169

[[11]]
    electrode length
90 electrode4 86
91 electrode4 66
92 electrode4 64
93 electrode4 52
94 electrode4 27
95 electrode4 26
96 electrode2 82
97 electrode2 78
98 electrode2 70
99 electrode2 58

[[12]]
 electrode length
100 electrode4 86
101 electrode4 66
102 electrode4 64
103 electrode4 52
104 electrode4 27
105 electrode4 26
106 electrode3    260
107 electrode3    176
108 electrode3    137


A.K.




- Original Message -
From: "dennis1...@gmx.net" 
To: Rui Barradas ; r-help@r-project.org
Cc: 
Sent: Friday, July 26, 2013 6:07 AM
Subject: Re: [R] How to split two levels several times?

Hi Rui & Arun,
really thanks for investing so much time to deal with this problem! The code 
works now for this specific example. However it is not generally robust for 
slightly different situations. For instance it cannot correctly handle a slight 
variation of the table where I have again 4 types of electrodes of certain 
lengths. Electrode4 exists only 6 times. At the transition of the combinations 
3-4 and 4-1 there are 12 times electrode4 which stick together in the output 
$`9`. This leads to wrong splittings thereafter. Sorry for asking such tricky 
questions.

New table XXX

electrode    length
electrode1    206
electrode1    194
electrode1    182
electrode1    172
electrode1    169
electrode2    82
electrode2    78
electrode2    70
electrode2    58
electrode1    206
electrode1    194
electrode1    182
electrode1    172
electrode1    169
electrode3    260
electrode3    176
electrode3    137
electrode1    206
electrode1    194
electrode1    182
electrode1    172
electrode1    169
electrode4    86
electrode4    66
electrode4    64
electrode4    52
electrode4    27
electrode4    26
electrode2    82
electrode2    78
electrode2    70
electrode2    58
electrode1    206
electrode1    194
electrode1    182
electrode1    172
electrode1    169
electrode2    82
electrode2    78
electrode2    70
electrode2    58
electrode3    260
electrode3    176
electrode3    137
electrode2    82
electrode2    78
electrode2    70
electrode2    58
electrode4    86
electrode4    66
electrode4    64
electrode4    52
electrode4    27
electrode4    26
electrode3   

[R] modeest with non-numeric data?

2013-07-26 Thread Tom Hopper
Hello,

I have recently discovered the modeest library, and am trying to understand
how to use it with non-numeric data (e.g. determining the most common last
name, or analysing customer demographics ​by zip code).

I have the mlv() function working for numeric (double and integer) data,
but it throws either an error or a warning and produces unexpected output
with character data. Any help is appreciated.

A simple example:


> my.rand.letters <- sample(letters, size=100, replace=TRUE)

> mlv(my.rand.letters, mode=C("discrete"))Error in match.arg(x, .distribList) : 
> 'arg' must be of length 1In addition: There were 21 warnings (use warnings() 
> to see them)



> mlv(as.factor(my.rand.letters))Mode (most frequent value): NA NA
Bickel's modal skewness: -2
Call: mlv.factor(x = as.factor(my.rand.letters)) Warning message:In
discrete(x, ...) : NAs introduced by coercion



TIA,

Tom

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maintaining data order in factanal with missing data

2013-07-26 Thread PIKAL Petr
Hi

Well, the function init.dfs does nothing as all data frames created inside it 
does not propagate to global environment and there is nothing what the function 
returns.

Tha last line (when used outside a function) gives warnings but there is no 
sign of error.

When 

> head(ab.1.df)
  dmid   g5oab2  g53  g54  g55   g5ovb1
11 1.418932 1.805227 2.791152 3.624116 3.425586
22 2.293907 1.187830 1.611237 1.748526 3.816533
33 2.836536 2.679523 1.279639 2.674986 2.452395
44 1.872259 3.278359 1.785872 2.458315 1.146480
55 1.467195 1.180747 3.564127 3.007682 2.109506
66 3.098512 3.151974 3.969379 3.750571 1.497358
> head(ab.2.df)
  dmid   w2oab3  w22  w23  w24   w2ovb1
11 4.831362 5.522764 7.809366 6.969172 7.398385
22 6.706346 4.101742 1.434697 5.266775 5.357641
33 3.653806 2.666885 1.209326 5.125556 4.963374
44 7.221255 7.649152 6.540398 6.648506 2.576081
55 1.848023 5.044314 2.761881 3.307220 1.454234
66 7.606429 4.911766 2.034813 2.638573 2.818834
> head(ab.3.df)
  dmid   w3oab3   w3oab4   w3oab7   w3oab8   w3ovb1
11 5.835609 6.108220 6.587721 2.451461 2.785467
22 4.973198 1.196815 6.388056 1.110877 4.226463
33 3.800367 6.697287 5.235345 6.666829 6.319073
44 1.093141 1.43 2.269252 3.194978 4.916342
55 1.975060 7.204516 4.825435 1.775874 3.484027
66 3.273361 2.243805 5.326547 5.720892 6.118723
>

> str(ab.1.fa)
List of 2
 $ rescaled.scores: Named num [1:154] 3.43 3.83 2.43 1.1 2.08 ...
  ..- attr(*, "names")= chr [1:154] "1" "2" "3" "4" ...
 $ factor.loadings: Named num [1:5] -0.0106 -0.0227 -0.1093 -0.0912 0.9975
  ..- attr(*, "names")= chr [1:5] "g5oab2" "g53" "g54" "g55" ...
> str(ab.2.fa)
List of 2
 $ rescaled.scores: Named num [1:154] 6.34 5.24 5.3 1.91 2.16 ...
  ..- attr(*, "names")= chr [1:154] "1" "2" "3" "4" ...
 $ factor.loadings: Named num [1:5] -0.2042 0.0063 -0.2287 -0.0119 0.7138
  ..- attr(*, "names")= chr [1:5] "w2oab3" "w22" "w23" "w24" ...
> str(ab.3.fa)
List of 2
 $ rescaled.scores: Named num [1:154] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 
...
  ..- attr(*, "names")= chr [1:154] "1" "2" "3" "4" ...
 $ factor.loadings: Named num [1:5] -0.1172 0.0128 -0.0968 0.106 0.9975
  ..- attr(*, "names")= chr [1:5] "w3oab3" "w3oab4" "w3oab7" "w3oab8" ...

Anyway I have no idea what you consider wrong?

Regards
Petr



> -Original Message-
> From: Justin Delahunty [mailto:a...@genius.net.au]
> Sent: Friday, July 26, 2013 2:22 PM
> To: PIKAL Petr; 'Justin Delahunty'; r-help@r-project.org
> Subject: RE: [R] Maintaining data order in factanal with missing data
> 
> Hi Petr,
> 
> Thanks for the quick response. Unfortunately I cannot share the data I
> am working with, however please find attached a suitable R workspace
> with generated data. It has the appropriate variable names, only the
> data has been changed.
> 
> The last function in the list (init.dfs()) I call to subset the overall
> data set into the three waves, then conduct the factor analysis on each
> (1 factor CFA); it's just in a function to ease re-typing in a new
> workspace.
> 
> 
> Thanks,
> 
> Justin
> 
> -Original Message-
> From: PIKAL Petr [mailto:petr.pi...@precheza.cz]
> Sent: Friday, 26 July 2013 7:35 PM
> To: Justin Delahunty; r-help@r-project.org
> Subject: RE: [R] Maintaining data order in factanal with missing data
> 
> Hi
> 
> You provided functions, so far so good. But without data it would be
> quite difficult to understand what the functions do and where could be
> the issue.
> 
> I suspect combination of complete cases selection together with subset
> and factor behaviour. But I can be completely out of target too.
> 
> Petr
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> > project.org] On Behalf Of s00123...@myacu.edu.au
> > Sent: Friday, July 26, 2013 9:35 AM
> > To: r-help@r-project.org
> > Subject: [R] Maintaining data order in factanal with missing data
> >
> > Hi,
> >
> >
> >
> > I'm new to R, so sorry if this is a simple answer. I'm currently
> > trying to collapse some ordinal variables into a composite; the
> > program ideally should take a data frame as input, perform a factor
> > analysis, compute factor scores, sds, etc., and return the rescaled
> > scores and loadings. The difficulty I'm having is that my data set
> > contains a number of NA, which I am excluding from the analysis using
> > complete.cases(), and thus the incomplete cases are "skipped". These
> > functions are for a longitudinal data set with repeated waves of
> data,
> > so the final rescaled scores from each wave need to be saved as
> > variables grouped by a unique ID (DMID). The functions I'm trying to
> > implement are as follows:
> >
> >
> >
> > weighted.sd<-function(x,w){
> >
> > sum.w<-sum(w)
> >
> > sum.w2<-sum(w^2)
> >
> > mean.w<-sum(x*w)/sum(w)
> >
> >
> > x.sd.w<-sqrt((

Re: [R] How to split two levels several times?

2013-07-26 Thread arun
Hi Dennis,
No problem.

As I mentioned, I was under the understanding that "electrode1" occurs only in 
multiples of 3.  I think your earlier post sounded like that.  May be I was 
misunderstood.  So, my solution was based on that.  In the below table, I am 
not sure, how you wanted to split.  For ex. it is not clear, whether rows 1-5 
(electrode1) are only in the first list element or should it include  the next 
electrode "electrode2" also?  Could you also show the expected output?

dat$Len
# [1]  5  4  5  3  5  6  4  5  4  3  4  6  3  5  3  4  3 12  5  6  4  6  3
 as.character(dat$Val)
# [1] "electrode1" "electrode2" "electrode1" "electrode3" "electrode1"
 #[6] "electrode4" "electrode2" "electrode1" "electrode2" "electrode3"
#[11] "electrode2" "electrode4" "electrode3" "electrode1" "electrode3"
#[16] "electrode2" "electrode3" "electrode4" "electrode1" "electrode4"
#[21] "electrode2" "electrode4" "electrode3"
A.K.



Hi Rui & Arun, 
really thanks for investing so much time to deal with this problem! 
The code works now for this specific example. However it is not 
generally robust for slightly different situations. For instance it 
cannot correctly handle a slight variation of the table where I have 
again 4 types of electrodes of certain lengths. Electrode4 exists only 6
 times. At the transition of the combinations 3-4 and 4-1 there are 12 
times electrode4 which stick together in the output $`9`. This leads to 
wrong splittings thereafter. Sorry for asking such tricky questions. 

New table XXX 

electrode   length 
electrode1  206 
electrode1  194 
electrode1  182 
electrode1  172 
electrode1  169 
electrode2  82 
electrode2  78 
electrode2  70 
electrode2  58 
electrode1  206 
electrode1  194 
electrode1  182 
electrode1  172 
electrode1  169 
electrode3  260 
electrode3  176 
electrode3  137 
electrode1  206 
electrode1  194 
electrode1  182 
electrode1  172 
electrode1  169 
electrode4  86 
electrode4  66 
electrode4  64 
electrode4  52 
electrode4  27 
electrode4  26 
electrode2  82 
electrode2  78 
electrode2  70 
electrode2  58 
electrode1  206 
electrode1  194 
electrode1  182 
electrode1  172 
electrode1  169 
electrode2  82 
electrode2  78 
electrode2  70 
electrode2  58 
electrode3  260 
electrode3  176 
electrode3  137 
electrode2  82 
electrode2  78 
electrode2  70 
electrode2  58 
electrode4  86 
electrode4  66 
electrode4  64 
electrode4  52 
electrode4  27 
electrode4  26 
electrode3  260 
electrode3  176 
electrode3  137 
electrode1  206 
electrode1  194 
electrode1  182 
electrode1  172 
electrode1  169 
electrode3  260 
electrode3  176 
electrode3  137 
electrode2  82 
electrode2  78 
electrode2  70 
electrode2  58 
electrode3  260 
electrode3  176 
electrode3  137 
electrode4  86 
electrode4  66 
electrode4  64 
electrode4  52 
electrode4  27 
electrode4  26 
electrode4  86 
electrode4  66 
electrode4  64 
electrode4  52 
electrode4  27 
electrode4  26 
electrode1  206 
electrode1  194 
electrode1  182 
electrode1  172 
electrode1  169 
electrode4  86 
electrode4  66 
electrode4  64 
electrode4  52 
electrode4  27 
electrode4  26 
electrode2  82 
electrode2  78 
electrode2  70 
electrode2  58 
electrode4  86 
electrode4  66 
electrode4  64 
electrode4  52 
electrode4  27 
electrode4  26 
electrode3  260 
electrode3  176 
electrode3  137 



- Original Message -
From: arun 
To: "dennis1...@gmx.net" 
Cc: 
Sent: Friday, July 26, 2013 1:05 AM
Subject: Re: [R] How to split two levels several times?

Just to add:
I assumed that "electrode1" would be found in multiples of 3.  I could be 
wrong.  



- Original Message -
From: arun 
To: Rui Barradas 
Cc: "dennis1...@gmx.net" ; "r-help@r-project.org" 

Sent: Friday, July 26, 2013 12:53 AM
Subject: Re: [R] How to split two levels several times?



May be this also helps:
XXX: dataset
rl<-rle(as.character(XXX$electrode))
 
dat<-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1<-if(rl$values[i]=="electrode1"
 & (rl$lengths[i]%/%3>1)) rep(3,rl$lengths[i]%/%3) else 
rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])}))
 lst1<-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1)
vec1<-sapply(lst1,max)
vec2<-c(1,vec1[-length(vec1)]+1)
res<-  lapply(seq_along(lst1),function(i) {x1<-lst1[[i]]; 
XXX[seq(vec2[i],max(x1)),]})
 res
#[[1]]
 #  electrode length
#1 electrode1    5.7
#2 electrode1    6.3
#3 electrode1    6.2
#4 electrode2   11.4
#5 electrode2    9.7
#
#[[2]]
 #   electrode length
#6  electrode3   14.2
#7  electrode3   14.8
#8  electrode3   12.6
#9  electrode2   11.4
#10 electrode2    9.7
#

Re: [R] Repeated measures Cox regression ??coxph??

2013-07-26 Thread Terry Therneau

Two choices.
If this were a linear model, do you like the GEE approach or a mixed effects approach?  
Assume that "subject" is a variable containing a per-subject identifier.


GEE approach: add "+ cluster(subject)" to the model statement in coxph
Mixed models approach: Add " + (1|subject)" to the model statment in coxme.

When only a very few subjects have multiple events, the mixed model (random effect) 
approach may not be reliable, however.  Multiple events per group are the fuel for 
estimation of the variance of the random effect, and with few of these the profile 
likelihood of the random effect will be very flat.  You can get esssentially a random 
estimate of the variance of the "subject effect".  I'm still getting my arms around this 
issue, and it has taken me a long time.


"Frailty" is an alternate label for "random effects when all we have is a random 
intercept".  Multiple labels for the same idea adds confusion, but nothing else.


Terry Therneau

On 07/25/2013 08:14 PM, Marc Schwartz wrote:

On Jul 25, 2013, at 4:45 PM, David Winsemius  wrote:


On Jul 25, 2013, at 12:27 PM, Marc Schwartz wrote:


On Jul 25, 2013, at 2:11 PM, John Sorkin  wrote:


Colleagues,
Is there any R package that will allow one to perform a repeated measures Cox 
Proportional Hazards regression? I don't think coxph is set up to handle this 
type of problem, but I would be happy to know that I am not correct.
I am doing a study of time to hip joint replacement. As each person has two 
hips, a given person can appear in the dataset twice, once for the left hip and 
once for the right hip, and I need to account for the correlation of data from 
a single individual.
Thank you,
John



John,

See Terry's 'coxme' package:

http://cran.r-project.org/web/packages/coxme/index.html


When I looked over the description of coxme, I was concerned it was not really 
designed with this in mind. Looking at Therneau and Grambsch, I thought section 
8.4.2 in the 'Multiple Events per Subject' Chapter fit the analysis question 
well. There they compared the use of coxph( ...+cluster(ID),,...)  withcoxph( 
...+strata(ID),,...). Unfortunately I could not tell for sure which one was 
being described as superio but I think it was the cluster() alternative. I seem 
to remember there are discussions in the archives.


David,

I think that you raise a good point. The example in the book (I had to wait to 
get home to read it) is potentially different however, in that the subject's 
eye's were randomized to treatment or control, which would seem to suggest 
comparable baseline characteristics for each pair of eyes, as well as an active 
intervention on one side where a difference in treatment effect between each 
eye is being analyzed.

It is not clear from John's description above if there is one hip that will be 
treated versus one as a control and whether the extent of disease at baseline 
is similar in each pair of hips. Presumably the timing of hip replacements will 
be staggered at some level, even if there is comparable disease, simply due to 
post-op recovery time and surgical risk. In cases where the disease between 
each hip is materially different, that would be another factor to consider, 
however I would defer to orthopaedic physicians/surgeons from a subject matter 
expertise consideration. It is possible that the bilateral hip replacement data 
might be more of a parallel to bilateral breast cancer data, if each breast 
were to be tracked separately.

I have cc'd Terry here, hoping that he might jump in and offer some insights 
into the pros/cons of using coxme versus coxph with either a cluster or strata 
based approach, or perhaps even a frailty based approach as in 9.4.1 in the 
book.

Regards,

Marc



--
David.

You also might find the following of interest:

http://bjo.bmj.com/content/71/9/645.full.pdf

http://www.ncbi.nlm.nih.gov/pubmed/6885

http://www.ncbi.nlm.nih.gov/pubmed/22078901



Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Jul 26, 2013; 12:34am

2013-07-26 Thread Frank Harrell

Thanks Rich and Jim and apologies for omitting the line

x <- c(285, 43.75, 94, 150, 214, 375, 270, 350, 41.5, 210, 30, 37.6,
281, 101, 210)

But the fundamental problem remains that vertical spacing is not correct 
unless I waste a lot of image space at the top.


Frank


--
Frank E Harrell Jr Professor and Chairman  School of Medicine
   Department of Biostatistics Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a very urgunt and important question about R

2013-07-26 Thread arun
Hi,
You may try:
library(psych)
?block.random()

Also, look this link
http://personality-project.org/revelle/syllabi/205/block.randomization.pdf
A.K.




- Original Message -
From: mina orang 
To: r-help@r-project.org
Cc: 
Sent: Friday, July 26, 2013 4:26 AM
Subject: [R] a very urgunt and important question about R

Dear all

I am doing a research in clinical psychology and I need to Generate a
block randomization for a clinical trial, with two treatments and
three therapists. I mean I must randomize the clients (all women) to
two kinds of treatment (called "treatment as usual" and "narrative
exposure therapy") and three different therapists.
I tried to do this randomization by using R software and by using
stratification package but I didn't (and still can not) understand how
I should use R for this job.

Could you please help me and show me how I should use R for this kind
of randomization in a really easy way?

I really appreciate your help in advance.

Best
Mina

P.S. I just know little about statistics and software. Please teach me
in a really easy way.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a very urgunt and important question about R

2013-07-26 Thread PIKAL Petr
Hi

I recommend to start with

library(fortunes)
fortune("surgery")

Anyway, beside Introduction to R you could go through 

CRAN Task View: Design of Experiments (DoE) & Analysis of Experimental Data

and specifically through

Experimental designs for clinical trials 
This task view only covers specific design of experiments packages; there may 
be some grey areas. Please, also consult the ClinicalTrials task view. 

*experiment contains tools for clinical experiments, e.g., a randomization 
tool, and it provides a few special analysis options for clinical trials. 
*Package gsDesign implements group sequential designs, 
*Package gsbDesign evaluates operating characteristics for group sequential 
Bayesian designs, 
*package asd implements adaptive sequential designs. 
*Package TEQR provides toxicity equivalence range designs (Blanchard and 
Longmate 2010) for phase I clinical trials. 
*The DoseFinding package provides functions for the design and analysis of 
dose-finding experiments (for example pharmaceutical Phase II clinical trials); 
it combines the facilities of the "MCPMod" package (maintenance discontinued; 
described in Bornkamp, Pinheiro and Bretz 2009) with a special type of optimal 
designs for dose finding situations (MED-optimal designs, or D-optimal designs, 
or a mixture of both; cf., Dette et al. 2008)

Which shall provide you with some insight.

Regards
Petr


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of mina orang
> Sent: Friday, July 26, 2013 10:26 AM
> To: r-help@r-project.org
> Subject: [R] a very urgunt and important question about R
> 
> Dear all
> 
> I am doing a research in clinical psychology and I need to Generate a
> block randomization for a clinical trial, with two treatments and three
> therapists. I mean I must randomize the clients (all women) to two
> kinds of treatment (called "treatment as usual" and "narrative exposure
> therapy") and three different therapists.
> I tried to do this randomization by using R software and by using
> stratification package but I didn't (and still can not) understand how
> I should use R for this job.
> 
> Could you please help me and show me how I should use R for this kind
> of randomization in a really easy way?
> 
> I really appreciate your help in advance.
> 
> Best
> Mina
> 
> P.S. I just know little about statistics and software. Please teach me
> in a really easy way.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] n-dash instead of hyphens in plots

2013-07-26 Thread Prof Brian Ripley

On 26/07/2013 09:36, Anders Sand wrote:

Dear community,

How do I change the hyphens that appear when using '-' to n-dash in, for
example, x-axes?

Example code:
axis(1, c(1:2), c("1-4", "23-26"), tck=.05)


On what device on what platform?   The commonest answer would be

'if you want en-dash not hyphen, ask for en-dash and not hyphen'

en-dash is "\u2013" where that is supported.




Thanks
Anders

[[alternative HTML version deleted]]

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This did ask for 'at a minimum' information and no HTML.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] n-dash instead of hyphens in plots

2013-07-26 Thread Anders Sand
Dear community,

How do I change the hyphens that appear when using '-' to n-dash in, for
example, x-axes?

Example code:
axis(1, c(1:2), c("1-4", "23-26"), tck=.05)

Thanks
Anders

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help R

2013-07-26 Thread Mª Teresa Martinez Soriano
Hi to everyone, first of all thanks for this service, it is being very useful 
for me, thanks in 

advance.

I am new in R, so I suppose I could make really naive questions, I'm sorry.

I have to impute some missing values and I am trying to do it with VIM library 
trough Hot Deck 

imputation.

I writte:vmGUImenu(), and it opens a small window of: Visualization and 
Imputation of Missing 
Values 

and I select Imptation and Hot Deck and then one of the variables which I have 
to select is Select 

Variables to Build Domains.

I don't know which variables I have to select, I don't understand this. I have 
tried don't put 

anything and I get :

hotdeck(dataframe,variable=c("CRV.IE.2005","CRV.IE.2006","CRV.IE.2007","CRV.IE.2008","CRV.IE.2009","CR

V.IE.2010"),ord_var=c("CRV.IE.2001","CRV.IE.2002","CRV.IE.2003","CRV.IE.2004","CRV.IE.2005","CRV.IE.20

06","CRV.IE.2007","CRV.IE.2008","CRV.IE.2009","CRV.IE.2010"),domain_var=NULL,imp_suffix="_imp")

Mensajes de aviso perdido:

 In hotdeck(data, variable = vars, ord_var = sort, domain_var = domain,  

 Some NAs remained, maybe due to a too restrictive domain building!?

 In hotdeck(b, variable = c("CRV.IE.2005", "CRV.IE.2006", "CRV.IE.2007", Some 
NAs remained, maybe due 

to a too restrictive domain building!?


What should I  put in this variable??

Thanks in advance

Best regards

Teresa
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] a very urgunt and important question about R

2013-07-26 Thread mina orang
Dear all

I am doing a research in clinical psychology and I need to Generate a
block randomization for a clinical trial, with two treatments and
three therapists. I mean I must randomize the clients (all women) to
two kinds of treatment (called "treatment as usual" and "narrative
exposure therapy") and three different therapists.
I tried to do this randomization by using R software and by using
stratification package but I didn't (and still can not) understand how
I should use R for this job.

Could you please help me and show me how I should use R for this kind
of randomization in a really easy way?

I really appreciate your help in advance.

Best
Mina

P.S. I just know little about statistics and software. Please teach me
in a really easy way.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to split two levels several times?

2013-07-26 Thread dennis1991
Hi Rui & Arun,
really thanks for investing so much time to deal with this problem! The code 
works now for this specific example. However it is not generally robust for 
slightly different situations. For instance it cannot correctly handle a slight 
variation of the table where I have again 4 types of electrodes of certain 
lengths. Electrode4 exists only 6 times. At the transition of the combinations 
3-4 and 4-1 there are 12 times electrode4 which stick together in the output 
$`9`. This leads to wrong splittings thereafter. Sorry for asking such tricky 
questions.

New table XXX

electrode   length
electrode1  206
electrode1  194
electrode1  182
electrode1  172
electrode1  169
electrode2  82
electrode2  78
electrode2  70
electrode2  58
electrode1  206
electrode1  194
electrode1  182
electrode1  172
electrode1  169
electrode3  260
electrode3  176
electrode3  137
electrode1  206
electrode1  194
electrode1  182
electrode1  172
electrode1  169
electrode4  86
electrode4  66
electrode4  64
electrode4  52
electrode4  27
electrode4  26
electrode2  82
electrode2  78
electrode2  70
electrode2  58
electrode1  206
electrode1  194
electrode1  182
electrode1  172
electrode1  169
electrode2  82
electrode2  78
electrode2  70
electrode2  58
electrode3  260
electrode3  176
electrode3  137
electrode2  82
electrode2  78
electrode2  70
electrode2  58
electrode4  86
electrode4  66
electrode4  64
electrode4  52
electrode4  27
electrode4  26
electrode3  260
electrode3  176
electrode3  137
electrode1  206
electrode1  194
electrode1  182
electrode1  172
electrode1  169
electrode3  260
electrode3  176
electrode3  137
electrode2  82
electrode2  78
electrode2  70
electrode2  58
electrode3  260
electrode3  176
electrode3  137
electrode4  86
electrode4  66
electrode4  64
electrode4  52
electrode4  27
electrode4  26
electrode4  86
electrode4  66
electrode4  64
electrode4  52
electrode4  27
electrode4  26
electrode1  206
electrode1  194
electrode1  182
electrode1  172
electrode1  169
electrode4  86
electrode4  66
electrode4  64
electrode4  52
electrode4  27
electrode4  26
electrode2  82
electrode2  78
electrode2  70
electrode2  58
electrode4  86
electrode4  66
electrode4  64
electrode4  52
electrode4  27
electrode4  26
electrode3  260
electrode3  176
electrode3  137





> Gesendet: Donnerstag, 25. Juli 2013 um 20:53 Uhr
> Von: "Rui Barradas" 
> An: dennis1...@gmx.net
> Cc: r-help@r-project.org
> Betreff: Re: Aw: Re:  Re:  Re: [R] How to split two levels several times?
>
> Hello,
>
> I think the following does what you want. (I don't know if it makes much
> sense but it works.)
>
>
>
> lens <- rle(as.character(XXX$electrode))$lengths
> m <- length(lens) %/% 2
> idx <- rep(1:m, sapply(1:m, function(.m) sum(lens[(2*.m - 1):(2*.m)])))
> if(length(lens) %% 2 != 0){
>   idx <- c(idx, rep(m + 1, lens[length(lens)]))
>   sp_idx <- split(idx, idx)
>   n <- length(sp_idx[[m]])
>   if(n %/% 2 < length(sp_idx[[m + 1]]))
>   sp_idx[[m]][(n %/% 2 + 1):n] <- sp_idx[[m + 1]][1]
>   else
>   sp_idx[[m]][(n - length(sp_idx[[m + 1]]) + 1):n] <-  sp_idx[[m 
> + 1]][1]
>   idx <- unlist(sp_idx)
> }
>
> sp <- split(XXX, idx)
> sp
>
>
>
> Rui Barradas
>
> Em 25-07-2013 11:40, dennis1...@gmx.net escreveu:
> > Hi Rui
> > once more thank you for your help. But the code does so far not solve the 
> > problem because it still treats rows 17-22 (repeated appearance of 
> > electrode1) as one single level. However as can be seen by rows 1-3 (or 
> > rows 17-19 and rows 20-22) and the order of the length variable (row 1 = 
> > 5.7, row 2 = 6.3, row 3 = 6.2) electrode1 consists only of 3 rows. Maybe 
> > that was not made absolutely clear by me. As described in my mail before if 
> > by chance (or systematically) it happens to be that electrode1 appears 
> > right after each other in the table then the code should split it “half 
> > way”.
> >
> > So idx should not return
> >   [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4
> >
> > but instead 6 times number 4 at the end
> >   [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4
> >
> > Do you have any solution?
> >
> >
> >> Gesendet: Mittwoch, 24. Juli 2013 um 23:47 Uhr
> >> Von: "Rui Barradas" 
> >> An: dennis1...@gmx.net
> >> Cc: r-help@r-project.org
> >> Betreff: Re: Aw: Re:  Re: [R] How to split two levels several times?
> >>
> >> Hello,
> >>
> >> As for the first question, note that in the case you describe, the
> >> resulting list of df's will not be a split of the original, there will
> >> be a duplication in the final 4-1 a

Re: [R] Maintaining data order in factanal with missing data

2013-07-26 Thread PIKAL Petr
Hi

You provided functions, so far so good. But without data it would be quite 
difficult to understand what the functions do and where could be the issue.

I suspect combination of complete cases selection together with subset and 
factor behaviour. But I can be completely out of target too.

Petr

> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> project.org] On Behalf Of s00123...@myacu.edu.au
> Sent: Friday, July 26, 2013 9:35 AM
> To: r-help@r-project.org
> Subject: [R] Maintaining data order in factanal with missing data
> 
> Hi,
> 
> 
> 
> I'm new to R, so sorry if this is a simple answer. I'm currently trying
> to collapse some ordinal variables into a composite; the program
> ideally should take a data frame as input, perform a factor analysis,
> compute factor scores, sds, etc., and return the rescaled scores and
> loadings. The difficulty I'm having is that my data set contains a
> number of NA, which I am excluding from the analysis using
> complete.cases(), and thus the incomplete cases are "skipped". These
> functions are for a longitudinal data set with repeated waves of data,
> so the final rescaled scores from each wave need to be saved as
> variables grouped by a unique ID (DMID). The functions I'm trying to
> implement are as follows:
> 
> 
> 
> weighted.sd<-function(x,w){
> 
> sum.w<-sum(w)
> 
> sum.w2<-sum(w^2)
> 
> mean.w<-sum(x*w)/sum(w)
> 
> 
> x.sd.w<-sqrt((sum.w/(sum.w^2-sum.w2))*sum(w*(x-mean.w)^2))
> 
> return(x.sd.w)
> 
> }
> 
> 
> 
> re.scale<-function(f.scores, raw.data, loadings){
> 
> 
> fz.scores<-(f.scores+mean(f.scores))/(sd(f.scores))
> 
> 
> means<-apply(raw.data,1,weighted.mean,w=loadings)
> 
> 
> sds<-apply(raw.data,1,weighted.sd,w=loadings)
> 
> grand.mean<-mean(means)
> 
> grand.sd<-mean(sds)
> 
> 
> final.scores<-((fz.scores*grand.sd)+grand.mean)
> 
> return(final.scores)
> 
> }
> 
> 
> 
> get.scores<-function(data){
> 
> 
> fact<-
> factanal(data[complete.cases(data),],factors=1,scores="regression")
> 
> f.scores<-fact$scores[,1]
> 
> f.loads<-fact$loadings[,1]
> 
> rescaled.scores<-re.scale(f.scores,
> data[complete.cases(data),], f.loads)
> 
> output.list<-list(rescaled.scores,
> f.loads)
> 
> names(output.list)<-
> c("rescaled.scores",
> "factor.loadings")
> 
> return(output.list)
> 
> }
> 
> 
> 
> init.dfs<-function(){
> 
> 
> ab.1.df<-subset(ab.df,,select=c(dmid,g5oab2:g5ovb1))
> 
> 
> ab.2.df<-subset(ab.df,,select=c(dmid,w2oab3:w2ovb1))
> 
> ab.3.df<-subset(ab.df,,select=c(dmid,
> w3oab3, w3oab4, w3oab7, w3oab8, w3ovb1))
> 
> 
> 
> ab.1.fa<-get.scores(ab.1.df[-1])
> 
> ab.2.fa<-get.scores(ab.2.df[-1])
> 
> ab.3.fa<-get.scores(ab.3.df[-1])
> 
> 
> }
> 
> 
> 
> Thanks for your help,
> 
> 
> 
> Justin
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] number of items to replace is not a multiple of replacement length

2013-07-26 Thread Pascal Oettli

Hello,

Once again, the matrix EWMA has not the correct size.

Did you carefully read the answer by Thomas Stewart?
https://stat.ethz.ch/pipermail/r-help/attachments/20130724/c454b0f7/attachment.pl

Extract of his reply: "When you expand the example to 5 stocks,
there will be 15 elements (5 variances and 10 covariances)"

> EWMA<-matrix(nrow=T,ncol=15))

Regards,
Pascal

On 26/07/2013 14:37, G Girija wrote:

Hi All,

I have 5 stock values and i am calculating EWMA
followed the logic as given ind following link.[
http://www.orecastingfinancialrisk.com/3.html
]

library('tseries')
returns[,1]<-returns[,1]-mean(returns[,1])
returns[,2]<-returns[,2]-mean(returns[,2])
returns[,3]<-returns[,3]-mean(returns[,3])
returns[,4]<-returns[,4]-mean(returns[,4])
returns[,5]<-returns[,5]-mean(returns[,5])
T<-length(returns[,1])
T
EWMA<-matrix(nrow=T,ncol=5)
lambda=0.94
S<-cov(returns)
S
EWMA[1,] <- S[lower.tri(S,diag=TRUE)]


*Error in EWMA[1, ] <- S[lower.tri(S, diag = TRUE)] : *
*  number of items to replace is not a multiple of replacement length*
*
*
for(i in 2:T)
{
   S<- lambda*S +(1-lambda)*t(returns[i])%*% returns[i]
EWMA[i,] <- S[lower.tri(S,diag=TRUE)]
}
*
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme (weights) and glht

2013-07-26 Thread ONKELINX, Thierry
Dear Sibylle,

Have you tried to create a new variable?

ME$fDiversity <- factor(ME$Diversity)

H08_lme <- lme(
log(Height2008_mean) ~ fDiversity,
data = ME,
random = ~ 1|Plot/SubPlot,
weights = varPower(form = ~Diversity),
na.action = na.omit,
subset = ME$Species == "Ace_pse",
method = "ML"
)
summary(H08_lme)
anova(H08_lme)
g_H08_lme <- glht(H08_lme, linfct = mcp(fDiversity = "Tukey"))

Please note that I’ve added (a lot of) whitespace to your code to make it 
easier to read.

Best regards,

Thierry

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
Sibylle Stöckli
Verzonden: donderdag 25 juli 2013 12:16
Aan: r-help@r-project.org
Onderwerp: [R] lme (weights) and glht

Dear R members,

I tried to fit an lme model and to use the glht function of multcomp.
However, the glht function gives me some errors when using
weights=varPower().
The glht error makes sense as glht needs factor levels and the model
works fine without weights=. Does anyone know a solution so I do not
have to change the lme model?

Thanks
Sibylle

--> works fine
ME$Diversity=factor(ME$Diversity)
H08_lme<-lme(log(Height2005_mean)~Diversity, data=ME, random=~1|Plot/
SubPlot, na.action=na.omit, subset=ME$Species=="Pse_men", method="ML")
summary(H08_lme)
anova(H08_lme)
g_H08_lme<-glht(H08_lme, linfct=mcp(Diversity="Tukey"))
print(summary(g_H08_lme))

--> using lme with weights I changed the order of factor() and
introduced as.factor in the model

H08_lme<-lme(log(Height2008_mean)~as.factor(Diversity), data=ME,
random=~1|Plot/SubPlot, weights=varPower(form=~Diversity),
na.action=na.omit, subset=ME$Species=="Ace_pse", method="ML")
summary(H08_lme)
anova(H08_lme)
ME$Diversity=factor(ME$Diversity)
g_H08_lme<-glht(H08_lme, linfct=mcp(Diversity="Tukey"))

Error in mcp2matrix(model, linfct = linfct) :
   Variable(s) �Diversity� have been specified in �linfct� but cannot
be found in �model�!
[[alternative HTML version deleted]]
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the 
writer and may not be regarded as stating an official position of INBO, as long 
as the message is not confirmed by a duly signed document.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GGplot 2 – cannot get histogram and box plot axis to match.

2013-07-26 Thread ONKELINX, Thierry
Dear John,

Use xlim() and ylim() instead of expand_limits()

library(ggplot2)

#sample data from ggplot2
data(Cars93, package = "MASS")
dataSet <- Cars93

#variables to calculate the range to extend the axis dataVector <- 
unlist(dataSet[,"MPG.city"])

dataRange <- diff(range(dataSet$MPG.city))

graphRange <- c(min(dataSet$MPG.city) - dataRange/5,
max(dataSet$MPG.city) + dataRange/5)

#making the box plot
theBoxPlot <- ggplot(dataSet,aes_string(x = "MPG.city", y = "MPG.city"))

theBoxPlot <-
  theBoxPlot  + geom_boxplot() + coord_flip() + ylim(limits = graphRange)
print(theBoxPlot)


#making the histogram
thePlot <- ggplot(dataSet,aes_string(x = "MPG.city"))
thePlot <- thePlot + geom_histogram()  + xlim(graphRange)
print(thePlot)



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
John W. Clow
Verzonden: donderdag 25 juli 2013 20:03
Aan: r-help@r-project.org
Onderwerp: [R] GGplot 2 – cannot get histogram and box plot axis to match.

Problem:
I am trying to get the histogram and box plot x axis to match. I’ve tried using 
the expand_limits function to make the axis match but that didn’t make the axis 
match. The histogram’s axis are still consistently larger than the ones for the 
box plot (though the function did help). Does anyone have a suggestion as to 
what I should do instead?


Background:
I am building a Shiny app that displays a histogram below a bar chart for a set 
of data that a user uploads to the app. If you want to see the app, go here 
http://spark.rstudio.com/jclow/Archive20130725HistogramApp/
To run the app, select “Use Sample Data” , then select  “MPG.city” under choose 
a column, then finally select box plot.


Sample code:
Below is a snippet of my code to demonstrate the problems I have.

library(ggplot2)

#sample data from ggplot2
data(Cars93, package = "MASS")
dataSet <- Cars93

#variables to calculate the range to extend the axis dataVector <- 
unlist(dataSet[,"MPG.city"])

dataRange <- max(dataVector) - min(dataVector)

graphRange <- c(min(dataVector) - dataRange/5,
max(dataVector) + dataRange/5)

#making the box plot
theBoxPlot <- ggplot(dataSet,aes_string(x = "MPG.city",y = "MPG.city"))

theBoxPlot = theBoxPlot  + geom_boxplot() + expand_limits(y= graphRange) + 
coord_flip()
print(theBoxPlot)


#making the histogram
thePlot <- ggplot(dataSet,aes_string(x = "MPG.city")) thePlot <-thePlot + 
geom_histogram()  + expand_limits(x= graphRange)

print(thePlot)


Thank you for taking the time to read this.

John Clow
UCSB Student

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the 
writer and may not be regarded as stating an official position of INBO, as long 
as the message is not confirmed by a duly signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Maintaining data order in factanal with missing data

2013-07-26 Thread s00123776
Hi,

 

I'm new to R, so sorry if this is a simple answer. I'm currently trying to
collapse some ordinal variables into a composite; the program ideally should
take a data frame as input, perform a factor analysis, compute factor
scores, sds, etc., and return the rescaled scores and loadings. The
difficulty I'm having is that my data set contains a number of NA, which I
am excluding from the analysis using complete.cases(), and thus the
incomplete cases are "skipped". These functions are for a longitudinal data
set with repeated waves of data, so the final rescaled scores from each wave
need to be saved as variables grouped by a unique ID (DMID). The functions
I'm trying to implement are as follows:

 

weighted.sd<-function(x,w){

sum.w<-sum(w)

sum.w2<-sum(w^2)

mean.w<-sum(x*w)/sum(w)

 
x.sd.w<-sqrt((sum.w/(sum.w^2-sum.w2))*sum(w*(x-mean.w)^2))

return(x.sd.w)

}

 

re.scale<-function(f.scores, raw.data, loadings){

 
fz.scores<-(f.scores+mean(f.scores))/(sd(f.scores))

 
means<-apply(raw.data,1,weighted.mean,w=loadings)

 
sds<-apply(raw.data,1,weighted.sd,w=loadings)

grand.mean<-mean(means)

grand.sd<-mean(sds)

 
final.scores<-((fz.scores*grand.sd)+grand.mean)

return(final.scores)

}

 

get.scores<-function(data){

 
fact<-factanal(data[complete.cases(data),],factors=1,scores="regression")

f.scores<-fact$scores[,1]

f.loads<-fact$loadings[,1]

rescaled.scores<-re.scale(f.scores,
data[complete.cases(data),], f.loads)

output.list<-list(rescaled.scores, f.loads)

names(output.list)<-c("rescaled.scores",
"factor.loadings")

return(output.list)

}

 

init.dfs<-function(){

 
ab.1.df<-subset(ab.df,,select=c(dmid,g5oab2:g5ovb1))

 
ab.2.df<-subset(ab.df,,select=c(dmid,w2oab3:w2ovb1))

ab.3.df<-subset(ab.df,,select=c(dmid,
w3oab3, w3oab4, w3oab7, w3oab8, w3ovb1))



ab.1.fa<-get.scores(ab.1.df[-1])

ab.2.fa<-get.scores(ab.2.df[-1])

ab.3.fa<-get.scores(ab.3.df[-1])


}

 

Thanks for your help,

 

Justin


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help on carrying forward several vectors of rownames

2013-07-26 Thread franklin johnson
Dear reader,
I'm trying to use multiple vectors as rownames for the read.table function.
I tried to insert these vectors as a list, e.g. row.names=c(1,4), from the
excel file. I tried other ways, as if the argument only took continuous tab
separated
columns, e.g. row.names=c(1:4). But, this also did not work.
Is there a solution to this problem?
Regards,
-- 
Franklin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] duplicated() with conditional statement

2013-07-26 Thread arun
Hi,
Sorry,`indx` should be:
indx<-which(tt$response=="buy") #I changed indx but forgot about it
 tt$newcolumn<-0
  tt[unlist(lapply(seq_along(indx),function(i) {x1<-if(indx[i]==nrow(tt)) 
indx[i] else seq(indx[i]+1,indx[i+1]-1);x2<-rbind(tt[indx[1:i],],tt[x1,]); 
if(any(x2$response=="sample")) 
row.names(x2[duplicated(x2$product),])})),"newcolumn"]<-1
 tt
   subj response product newcolumn
1 1   sample   1 0
2 1   sample   2 0
3 1  buy   3 0
4 2   sample   2 0
5 2  buy   2 0
6 3   sample   3 1
7 3   sample   2 1
8 3  buy   1 0
9 4   sample   1 1
10    4  buy   4 0
A.K.








From: vanessa van der vaart 
To: arun  
Sent: Thursday, July 25, 2013 8:55 PM
Subject: Re: duplicated() with conditional statement



hii thanks for the code, I tried the code but i got the error message,
Error in from:to : NA/NaN argument


I dont know what doest it mean, and I dont know how to fix it..
could you help please..

thank you very much in advance





On Thu, Jul 25, 2013 at 10:52 PM, arun  wrote:

Hi,
>You may try this (didn't get time to test this extensively)
>indx<-which(tt$response!="buy")
>tt$newcolumn<-0
> tt[unlist(lapply(seq_along(indx),function(i) {x1<-if(indx[i]==nrow(tt)) 
>indx[i] else seq(indx[i]+1,indx[i+1]-1);x2<-rbind(tt[indx[1:i],],tt[x1,]); 
>if(any(x2$response=="sample")) 
>row.names(x2[duplicated(x2$product),])})),"newcolumn"]<-1
> tt
>#   subj response product newcolumn
>#1 1   sample   1 0
>#2 1   sample   2 0
>#3 1  buy   3 0
>#4 2   sample   2 0
>#5 2  buy   2 0
>#6 3   sample   3 1
>#7 3   sample   2 1
>#8 3  buy   1 0
>#9 4   sample   1 1
>#10    4  buy   4 0
>A.K.
>
>
>
>
>
>
>
>
>
>From: vanessa van der vaart 
>To: smartpink...@yahoo.com
>Sent: Thursday, July 25, 2013 3:49 PM
>Subject: Re: duplicated() with conditional statement
>
>
>
>
>thank you for the reply.
>It based on entire data set. 
>
>subj response product newcolumn
>1     1   sample       1          0         
>2     1   sample       2          0         
>3     1      buy          3           0       
>4     2   sample       2          0        . 
>5     2         buy       2           0
>6     3   sample       3          1
>7     3   sample       2           1
>8     3         buy       1           0
>9     4  sample       1            1
>10   4       buy       4             0
>
>I am sorry i didnt question it very clearly, let me change the conditional 
>statement, I hope you can understand. i will explain by example
>
>as you can see, almost every number is duplicated, but only in row 6th,7th,and 
>9th the value on column is 1.
>
>on row4th, the value is duplicated( 2 already occurred on 2nd row),but since 
>the value is considered as duplicated only if the value is duplicated where 
>the response is 'buy' than the value on column, on row4th still zero. 
>
>On row 6th, where the value product column is 3. 3 is already occurred in 3rd 
>row where the value on response is 'buy', so the value on column should be 1
>
>I hope it can understand the conditional statement. 
>
>
>
>
>
>On Thu, Jul 25, 2013 at 8:25 PM,  wrote:
>
>Hi,
>>May be I understand it incorrectly.
>>Your new column value doesn't correspond to your conditional statement.  
>>Also, is this duplication based on entire dataset or within "subj".
>>
>>Hi everybody,,
>>I have a question about R function duplicated(). I have spent days try to
>>figure this out,but I cant find any solution yet. I hope somebody can help
>>me..
>>this is my data:
>>
>>subj=c(1,1,1,2,2,3,3,3,4,4)
>>response=c('sample','sample','buy','sample','buy','sample','sample','buy','sample','buy')
>>product=c(1,2,3,2,2,3,2,1,1,4)
>>tt=data.frame(subj, response, product)
>>
>>the data look like this:
>>
>> subj response product
>>1     1   sample       1
>>2     1   sample       2
>>3     1      buy          3
>>4     2   sample       2
>>5     2         buy       2
>>6     3   sample       3
>>7     3   sample       2
>>8     3         buy       1
>>9     4  sample       1
>>10   4       buy        4
>>
>>
>>
>>I want to create new  column based on the value on response and product
>>column. if the value on product is duplicated, then  the value on new column
>>is 1, otherwise is 0.
>>but I want to add conditional statement that the value on product column
>>will only be considered as duplicated if the value on response column is
>>'buy'.
>>for illustration, the table should look like this:
>>
>>subj response product newcolumn
>>1     1   sample       1          0
>>2     1   sample       2          0
>>3     1      buy          3           0
>>4     2   sample       2          0
>>5     2         buy 

Re: [R] Pairwise comparison between columns, logic

2013-07-26 Thread arun
HI,
Not sure about what your expected output would be.  Also 'CEBPA' was not 
present in the Data.txt. 

gset<- read.table("Names.txt",header=TRUE,stringsAsFactors=FALSE)
 temp1<- read.table("Data.txt",header=TRUE,stringsAsFactors=FALSE)
lst1<-split(temp1,temp1$Names)
mat1<-combn(gset[-1,1],2) #removed CEBPA
library(plyr)

lst2<-lapply(split(mat1,col(mat1)),function(x) 
{x1<-join_all(lst1[x],by="patient_id",type="inner");x1["patient_id"] })
names(lst2)<-apply(mat1,2,paste,collapse="_")
do.call(rbind,lst2)
#   patient_id
#DNMT3A_FLT3.1 LAML-AB-2811-TB #common ids between DNMT3A and FLT3
#DNMT3A_FLT3.2 LAML-AB-2816-TB
#DNMT3A_FLT3.3 LAML-AB-2818-TB
#DNMT3A_IDH1.1 LAML-AB-2802-TB#common ids between DNMT3A and IDH1.  If you 
wanted it as separate dataframes, use `lst2`.
#DNMT3A_IDH1.2 LAML-AB-2822-TB
#DNMT3A_NPM1.1 LAML-AB-2802-TB
#DNMT3A_NPM1.2 LAML-AB-2809-TB
#DNMT3A_NPM1.3 LAML-AB-2811-TB
#DNMT3A_NPM1.4 LAML-AB-2816-TB
#DNMT3A_NRAS   LAML-AB-2816-TB
#FLT3_NPM1.1   LAML-AB-2811-TB
#FLT3_NPM1.2   LAML-AB-2812-TB
#FLT3_NPM1.3   LAML-AB-2816-TB
#FLT3_NRAS LAML-AB-2816-TB
#IDH1_NPM1 LAML-AB-2802-TB
#NPM1_NRAS LAML-AB-2816-TB
A.K.



Hello R experts, 

I am trying to solve the following logic. 
I have two input files. The first file (Names.txt) that has two columns: 
Column1 Column2 
CEBPA   CEBPA 
DNMT3A  DNMT3A 
FLT3FLT3 
IDH1IDH1 
NPM1NPM1 
NRASNRAS 
and the second input file Data.txt has two columns Names, patient_id. 
Namepatient_id 
DNMT3A  LAML-AB-2802-TB 
DNMT3A  LAML-AB-2809-TB 
DNMT3A  LAML-AB-2811-TB 
DNMT3A  LAML-AB-2816-TB 
DNMT3A  LAML-AB-2818-TB 
DNMT3A  LAML-AB-2822-TB 
DNMT3A  LAML-AB-2824-TB 
FLT3LAML-AB-2811-TB 
FLT3LAML-AB-2812-TB 
FLT3LAML-AB-2814-TB 
FLT3LAML-AB-2816-TB 
FLT3LAML-AB-2818-TB 
FLT3LAML-AB-2825-TB 
FLT3LAML-AB-2830-TB 
FLT3LAML-AB-2834-TB 
IDH1LAML-AB-2802-TB 
IDH1LAML-AB-2821-TB 

 What I am attempting to do is for each name in first column of 
names.txt, I do a pairwise comparison with the other names in the second
 column based on which patient ids are common. 
To explain in detail: 
As an example: I extract patient_ids for CEBPA and DNMT3A and see 
which are common, then I do the same for CEBPA and FLT3 and so on for 
CEBPA and the next name in column 2. 
So far the script I have written only does the comparison with the 
first name in the list. So essentially with itself. I am not sure why 
this logic is not working for all the names in column 2 for a single 
name in column 1. 

Below is my script: 

gset<-read.table("Names.txt",header=F,na.strings = ".", as.is=T) # reading in 
the genes 
temp<-read.table("Data.txt",header=T,sep="\t") 


# 
  
  all<-length(unique(temp$fpatient_id)) 
  final<-c() 
  
  both.ab <- list() 
  both <- list() 
  temp.b <- matrix() 
  
  for(i in 1:nrow(gset))  # Loop for genes in the first column 
  
  { 
    
    temp2<-temp[which(temp$Column1 %in% gset[i,]),] 
    num.mut<-length(unique(temp2$patient_id)) 
    
    temp.a <-temp[which(temp$Column1 == gset[i,1]),] 
  
    for(j in 1:(nrow(gset))  # Loop for genes in the second column 
            
    { 
      temp.b <-temp[which(temp$Column2 == gset[j,2]),] 
      # See which patient_ids of temp.a are in temp.b 
      both.ab[[i]]<-temp.a[which(temp.a$patient_id %in% temp.b$patient_id),] 
    } 

    both[[i]]<-both.ab[[i]] 
    
    num.both<-length(unique(both[[i]]$patient_id)) 
    
    line<-c(paste(gset[i, which(!(is.na(gset[i,]))) ],collapse="/"), num.mut, 
all, num.mut/all, num.both) 
    final<-rbind(final,line) 
  } 
Names.txtData.txtScript.txt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] duplicated() with conditional statement

2013-07-26 Thread arun
Hi,
You may try this (didn't get time to test this extensively)
indx<-which(tt$response!="buy")
tt$newcolumn<-0
 tt[unlist(lapply(seq_along(indx),function(i) {x1<-if(indx[i]==nrow(tt)) 
indx[i] else seq(indx[i]+1,indx[i+1]-1);x2<-rbind(tt[indx[1:i],],tt[x1,]); 
if(any(x2$response=="sample")) 
row.names(x2[duplicated(x2$product),])})),"newcolumn"]<-1
 tt
#   subj response product newcolumn
#1 1   sample   1 0
#2 1   sample   2 0
#3 1  buy   3 0
#4 2   sample   2 0
#5 2  buy   2 0
#6 3   sample   3 1
#7 3   sample   2 1
#8 3  buy   1 0
#9 4   sample   1 1
#10    4  buy   4 0
A.K.








From: vanessa van der vaart 
To: smartpink...@yahoo.com 
Sent: Thursday, July 25, 2013 3:49 PM
Subject: Re: duplicated() with conditional statement



thank you for the reply.
It based on entire data set. 

subj response product newcolumn
1     1   sample       1          0         
2     1   sample       2          0         
3     1      buy          3           0       
4     2   sample       2          0        . 
5     2         buy       2           0
6     3   sample       3          1
7     3   sample       2           1
8     3         buy       1           0
9     4  sample       1            1
10   4       buy       4             0

I am sorry i didnt question it very clearly, let me change the conditional 
statement, I hope you can understand. i will explain by example

as you can see, almost every number is duplicated, but only in row 6th,7th,and 
9th the value on column is 1.

on row4th, the value is duplicated( 2 already occurred on 2nd row),but since 
the value is considered as duplicated only if the value is duplicated where the 
response is 'buy' than the value on column, on row4th still zero. 

On row 6th, where the value product column is 3. 3 is already occurred in 3rd 
row where the value on response is 'buy', so the value on column should be 1

I hope it can understand the conditional statement. 





On Thu, Jul 25, 2013 at 8:25 PM,  wrote:

Hi,
>May be I understand it incorrectly.
>Your new column value doesn't correspond to your conditional statement.  Also, 
>is this duplication based on entire dataset or within "subj".
>
>Hi everybody,,
>I have a question about R function duplicated(). I have spent days try to
>figure this out,but I cant find any solution yet. I hope somebody can help
>me..
>this is my data:
>
>subj=c(1,1,1,2,2,3,3,3,4,4)
>response=c('sample','sample','buy','sample','buy','sample','sample','buy','sample','buy')
>product=c(1,2,3,2,2,3,2,1,1,4)
>tt=data.frame(subj, response, product)
>
>the data look like this:
>
> subj response product
>1     1   sample       1
>2     1   sample       2
>3     1      buy          3
>4     2   sample       2
>5     2         buy       2
>6     3   sample       3
>7     3   sample       2
>8     3         buy       1
>9     4  sample       1
>10   4       buy        4
>
>
>
>I want to create new  column based on the value on response and product
>column. if the value on product is duplicated, then  the value on new column
>is 1, otherwise is 0.
>but I want to add conditional statement that the value on product column
>will only be considered as duplicated if the value on response column is
>'buy'.
>for illustration, the table should look like this:
>
>subj response product newcolumn
>1     1   sample       1          0
>2     1   sample       2          0
>3     1      buy          3           0
>4     2   sample       2          0
>5     2         buy       2           0
>6     3   sample       3          1
>7     3   sample       2           1
>8     3         buy       1           0
>9     4  sample       1            1
>10   4       buy       4             0
>
>
>can somebody help me?
>any help will be appreciated.
>I am new in this mailing list, so forgive me in advance, If I did not  ask
>the question appropriately.
>
>
>
>
>
>
>Quoted from:
>http://r.789695.n4.nabble.com/duplicated-with-conditional-statement-tp4672342.html
>
>
>_
>Sent from http://r.789695.n4.nabble.com
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Duplicated function with conditional statement

2013-07-26 Thread vanessa van der vaart
Hi everybody,,
I have a question about R function duplicated(). I have spent days try to
figure this out,but I cant find any solution yet. I hope somebody can help
me..
this is my data:

subj=c(1,1,1,2,2,3,3,3,4,4)
response=c('sample','sample','buy','sample','buy','sample','
sample','buy','sample','buy')
product=c(1,2,3,2,2,3,2,1,1,4)
tt=data.frame(subj, response, product)

the data look like this:

 subj response product
1 1   sample   1
2 1   sample   2
3 1  buy  3
4 2   sample   2
5 2 buy   2
6 3   sample   3
7 3   sample   2
8 3 buy   1
9 4  sample   1
10   4   buy4

I want to create new  column based on the value on response and product
column. if the value on product is duplicated, then  the value on new column
is 1, otherwise is 0.
but I want to add conditional statement that the value on product column
will only be considered as duplicated if the value on response column is
'buy'.
for illustration, the table should look like this:

subj response product newcolumn
1 1   sample   1  0
2 1   sample   2  0
3 1  buy  3  0
4 2   sample   2  0
5 2 buy   2  0
6 3   sample   3  1
7 3   sample   2   1
8 3 buy   1   0
9 4  sample   11
10   4   buy   4 0


can somebody help me?
any help will be appreciated.
I am new in this mailing list, so forgive me in advance, If I did not  ask
the question appropriately.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How do I get a a p-value for the output of an lme model with lme4?

2013-07-26 Thread Maria Sol Lago
Hi there,

I just started using lme4 and I have a question about obtaining p-values. I'm 
trying to get p-values for the output of a linear mixed-effects model. In my 
experiment  I have a 2 by 2 within subjects design, fully crossing two factors, 
"Gram" and "Number". This is the command I used to run the model:

>m <- lmer(RT ~ Gram*Number + (1|Subject) + (0+Gram+Number|Subject) + 
(1|Item_number),data= data)

If I understand this code, I am getting coefficients for the two fixed effects 
(Gram and Number) and their interaction, and I am fitting a model that has 
by-subject intercepts and slopes for the two fixed effects, and a by-item 
intercept for them. Following Barr et al. (2013), I thought that this code gets 
rid of the correlation parameters. I don't want estimate the correlations 
because I want to get the p-values using pvals.fnc (), and I read that this 
function doesn't work if there are correlations in the model.

The command seems to work:

>m
Linear mixed model fit by REML 
Formula: RT ~ Gram * Number + (1 | Subject) + (0 + Gram + Number | Subject) 
+ (1 |Item_number) 
   Data: mverb[mverb$Region == "06v1", ] 
   AIC   BIC logLik deviance REMLdev
 20134 20204 -1005420138   20108
Random effects:
 Groups  NameVariance  Std.Dev. Corr  
Item_number (Intercept)   273.508  16.5381   
 Subject Gramgram0.000   0.   
 Gramungram   3717.213  60.9689NaN
 Number159.361   7.7046NaN -1.000 
 Subject (Intercept) 14075.240 118.6391   
 Residual35758.311 189.0987   
Number of obs: 1502, groups: Item_number, 48; Subject, 32

Fixed effects:
 Estimate Std. Error t value
(Intercept)402.520 22.321  18.033
Gram1  -57.788 14.545  -3.973
Number1 -4.191  9.858  -0.425
Gram1:Number1   15.693 19.527   0.804

Correlation of Fixed Effects:
(Intr) Gram1  Numbr1
Gram1   -0.181  
Number1 -0.034  0.104   
Gram1:Nmbr1  0.000 -0.002 -0.011

However, when I try to calculate the p-values I still get an error message:

>pvals.fnc(m, withMCMC=T)$fixed
Error in pvals.fnc(m, withMCMC = T) : 
MCMC sampling is not implemented in recent versions of lme4
  for models with random correlation parameters

Am I making a mistake when I specify my model? Shouldn't pvals.fnc() work if I 
removed the correlations?

Thanks for your help!

--Sol
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to split two levels several times?

2013-07-26 Thread arun


May be this also helps:
XXX: dataset
rl<-rle(as.character(XXX$electrode))
 
dat<-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1<-if(rl$values[i]=="electrode1"
 & (rl$lengths[i]%/%3>1)) rep(3,rl$lengths[i]%/%3) else 
rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])}))
 lst1<-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1)
vec1<-sapply(lst1,max)
vec2<-c(1,vec1[-length(vec1)]+1)
res<-  lapply(seq_along(lst1),function(i) {x1<-lst1[[i]]; 
XXX[seq(vec2[i],max(x1)),]})
 res
#[[1]]
 #  electrode length
#1 electrode1    5.7
#2 electrode1    6.3
#3 electrode1    6.2
#4 electrode2   11.4
#5 electrode2    9.7
#
#[[2]]
 #   electrode length
#6  electrode3   14.2
#7  electrode3   14.8
#8  electrode3   12.6
#9  electrode2   11.4
#10 electrode2    9.7
#
#[[3]]
 #   electrode length
#11 electrode4   17.0
#12 electrode4   16.3
#13 electrode4   17.8
#14 electrode4   18.3
#15 electrode4   16.9
#16 electrode4   18.5
#17 electrode1    5.7
#18 electrode1    6.3
#19 electrode1    6.2

#[[4]]
 #   electrode length
#20 electrode1    5.7
#21 electrode1    6.3
#22 electrode1    6.2
#23 electrode3   14.2
#24 electrode3   14.8
#25 electrode3   12.6

Also, tested in cases like below:
XXX1<- structure(list(electrode = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 
3L, 3L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L), .Label = c("electrode1", 
"electrode2", "electrode3", "electrode4"), class = "factor"), 
    length = c(5.7, 6.3, 6.2, 11.4, 9.7, 14.2, 14.8, 12.6, 11.4, 
    9.7, 17, 16.3, 17.8, 18.3, 16.9, 18.5, 5.7, 6.3, 6.2, 7.7, 
    7.3, 6.2, 6.7, 6.8, 6.9, 5.7, 6.3, 6.2, 14.2, 14.8, 12.6)), .Names = 
c("electrode", 
"length"), class = "data.frame", row.names = c(NA, -31L))

XXX2<-structure(list(electrode = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 
3L, 3L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 3L, 3L, 3L), .Label = c("electrode1", "electrode2", 
"electrode3", "electrode4"), class = "factor"), length = c(5.7, 
6.3, 6.2, 11.4, 9.7, 14.2, 14.8, 12.6, 11.4, 9.7, 17, 16.3, 17.8, 
18.3, 16.9, 18.5, 5.7, 6.3, 6.2, 7.7, 7.3, 6.2, 6.7, 6.8, 6.9, 
14.2, 14.8, 12.6)), .Names = c("electrode", "length"), class = "data.frame", 
row.names = c(NA, 
-28L))

rl<-rle(as.character(XXX1$electrode))
 
dat<-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1<-if(rl$values[i]=="electrode1"
 & (rl$lengths[i]%/%3>1)) rep(3,rl$lengths[i]%/%3) else 
rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])}))
 lst1<-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1)
vec1<-sapply(lst1,max)
vec2<-c(1,vec1[-length(vec1)]+1)
res1<-  lapply(seq_along(lst1),function(i) {x1<-lst1[[i]]; 
XXX1[seq(vec2[i],max(x1)),]})


rl<-rle(as.character(XXX2$electrode))
 
dat<-do.call(rbind,lapply(seq_along(rl$lengths),function(i){x1<-if(rl$values[i]=="electrode1"
 & (rl$lengths[i]%/%3>1)) rep(3,rl$lengths[i]%/%3) else 
rl$lengths[i];data.frame(Len=x1,Val=rl$values[i])}))
 lst1<-split(cumsum(dat[,1]),((seq_along(dat[,1])-1)%/%2)+1)
vec1<-sapply(lst1,max)
vec2<-c(1,vec1[-length(vec1)]+1)
res2<-  lapply(seq_along(lst1),function(i) {x1<-lst1[[i]]; 
XXX2[seq(vec2[i],max(x1)),]})

Didn't test it extensively.  So, it may fail in other situations.
A.K.






- Original Message -
From: Rui Barradas 
To: dennis1...@gmx.net
Cc: r-help@r-project.org
Sent: Thursday, July 25, 2013 2:53 PM
Subject: Re: [R] How to split two levels several times?

Hello,

I think the following does what you want. (I don't know if it makes much 
sense but it works.)



lens <- rle(as.character(XXX$electrode))$lengths
m <- length(lens) %/% 2
idx <- rep(1:m, sapply(1:m, function(.m) sum(lens[(2*.m - 1):(2*.m)])))
if(length(lens) %% 2 != 0){
    idx <- c(idx, rep(m + 1, lens[length(lens)]))
    sp_idx <- split(idx, idx)
    n <- length(sp_idx[[m]])
    if(n %/% 2 < length(sp_idx[[m + 1]]))
        sp_idx[[m]][(n %/% 2 + 1):n] <- sp_idx[[m + 1]][1]
    else
        sp_idx[[m]][(n - length(sp_idx[[m + 1]]) + 1):n] <-  sp_idx[[m + 1]][1]
    idx <- unlist(sp_idx)
}

sp <- split(XXX, idx)
sp



Rui Barradas

Em 25-07-2013 11:40, dennis1...@gmx.net escreveu:
> Hi Rui
> once more thank you for your help. But the code does so far not solve the 
> problem because it still treats rows 17-22 (repeated appearance of 
> electrode1) as one single level. However as can be seen by rows 1-3 (or rows 
> 17-19 and rows 20-22) and the order of the length variable (row 1 = 5.7, row 
> 2 = 6.3, row 3 = 6.2) electrode1 consists only of 3 rows. Maybe that was not 
> made absolutely clear by me. As described in my mail before if by chance (or 
> systematically) it happens to be that electrode1 appears right after each 
> other in the table then the code should split it “half way”.
>
> So idx should not return
>   [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4
>
> but instead 6 times number 4 at the end
>   [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4
>
> Do you have any solution?
>
>
>> Gesendet: Mittwoch, 24. Juli 2013 um 23:47 Uhr
>>