Re: [R] merging data list in to single data frame

2011-04-04 Thread Umesh Rosyara
Thank you Hadley. With your solution, now it feels very easy ! 

  _  

From: h.wick...@gmail.com [mailto:h.wick...@gmail.com] On Behalf Of Hadley
Wickham
Sent: Monday, April 04, 2011 6:11 PM
To: Umesh Rosyara
Cc: Dennis Murphy; r-help@r-project.org; rosyar...@gmail.com
Subject: Re: [R] merging data list in to single data frame



> filelist = list.files(pattern = "K*cd.txt") # the file names are K1cd.txt
> .to K200cd.txt

It's very easy:

names(filelist) <- basename(filelist)
data_list <- ldply(filelist, read.table, header=T, comment=";", fill=T)

Hadley


--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/


  _  

No virus found in this message.
Checked by AVG - www.avg.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merging data list in to single data frame

2011-04-04 Thread Umesh Rosyara
Thank you Dennis for the solution. It is a step ahead..However I need to
read all 200 files as  dataframes one-by-one. Can we automate this process.
I used the following step to read all file at once however the data_list
ended as list. 

 

filelist = list.files(pattern = "K*cd.txt") # the file names are K1cd.txt
.to K200cd.txt

data_list <-lapply(filelist, read.table, header=T, comment=";", fill=T)
names(filelist) <- 1:length(filelist)

library("plyr")

ldply(data_list, rbind)

 

I tried to use your approach to list, is not successful to have the var  .id
(otherwise it is binding the component dataframes !), probably this is
applicable to component data frames not list with many data frames. 

Do you any suggestion on using fuctions that can read the files (as I did
above) and save as new dataframe (for example DF1.DF2) not a list of 200
data frames? If we can do that then we will able to use this approach.

 

Thank you so much,

Umesh R 



 

From: Dennis Murphy [mailto:djmu...@gmail.com] 
Sent: Monday, April 04, 2011 3:25 PM
To: Umesh Rosyara
Cc: r-help@r-project.org; rosyar...@gmail.com
Subject: Re: [R] merging data list in to single data frame

 

Hi:

Here's an alternative using ldply() from the plyr package. The idea is to
read the data frames into a list, name them accordingly and then call
ldply().

# Read in the test data frames (you want to use list.files() instead to
input the data per Uwe's guidelines)
df1 <- read.table(textConnection("
+ var1  var2  var3var4
+ 1   6 0.3 8
+ 3  4 0.4 9
+ 2  3 0.4 6
+ 1   0.40.9  3"), header = TRUE)
> df2 <- read.table(textConnection("
+ var1 var2 var3 var4
+ 1 16 0.6 7
+ 3 14 0.4 6
+ 2 13 0.4 5
+ 1 0.6 0.9 2"), header = TRUE)
closeAllConnections()
# generate the list
dl <- list(df1, df2)

# Name the list components by number and then call ldply():
names(dl) <- 1:2  # more generally, names(dl) <- 1:length(dl)
library("plyr")
ldply(dl, rbind)
  .id var1 var2 var3 var4
1   11  6.0  0.38
2   13  4.0  0.49
3   12  3.0  0.46
4   11  0.4  0.93
5   21 16.0  0.67
6   23 14.0  0.46
7   22 13.0  0.45
8   21  0.6  0.92

You can always change .id to fileno afterwards.

HTH,
Dennis

On Mon, Apr 4, 2011 at 7:41 AM, Umesh Rosyara  wrote:

Dear R community members



I did find a good way to merge my 200 text data files in to a single data
file with one column added will show indicator for that file.



filelist = list.files(pattern = "K*cd.txt") # the file names are K1cd.txt
.to K200cd.txt

data_list <-lapply(filelist, read.table, header=T, comment=";", fill=T)



This will create list, but this is not what I want.



I want a single dataframe (all separate dataframes have same variable
headings) with additional row for example



; just for example, two small datasets are created by my component datasets
are huge, need automation

;read from file K1cd.txt

var1  var2  var3var4

1   6 0.3 8

3  4 0.4 9

2  3 0.4 6

1   0.40.9  3



;read from file K2cd.txt

var1  var2  var3var4

1   16 0.67

3  14 0.4 6

2 1 3 0.4 5

1  0.60.9  2



the output dataframe should look like



Fileno  var1  var2  var3var4

1  1   6 0.38

1  3  4 0.4 9

1  2  3 0.4 6

1  1   0.4  0.93

2  1   16   0.67

2  3  14 0.46

2  2 1 3 0.45

2  1  0.6 0.9   2



Please note that new file no column is added



Thank you for the help.



Umesh R




   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Questions remaining: define any character as na.string RE: merging data list in to single data frame

2011-04-04 Thread Umesh Rosyara
Dear Uwe and R community members 

Thank you Uwe for the help. 

I have still a question remaining, I am trying to find answer from long
time.  

While exporting my data, I have some characters mixed into it. I want to
define any characters as na.string? Is it possible to do so?

Thanks;

Umesh 



-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] 
Sent: Monday, April 04, 2011 12:22 PM
To: Umesh Rosyara
Cc: r-help@r-project.org; rosyar...@gmail.com
Subject: Re: [R] merging data list in to single data frame



On 04.04.2011 16:41, Umesh Rosyara wrote:
> Dear R community members
>
>
>
> I did find a good way to merge my 200 text data files in to a single data
> file with one column added will show indicator for that file.
>
>
>
> filelist = list.files(pattern = "K*cd.txt")


I doubt you meant "K*cd.txt" but "^K[[:digit:]]*cd\\.txt$".



# the file names are K1cd.txt
> .to K200cd.txt
>
> data_list<-lapply(filelist, read.table, header=T, comment=";", fill=T)


Replace by:

data_list <- lapply(filelist, function(x)
cbind(Filename = x, read.table(x, header=T, comment=";", fill=TRUE))


And then:

result <- do.call("rbind", data_list)

Uwe Ligges


>
>
>
> This will create list, but this is not what I want.
>
>
>
> I want a single dataframe (all separate dataframes have same variable
> headings) with additional row for example
>
>
>
> ; just for example, two small datasets are created by my component
datasets
> are huge, need automation
>
> ;read from file K1cd.txt
>
> var1  var2  var3var4
>
> 1   6 0.3 8
>
> 3  4 0.4 9
>
> 2  3 0.4 6
>
> 1   0.40.9  3
>
>
>
> ;read from file K2cd.txt
>
> var1  var2  var3var4
>
> 1   16 0.67
>
> 3  14 0.4 6
>
> 2 1 3 0.4 5
>
> 1  0.60.9  2
>
>
>
> the output dataframe should look like
>
>
>
> Fileno  var1  var2  var3var4
>
> 1  1   6 0.38
>
> 1  3  4 0.4 9
>
> 1  2  3 0.4 6
>
> 1  1   0.4  0.93
>
> 2  1   16   0.67
>
> 2  3  14 0.46
>
> 2  2 1 3 0.45
>
> 2  1  0.6 0.9   2
>
>
>
> Please note that new file no column is added
>
>
>
> Thank you for the help.
>
>
>
> Umesh R
>
>
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merging data list in to single data frame

2011-04-04 Thread Umesh Rosyara
Dear R community members 

 

I did find a good way to merge my 200 text data files in to a single data
file with one column added will show indicator for that file.  

 

filelist = list.files(pattern = "K*cd.txt") # the file names are K1cd.txt
.to K200cd.txt 

data_list <-lapply(filelist, read.table, header=T, comment=";", fill=T)

 

This will create list, but this is not what I want.  

 

I want a single dataframe (all separate dataframes have same variable
headings) with additional row for example 

 

; just for example, two small datasets are created by my component datasets
are huge, need automation

;read from file K1cd.txt 

var1  var2  var3var4 

1   6 0.3 8

3  4 0.4 9

2  3 0.4 6

1   0.40.9  3

 

;read from file K2cd.txt 

var1  var2  var3var4 

1   16 0.67

3  14 0.4 6

2 1 3 0.4 5

1  0.60.9  2

 

the output dataframe should look like 

 

Fileno  var1  var2  var3var4 

1  1   6 0.38

1  3  4 0.4 9

1  2  3 0.4 6

1  1   0.4  0.93

2  1   16   0.67

2  3  14 0.46

2  2 1 3 0.45

2  1  0.6 0.9   2

 

Please note that new file no column is added 

 

Thank you for the help.

 

Umesh R 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help need on working in subset within a dataframe

2011-03-22 Thread Umesh Rosyara
Thank you, Ista. It helps. 
 
Best Regards
 
Umesh R 
 
 
 

  _  

From: istaz...@gmail.com [mailto:istaz...@gmail.com] On Behalf Of Ista Zahn
Sent: Tuesday, March 22, 2011 8:58 AM
To: Umesh Rosyara
Cc: R mailing list
Subject: Re: [R] help need on working in subset within a dataframe



Hi Umesh,
I use the plyr package for this sort of thing:

library(plyr)
daply(dataframe, .(ped), myfun)

Best,
Ista
On Tue, Mar 22, 2011 at 3:48 AM, Umesh Rosyara  wrote:
> Dear R-experts
>
> Execuse me for an easy question, but I need help, sorry for that.
>
> >From days I have been working with a large dataset, where operations are
> needed within a component of dataset. Here is my question:
>
> I have big dataset where x1:.x1000 or so. What I need to do is to work
> on 4 consequite variables to calculate a statistics and output. So far so
> good. There are more vector operations inside function to do this. My
> question this time is I want to do this seperately for each level of
factor
> (infollowing example it is Ped, thus if there are 20 ped, I want a output
> with 20 statistics, so that I can work further on them).
>
> #data generation
> ped <- c(1,1,1,1,1, 1,1,1,1,1, 2,2,2,2,2, 2,2,2,2,2)# I have 20 ped
> fd <- c(1,1,1,1,1, 2,2,2,2,2,  3,3,3,3,3, 4,4,4,4,4) # I have ~100 fd
> iid <- c(1:20) # number can go up to 2000
> mid <- c(0,0,1,1,1, 0,0,6,6,6, 0,0, 11,11,11, 0,0,16,16,16)
> fid <- c(0,0,2,2,2, 0,0,7,7,7, 0,0, 12,12,12, 0,0,17, 17, 17)
> y <- c(3,4,5,6,7,  3,4,8,9, 8,  2,3,3,6,7,  9,12,10,8,12)
> x1 <- c(1,1,1,0,0, 1,0,1,1,0,   0, 1,1,0,1,1, 1,1,0,0)
> x2 <- c(1,1,1,0,0, 1,0,1,1,0,   0, 1,1, 1,0,   1,1,0,1,0)
> x3 <- c(1,0,0,1,1, 1,1,1,1,1,   1, 1,1, 1,0,   1,1,0,1,0)
> x4 <- c(1,1,1,1,0, 0,1,1, 0,0,  0, 1,0,0, 0,   0,0,1, 1,1)
> # I have more X variables potentially >1000 but I need to work four at a
> time
> dataframe <- data.frame(ped, fd, iid, mid, fid, y, x1, x2, x3, x4)
>
> myfun <- function(dataframe)  {
> namemat <- matrix(c(1:4), nrow = 1)
> smyfun <- function(x)  {
>  x <- as.vector(x)
>  K1 <- dataframe$x1 * 0.23
>  K2 <- dataframe$x2 * 0.98
>  # just example there is long vector calculations in read dataset
>  kt1 <- K1 * K2
>  kt2 <- K1 / K2
>  Qni <- (K1*(kt1-0.25)+ K2 *(kt2-0.25))
>  y <- dataframe$y
>  yg <- mean(y, na.rm= TRUE) # mean of trait Y # mean of trait Y
>  dvm <- (y-yg ) # deviation of phenotypic value from mean
>  sumdvm <-abs(sum(dvm, na.rm= TRUE))
>  yQni <- y* Qni
>  sumyQni <-abs(sum(yQni, na.rm= TRUE))
>  npt = ( sumdvm/ sumyQni)
>  return(npt)
>  }
>  npt1 <- apply(namemat,1, smyfun)
>  return(npt1)
> }
>
>  myfun (dataframe)
>
> My question is how can I automate the process so that the above function
can
> calculate different values for n levels (>20 in my real data) of factor
ped.
>
>
> Thanks in advance for the help. R-community is always helpful.
>
> Umesh R
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org 

  _  

No virus found in this message.
Checked by AVG - www.avg.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help need on working in subset within a dataframe

2011-03-21 Thread Umesh Rosyara
Dear R-experts
 
Execuse me for an easy question, but I need help, sorry for that.
 
>From days I have been working with a large dataset, where operations are
needed within a component of dataset. Here is my question:
 
I have big dataset where x1:.x1000 or so. What I need to do is to work
on 4 consequite variables to calculate a statistics and output. So far so
good. There are more vector operations inside function to do this. My
question this time is I want to do this seperately for each level of factor
(infollowing example it is Ped, thus if there are 20 ped, I want a output
with 20 statistics, so that I can work further on them). 
 
#data generation 
ped <- c(1,1,1,1,1, 1,1,1,1,1, 2,2,2,2,2, 2,2,2,2,2)# I have 20 ped 
fd <- c(1,1,1,1,1, 2,2,2,2,2,  3,3,3,3,3, 4,4,4,4,4) # I have ~100 fd 
iid <- c(1:20) # number can go up to 2000  
mid <- c(0,0,1,1,1, 0,0,6,6,6, 0,0, 11,11,11, 0,0,16,16,16) 
fid <- c(0,0,2,2,2, 0,0,7,7,7, 0,0, 12,12,12, 0,0,17, 17, 17) 
y <- c(3,4,5,6,7,  3,4,8,9, 8,  2,3,3,6,7,  9,12,10,8,12)
x1 <- c(1,1,1,0,0, 1,0,1,1,0,   0, 1,1,0,1,1, 1,1,0,0)
x2 <- c(1,1,1,0,0, 1,0,1,1,0,   0, 1,1, 1,0,   1,1,0,1,0)
x3 <- c(1,0,0,1,1, 1,1,1,1,1,   1, 1,1, 1,0,   1,1,0,1,0)
x4 <- c(1,1,1,1,0, 0,1,1, 0,0,  0, 1,0,0, 0,   0,0,1, 1,1)
# I have more X variables potentially >1000 but I need to work four at a
time
dataframe <- data.frame(ped, fd, iid, mid, fid, y, x1, x2, x3, x4)  
 
myfun <- function(dataframe)  {
namemat <- matrix(c(1:4), nrow = 1)
smyfun <- function(x)  {
 x <- as.vector(x)
 K1 <- dataframe$x1 * 0.23
 K2 <- dataframe$x2 * 0.98
 # just example there is long vector calculations in read dataset 
 kt1 <- K1 * K2
 kt2 <- K1 / K2
 Qni <- (K1*(kt1-0.25)+ K2 *(kt2-0.25))
 y <- dataframe$y
 yg <- mean(y, na.rm= TRUE) # mean of trait Y # mean of trait Y
 dvm <- (y-yg ) # deviation of phenotypic value from mean 
 sumdvm <-abs(sum(dvm, na.rm= TRUE))
  yQni <- y* Qni
  sumyQni <-abs(sum(yQni, na.rm= TRUE)) 
  npt = ( sumdvm/ sumyQni) 
  return(npt)
 }
 npt1 <- apply(namemat,1, smyfun)
 return(npt1)
}
 
  myfun (dataframe)
 
My question is how can I automate the process so that the above function can
calculate different values for n levels (>20 in my real data) of factor ped.

 
Thanks in advance for the help. R-community is always helpful. 
 
Umesh R

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] still a problem remainingRE: Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function

2011-03-11 Thread Umesh Rosyara
Thank you for helping me and this solved the problem
 
Best Regards
 
Umesh R 
 
 
 
 

  _  

From: foolish.andr...@gmail.com [mailto:foolish.andr...@gmail.com] On Behalf
Of Felix Andrews
Sent: Friday, March 11, 2011 4:05 AM
To: Umesh Rosyara
Cc: R mailing list; deepayan.sar...@r-project.org
Subject: Re: [R] still a problem remainingRE: Data lebals xylattice plot:
RE: displaying label meeting condition (i.e. significant, i..e p value less
than 005) in plot function



Yes, it is intersect rather than intersection, sorry.

And in panel.text() the x and y were switched, so just reverse the
first two arguments.

Thats what comes from posting from an iGizmo with no R to test my code.


2011/3/11 Umesh Rosyara :
> Thank you so much for the advice. The R could not find function
> "intersection". Do I need additional package to have this function active.
I
> tried "intersect" instead has no effect.
>
> xyplot(p ~ xvar|chr, data=dataf,
> panel=function(x, y, subscripts){
> panel.xyplot(x, y)
> ok= intersection(subscripts, which(dataf$p < 0.05))
> with(dataf[ok,], panel.text(p, xvar, name))
>  }, as.table=T, subscripts=T)
>
>
> Best Regards
>
> Umesh R
>
>
>
>
> 
> From: foolish.andr...@gmail.com [mailto:foolish.andr...@gmail.com] On
Behalf
> Of Felix Andrews
> Sent: Thursday, March 10, 2011 7:01 AM
> To: Umesh Rosyara
> Cc: R mailing list; deepayan.sar...@r-project.org
> Subject: Re: [R] still a problem remainingRE: Data lebals xylattice plot:
> RE: displaying label meeting condition (i.e. significant, i..e p value
less
> than 005) in plot function
>
> Notice that pvals is a subset of dataf so 'subscripts' can not be
> applied directly to pvals. Instead you should do the subsetting inside
> the panel function. e.g.
> ok <- intersection(subscripts, which(dataf$p < 0.05))
> with(dataf[ok,], panel.text(p, xval, name))
>
>
> By the way you should include the dots (...) in your panel function
> arguments and pass them on to panel.xyplot.
>
>
> On Thursday, 10 March 2011, Umesh Rosyara  wrote:
>> Lattice-experts:
>> Thank you for those who have responded earlier. I have not got a perfect
>> solution yet but tried several ways, unless anybody really lattice killer
>> steps up, I will leave it and see alternatives. Sorry to send it again.
>>
>>
>>
>> #Data
>>
>> name <- c(paste ("M", 1:1000, sep = ""))
>> xvar <- seq(1, 1, 10)
>> chr <- c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200))
>> set.seed(134)
>> p <- rnorm(1000, 0.15,0.05)
>> dataf <- data.frame(name,xvar, chr, p)
>> dataf$chr <- as.factor(dataf$chr)
>>
>>
>>
>> #subset data
>>
>> pvals <- dataf[dataf$p < 0.05,]
>>
>>
>>
>> # unsuccessful commands
>>
>> xyplot(p ~ xvar|chr, data=dataf,
>>   panel=function(x, y, subscripts){
>>
>>   panel.xyplot(x, y)
>>
>>panel.xyplot(pvals$xvar[subscripts],pvals$p[subscripts], pch=6)
>>   panel.abline(h=0.01, col="red")
>>
>>
>>   panel.text(pvals$xvar[subscripts], pvals$p[subscripts],
>> pvals$name[subscripts], col="green2")
>>
>>
>>   }, as.table=T, subscripts=T)
>>
>>
>>
>>
>>
>> Best Regards
>>
>> Umesh R
>>
>>
>>
>>
>>   _
>>
>> From: Bert Gunter [mailto:gunter.ber...@gene.com]
>> Sent: Tuesday, March 08, 2011 12:00 AM
>> To: Umesh Rosyara
>> Cc: Jorge Ivan Velez; Dennis Murphy; sarah.gos...@gmail.com; R mailing
>> list
>> Subject: Re: still a problem remainingRE: [R] Data lebals xylattice plot:
>> RE: displaying label meeting condition (i.e. significant, i..e p value
>> less
>> than 005) in plot function
>>
>>
>>
>> As I believe I already told you in my original reply, you have to make
>> use of the subscripts argument in the panel function to subscript the
>> P values etc. vector to be plotted in each panel. Something like:
>> (untested)
>>
>>panel = function(x, y,subscripts,...) {
>> panel.xyplot(x, y,...)
>> panel.abline(h=0.01, col="red")
>> panel.text(xv1[subscripts], p1[subscripts],
>> n1[subscripts], col="green2")
>>}
>>
>>
>> Also,in future, please send plain text email, as requested in the
>> guide. Your message was in an annoying blue font in my gmail reader.
>>
>> Cheers,
>> Bert
>>
>>
>> On Mon, Mar 7, 2011 at 5:26 PM, Umesh Rosya

Re: [R] still a problem remainingRE: Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function

2011-03-10 Thread Umesh Rosyara
Thank you so much for the advice. The R could not find function
"intersection". Do I need additional package to have this function active. I
tried "intersect" instead has no effect. 
 
xyplot(p ~ xvar|chr, data=dataf,
panel=function(x, y, subscripts){
panel.xyplot(x, y)
ok= intersection(subscripts, which(dataf$p < 0.05))
with(dataf[ok,], panel.text(p, xvar, name))
 }, as.table=T, subscripts=T)
 
 
Best Regards
 
Umesh R  
 
 
 
 

  _  

From: foolish.andr...@gmail.com [mailto:foolish.andr...@gmail.com] On Behalf
Of Felix Andrews
Sent: Thursday, March 10, 2011 7:01 AM
To: Umesh Rosyara
Cc: R mailing list; deepayan.sar...@r-project.org
Subject: Re: [R] still a problem remainingRE: Data lebals xylattice plot:
RE: displaying label meeting condition (i.e. significant, i..e p value less
than 005) in plot function



Notice that pvals is a subset of dataf so 'subscripts' can not be
applied directly to pvals. Instead you should do the subsetting inside
the panel function. e.g.
ok <- intersection(subscripts, which(dataf$p < 0.05))
with(dataf[ok,], panel.text(p, xval, name))


By the way you should include the dots (...) in your panel function
arguments and pass them on to panel.xyplot.


On Thursday, 10 March 2011, Umesh Rosyara  wrote:
> Lattice-experts:
> Thank you for those who have responded earlier. I have not got a perfect
> solution yet but tried several ways, unless anybody really lattice killer
> steps up, I will leave it and see alternatives. Sorry to send it again.
>
>
>
> #Data
>
> name <- c(paste ("M", 1:1000, sep = ""))
> xvar <- seq(1, 1, 10)
> chr <- c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200))
> set.seed(134)
> p <- rnorm(1000, 0.15,0.05)
> dataf <- data.frame(name,xvar, chr, p)
> dataf$chr <- as.factor(dataf$chr)
>
>
>
> #subset data
>
> pvals <- dataf[dataf$p < 0.05,]
>
>
>
> # unsuccessful commands
>
> xyplot(p ~ xvar|chr, data=dataf,
>   panel=function(x, y, subscripts){
>
>   panel.xyplot(x, y)
>
>panel.xyplot(pvals$xvar[subscripts],pvals$p[subscripts], pch=6)
>   panel.abline(h=0.01, col="red")
>
>
>   panel.text(pvals$xvar[subscripts], pvals$p[subscripts],
> pvals$name[subscripts], col="green2")
>
>
>   }, as.table=T, subscripts=T)
>
>
>
>
>
> Best Regards
>
> Umesh R
>
>
>
>
>   _
>
> From: Bert Gunter [mailto:gunter.ber...@gene.com]
> Sent: Tuesday, March 08, 2011 12:00 AM
> To: Umesh Rosyara
> Cc: Jorge Ivan Velez; Dennis Murphy; sarah.gos...@gmail.com; R mailing
list
> Subject: Re: still a problem remainingRE: [R] Data lebals xylattice plot:
> RE: displaying label meeting condition (i.e. significant, i..e p value
less
> than 005) in plot function
>
>
>
> As I believe I already told you in my original reply, you have to make
> use of the subscripts argument in the panel function to subscript the
> P values etc. vector to be plotted in each panel. Something like:
> (untested)
>
>panel = function(x, y,subscripts,...) {
> panel.xyplot(x, y,...)
> panel.abline(h=0.01, col="red")
> panel.text(xv1[subscripts], p1[subscripts],
> n1[subscripts], col="green2")
>}
>
>
> Also,in future, please send plain text email, as requested in the
> guide. Your message was in an annoying blue font in my gmail reader.
>
> Cheers,
> Bert
>
>
> On Mon, Mar 7, 2011 at 5:26 PM, Umesh Rosyara  wrote:
>> Hi Lattice Users
>>
>> I have been working to fix this problem, still I am not able to solve
> fully.
>> I could label those names that have pvalue less than 0.01 but still the
>> label appears in all compoent plots eventhough those who do have the
> pvalue
>> ! How can I implement it successuflly to grouped data like mine. You help
> is
>> highly appreciated.
>>
>> #my data
>> name <- c(paste ("M", 1:1000, sep = ""))
>> xvar <- seq(1, 1, 10)
>> chr <- c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200))
>> set.seed(134)
>> p <- rnorm(1000, 0.15,0.05)
>> dataf <- data.frame(name,xvar, chr, p)
>> dataf$chr <- as.factor(dataf$chr)
>>
>> # lattice plot: As far as I can go now ! little progress but final push
>> required !
>> require(lattice)
>> pvals <- dataf[dataf$p < 0.01,]
>> p1 <- pvals$p
>> n1 <- pvals$name
>> xv1 <- pvals$xvar
>> xyplot(p ~ xvar|chr, data=dataf,
>>panel = function(x, y) {
>>panel.xyplot(x, y)
>>panel.abline(h=0.01, col="red")
>>panel

Re: [R] still a problem remainingRE: Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function

2011-03-09 Thread Umesh Rosyara
Lattice-experts:
Thank you for those who have responded earlier. I have not got a perfect
solution yet but tried several ways, unless anybody really lattice killer
steps up, I will leave it and see alternatives. Sorry to send it again.
 
  

#Data

name <- c(paste ("M", 1:1000, sep = ""))
xvar <- seq(1, 1, 10)
chr <- c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200))
set.seed(134)
p <- rnorm(1000, 0.15,0.05)
dataf <- data.frame(name,xvar, chr, p)
dataf$chr <- as.factor(dataf$chr)

 

#subset data 

pvals <- dataf[dataf$p < 0.05,]

 

# unsuccessful commands 

xyplot(p ~ xvar|chr, data=dataf,
  panel=function(x, y, subscripts){

  panel.xyplot(x, y)

   panel.xyplot(pvals$xvar[subscripts],pvals$p[subscripts], pch=6)
  panel.abline(h=0.01, col="red")


  panel.text(pvals$xvar[subscripts], pvals$p[subscripts],
pvals$name[subscripts], col="green2")


  }, as.table=T, subscripts=T)

 

 
 
Best Regards
 
Umesh R 
 
 
 

  _  

From: Bert Gunter [mailto:gunter.ber...@gene.com] 
Sent: Tuesday, March 08, 2011 12:00 AM
To: Umesh Rosyara
Cc: Jorge Ivan Velez; Dennis Murphy; sarah.gos...@gmail.com; R mailing list
Subject: Re: still a problem remainingRE: [R] Data lebals xylattice plot:
RE: displaying label meeting condition (i.e. significant, i..e p value less
than 005) in plot function



As I believe I already told you in my original reply, you have to make
use of the subscripts argument in the panel function to subscript the
P values etc. vector to be plotted in each panel. Something like:
(untested)

   panel = function(x, y,subscripts,...) {
panel.xyplot(x, y,...)
panel.abline(h=0.01, col="red")
panel.text(xv1[subscripts], p1[subscripts],
n1[subscripts], col="green2")
   }


Also,in future, please send plain text email, as requested in the
guide. Your message was in an annoying blue font in my gmail reader.

Cheers,
Bert


On Mon, Mar 7, 2011 at 5:26 PM, Umesh Rosyara  wrote:
> Hi Lattice Users
>
> I have been working to fix this problem, still I am not able to solve
fully.
> I could label those names that have pvalue less than 0.01 but still the
> label appears in all compoent plots eventhough those who do have the
pvalue
> ! How can I implement it successuflly to grouped data like mine. You help
is
> highly appreciated.
>
> #my data
> name <- c(paste ("M", 1:1000, sep = ""))
> xvar <- seq(1, 1, 10)
> chr <- c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200))
> set.seed(134)
> p <- rnorm(1000, 0.15,0.05)
> dataf <- data.frame(name,xvar, chr, p)
> dataf$chr <- as.factor(dataf$chr)
>
> # lattice plot: As far as I can go now ! little progress but final push
> required !
> require(lattice)
> pvals <- dataf[dataf$p < 0.01,]
> p1 <- pvals$p
> n1 <- pvals$name
> xv1 <- pvals$xvar
> xyplot(p ~ xvar|chr, data=dataf,
>panel = function(x, y) {
>panel.xyplot(x, y)
>panel.abline(h=0.01, col="red")
>panel.text(xv1, p1, n1, col="green2")
>})
>
> Thank you in advance.
>
> Best Regards
>
> Umesh R
>
>
>
> 
> From: Bert Gunter [mailto:gunter.ber...@gene.com]
> Sent: Sunday, March 06, 2011 10:50 AM
> To: Umesh Rosyara
> Cc: Jorge Ivan Velez; Dennis Murphy; sarah.gos...@gmail.com; R mailing
list
> Subject: Re: [R] Data lebals xylattice plot: RE: displaying label meeting
> condition (i.e. significant, i..e p value less than 005) in plot function
>
> This is easy to do by specifying  xyplot's panel function. Assuming
> only one panel -- otherwise you need to pass the subscripts arguments
> to choose the values belonging to the panel -- somethings like:
>
> xyplot(y~x, pvals = pvals,..., ## pvals is your vector of small p
> values with e.g. NA's elsewhere
> panel = function(x,y, pvals,...) {
> panel.xyplot(...)
> panel.text((x,y, pvals,...)
> }  )
>
> This is obviously just a sketch and will not work as written. So
> please read the Help page on xyplot carefully and perhaps also
> Deepayan's book on trellis graphics -- there are also undoubtedly
> online resources: search on "trellis graphics tutorial" or some such.
> This is not hard, but there are some details that you will need to
> master,especially regarding argument passing.
>
> Another alternative is to use the layer() function in the latticeExtra
> package instead. Consult the documentation there for details.
>
> Cheers,
> Bert
>
>
>
> On Sun, Mar 6, 2011 at 5:17 AM, Umesh Rosyara  wrote:
>> Dear Jorge, Dennis,  Sarah and R-experts.
>>
>> Thank  for helping me. As you mentioned it i

[R] still a problem remainingRE: Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function

2011-03-07 Thread Umesh Rosyara
Hi Lattice Users 
 
I have been working to fix this problem, still I am not able to solve fully.
I could label those names that have pvalue less than 0.01 but still the
label appears in all compoent plots eventhough those who do have the pvalue
! How can I implement it successuflly to grouped data like mine. You help is
highly appreciated.
 
#my data
name <- c(paste ("M", 1:1000, sep = ""))
xvar <- seq(1, 1, 10)
chr <- c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200))
set.seed(134)
p <- rnorm(1000, 0.15,0.05)
dataf <- data.frame(name,xvar, chr, p)
dataf$chr <- as.factor(dataf$chr)
 
# lattice plot: As far as I can go now ! little progress but final push
required ! 
require(lattice)
pvals <- dataf[dataf$p < 0.01,]
p1 <- pvals$p
n1 <- pvals$name
xv1 <- pvals$xvar
xyplot(p ~ xvar|chr, data=dataf,
   panel = function(x, y) {
   panel.xyplot(x, y)
   panel.abline(h=0.01, col="red")
   panel.text(xv1, p1, n1, col="green2")
   })
 
Thank you in advance. 
 
Best Regards
 
Umesh R 
 
 
 

  _  

From: Bert Gunter [mailto:gunter.ber...@gene.com] 
Sent: Sunday, March 06, 2011 10:50 AM
To: Umesh Rosyara
Cc: Jorge Ivan Velez; Dennis Murphy; sarah.gos...@gmail.com; R mailing list
Subject: Re: [R] Data lebals xylattice plot: RE: displaying label meeting
condition (i.e. significant, i..e p value less than 005) in plot function



This is easy to do by specifying  xyplot's panel function. Assuming
only one panel -- otherwise you need to pass the subscripts arguments
to choose the values belonging to the panel -- somethings like:

xyplot(y~x, pvals = pvals,..., ## pvals is your vector of small p
values with e.g. NA's elsewhere
panel = function(x,y, pvals,...) {
panel.xyplot(...)
panel.text((x,y, pvals,...)
}  )

This is obviously just a sketch and will not work as written. So
please read the Help page on xyplot carefully and perhaps also
Deepayan's book on trellis graphics -- there are also undoubtedly
online resources: search on "trellis graphics tutorial" or some such.
This is not hard, but there are some details that you will need to
master,especially regarding argument passing.

Another alternative is to use the layer() function in the latticeExtra
package instead. Consult the documentation there for details.

Cheers,
Bert



On Sun, Mar 6, 2011 at 5:17 AM, Umesh Rosyara  wrote:
> Dear Jorge, Dennis,  Sarah and R-experts.
>
> Thank  for helping me. As you mentioned it is difficult apply in lattice
in
> this situation.
>
> Unless, there is a possibility, I would try to use lattice. The major
reason
> toward this is- my ultimate solution might be better of in lattice as I
have
> a classificatory variable to make similar graph for each caterogory in the
> lattice graph. Lattice cleates nice stacked xyplots.
>
> p ~ xvar | chr # require plots by the factor  variable "chr"
>
> # with a classificatory variable
> name <- c(paste ("M", 1:1000, sep = ""))
> xvar <- seq(1, 1, 10)
> chr <- c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200))
> set.seed(134)
> p <- rnorm(1000, 0.15,0.05)
> dataf <- data.frame(name,xvar, chr, p)
> dataf$chr <- as.factor(dataf$chr)
>
> # lattice plot: As far as I can go now !
> require(lattice)
> xyplot(pval ~ xvar1|chr, dataf)
>
>
> Best Regards
>
> Umesh R
>
>
>
>
>  _
>
> From: Jorge Ivan Velez [mailto:jorgeivanve...@gmail.com]
> Sent: Sunday, March 06, 2011 12:22 AM
> To: Umesh Rosyara
> Cc: R mailing list
> Subject: Re: [R] displaying label meeting condition (i.e. significant,
i..e
> p value less than 005) in plot function
>
>
> Hi Umesh,
>
>
> You can try something along the lines of:
>
>
> d <- dataf[dataf$p < 0.05, ]   # p < 0.05
> with(d, plot(xvar, p, col = 'white'))
> with(d, text(xvar, p, name, cex = .7))
>
> HTH,
> Jorge
>
>
>
> On Sat, Mar 5, 2011 at 12:29 PM, Umesh Rosyara <> wrote:
>
>
> Dear R users,
>
> Here is my problem:
>
> # example data
> name <- c(paste ("M", 1:1000, sep = ""))
> xvar <- seq(1, 1, 10)
> set.seed(134)
> p <- rnorm(1000, 0.15,0.05)
> dataf <- data.frame(name,xvar, p)
> plot (dataf$xvar,p)
> abline(h=0.05)
>
> # I can know which observation number is less than 0.05
> which (dataf$p < 0.05)
> [1]  12  20  80 269 272 338 366 368 397 403 432 453 494 543 592 691 723
789
> 811
> [20] 854 891 931 955
>
> I want to display (label) corresponding names on the plot above:
> means that 12th observation M12, 20th observation M20 and so on. Please
note
> that I have names not in numerical sequience (rather different names),
just
> provided

[R] Lattice experts: RE: Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function

2011-03-06 Thread Umesh Rosyara
Hi Bert and Lattice experts
 
Thank you for suggestion and I am still reading your suggestions (Deepayan's
book) and help guide. So far no silverlining in horizon. Here is my outline,
that keep changing:
 
require(lattice)
pvals <- which (dataf$p < 0.05)
xyplot(p ~ xvar|chr, data=dataf, pvals = pvals, col="green",pch=3,
fill.color="green", cex=1,
panel = function(x,y, pvals) {
panel.xyplot(x, y, pch=3, fill=fill)
panel.text((x,y, pvals)
}  )
 
# for new lattice plot experts, this was my data:
name <- c(paste ("M", 1:1000, sep = ""))
xvar <- seq(1, 1, 10)
chr <- c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200))
set.seed(134)
p <- rnorm(1000, 0.15,0.05)
dataf <- data.frame(name,xvar, chr, p)
dataf$chr <- as.factor(dataf$chr)
 
 
May I need some rest. Thank you for your suggestions.
 
Thanks;
 
 
Best Regards
 
Umesh R  
 
 
 
 

  _  

From: Bert Gunter [mailto:gunter.ber...@gene.com] 
Sent: Sunday, March 06, 2011 10:50 AM
To: Umesh Rosyara
Cc: Jorge Ivan Velez; Dennis Murphy; sarah.gos...@gmail.com; R mailing list
Subject: Re: [R] Data lebals xylattice plot: RE: displaying label meeting
condition (i.e. significant, i..e p value less than 005) in plot function



This is easy to do by specifying  xyplot's panel function. Assuming
only one panel -- otherwise you need to pass the subscripts arguments
to choose the values belonging to the panel -- somethings like:

xyplot(y~x, pvals = pvals,..., ## pvals is your vector of small p
values with e.g. NA's elsewhere
panel = function(x,y, pvals,...) {
panel.xyplot(...)
panel.text((x,y, pvals,...)
}  )

This is obviously just a sketch and will not work as written. So
please read the Help page on xyplot carefully and perhaps also
Deepayan's book on trellis graphics -- there are also undoubtedly
online resources: search on "trellis graphics tutorial" or some such.
This is not hard, but there are some details that you will need to
master,especially regarding argument passing.

Another alternative is to use the layer() function in the latticeExtra
package instead. Consult the documentation there for details.

Cheers,
Bert



On Sun, Mar 6, 2011 at 5:17 AM, Umesh Rosyara  wrote:
> Dear Jorge, Dennis,  Sarah and R-experts.
>
> Thank  for helping me. As you mentioned it is difficult apply in lattice
in
> this situation.
>
> Unless, there is a possibility, I would try to use lattice. The major
reason
> toward this is- my ultimate solution might be better of in lattice as I
have
> a classificatory variable to make similar graph for each caterogory in the
> lattice graph. Lattice cleates nice stacked xyplots.
>
> p ~ xvar | chr # require plots by the factor  variable "chr"
>
> # with a classificatory variable
> name <- c(paste ("M", 1:1000, sep = ""))
> xvar <- seq(1, 1, 10)
> chr <- c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200))
> set.seed(134)
> p <- rnorm(1000, 0.15,0.05)
> dataf <- data.frame(name,xvar, chr, p)
> dataf$chr <- as.factor(dataf$chr)
>
> # lattice plot: As far as I can go now !
> require(lattice)
> xyplot(pval ~ xvar1|chr, dataf)
>
>
> Best Regards
>
> Umesh R
>
>
>
>
>  _
>
> From: Jorge Ivan Velez [mailto:jorgeivanve...@gmail.com]
> Sent: Sunday, March 06, 2011 12:22 AM
> To: Umesh Rosyara
> Cc: R mailing list
> Subject: Re: [R] displaying label meeting condition (i.e. significant,
i..e
> p value less than 005) in plot function
>
>
> Hi Umesh,
>
>
> You can try something along the lines of:
>
>
> d <- dataf[dataf$p < 0.05, ]   # p < 0.05
> with(d, plot(xvar, p, col = 'white'))
> with(d, text(xvar, p, name, cex = .7))
>
> HTH,
> Jorge
>
>
>
> On Sat, Mar 5, 2011 at 12:29 PM, Umesh Rosyara <> wrote:
>
>
> Dear R users,
>
> Here is my problem:
>
> # example data
> name <- c(paste ("M", 1:1000, sep = ""))
> xvar <- seq(1, 1, 10)
> set.seed(134)
> p <- rnorm(1000, 0.15,0.05)
> dataf <- data.frame(name,xvar, p)
> plot (dataf$xvar,p)
> abline(h=0.05)
>
> # I can know which observation number is less than 0.05
> which (dataf$p < 0.05)
> [1]  12  20  80 269 272 338 366 368 397 403 432 453 494 543 592 691 723
789
> 811
> [20] 854 891 931 955
>
> I want to display (label) corresponding names on the plot above:
> means that 12th observation M12, 20th observation M20 and so on. Please
note
> that I have names not in numerical sequience (rather different names),
just
> provided for this example to create dataset easily.
>
> Thanks in advance
>
> Umesh R
>
>
>   [[alternative HTML version deleted]]
>
> _

[R] Data lebals xylattice plot: RE: displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function

2011-03-06 Thread Umesh Rosyara
Dear Jorge, Dennis,  Sarah and R-experts. 
 
Thank  for helping me. As you mentioned it is difficult apply in lattice in
this situation. 
 
Unless, there is a possibility, I would try to use lattice. The major reason
toward this is- my ultimate solution might be better of in lattice as I have
a classificatory variable to make similar graph for each caterogory in the
lattice graph. Lattice cleates nice stacked xyplots. 
 
p ~ xvar | chr # require plots by the factor  variable "chr" 
 
# with a classificatory variable
name <- c(paste ("M", 1:1000, sep = ""))
xvar <- seq(1, 1, 10)
chr <- c(rep(1,200),rep(2,200), rep(3,200), rep(4,200), rep(5,200))
set.seed(134)
p <- rnorm(1000, 0.15,0.05)
dataf <- data.frame(name,xvar, chr, p)
dataf$chr <- as.factor(dataf$chr)

# lattice plot: As far as I can go now !
require(lattice)
xyplot(pval ~ xvar1|chr, dataf)
 
 
Best Regards
 
Umesh R 
 
 
 

  _  

From: Jorge Ivan Velez [mailto:jorgeivanve...@gmail.com] 
Sent: Sunday, March 06, 2011 12:22 AM
To: Umesh Rosyara
Cc: R mailing list
Subject: Re: [R] displaying label meeting condition (i.e. significant, i..e
p value less than 005) in plot function


Hi Umesh, 


You can try something along the lines of:


d <- dataf[dataf$p < 0.05, ]   # p < 0.05
with(d, plot(xvar, p, col = 'white'))
with(d, text(xvar, p, name, cex = .7))

HTH,
Jorge



On Sat, Mar 5, 2011 at 12:29 PM, Umesh Rosyara <> wrote:


Dear R users,

Here is my problem:

# example data
name <- c(paste ("M", 1:1000, sep = ""))
xvar <- seq(1, 1, 10)
set.seed(134)
p <- rnorm(1000, 0.15,0.05)
dataf <- data.frame(name,xvar, p)
plot (dataf$xvar,p)
abline(h=0.05)

# I can know which observation number is less than 0.05
which (dataf$p < 0.05)
[1]  12  20  80 269 272 338 366 368 397 403 432 453 494 543 592 691 723 789
811
[20] 854 891 931 955

I want to display (label) corresponding names on the plot above:
means that 12th observation M12, 20th observation M20 and so on. Please note
that I have names not in numerical sequience (rather different names), just
provided for this example to create dataset easily.

Thanks in advance

Umesh R


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  _  

No virus found in this message.
Checked by AVG - www.avg.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] displaying label meeting condition (i.e. significant, i..e p value less than 005) in plot function

2011-03-05 Thread Umesh Rosyara
Dear R users,
 
Here is my problem:
 
# example data
name <- c(paste ("M", 1:1000, sep = ""))
xvar <- seq(1, 1, 10)
set.seed(134)
p <- rnorm(1000, 0.15,0.05)
dataf <- data.frame(name,xvar, p)
plot (dataf$xvar,p)
abline(h=0.05)
 
# I can know which observation number is less than 0.05 
which (dataf$p < 0.05) 
[1]  12  20  80 269 272 338 366 368 397 403 432 453 494 543 592 691 723 789
811
[20] 854 891 931 955
 
I want to display (label) corresponding names on the plot above: 
means that 12th observation M12, 20th observation M20 and so on. Please note
that I have names not in numerical sequience (rather different names), just
provided for this example to create dataset easily.
 
Thanks in advance
 
Umesh R
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] please help ! label selected data points in huge number of data points potentially as high as 50, 000 !

2011-03-05 Thread Umesh Rosyara
Dear All 
 
I am reposting because I my problem is real issue and I have been working on
this. I know this might be simple to those who know it ! Anyway I need help
!
 
Let me clear my point. I have huge number of datapoints plotted using either
base plot function or xyplot in lattice (I have preference to use lattice). 
 name xvarp
1   M11  0.107983837
2   M2   11  0.209125624
3   M3   21  0.163959428
4   M4   31  0.132469859
5   M5   41  0.086095130
6   M6   51  0.180822010
7   M7   61  0.246619925
8   M8   71  0.147363687
9   M9   81  0.162663127

5000 observations  
 
I need to plot xvar (x variable) and p (y variable) using either plot () or
xyplot(). And I want show (print to graph) datapoint name labels to those
rows that have p value < 0.01 (means that they are significant). With my
limited R knowlege I can use text (x,y, labels) option to manually add the
text, but I have huge number of data point(though I provide just 1000 here,
potentially it can go upto 50,000). So I want to display name corresponding
to those observations (rows) that have pvalue less than 0.05 (threshold). 
 
Here is my example dataset and my status:
name <- c(paste ("M", 1:5000, sep = ""))
xvar <- seq(1, 5, 10)
set.seed(134)
p <- rnorm(5000, 0.15,0.05)
dataf <- data.frame(name,xvar, p)
 
# using lattice (my first preference)
require(lattice) 
xyplot(p ~ xvar, dataf)
 
#I want to display names for the following observation that meet requirement
of p <0.01. 
which (dataf$p < 0.01) 
[1]  811  854 1636 1704 2148 2161 2244 3205 3268 4177 4564 4614 4639 4706
 
Thus significant observations are:
name  xvar p
811   M811  8101  0.0050637068
854   M854  8531 -0.0433901783
1636 M1636 16351 -0.0279014039
1704 M1704 17031  0.0029878335
2148 M2148 21471  0.0048898232
2161 M2161 21601 -0.0354130557
2244 M2244 22431  0.0003255200
3205 M3205 32041  0.0079758430
3268 M3268 32671  0.0012797145
4177 M4177 41761  0.0015487439
4564 M4564 45631  0.0024867152
4614 M4614 46131  0.0078381964
4639 M4639 46381 -0.0063151605
4706 M4706 47051  0.0032200517

I want the datapoint (8101, 0.0050637068) with M811 in the plot. Similarly
for all of the above (that are significant). I do not want to label all out
of 5000 who do have p value < 0.01. I know I can add manually - text (8101,
0.0050637068, M811) in plot() in base. 
 
plot (dataf$xvar,p)
text (8101, 0.0050637068, "M811")
text (8531, -0.0433901783, "M854")
 
I need more automation to deal with observations as high as 50,000. In real
sense I do not know how many variables there will be. 
 
You help is highly appreciated. Thank you;
 
Best Regards
 
Umesh R 
 
 
 

  _  

From: Umesh Rosyara [mailto:rosyar...@gmail.com] 
Sent: Saturday, March 05, 2011 12:30 PM
To: 'r-help@r-project.org'
Subject: displaying label meeting condition (i.e. significant, i..e p value
less than 005) in plot function 


Dear R users,
 
Here is my problem:
 
# example data
name <- c(paste ("M", 1:1000, sep = ""))
xvar <- seq(1, 1, 10)
set.seed(134)
p <- rnorm(1000, 0.15,0.05)
dataf <- data.frame(name,xvar, p)
plot (dataf$xvar,p)
abline(h=0.05)
 
# I can know which observation number is less than 0.05 
which (dataf$p < 0.05) 
[1]  12  20  80 269 272 338 366 368 397 403 432 453 494 543 592 691 723 789
811
[20] 854 891 931 955
 
I want to display (label) corresponding names on the plot above: 
means that 12th observation M12, 20th observation M20 and so on. Please note
that I have names not in numerical sequience (rather different names), just
provided for this example to create dataset easily.
 
Thanks in advance
 
Umesh R
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] thank you

2011-03-02 Thread Umesh Rosyara
 

Hi Dennis 

 

I was able to my problem. Thank you encouragement and time.

 

n<-7 

 

newvars <- c(paste('m', rep(1:n, each = 4), rep(c('a', 'b')), rep(c('p1',
'p2'), each = 2), sep = ''))

newvars

[1] "m1ap1" "m1bp1" "m1ap2" "m1bp2" "m2ap1" "m2bp1" "m2ap2" "m2bp2" "m3ap1"

[10] "m3bp1" "m3ap2" "m3bp2" "m4ap1" "m4bp1" "m4ap2" "m4bp2" "m5ap1" "m5bp1"

[19] "m5ap2" "m5bp2" "m6ap1" "m6bp1" "m6ap2" "m6bp2" "m7ap1" "m7bp1" "m7ap2"

[28] "m7bp2"

 

Umesh R 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stuk at another point: simple question

2011-02-28 Thread Umesh Rosyara
Dear All
 
I now realized that it is not simple to deal with realworld problems! 
 
This what I tried without any success:
 
a <- seq(1, nvar, by = 2)
b <- seq(2, nvar, by = 2)
 
#df2 <- transform(df2, ima1p1 = df2$x1[df2$Parent1],   # Parent 1's
allele 1
  #ima2p1 = df2$x2[df2$Parent1],  # Parent 1's allele 2
  #ima1p2 = df2$x1[df,# Parent 2's allele 1
  #ima2p2 = df2$x2[df2$Parent2]) # Parent2's allele 2

out <- lapply(1:nmark, function(ind){
  n <- nvar/2
  transform(df2, ima1p1 = df2[, a[ind]][df$Parent1],   # Parent 1's
allele 1
  ima2p1 = df2[, b[ind]][df2$Parent1],  # Parent 1's
allele 2
  ima1p2 = df2[, a[ind]][df2$Parent2],  # Parent 2's
allele 1
  ima2p2 = df2[, a[ind]][df2$Parent2])}  # Parent2's
allele 2
 
I could go further down because I had already an error ! I am particularly
confused how can apply the index in df2$Parent1 or df2$ parent2. 
 
Please help.
 
Thank you;
 
Umesh R 
 
 
 
 

  _  

From: Umesh Rosyara [mailto:rosyar...@gmail.com] 
Sent: Monday, February 28, 2011 8:01 AM
To: 'Dennis Murphy'
Cc: 'r-help@r-project.org'
Subject: stuk at another point: simple question 


Dear R-community members. 
 
I am really appreciate R-help group. Dennis has been extrremely helpful to
solve some of my questions. I am following Dennis recommendation in the
following email, yet I am stuck at another point (hope this will took me to
end of this project. 
 
Ind <- c(1:5)
Parent1 <- c(NA,NA,1,1,3)
Parent2 <- c(NA,NA,2,2,4)
y <- c(6,5,8,10,7)
M1a <- c(1,2,1,1,1)
M1b <- c(1,2,2,2,1)
M2a <- c(3,3,1,1,3)
M2b <- c(1,1,3,3,3)
M3a <- c(4,4,4,4,4)
M3b <- c(4,4,1,1,4)
M4a <- c(1,4,4,1,4)
M4b <- c(4,4,4,4,4)
 
dataf <- data.frame (Ind, Parent1, Parent2, y, M1a, M1b,M2a,M2b,
M3a,M3b,M4a, M4b) # I have more than >1000 variables pair 
 
# pair1 (M1a,M1b) pair2 (M2a, M2b), pair3 (M3a, M3b)... 
 
df2 <- transform(dataf,m1ap1 = dataf$M1a[dataf$Parent1],  
   m1bp1 = dataf$M1b[dataf$Parent1], 
   m1ap2 = dataf$M1a[dataf$Parent2],
   m1bp2 = dataf$M1b[dataf$Parent2])   

# downstream calculations  
 hP1 <- ifelse(df2$m1ap1==df2$m1bp1,0,1)
 hP2 <- ifelse(df2$m1bp2==df2$m1bp2,0,1)
 t1 <- ifelse(df2$M1a==df2$m1ap1,1,0) 
 t2 <- ifelse(df2$M1b==df2$m1ap2,1,0)
 C <- (hP1*(t1-0.25)+ hP2 *(t2-0.25))
 yv <- df2$y 
 Cy <- C*yv 
 avgCy <- mean(Cy, na.rm=T)
 avgCy # I want to store this value to new dataframe with first model i.e. 
 
 
How can I loop the process to output the second pair( here M2a, M2b), third
pair (here M3a, M3b) to all pairs (I have more than 1000)
 
Mode1  avgCy
1   1.75  # from pair M1a and M1b
2 # from pair M2a and M2b
3 # from pair M3a and M3b
4 # from pair M4a and M4b
 
to the end of the file
 
Thank you in advance 
 
Umesh R 
 
  _  

From: Dennis Murphy [mailto:djmu...@gmail.com] 
Sent: Friday, February 18, 2011 12:28 AM
To: Umesh Rosyara
Cc: r-help@r-project.org
Subject: Re: [R] recoding a data in different way: please help


Hi:

This is as far as I could get:

df <- read.table(textConnection("
 Individual  Parent1  Parent2 mark1   mark2
 10   0   12  11
 20   0   11  22
 30   0   13  22
 40   0   13  11
 51   2   11  12
 61   2   12  12
 73   4   11  12
 83   4   13  12
 91   4   11  12
 10   1   4   11  12"), header = TRUE)
df2 <- transform(df, Parent1 = replace(Parent1, Parent1 == 0, NA),
 Parent2 = replace(Parent2, Parent2 == 0, NA))
df2 <- transform(df2, imark1p1 = df2$mark1[df2$Parent1],   # Parent 1's
mark1
  imark1p2 = df2$mark1[df2$Parent2],
# Parent 2's mark1
  imark2p1 = df2$mark2[df2$Parent1],
# Parent 1's mark2
  imark2p2 = df2$mark2[df2$Parent2])
# Parent 2's mark2

I created df2 so as not to overwrite the original in case of a mistake. At
this point, you have several sets of vectors that you can compare; e.g.,
mark1 with imark1p1 and imark1p2. Like Josh, I couldn't make heads or tails
out of what these logical tests were meant to output, but perhaps this gives
you a broader template with which to work. At this point, you can probably
remove the rows corresponding to the parents. I believe ifelse() is your
friend here - it can perform logical tests in a vectorized fashion. As long
as the tests are consistent from one individual to the next, it's likely to
be an efficient route.

HTH,
Dennis


On Thu, Feb 17, 2011 at 6:21 PM, Umesh

[R] stuk at another point: simple question

2011-02-28 Thread Umesh Rosyara
Dear R-community members. 
 
I am really appreciate R-help group. Dennis has been extrremely helpful to
solve some of my questions. I am following Dennis recommendation in the
following email, yet I am stuck at another point (hope this will took me to
end of this project. 
 
Ind <- c(1:5)
Parent1 <- c(NA,NA,1,1,3)
Parent2 <- c(NA,NA,2,2,4)
y <- c(6,5,8,10,7)
M1a <- c(1,2,1,1,1)
M1b <- c(1,2,2,2,1)
M2a <- c(3,3,1,1,3)
M2b <- c(1,1,3,3,3)
M3a <- c(4,4,4,4,4)
M3b <- c(4,4,1,1,4)
M4a <- c(1,4,4,1,4)
M4b <- c(4,4,4,4,4)
 
dataf <- data.frame (Ind, Parent1, Parent2, y, M1a, M1b,M2a,M2b,
M3a,M3b,M4a, M4b) # I have more than >1000 variables pair 
 
# pair1 (M1a,M1b) pair2 (M2a, M2b), pair3 (M3a, M3b)... 
 
df2 <- transform(dataf,m1ap1 = dataf$M1a[dataf$Parent1],  
   m1bp1 = dataf$M1b[dataf$Parent1], 
   m1ap2 = dataf$M1a[dataf$Parent2],
   m1bp2 = dataf$M1b[dataf$Parent2])   

# downstream calculations  
 hP1 <- ifelse(df2$m1ap1==df2$m1bp1,0,1)
 hP2 <- ifelse(df2$m1bp2==df2$m1bp2,0,1)
 t1 <- ifelse(df2$M1a==df2$m1ap1,1,0) 
 t2 <- ifelse(df2$M1b==df2$m1ap2,1,0)
 C <- (hP1*(t1-0.25)+ hP2 *(t2-0.25))
 yv <- df2$y 
 Cy <- C*yv 
 avgCy <- mean(Cy, na.rm=T)
 avgCy # I want to store this value to new dataframe with first model i.e. 
 
 
How can I loop the process to output the second pair( here M2a, M2b), third
pair (here M3a, M3b) to all pairs (I have more than 1000)
 
Mode1  avgCy
1   1.75  # from pair M1a and M1b
2 # from pair M2a and M2b
3 # from pair M3a and M3b
4 # from pair M4a and M4b
 
to the end of the file
 
Thank you in advance 
 
Umesh R 
 
  _  

From: Dennis Murphy [mailto:djmu...@gmail.com] 
Sent: Friday, February 18, 2011 12:28 AM
To: Umesh Rosyara
Cc: r-help@r-project.org
Subject: Re: [R] recoding a data in different way: please help


Hi:

This is as far as I could get:

df <- read.table(textConnection("
 Individual  Parent1  Parent2 mark1   mark2
 10   0   12  11
 20   0   11  22
 30   0   13  22
 40   0   13  11
 51   2   11  12
 61   2   12  12
 73   4   11  12
 83   4   13  12
 91   4   11  12
 10   1   4   11  12"), header = TRUE)
df2 <- transform(df, Parent1 = replace(Parent1, Parent1 == 0, NA),
 Parent2 = replace(Parent2, Parent2 == 0, NA))
df2 <- transform(df2, imark1p1 = df2$mark1[df2$Parent1],   # Parent 1's
mark1
  imark1p2 = df2$mark1[df2$Parent2],
# Parent 2's mark1
  imark2p1 = df2$mark2[df2$Parent1],
# Parent 1's mark2
  imark2p2 = df2$mark2[df2$Parent2])
# Parent 2's mark2

I created df2 so as not to overwrite the original in case of a mistake. At
this point, you have several sets of vectors that you can compare; e.g.,
mark1 with imark1p1 and imark1p2. Like Josh, I couldn't make heads or tails
out of what these logical tests were meant to output, but perhaps this gives
you a broader template with which to work. At this point, you can probably
remove the rows corresponding to the parents. I believe ifelse() is your
friend here - it can perform logical tests in a vectorized fashion. As long
as the tests are consistent from one individual to the next, it's likely to
be an efficient route.

HTH,
Dennis


On Thu, Feb 17, 2011 at 6:21 PM, Umesh Rosyara  wrote:


Dear R users

The following question looks simple but I have spend alot of time to solve
it. I would highly appeciate your help.

I have following dataset from family dataset :

Here we have individuals and their two parents and their marker scores
(marker1, marker2,and so on). 0 means that their parent information not
available.


Individual  Parent1  Parent2 mark1   mark2
10   0   12  11
20   0   11  22
30   0   13  22
40   0   13  11
51   2   11  12
61   2   12  12
73   4   11  12
83   4   13  12
91   4   11  12
10   1   4   11  12

I want to recode mark1 and other mark2.and so on column by looking
indvidual parent (Parent1 and Parent2).

For example

Take case of Individual 5, who's Parent 1 is 1 (has mark1 score 12) and
Parent 2 is 2 (has mark1 score 11). Individual 5 has mark1 score 11. Suppose
I have following condition to recode Individual 5's mark1 score:

For mark1 variable, If Parent1 score "11" and Parent2 score "22" and recode
indvidual 5's score, "12"=1, else 0
   If Parent1 score "12" and Parent2 score
"

Re: [R] help please ..simple question regarding output the p-value inside a function and lm

2011-02-26 Thread Umesh Rosyara
Hi Jorge and R users 
 
Thank you so much for the responses. You input helped me alot and
potentially can help me to solve one more problem, but I got error message.
I am sorry to ask you again but if you can find my problem in quick look
that will be great. I hope this will not cost alot of your time as this is
based on your idea. 
 
# Just data 
X1 <- c(1,3,4,2,2)
X2 <- c(2,1,3,1,2)
X3 <- c(4,3,2,1,1)
X4<- c(1,1,1,2,3)
X5 <- c(3,2,1,1,2)
X6 <- c(1,1,2,2,3)
odataframe <- data.frame(X1,X2,X3,X4,X5,X6)
 
My objective here is sort the value of the pair of variables (X1 and X2, X3
and X4, X5 and X6 and so on.)  in such way that the second column in
pair is always higher than the first one (X2 > X1, X4 > X3, X6> X5 and so
on...). 
 
Here is my attempt: 
nmrk <- 3
nvar <- 2*nmrk 
lapply(1:nvar, function(ind){
# indices for the variables we need
 a <- seq(1, nvar, by = 2)
 b <- seq(2, nvar, by = 2)
# shorting column
tx[, a[ind]] = ifelse(odataframe[, a[ind]] < odataframe[,b[ind]],
odataframe[, a[ind]], odataframe[, b[ind]])
tx[, b[ind]] = ifelse(odataframe[, b[ind]] > dataframe[,a[ind]],
odataframe[,b[ind]], odataframe[,a[ind]])
df1 <- transform( odataframe, odataframe[, a[ind]]= tx[, a[ind]],
odataframe[, b[ind]]= tx[, b[ind]]))
}
 
I got the following error: 
Error:
Error: unexpected '=' in:
"tx[, b[ind]] = ifelse(odataframe[, b[ind]] > dataframe[,a[ind]],
odataframe[,b[ind]], odataframe[,a[ind]])
df1 <- transform( odataframe, odataframe[, a[ind]]="
 
Thanks;
Umesh R 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help please ..simple question regarding output the p-value inside a function and lm

2011-02-25 Thread Umesh Rosyara
Dear R community members and R experts 

 

I am stuck at a point and I tried with my colleagues and did not get it out.
Sorry, I need your help. 

 

Here my data (just created to show the example):

 

# generating a dataset just to show how my dataset look like, here I have x
variables

# x1 .to X1000 plus ind and y 

ind <- c(1:100)

y <- rnorm(100, 10,2)

set.seed(201)

P <- vector()

dataf1 <- as.data.frame(matrix(rep(NA, 10), nrow=100))

dataf <- data.frame (dataf1, ind,y)

names(dataf) <- (c(paste("x",1:1000, sep=""),"ind", "y"))

for(i in 1:1000) {

dataf[,i] <- rnorm(100)

}

 

# my intension was to fit a model that would fit the following fashion:

y ~ x1 +x2, y ~ x3+x4, y ~ x5+ x6y ~ x999+x1000 (to end of the
dataframe)

 

# please not that I want to avoid to fit  y ~ x2 + x3 or  y ~ x4 + x5 (means
that I am selecting two x variables at time to end)

# question: how can I do this and put inside a user function as I worked out
the following??? 

 

 

# defining function for lm model 

mylm <- function (mydata,nvar) {

y <- NULL

P1 <- vector (mode="numeric", length = nvar)

P2 <- vector (mode="numeric", length = nvar)

for(i in 1: nvar) {

print(P1[i] <- summary(lm(mydata$y ~   mydata[,i]) +
mydata[,i+1]$coefficients[2,4]))

print(P2[i] <- summary(lm(mydata$y ~   mydata[,i]) +
mydata[,i+1]$coefficients[2,5]))

print(plot(nvar, P1))

print(plot(nvar, P2))

}

} 

 

# applying the function to mydata 

mylm (dataf, 1000)

 

Does not work?? The following is the error message: 

Error in model.frame.default(formula = mydata$y ~ mydata[, i],
drop.unused.levels = TRUE) : 

  invalid type (NULL) for variable 'mydata$y'

 

Please help !

 

Thanks;

 

Umesh R


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple recoding problem, but a trouble !

2011-02-19 Thread Umesh Rosyara
Thank you David 
 
I was able to create dataframe and  restore names with the following:
 
dfr1 <- data.frame(t( apply(dfr, 1, func) ))

names(dfr1) <- c("marker1a","marker1b", "marker2a", "marker2b" ,"marker3a",
"marker3b")
Still I wonder if there is easier way to restore the names, in situations
where there are 1000's of variables making the list as above might be
tidious. 
Thank you for solving my problem. I appreciate it.
Umesh R 
  _  

From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Saturday, February 19, 2011 10:28 AM
To: Umesh Rosyara
Cc: 'Joshua Wiley'; r-help@r-project.org
Subject: Re: [R] simple recoding problem, but a trouble !




On Feb 19, 2011, at 8:40 AM, Umesh Rosyara wrote:

> Just a correction. My expected outdata frame was somehow distorted 
> to a
> single, one column. So correct one is:
>
> marker1a   markerb marker2amarker2b  
> 1  1   1   1 
> 1  3   1   3 
> 3  3   3   3 
> 3  3   3   3 
> 1  3   1   3 
> 1  3   1   3 


func <- function(x) {sapply( strsplit(x, ""),
 match, table= c("A", NA, "C"))}
t( apply(dfr, 1, func) )

  [,1] [,2] [,3] [,4]
[1,]1111
[2,]1313
[3,]3333
[4,]3333
[5,]1313
[6,]1313


It's amatrix rather than a dataframe and doesn't have colnames but 
that should be trivial to fix.

>
> Thanks;
>
> Umesh R
>
>  _
>
> From: Umesh Rosyara [mailto:rosyar...@gmail.com]
> Sent: Friday, February 18, 2011 10:09 PM
> To: 'Joshua Wiley'
> Cc: 'r-help@r-project.org'
> Subject: RE: [R] recoding a data in different way: please help
>
>
> Hi Josh and R community members
>
> Thank you for quick response. I am impressed with the help.
>
> To solve my problems, I tried recode options and I had the following 
> problem
> and which motivated me to leave it. Thank you for remind me the option
> again, might help to solve my problem in different way.
>
> marker1 <- c("AA", "AC", "CC", "CC", "AC", "AC")
>
> marker2 <- c("AA", "AC", "CC", "CC", "AC", "AC")
>
> dfr <- data.frame(cbind(marker1, marker2))
>
> Objective: replace A with 1, C with 3, and split AA into 1 1 (two 
> columns
> numeric). So the intended output for the above dataframe is:
>
>
>
> marker1a
> markerb
> marker2a
> marker2b
>
> 1
> 1
> 1
> 1
>
> 1
> 3
> 1
> 3
>
> 3
> 3
> 3
> 3
>
> 3
> 3
> 3
> 3
>
> 1
> 3
> 1
> 3
>
> 1
> 3
> 1
> 3
>
> I tried the following:
>
> for(i in 1:length(dfr))
>   {
> dfr[[i]]=recode (dfr[[i]],"c('AA')= '1,1'; c('AC')= '1,3'; 
> c('CA')=
> '1,3';  c('CC')= '3,3' ")
> }
>
> write.table(dfr,"dfr.out", sep=" ,", col.names = T)
> dfn=read.table("dfr.out",header=T, sep="," )
>
> # just trying to cheat R, unfortunately the marker1 and marker columns
> remained non-numeric, even when opened in excel !!
>
>
> Unfortunately I got the following result !
>
>   marker1 marker2
> 1 1,1  1,1
> 2 1,2  1,2
> 3 2,2  2,2
> 4 2,2  2,2
> 5 1,2  1,2
> 6 1,2  1,2
>
>
> Sorry to bother all of you, but simple things are being complicated 
> these
> days to me.
>
> Thank you so much
> Umesh R
>
>
>  _
>
> From: Joshua Wiley [mailto:jwiley.ps...@gmail.com]
> Sent: Friday, February 18, 2011 12:15 AM
> Cc: r-help@r-project.org
> Subject: Re: [R] recoding a data in different way: please help
>
>
>
> Dear Umesh,
>
> I could not figure out exactly what your recoding scheme was, so I do
> not have a specific solution for you.  That said, the following
> functions may help you get started.
>
> ?ifelse # vectorized and different from using if () statements
> ?if #
> ?Logic ## logical operators for your tests
> ## if you install and load the "car" package by John Fox
> ?recode # a function for recoding in package "car"
>
> I am sure it is possible to string together some massive series of if
> statements and then use a for loop, but that is probably the messiest
> and slowest possible way.  I suspect there will be faster, neater
> options, but I cannot say for certain wit

[R] simple recoding problem, but a trouble !

2011-02-19 Thread Umesh Rosyara
Just a correction. My expected outdata frame was somehow distorted to a
single, one column. So correct one is:
 
marker1a markerb marker2amarker2b   
11   1   1  
13   1   3  
33   3   3  
33   3   3  
13   1   3  
13   1   3  
 
Thanks;
 
Umesh R 
 
  _  

From: Umesh Rosyara [mailto:rosyar...@gmail.com] 
Sent: Friday, February 18, 2011 10:09 PM
To: 'Joshua Wiley'
Cc: 'r-help@r-project.org'
Subject: RE: [R] recoding a data in different way: please help


Hi Josh and R community members 
 
Thank you for quick response. I am impressed with the help. 
 
To solve my problems, I tried recode options and I had the following problem
and which motivated me to leave it. Thank you for remind me the option
again, might help to solve my problem in different way. 
 
marker1 <- c("AA", "AC", "CC", "CC", "AC", "AC")

marker2 <- c("AA", "AC", "CC", "CC", "AC", "AC")

dfr <- data.frame(cbind(marker1, marker2))

Objective: replace A with 1, C with 3, and split AA into 1 1 (two columns
numeric). So the intended output for the above dataframe is:   



marker1a
markerb
marker2a
marker2b

1
1
1
1

1
3
1
3

3
3
3
3

3
3
3
3

1
3
1
3

1
3
1
3

I tried the following: 

 for(i in 1:length(dfr)) 
   {
 dfr[[i]]=recode (dfr[[i]],"c('AA')= '1,1'; c('AC')= '1,3'; c('CA')=
'1,3';  c('CC')= '3,3' ")
}

write.table(dfr,"dfr.out", sep=" ,", col.names = T) 
dfn=read.table("dfr.out",header=T, sep="," ) 

# just trying to cheat R, unfortunately the marker1 and marker columns
remained non-numeric, even when opened in excel !!


Unfortunately I got the following result ! 

   marker1 marker2
1 1,1  1,1
2 1,2  1,2
3 2,2  2,2
4 2,2  2,2
5 1,2  1,2
6 1,2  1,2

 
Sorry to bother all of you, but simple things are being complicated these
days to me. 
 
Thank you so much
Umesh R 

 
  _  

From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] 
Sent: Friday, February 18, 2011 12:15 AM
Cc: r-help@r-project.org
Subject: Re: [R] recoding a data in different way: please help



Dear Umesh,

I could not figure out exactly what your recoding scheme was, so I do
not have a specific solution for you.  That said, the following
functions may help you get started.

?ifelse # vectorized and different from using if () statements
?if #
?Logic ## logical operators for your tests
## if you install and load the "car" package by John Fox
?recode # a function for recoding in package "car"

I am sure it is possible to string together some massive series of if
statements and then use a for loop, but that is probably the messiest
and slowest possible way.  I suspect there will be faster, neater
options, but I cannot say for certain without having a better feel for
how all the conditions work.

Best regards,

Josh

On Thu, Feb 17, 2011 at 6:21 PM, Umesh Rosyara  wrote:
> Dear R users
>
> The following question looks simple but I have spend alot of time to solve
> it. I would highly appeciate your help.
>
> I have following dataset from family dataset :
>
> Here we have individuals and their two parents and their marker scores
> (marker1, marker2,and so on). 0 means that their parent information
not
> available.
>
>
> Individual  Parent1  Parent2 mark1   mark2
> 10   0   12  11
> 20   0   11  22
> 30   0   13  22
> 40   0   13  11
> 51   2   11  12
> 61   2   12  12
> 73   4   11  12
> 83   4   13  12
> 91   4   11  12
> 10   1   4   11  12
>
> I want to recode mark1 and other mark2.and so on column by looking
> indvidual parent (Parent1 and Parent2).
>
> For example
>
> Take case of Individual 5, who's Parent 1 is 1 (has mark1 score 12) and
> Parent 2 is 2 (has mark1 score 11). Individual 5 has mark1 score 11.
Suppose
> I have following condition to recode Individual 5's mark1 score:
>
> For mark1 variable, If Parent1 score "11" and Parent2 score "22" and
recode
> indvidual 5's score, "12"=1, else 0
>If Parent1 score "12" and Parent2 score
> "22" and recode individual 5's score, "22"=1, "12"= 0.5, else 0
>.more
conditions
>
> Similarly the pointer should move from i

Re: [R] recoding a data in different way: please help

2011-02-18 Thread Umesh Rosyara
Hi Josh and R community members 
 
Thank you for quick response. I am impressed with the help. 
 
To solve my problems, I tried recode options and I had the following problem
and which motivated me to leave it. Thank you for remind me the option
again, might help to solve my problem in different way. 
 
marker1 <- c("AA", "AC", "CC", "CC", "AC", "AC")

marker2 <- c("AA", "AC", "CC", "CC", "AC", "AC")

dfr <- data.frame(cbind(marker1, marker2))

Objective: replace A with 1, C with 3, and split AA into 1 1 (two columns
numeric). So the intended output for the above dataframe is:   



marker1a
markerb
marker2a
marker2b

1
1
1
1

1
3
1
3

3
3
3
3

3
3
3
3

1
3
1
3

1
3
1
3

I tried the following: 

 for(i in 1:length(dfr)) 
   {
 dfr[[i]]=recode (dfr[[i]],"c('AA')= '1,1'; c('AC')= '1,3'; c('CA')=
'1,3';  c('CC')= '3,3' ")
}

write.table(dfr,"dfr.out", sep=" ,", col.names = T) 
dfn=read.table("dfr.out",header=T, sep="," ) 

# just trying to cheat R, unfortunately the marker1 and marker columns
remained non-numeric, even when opened in excel !!


Unfortunately I got the following result ! 

   marker1 marker2
1 1,1  1,1
2 1,2  1,2
3 2,2  2,2
4 2,2  2,2
5 1,2  1,2
6 1,2  1,2

 
Sorry to bother all of you, but simple things are being complicated these
days to me. 
 
Thank you so much
Umesh R 

 
  _  

From: Joshua Wiley [mailto:jwiley.ps...@gmail.com] 
Sent: Friday, February 18, 2011 12:15 AM
Cc: r-help@r-project.org
Subject: Re: [R] recoding a data in different way: please help



Dear Umesh,

I could not figure out exactly what your recoding scheme was, so I do
not have a specific solution for you.  That said, the following
functions may help you get started.

?ifelse # vectorized and different from using if () statements
?if #
?Logic ## logical operators for your tests
## if you install and load the "car" package by John Fox
?recode # a function for recoding in package "car"

I am sure it is possible to string together some massive series of if
statements and then use a for loop, but that is probably the messiest
and slowest possible way.  I suspect there will be faster, neater
options, but I cannot say for certain without having a better feel for
how all the conditions work.

Best regards,

Josh

On Thu, Feb 17, 2011 at 6:21 PM, Umesh Rosyara  wrote:
> Dear R users
>
> The following question looks simple but I have spend alot of time to solve
> it. I would highly appeciate your help.
>
> I have following dataset from family dataset :
>
> Here we have individuals and their two parents and their marker scores
> (marker1, marker2,and so on). 0 means that their parent information
not
> available.
>
>
> Individual  Parent1  Parent2 mark1   mark2
> 10   0   12  11
> 20   0   11  22
> 30   0   13  22
> 40   0   13  11
> 51   2   11  12
> 61   2   12  12
> 73   4   11  12
> 83   4   13  12
> 91   4   11  12
> 10   1   4   11  12
>
> I want to recode mark1 and other mark2.and so on column by looking
> indvidual parent (Parent1 and Parent2).
>
> For example
>
> Take case of Individual 5, who's Parent 1 is 1 (has mark1 score 12) and
> Parent 2 is 2 (has mark1 score 11). Individual 5 has mark1 score 11.
Suppose
> I have following condition to recode Individual 5's mark1 score:
>
> For mark1 variable, If Parent1 score "11" and Parent2 score "22" and
recode
> indvidual 5's score, "12"=1, else 0
>If Parent1 score "12" and Parent2 score
> "22" and recode individual 5's score, "22"=1, "12"= 0.5, else 0
>.more
conditions
>
> Similarly the pointer should move from individual 5 to n individuals at
the
> end of the file.
>
>  Thank you in advance
>
> Umesh R
>
>
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/ 

  _  

No virus found in this message.
Checked by AVG - www.avg.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recoding a data in different way: please help

2011-02-18 Thread Umesh Rosyara
Hi Dennis 
 
Thank you so much it helped me to go a step ahead. Regarding comparisions,
here is what I want to do. 
 
If value of imarkP1 = 22, imarkP2 = 11, and mark1= 12 then the value of
mark1 should be coded as 1
   (means that all three conditions must be satified to get
a code "1") 
   imarkP1 =  22, imarkP2= 11, and mark1= 22 then the value
mark should be coded as 2
   (imarkP1 = 22, imparkP2=11, and mark1= 11 then the value
of mark should be coded as 0) will go to else 
   
   imarkP1= 33, imarkP2= 14, and mark=13 the value of mark1
should be coded as 0
   imarkP1=33, imarkP2=14, and mark=34, the value of mark1
should be coded as 1
   
. I do have more such condtions 
 
I tried the following for the first conditon listed above, but could not get
the result I want. I do not know what is wrong. 
 
Ifelse (imarkP1==22|imarkP2==11|mark1==12,1,0)
 
I could not go forward..
 
Thank you so much for the help.
 
Best regards;
 
Umesh R 
 
 
 

  _  

From: Dennis Murphy [mailto:djmu...@gmail.com] 
Sent: Friday, February 18, 2011 12:28 AM
Cc: r-help@r-project.org
Subject: Re: [R] recoding a data in different way: please help


Hi:

This is as far as I could get:

df <- read.table(textConnection("
 Individual  Parent1  Parent2 mark1   mark2
 10   0   12  11
 20   0   11  22
 30   0   13  22
 40   0   13  11
 51   2   11  12
 61   2   12  12
 73   4   11  12
 83   4   13  12
 91   4   11  12
 10   1   4   11  12"), header = TRUE)
df2 <- transform(df, Parent1 = replace(Parent1, Parent1 == 0, NA),
 Parent2 = replace(Parent2, Parent2 == 0, NA))
df2 <- transform(df2, imark1p1 = df2$mark1[df2$Parent1],   # Parent 1's
mark1
  imark1p2 = df2$mark1[df2$Parent2],
# Parent 2's mark1
  imark2p1 = df2$mark2[df2$Parent1],
# Parent 1's mark2
  imark2p2 = df2$mark2[df2$Parent2])
# Parent 2's mark2

I created df2 so as not to overwrite the original in case of a mistake. At
this point, you have several sets of vectors that you can compare; e.g.,
mark1 with imark1p1 and imark1p2. Like Josh, I couldn't make heads or tails
out of what these logical tests were meant to output, but perhaps this gives
you a broader template with which to work. At this point, you can probably
remove the rows corresponding to the parents. I believe ifelse() is your
friend here - it can perform logical tests in a vectorized fashion. As long
as the tests are consistent from one individual to the next, it's likely to
be an efficient route.

HTH,
Dennis


On Thu, Feb 17, 2011 at 6:21 PM 

Dear R users

The following question looks simple but I have spend alot of time to solve
it. I would highly appeciate your help.

I have following dataset from family dataset :

Here we have individuals and their two parents and their marker scores
(marker1, marker2,and so on). 0 means that their parent information not
available.


Individual  Parent1  Parent2 mark1   mark2
10   0   12  11
20   0   11  22
30   0   13  22
40   0   13  11
51   2   11  12
61   2   12  12
73   4   11  12
83   4   13  12
91   4   11  12
10   1   4   11  12

I want to recode mark1 and other mark2.and so on column by looking
indvidual parent (Parent1 and Parent2).

For example

Take case of Individual 5, who's Parent 1 is 1 (has mark1 score 12) and
Parent 2 is 2 (has mark1 score 11). Individual 5 has mark1 score 11. Suppose
I have following condition to recode Individual 5's mark1 score:

For mark1 variable, If Parent1 score "11" and Parent2 score "22" and recode
indvidual 5's score, "12"=1, else 0
   If Parent1 score "12" and Parent2 score
"22" and recode individual 5's score, "22"=1, "12"= 0.5, else 0
   .more conditions

Similarly the pointer should move from individual 5 to n individuals at the
end of the file.

 Thank you in advance

Umesh R





   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  _  

No virus found in this message.
Checked by AVG - www.avg.com



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
htt

[R] recoding a data in different way: please help

2011-02-17 Thread Umesh Rosyara
Dear R users 
 
The following question looks simple but I have spend alot of time to solve
it. I would highly appeciate your help.  
 
I have following dataset from family dataset : 
 
Here we have individuals and their two parents and their marker scores
(marker1, marker2,and so on). 0 means that their parent information not
available. 
 
 
Individual  Parent1  Parent2 mark1   mark2  
10   0   12  11 
20   0   11  22 
30   0   13  22 
40   0   13  11 
51   2   11  12 
61   2   12  12 
73   4   11  12 
83   4   13  12 
91   4   11  12 
10   1   4   11  12 
 
I want to recode mark1 and other mark2.and so on column by looking
indvidual parent (Parent1 and Parent2). 
 
For example 
 
Take case of Individual 5, who's Parent 1 is 1 (has mark1 score 12) and
Parent 2 is 2 (has mark1 score 11). Individual 5 has mark1 score 11. Suppose
I have following condition to recode Individual 5's mark1 score:
 
For mark1 variable, If Parent1 score "11" and Parent2 score "22" and recode
indvidual 5's score, "12"=1, else 0
If Parent1 score "12" and Parent2 score
"22" and recode individual 5's score, "22"=1, "12"= 0.5, else 0
.more conditions
 
Similarly the pointer should move from individual 5 to n individuals at the
end of the file. 
 
 Thank you in advance
 
Umesh R 
 
 
 
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.