[R] combine all data frame columns into a vector.

2013-08-12 Thread Khan, Sohail
Dear All,

Could anyone suggest a quick way to combine all columns in a data frame into a 
vector?
For example, I have a data frame of 205 columns with character data types, many 
data values are repeated in all the columns.  Actually, I would like to 
retrieve all the unique values from this data set.  My strategy is to put all 
column value into a vector and then select unique from that vector.

I would appreciate a more efficient method.
Thanks.
-Sohail


The information contained in this electronic e-mail transmission and any 
attachments are intended only for the use of the individual or entity to whom 
or to which it is addressed, and may contain information that is privileged, 
confidential and exempt from disclosure under applicable law. If the reader of 
this communication is not the intended recipient, or the employee or agent 
responsible for delivering this communication to the intended recipient, you 
are hereby notified that any dissemination, distribution, copying or disclosure 
of this communication and any attachment is strictly prohibited. If you have 
received this transmission in error, please notify the sender immediately by 
telephone and electronic mail, and delete the original communication and any 
attachment from any computer, server or other electronic recording or storage 
device or medium. Receipt by anyone other than the intended recipient is not a 
waiver of any attorney-client, physician-patient or other priv!
 ilege.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] combine all data frame columns into a vector.

2013-08-12 Thread Khan, Sohail
Thanks Bert,
All are character values.
-Sohail

-Original Message-
From: Bert Gunter [mailto:gunter.ber...@gene.com] 
Sent: Monday, August 12, 2013 4:35 PM
To: Khan, Sohail
Cc: greatest.possible.newbie; r-help@r-project.org
Subject: Re: [R] combine all data frame columns into a vector.

Sohail:

1. Are they character or factor?

2. ?unlist
 unique(unlist(yourframe))

-- Bert

On Mon, Aug 12, 2013 at 1:23 PM, Khan, Sohail skha...@nshs.edu wrote:
 Dear All,

 Could anyone suggest a quick way to combine all columns in a data frame into 
 a vector?
 For example, I have a data frame of 205 columns with character data types, 
 many data values are repeated in all the columns.  Actually, I would like to 
 retrieve all the unique values from this data set.  My strategy is to put all 
 column value into a vector and then select unique from that vector.

 I would appreciate a more efficient method.
 Thanks.
 -Sohail


 The information contained in this electronic e-mail transmission and any 
 attachments are intended only for the use of the individual or entity to whom 
 or to which it is addressed, and may contain information that is privileged, 
 confidential and exempt from disclosure under applicable law. If the reader 
 of this communication is not the intended recipient, or the employee or agent 
 responsible for delivering this communication to the intended recipient, you 
 are hereby notified that any dissemination, distribution, copying or 
 disclosure of this communication and any attachment is strictly prohibited. 
 If you have received this transmission in error, please notify the sender 
 immediately by telephone and electronic mail, and delete the original 
 communication and any attachment from any computer, server or other 
 electronic recording or storage device or medium. Receipt by anyone other 
 than the intended recipient is not a waiver of any attorney-client, 
 physician-patient or other pr!
 iv!
  ilege.
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm


The information contained in this electronic e-mail transmission and any 
attachments are intended only for the use of the individual or entity to whom 
or to which it is addressed, and may contain information that is privileged, 
confidential and exempt from disclosure under applicable law. If the reader of 
this communication is not the intended recipient, or the employee or agent 
responsible for delivering this communication to the intended recipient, you 
are hereby notified that any dissemination, distribution, copying or disclosure 
of this communication and any attachment is strictly prohibited. If you have 
received this transmission in error, please notify the sender immediately by 
telephone and electronic mail, and delete the original communication and any 
attachment from any computer, server or other electronic recording or storage 
device or medium. Receipt by anyone other than the intended recipient is not a 
waiver of any attorney-client, physician-patient or other priv!
 ilege.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] combine all data frame columns into a vector.

2013-08-12 Thread Khan, Sohail
Thanks Arun and Bert.
Both options work.
-Sohail

-Original Message-
From: arun [mailto:smartpink...@yahoo.com] 
Sent: Monday, August 12, 2013 4:51 PM
To: Khan, Sohail
Cc: R help
Subject: Re: [R] combine all data frame columns into a vector.



Hi,


May be this help:
dat1- structure(list(V1 = c(h, f, s, n, r, x, h, t, u, g), 
V2 = c(p, j, r, r, i, x, f, b, n, d), V3 = c(c, o, s, 
d, f, r, b, p, q, b
), V4 = c(i, g, j, d, y, f, s, q, s, z),
    V5 = c(m, j, h, f, b, b, k, j, g, i),
    V6 = c(m, w, m, s, o, z, l, h, e, d),
    V7 = c(m, g, h, d, s, i, y, z, t, m),
    V8 = c(d, f, a, z, q, i, o, v, a, s),
    V9 = c(n, d, n, f, j, j, g, w, k, v),
    V10 = c(i, t, y, c, m, p, q, c, k, m)), .Names = 
c(V1, V2, V3, V4, V5, V6, V7, V8, V9, V10), row.names = 
c(NA, -10L), class = data.frame)

unique(unlist(dat1))
# [1] h f s n r x t u g p j i b d c o q y 
z
#[20] m k w l e a v

#or
unique(as.vector(as.matrix(dat1)))
# [1] h f s n r x t u g p j i b d c o q y 
z
#[20] m k w l e a v
A.K.


- Original Message -
From: Khan, Sohail skha...@nshs.edu
To: 'greatest.possible.newbie' daniel.h...@gmx.net; r-help@r-project.org 
r-help@r-project.org
Cc: 
Sent: Monday, August 12, 2013 4:23 PM
Subject: [R] combine all data frame columns into a vector.

Dear All,

Could anyone suggest a quick way to combine all columns in a data frame into a 
vector?
For example, I have a data frame of 205 columns with character data types, many 
data values are repeated in all the columns.  Actually, I would like to 
retrieve all the unique values from this data set.  My strategy is to put all 
column value into a vector and then select unique from that vector.

I would appreciate a more efficient method.
Thanks.
-Sohail


The information contained in this electronic e-mail transmission and any 
attachments are intended only for the use of the individual or entity to whom 
or to which it is addressed, and may contain information that is privileged, 
confidential and exempt from disclosure under applicable law. If the reader of 
this communication is not the intended recipient, or the employee or agent 
responsible for delivering this communication to the intended recipient, you 
are hereby notified that any dissemination, distribution, copying or disclosure 
of this communication and any attachment is strictly prohibited. If you have 
received this transmission in error, please notify the sender immediately by 
telephone and electronic mail, and delete the original communication and any 
attachment from any computer, server or other electronic recording or storage 
device or medium. Receipt by anyone other than the intended recipient is not a 
waiver of any attorney-client,  physician-patient or other priv!
ilege.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



The information contained in this electronic e-mail transmission and any 
attachments are intended only for the use of the individual or entity to whom 
or to which it is addressed, and may contain information that is privileged, 
confidential and exempt from disclosure under applicable law. If the reader of 
this communication is not the intended recipient, or the employee or agent 
responsible for delivering this communication to the intended recipient, you 
are hereby notified that any dissemination, distribution, copying or disclosure 
of this communication and any attachment is strictly prohibited. If you have 
received this transmission in error, please notify the sender immediately by 
telephone and electronic mail, and delete the original communication and any 
attachment from any computer, server or other electronic recording or storage 
device or medium. Receipt by anyone other than the intended recipient is not a 
waiver of any attorney-client, physician-patient or other privilege.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to transpose it in a fast way?

2013-03-08 Thread Khan, Sohail
Perhaps you could process this with a unix/Linux utility Awk, before reading 
the file into R.
-Sohail


From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
peter dalgaard [pda...@gmail.com]
Sent: Friday, March 08, 2013 5:08 AM
To: Yao He
Cc: R help
Subject: Re: [R] How to transpose it in a fast way?

On Mar 7, 2013, at 01:18 , Yao He wrote:

 Dear all:

 I have a big data file of 6 columns and 6 rows like that:

 AA AC AA AA ...AT
 CC CC CT CT...TC
 ..
 .

 I want to transpose it and the output is a new like that
 AA CC 
 AC CC
 AA CT.
 AA CT.
 
 
 AT TC.

 The keypoint is  I can't read it into R by read.table() because the
 data is too large,so I try that:
 c-file(silygenotype.txt,r)
 geno_t-list()
 repeat{
  line-readLines(c,n=1)
  if (length(line)==0)break  #end of file
  line-unlist(strsplit(line,\t))
 geno_t-cbind(geno_t,line)
 }
 write.table(geno_t,xxx.txt)

 It works but it is too slow ,how to optimize it???


As others have pointed out, that's a lot of data!

You seem to have the right idea: If you read the columns line by line there is 
nothing to transpose. A couple of points, though:

- The cbind() is a potential performance hit since it copies the list every 
time around. geno_t - vector(list, 6) and then
geno_t[[i]] - etc

- You might use scan() instead of readLines, strsplit

- Perhaps consider the data type as you seem to be reading strings with 16 
possible values (I suspect that R already optimizes string storage to make this 
point moot, though.)

--
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


The information contained in this electronic e-mail transmission and any 
attachments are intended only for the use of the individual or entity to whom 
or to which it is addressed, and may contain information that is privileged, 
confidential and exempt from disclosure under applicable law. If the reader of 
this communication is not the intended recipient, or the employee or agent 
responsible for delivering this communication to the intended recipient, you 
are hereby notified that any dissemination, distribution, copying or disclosure 
of this communication and any attachment is strictly prohibited. If you have 
received this transmission in error, please notify the sender immediately by 
telephone and electronic mail, and delete the original communication and any 
attachment from any computer, server or other electronic recording or storage 
device or medium. Receipt by anyone other than the intended recipient is not a 
waiver of any attorney-client, physician-patient or other priv!
 ilege.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calculateing means

2012-11-16 Thread Khan, Sohail

Dear List,

I have a data matrix with 570 columns containing 95 (samples) with 6 replicates 
each.
How can I calculate the mean of the replicates for 95 samples?
Thank you.


The information contained in this electronic e-mail transmission and any 
attachments are intended only for the use of the individual or entity to whom 
or to which it is addressed, and may contain information that is privileged, 
confidential and exempt from disclosure under applicable law. If the reader of 
this communication is not the intended recipient, or the employee or agent 
responsible for delivering this communication to the intended recipient, you 
are hereby notified that any dissemination, distribution, copying or disclosure 
of this communication and any attachment is strictly prohibited. If you have 
received this transmission in error, please notify the sender immediately by 
telephone and electronic mail, and delete the original communication and any 
attachment from any computer, server or other electronic recording or storage 
device or medium. Receipt by anyone other than the intended recipient is not a 
waiver of any attorney-client, physician-patient or other priv!
 ilege.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculateing means

2012-11-16 Thread Khan, Sohail
Thanks. But aggregate will work on rows or columns.  I need to calculate mean 
for subsets of rows in a matrix
I.E.

Indx x1 x2  x3   x4 x5  x6  x7  x8  x9
1   25  30  15  8   12  9   18  21  89
2   52  35  42  74  65  20  28  32  12
3   12  35  33  88  12  52  32  32  18
4   25  25  16  23  89  21  21  21  42

...
I would like to calculate means for x1-x3, x4-x6, x7-x9
For each row.

-Sohail


-Original Message-
From: John Kane [mailto:jrkrid...@inbox.com] 
Sent: Friday, November 16, 2012 4:35 PM
To: Khan, Sohail; 'r-help@r-project.org'
Subject: RE: [R] Calculateing means

?aggregate will do it.

x - data.frame( height= c(50, 174, 145, 200, 210, 140, 175), 
age_group=c(1,2,2,1,1,2,1), 
 ville= c(1,2,3,1,2,3,1))

aggregate(x$height,list(x$age_group, x$ville), mean)

or have a look at the plyr or datatable packages.

John Kane
Kingston ON Canada


 -Original Message-
 From: skha...@nshs.edu
 Sent: Fri, 16 Nov 2012 15:58:17 -0500
 To: r-help@r-project.org
 Subject: [R] Calculateing means
 
 
 Dear List,
 
 I have a data matrix with 570 columns containing 95 (samples) with 6 
 replicates each.
 How can I calculate the mean of the replicates for 95 samples?
 Thank you.
 
 
 The information contained in this electronic e-mail transmission and 
 any attachments are intended only for the use of the individual or 
 entity to whom or to which it is addressed, and may contain 
 information that is privileged, confidential and exempt from disclosure under 
 applicable law.
 If the reader of this communication is not the intended recipient, or 
 the employee or agent responsible for delivering this communication to 
 the intended recipient, you are hereby notified that any 
 dissemination, distribution, copying or disclosure of this 
 communication and any attachment is strictly prohibited. If you have 
 received this transmission in error, please notify the sender 
 immediately by telephone and electronic mail, and delete the original 
 communication and any attachment from any computer, server or other 
 electronic recording or storage device or medium. Receipt by anyone 
 other than the intended recipient is not a waiver of any attorney-client, 
 physician-patient or other priv!
  ilege.
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Send any screenshot to your friends in seconds...
Works in all emails, instant messengers, blogs, forums and social networks.
TRY IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if2 for FREE




The information contained in this electronic e-mail transmission and any 
attachments are intended only for the use of the individual or entity to whom 
or to which it is addressed, and may contain information that is privileged, 
confidential and exempt from disclosure under applicable law. If the reader of 
this communication is not the intended recipient, or the employee or agent 
responsible for delivering this communication to the intended recipient, you 
are hereby notified that any dissemination, distribution, copying or disclosure 
of this communication and any attachment is strictly prohibited. If you have 
received this transmission in error, please notify the sender immediately by 
telephone and electronic mail, and delete the original communication and any 
attachment from any computer, server or other electronic recording or storage 
device or medium. Receipt by anyone other than the intended recipient is not a 
waiver of any attorney-client, physician-patient or other priv!
 ilege.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.