Re: [R] finding big matrix size and SVD

2010-09-27 Thread Natasha Asar
I have 500,00 rows in my matrix and i was wondering whether there is any way to 
get its SVD without breaking it to parts

because if R can only read about 1000 columns then to have a rectangular matrix 
(diagonal i think they are called) I will need to have only 1000 rows

I want to know how i can do this? 

natasha




From: Steve Lianoglou mailinglist.honey...@gmail.com

Cc: R group r-help@r-project.org
Sent: Mon, 27 September, 2010 3:23:58
Subject: Re: [R] finding big matrix size and SVD

Hi,


rote:
 Dear R helpers

 I have a big data sheet (CSV) which I use “read.csv” to read it

 When im trying to get the Dim() it says 38 column which is not correct it
should
 be something about 400.
 I am wondering whether there is any way I can read it right… I have used 
ncol()
 and it’s the same answer

It seems that perhaps the input file isn't well formed, and R can't
correctly identify that each row should have 400 columns?

Maybe you can try to read it in manually (using readLines, for
instance), and strsplit each line so that you can get finer control of
reading the file. Alternatively, you can try and edit the file by hand
to fix it.

 On the other hand I use to get the SVD of it but I have about 500,000 rows so
 considering that it should be rectangular matrix…
 im wondering how this can be possible

I don't follow this part, sorry ... what's the question here?

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] finding big matrix size and SVD

2010-09-26 Thread Natasha Asar
Dear R helpers

I have a big data sheet (CSV) which I use “read.csv” to read it

When im trying to get the Dim() it says 38 column which is not correct it 
should 
be something about 400. 
I am wondering whether there is any way I can read it right… I have used 
ncol() 
and it’s the same answer

On the other hand I use to get the SVD of it but I have about 500,000 rows so 
considering that it should be rectangular matrix… 
im wondering how this can be possible

Natasha


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshape matrix entities to columns

2010-09-16 Thread Natasha Asar
Thanks for the help 
I have tried the one that uses xtab() and the answer is correct because when I 
was using the rbind() still getting the same error message…
 
Now I have some more questions …
I am trying to be able to adapt this for any kind of data of the same structure 
so now im thinking about few points

1. what is there is no row name : (center1,center2) assuming the rows 
automatically present that

2. what if I don’t have the column names: considering that there is no rows 
names as well:  
assuming odd columns are age and even ones are numbers of ppl. 

3. And Time? How to automatically match this … 
I have tried with ncol() but because now I have name of the rows there its 
giving me a number like 5.5 which is not helpful

thanks again




From: David Winsemius dwinsem...@comcast.net

Cc: Dennis Murphy djmu...@gmail.com; r-help@r-project.org
Sent: Tue, 14 September, 2010 2:24:19
Subject: Re: [R] reshape matrix entities to columns



On Sep 13, 2010, at 8:51 PM, Natasha Asar wrote:

I am trying this as you mentioned and getting an error which i cant fix
do you know where is the problem? 


 df2[is.na(df2)] - 0
 df2
  X age no. age.1 no..1 age.2 no..2 age.3 no..3 age.4 no..4
1   center1   3   9 6 4 9 110 1 0 0
2   center2   5   3 9 2 0 0 0 0 0 0
3   center3   2   2 5 8 7 3 0 0 0 0
4   center4   1  12 4 7 8 3 9 1 0 0
5   center5   6   9 8 5 0 0 0 0 0 0
6   center6   4   8 0 0 0 0 0 0 0 0
7   center7   9   5 0 0 0 0 0 0 0 0
8   center8   4   7 6 3 7 1 8 1 9 2
9   center9   7   3 9 110 2 0 0 0 0
10 center10  10   5 0 0 0 0 0 0 0 0
 df3 - rbind(df2, data.frame(center=1,time=1, age=1:max(df2$age), n=0))
Error in rbind(deparse.level, ...) : 
  numbers of columns of arguments do not match
I was doing the rbind on df2 after it was reshaped to long structure and before 
it was xtab()-ed to back wide structure

df2 - reshape(df, idvar = 'center', varying =
  list(c(paste('age', 1:5, sep = '')), c(paste('n', 1:5, sep = ''))),
  v.names = c('age', 'n'), times = 1:5, direction = 'long')
df2
   center time age  n
1.1  11   6 10
2.1  21   7 12
3.1  31   5  6
1.2  12   8 13
2.2  22   8 14
3.2  32   8 NA
1.3  13  10  9
2.3  23  10 NA
3.3  33   9 10
1.4  14  12  7
2.4  24  11 16
3.4  34  11 12
1.5  15  14 10
2.5  25  14 13
3.5  35  13  9

Then do the rbind so while  the columns match up.

xtabs then puts zeros back in for each empty cell with a level.

-- 
David







Natatsha




From: Dennis Murphy djmu...@gmail.com
To: David Winsemius dwinsem...@comcast.net
Cc: r-help@r-project.org
Sent: Sun, 12 September, 2010 23:16:47
Subject: Re: [R] reshape matrix entities to columns

Thanks, David; I overlooked that part.

Dennis

On Sun, Sep 12, 2010 at 1:18 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Sep 12, 2010, at 3:34 PM, Dennis Murphy wrote:

  Hi:

 Natasha said:
 
 I changed it so i hope it will look better now
 the matrix is like this:
AgeNo.  Age  No.  AgeNo.
 Center152  8  7
 Center210  7209  4  10
 column name = sequence of age-no.

 But what I want the data to look like is this
 Age
 1  2  3  4  5  6  7  8  910

  20
 Center1  27
 Center2
 107  9
 column name= age of ppl
 entries = number of ppl with that age in on center
 *

 It's a continuation of the reshape problem, but we have to
 change the NAs in the reshaped data frame to zeros first:

 df2[is.na(df2)] - 0

 xtabs(n ~ center + age, data = df2)
age
 center  5  6  7  8  9 10 11 12 13 14
1  0 10  0 13  0  9  0  7  0 10
2  0  0 12 14  0  0 16  0  0 13
3  6  0  0  0 10  0 12  0  9  0

 How's that?


 You've done all the hard work, but the OP wanted the full range of age
 values from 1:max and that pretty easy to do with one further step that adds
 entries fo the missing age levels:

  df3 - rbind(df2, data.frame(center=1,time=1, age=1:max(df2$age), n=0))

  xtabs(n ~ center + age, data = df3)
  age
 center  1  2  3  4  5  6  7  8  9 10 11 12 13 14
1  0  0  0  0  0 10  0 13  0  9  0  7  0 10
2  0  0  0  0  0  0 12 14  0  0 16  0  0 13
3  0  0  0  0  6  0  0  0 10  0 12  0  9  0

 --
 David.

  Dennis

 On Sun, Sep 12, 2010 at 9:46 AM, Dennis Murphy djmu...@gmail.com wrote:

  Hi:

 Here's a made up example using the reshape function:

 Input data:
 df

Re: [R] reshape matrix entities to columns

2010-09-13 Thread Natasha Asar
thanks for your help

I am trying to work around this in R but i have the feeling that this is going 
to but the center and age next to each other which is not what i need...
it might be me not being able to find my head around this but...

i need a table with age as columns and center as rows

if deleting row and column names from the main matrix will help i can do 
that... 

can i get bit of explanation about how things work ( so i can learn the process)

thanks again

Natasha 






From: David Winsemius dwinsem...@comcast.net
To: Dennis Murphy djmu...@gmail.com

Sent: Sun, 12 September, 2010 21:18:26
Subject: Re: [R] reshape matrix entities to columns


On Sep 12, 2010, at 3:34 PM, Dennis Murphy wrote:

 Hi:
 
 Natasha said:
 
 I changed it so i hope it will look better now
 the matrix is like this:
 AgeNo.   Age   No.   AgeNo.
 Center1 52  8   7
 Center210  720 9   4  10
 column name = sequence of age-no.
 
 But what I want the data to look like is this
 Age
 1  2  3   4   5  6   7  8   9 10
 
   20
 Center1   27
 Center2
 10 7   9
 column name= age of ppl
 entries = number of ppl with that age in on center
 *
 
 It's a continuation of the reshape problem, but we have to
 change the NAs in the reshaped data frame to zeros first:
 
 df2[is.na(df2)] - 0
 
 xtabs(n ~ center + age, data = df2)
  age
 center  5  6  7  8  9 10 11 12 13 14
 1  0 10  0 13  0  9  0  7  0 10
 2  0  0 12 14  0  0 16  0  0 13
 3  6  0  0  0 10  0 12  0  9  0
 
 How's that?
 

You've done all the hard work, but the OP wanted the full range of age values 
from 1:max and that pretty easy to do with one further step that adds entries 
fo 
the missing age levels:

 df3 - rbind(df2, data.frame(center=1,time=1, age=1:max(df2$age), n=0))

 xtabs(n ~ center + age, data = df3)
  age
center  1  2  3  4  5  6  7  8  9 10 11 12 13 14
 1  0  0  0  0  0 10  0 13  0  9  0  7  0 10
 2  0  0  0  0  0  0 12 14  0  0 16  0  0 13
 3  0  0  0  0  6  0  0  0 10  0 12  0  9  0

--David.
 Dennis
 
 On Sun, Sep 12, 2010 at 9:46 AM, Dennis Murphy djmu...@gmail.com wrote:
 
 Hi:
 
 Here's a made up example using the reshape function:
 
 Input data:
 df - structure(list(center = 1:3, age1 = c(6L, 7L, 5L), n1 = c(10L,
 12L, 6L), age2 = c(8L, 8L, 8L), n2 = c(13L, 14L, NA), age3 = c(10L,
 10L, 9L), n3 = c(9L, NA, 10L), age4 = c(12L, 11L, 11L), n4 = c(7L,
 16L, 12L), age5 = c(14L, 14L, 13L), n5 = c(10L, 13L, 9L)), .Names =
 c(center,
 age1, n1, age2, n2, age3, n3, age4, n4, age5,
 n5), class = data.frame, row.names = c(NA, -3L))
 
 df
  center age1 n1 age2 n2 age3 n3 age4 n4 age5 n5
 1  16 108 13   10  9   12  7   14 10
 2  27 128 14   10 NA   11 16   14 13
 3  35  68 NA9 10   11 12   13  9
 
 # To reshape more than one variable at a time, you need
 # to put the sets of variables into a list, as follows:
 
 df2 - reshape(df, idvar = 'center', varying =
   list(c(paste('age', 1:5, sep = '')), c(paste('n', 1:5, sep = ''))),
   v.names = c('age', 'n'), times = 1:5, direction = 'long')
 df2
center time age  n
 1.1  11   6 10
 2.1  21   7 12
 3.1  31   5  6
 1.2  12   8 13
 2.2  22   8 14
 3.2  32   8 NA
 1.3  13  10  9
 2.3  23  10 NA
 3.3  33   9 10
 1.4  14  12  7
 2.4  24  11 16
 3.4  34  11 12
 1.5  15  14 10
 2.5  25  14 13
 3.5  35  13  9
 
 HTH,
 Dennis
 
 On Sun, Sep 12, 2010 at 7:45 AM, Natasha Asar 

 
 Greeting R helpers J
 I am not familiar with R but I have to use it to analyze data set that I
 have
 (30,000 20,000)
 I want to change the structure of the dataset and I am wondering how that
 might
 be possible in R
 A main data looks like this:  some entities are empty
 AgeNo. AgeNo. AgeNo.
 Center15  2  8
 7
 
 Center210   7  20
 9  4  10
 But what I want the data to look like is
 Age1  2  3
 4  5  6  7  8
 9  10
   20
 Center1
 2  7
 Center2
 10
 7  9
 
 It should read the entities one by one
 when j is in age column take its value and consider it as the column
 number for
 new matrix
 then go to next entity (j No. columns) and put that entity under the
 columns
 number identified in previous step.
 In other word
 it should get the each element in No. columns (one by one) and place them
 in a
 new matrix under the column number which are equal to entity

Re: [R] reshape matrix entities to columns

2010-09-13 Thread Natasha Asar
I am trying this as you mentioned and getting an error which i cant fix
do you know where is the problem? 


 df2[is.na(df2)] - 0
 df2
  X age no. age.1 no..1 age.2 no..2 age.3 no..3 age.4 no..4
1   center1   3   9 6 4 9 110 1 0 0
2   center2   5   3 9 2 0 0 0 0 0 0
3   center3   2   2 5 8 7 3 0 0 0 0
4   center4   1  12 4 7 8 3 9 1 0 0
5   center5   6   9 8 5 0 0 0 0 0 0
6   center6   4   8 0 0 0 0 0 0 0 0
7   center7   9   5 0 0 0 0 0 0 0 0
8   center8   4   7 6 3 7 1 8 1 9 2
9   center9   7   3 9 110 2 0 0 0 0
10 center10  10   5 0 0 0 0 0 0 0 0
 df3 - rbind(df2, data.frame(center=1,time=1, age=1:max(df2$age), n=0))
Error in rbind(deparse.level, ...) : 
  numbers of columns of arguments do not match


Natatsha




From: Dennis Murphy djmu...@gmail.com
To: David Winsemius dwinsem...@comcast.net
Cc: r-help@r-project.org
Sent: Sun, 12 September, 2010 23:16:47
Subject: Re: [R] reshape matrix entities to columns

Thanks, David; I overlooked that part.

Dennis

On Sun, Sep 12, 2010 at 1:18 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Sep 12, 2010, at 3:34 PM, Dennis Murphy wrote:

  Hi:

 Natasha said:
 
 I changed it so i hope it will look better now
 the matrix is like this:
AgeNo.   Age   No.   AgeNo.
 Center1 52  8   7
 Center210  720 9   4  10
 column name = sequence of age-no.

 But what I want the data to look like is this
 Age
 1  2  3   4   5  6   7  8   9 10

  20
 Center1   27
 Center2
 10 7   9
 column name= age of ppl
 entries = number of ppl with that age in on center
 *

 It's a continuation of the reshape problem, but we have to
 change the NAs in the reshaped data frame to zeros first:

 df2[is.na(df2)] - 0

 xtabs(n ~ center + age, data = df2)
 age
 center  5  6  7  8  9 10 11 12 13 14
1  0 10  0 13  0  9  0  7  0 10
2  0  0 12 14  0  0 16  0  0 13
3  6  0  0  0 10  0 12  0  9  0

 How's that?


 You've done all the hard work, but the OP wanted the full range of age
 values from 1:max and that pretty easy to do with one further step that adds
 entries fo the missing age levels:

  df3 - rbind(df2, data.frame(center=1,time=1, age=1:max(df2$age), n=0))

  xtabs(n ~ center + age, data = df3)
  age
 center  1  2  3  4  5  6  7  8  9 10 11 12 13 14
 1  0  0  0  0  0 10  0 13  0  9  0  7  0 10
 2  0  0  0  0  0  0 12 14  0  0 16  0  0 13
 3  0  0  0  0  6  0  0  0 10  0 12  0  9  0

 --
 David.

  Dennis

 On Sun, Sep 12, 2010 at 9:46 AM, Dennis Murphy djmu...@gmail.com wrote:

  Hi:

 Here's a made up example using the reshape function:

 Input data:
 df - structure(list(center = 1:3, age1 = c(6L, 7L, 5L), n1 = c(10L,
 12L, 6L), age2 = c(8L, 8L, 8L), n2 = c(13L, 14L, NA), age3 = c(10L,
 10L, 9L), n3 = c(9L, NA, 10L), age4 = c(12L, 11L, 11L), n4 = c(7L,
 16L, 12L), age5 = c(14L, 14L, 13L), n5 = c(10L, 13L, 9L)), .Names =
 c(center,
 age1, n1, age2, n2, age3, n3, age4, n4, age5,
 n5), class = data.frame, row.names = c(NA, -3L))

 df
  center age1 n1 age2 n2 age3 n3 age4 n4 age5 n5
 1  16 108 13   10  9   12  7   14 10
 2  27 128 14   10 NA   11 16   14 13
 3  35  68 NA9 10   11 12   13  9

 # To reshape more than one variable at a time, you need
 # to put the sets of variables into a list, as follows:

 df2 - reshape(df, idvar = 'center', varying =
  list(c(paste('age', 1:5, sep = '')), c(paste('n', 1:5, sep = ''))),
  v.names = c('age', 'n'), times = 1:5, direction = 'long')
 df2
   center time age  n
 1.1  11   6 10
 2.1  21   7 12
 3.1  31   5  6
 1.2  12   8 13
 2.2  22   8 14
 3.2  32   8 NA
 1.3  13  10  9
 2.3  23  10 NA
 3.3  33   9 10
 1.4  14  12  7
 2.4  24  11 16
 3.4  34  11 12
 1.5  15  14 10
 2.5  25  14 13
 3.5  35  13  9

 HTH,
 Dennis


 wrote:

  Greeting R helpers J
 I am not familiar with R but I have to use it to analyze data set that I
 have
 (30,000 20,000)
 I want to change the structure of the dataset and I am wondering how
 that
 might
 be possible in R
 A main data looks like this:  some entities are empty
 AgeNo. AgeNo. AgeNo.
 Center15  2  8
 7

 Center210   7  20
 9  4  10
 But what I want the data to look like 

[R] reshape matrix entities to columns

2010-09-12 Thread Natasha Asar
Greeting R helpers J
I am not familiar with R but I have to use it to analyze data set that I have 
(30,000 20,000)
I want to change the structure of the dataset and I am wondering how that might 
be possible in R
A main data looks like this:  some entities are empty
AgeNo. AgeNo. AgeNo.
Center15  2  8  
7  

Center210   7  20   
9  4  10
But what I want the data to look like is 
Age1  2  3  
4  5  6  7  8  
9  10 …   20
Center1 
  
 2  7
Center2 
  
10  

  7  9
 
It should read the entities one by one
when j is in age column take its value and consider it as the column number for 
new matrix
then go to next entity (j No. columns) and put that entity under the columns 
number identified in previous step.
In other word
it should get the each element in No. columns (one by one) and place them in a 
new matrix under the column number which are equal to entity of age columns of 
first matrix
i have tired ncol, and cbind and things like that but I guess im on the wrong 
path because it is not working.  I am reading this fine with read.csv and 
writing back the same way.
do you know how I can make this work?? Is it even possible to do something like 
this?
Thank you in advance
Natasha


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshape matrix entities to columns

2010-09-12 Thread Natasha Asar
I changed it so i hope it will look better now
the matrix is like this:
 AgeNo.   Age   No.   AgeNo.
Center1 52  8   7 
Center210  720 9   4  10
column name = sequence of age-no.

But what I want the data to look like is this
Age 
  1  2  3   4   5  6   7  8   9 10 
…   20
Center1   27
Center2  
  10 7   9
column name= age of ppl
entries = number of ppl with that age in on center

thanks again 
Natasha



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] replacing matrix column entities as columns name

2010-09-12 Thread Natasha Asar
I am sending this again as they told me that the data was unreadable, so really 
hope that this will work...so sorry all
note: matrix contains a lot of empty entities 

Greeting R helpers
I am not familiar with R but I have to use it to analyze data set that I have 
(30,000 20,000)
I want to change the structure of the dataset and I am wondering how that might 
be possible in R

  Age  No.  Age  No.  AgeNo.
 
 Center15  2  8  7
 
 Center210  7  20  9410
 
 column name = sequence of age-no.
 
 
 
 But what I want the data to look like is this
 
 Age  12345678  910 ..  20
 
 Center12  7
 
 Center2  10   7 ..  9

 the rest of the matrix is empty == center1(2,7) and center2(4,10,20) are full
 column name= age of ppl
 
 entries = number of ppl with that age in on center
It should read the entities one by one
when j is in age column take its value and consider it as the column number for 
new matrix
then go to next entity (j No. columns) and put that entity under the columns 
number identified in previous step.
In other word
it should get the each element in No. columns (one by one) and place them in a 
new matrix under the column number which are equal to entity of age columns of 
first matrix
i have tired ncol, and cbind and things like that but I guess im on the wrong 
path because it is not working.  I am reading this fine with read.csv and 
writing back the same way.
do you know how I can make this work?? Is it even possible to do something like 
this?
Thank you in advance
Natasha


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.