Re: [R] subset in a matrix
Try this: z[z[,1] 0,] On Tue, Feb 9, 2010 at 6:12 PM, DonDiego jorge.nie...@moorecap.com wrote: Hi, I have a matrix of data values like the example bellow. I would like to extract a subset of the matrix for the values where the first column is negative. I am using the subset function. However, I am getting an error message that the conditional variable doe snot exist. For some reason, the subset operation only works if I transform the matrix to a data set using as.data.set(). The help indicates that the subset function can be applied to matrixes and data sets. I am wondering if anyone has seen a similar problem before. am I using the correct syntax? n = 15 m = 5 cnames = paste(x,1:m,sep=) rnames = 1:n z = matrix(rnorm(n*m),n,m,dimnames =list(rnames,cnames)) Thanks, Jorge test = subset(z,x1 0, select = c(cnames)) -- View this message in context: http://n4.nabble.com/subset-in-a-matrix-tp1474958p1474958.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset in a matrix
Hi: z = matrix(rnorm(n*m),n,m,dimnames =list(rnames,cnames)) z x1 x2 x3 x4 x5 1 -0.3942900 -0.61202639 -1.804958629 -0.1351786 -1.27659221 2 -0.0593134 0.34111969 1.465554862 1.1780870 -0.57326541 3 1.1000254 -1.12936310 0.153253338 -1.5235668 -1.22461261 4 0.7631757 1.43302370 2.172611670 0.5939462 -0.47340064 5 -0.1645236 1.98039990 0.475509529 0.3329504 -0.62036668 6 -0.2533617 -0.36722148 -0.709946431 1.0630998 0.04211587 7 0.6969634 -1.04413463 0.610726353 -0.3041839 -0.91092165 8 0.5566632 0.56971963 -0.934097632 0.3700188 0.15802877 9 -0.6887557 -0.13505460 -1.253633400 0.2670988 -0.65458464 10 -0.7074952 2.40161776 0.291446236 -0.5425200 1.76728727 11 0.3645820 -0.03924000 -0.443291873 1.2078678 0.71670748 12 0.7685329 0.68973936 0.001105352 1.1604026 0.91017423 13 -0.1123462 0.02800216 0.074341324 0.7002136 0.38418536 14 0.8811077 -0.74327321 -0.589520946 1.5868335 1.68217608 15 0.3981059 0.18879230 -0.568668733 0.5584864 -0.63573645 z[z[, 1] 0, ] x1 x2 x3 x4 x5 1 -0.3942900 -0.61202639 -1.80495863 -0.1351786 -1.27659221 2 -0.0593134 0.34111969 1.46555486 1.1780870 -0.57326541 5 -0.1645236 1.98039990 0.47550953 0.3329504 -0.62036668 6 -0.2533617 -0.36722148 -0.70994643 1.0630998 0.04211587 9 -0.6887557 -0.13505460 -1.25363340 0.2670988 -0.65458464 10 -0.7074952 2.40161776 0.29144624 -0.5425200 1.76728727 13 -0.1123462 0.02800216 0.07434132 0.7002136 0.38418536 Is this what you were looking for? HTH, Dennis On Tue, Feb 9, 2010 at 12:12 PM, DonDiego jorge.nie...@moorecap.com wrote: Hi, I have a matrix of data values like the example bellow. I would like to extract a subset of the matrix for the values where the first column is negative. I am using the subset function. However, I am getting an error message that the conditional variable doe snot exist. For some reason, the subset operation only works if I transform the matrix to a data set using as.data.set(). The help indicates that the subset function can be applied to matrixes and data sets. I am wondering if anyone has seen a similar problem before. am I using the correct syntax? n = 15 m = 5 cnames = paste(x,1:m,sep=) rnames = 1:n z = matrix(rnorm(n*m),n,m,dimnames =list(rnames,cnames)) Thanks, Jorge test = subset(z,x1 0, select = c(cnames)) -- View this message in context: http://n4.nabble.com/subset-in-a-matrix-tp1474958p1474958.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset in a matrix
As others have said, z[z[, 1] 0, ] does it. Just in case you're wondering why your subset command won't work, str() is your friend (as is so often the case): str(z) str(as.data.frame(z)) ## (I don't think that R has 'as.data.set') So z is a matrix with column *names* x1, etc; as.data.frame(z) is a data.frame with *variables* named x1 etc. If you really want to use subset(), then subset(z, z[, x1] 0, select = ...) will work, but I wouldn't use it. -Peter Ehlers DonDiego wrote: Hi, I have a matrix of data values like the example bellow. I would like to extract a subset of the matrix for the values where the first column is negative. I am using the subset function. However, I am getting an error message that the conditional variable doe snot exist. For some reason, the subset operation only works if I transform the matrix to a data set using as.data.set(). The help indicates that the subset function can be applied to matrixes and data sets. I am wondering if anyone has seen a similar problem before. am I using the correct syntax? n = 15 m = 5 cnames = paste(x,1:m,sep=) rnames = 1:n z = matrix(rnorm(n*m),n,m,dimnames =list(rnames,cnames)) Thanks, Jorge test = subset(z,x1 0, select = c(cnames)) -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset of a matrix
Hi Carlos, how about this step first: rownames(mydata)-gsub(361a,00361a,rownames(mydata)) rownames(mydata)-gsub(456a,00456a,rownames(mydata)) good luck milton On Thu, Aug 27, 2009 at 12:27 PM, Carlos Gonzalo Merino Mendez carlosgmer...@yahoo.com wrote: Hello everyone, I would appreciate any help with the following. My dataset is a list containing matrices. So if you type e.g. data[[1]] you get something like: [,1][,2] 361a AT 456b AG 72145aTG As you can see my rows have names which are character strings containing numbers and letters. I want something similar to a histogram, per column. i.e. I want to know how many times I have a single repeat character in a column and how many times I have a twice repeated character and so on. Maybe there is an easy way to do this, but I wrote my own code which works perfectly, so don't bother to correct it unless extremely necessary. I write down the code so you know exactly what I'm trying to do: table - vector() for (i in (1:length(data))){ for (j in (1:length(data[[i]][1,]))){ t - table(data[[i]][,j]) table - c(table, t) }} ncount - table[names(table) != -] #this line is necessary to eliminate - characters which should not be included in the analysis sfs - table (ncount) And with this code I get something like: 1 2 3 4 5 6 7 8 9 10 542 125 98 49 47 41 26 31 22 18 which is what I'm looking for. Now comes THE problem: As I said before my rows have names. Each name is unique. I want to apply my analysis to a subset of rows en each matrix, namely all rows whose names start with 3, all that start with 4, all that start with 721. In most cases only the first character is important, but since I have names of different length, in some cases I need the first three characters to differentiate the groups. I want to integrate this into the loop so that I get a vector (such as the one called table in my code) for each subset analyzed. I tried using the subset function, but I couldn't figure out how to use it, because it's intended to use row values to define the subset, not row names. I hope someone can help me out, but please bear in mind I am really new at R and most commands and parameters are really unfamiliar to me. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset of a matrix
Hi Carlos, I think I made a wrong suggestion. Sorry about that. I was thinking that if you have the same rowname length it helps you on the data handling. Is it true?! Case yes I can try suggest another automatic way of you get it. bests milton On Thu, Aug 27, 2009 at 12:39 PM, milton ruser milton.ru...@gmail.comwrote: Hi Carlos, how about this step first: rownames(mydata)-gsub(361a,00361a,rownames(mydata)) rownames(mydata)-gsub(456a,00456a,rownames(mydata)) good luck milton On Thu, Aug 27, 2009 at 12:27 PM, Carlos Gonzalo Merino Mendez carlosgmer...@yahoo.com wrote: Hello everyone, I would appreciate any help with the following. My dataset is a list containing matrices. So if you type e.g. data[[1]] you get something like: [,1][,2] 361a AT 456b AG 72145aTG As you can see my rows have names which are character strings containing numbers and letters. I want something similar to a histogram, per column. i.e. I want to know how many times I have a single repeat character in a column and how many times I have a twice repeated character and so on. Maybe there is an easy way to do this, but I wrote my own code which works perfectly, so don't bother to correct it unless extremely necessary. I write down the code so you know exactly what I'm trying to do: table - vector() for (i in (1:length(data))){ for (j in (1:length(data[[i]][1,]))){ t - table(data[[i]][,j]) table - c(table, t) }} ncount - table[names(table) != -] #this line is necessary to eliminate - characters which should not be included in the analysis sfs - table (ncount) And with this code I get something like: 1 2 3 4 5 6 7 8 9 10 542 125 98 49 47 41 26 31 22 18 which is what I'm looking for. Now comes THE problem: As I said before my rows have names. Each name is unique. I want to apply my analysis to a subset of rows en each matrix, namely all rows whose names start with 3, all that start with 4, all that start with 721. In most cases only the first character is important, but since I have names of different length, in some cases I need the first three characters to differentiate the groups. I want to integrate this into the loop so that I get a vector (such as the one called table in my code) for each subset analyzed. I tried using the subset function, but I couldn't figure out how to use it, because it's intended to use row values to define the subset, not row names. I hope someone can help me out, but please bear in mind I am really new at R and most commands and parameters are really unfamiliar to me. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset of a matrix
Try this: lapply(data, function(r) lapply(split(r, substr(sprintf(%05d, as.numeric(gsub([a-z], , row.names(r, 1, 3)), table)) On Thu, Aug 27, 2009 at 1:27 PM, Carlos Gonzalo Merino Mendez carlosgmer...@yahoo.com wrote: Hello everyone, I would appreciate any help with the following. My dataset is a list containing matrices. So if you type e.g. data[[1]] you get something like: [,1][,2] 361a AT 456b AG 72145aTG As you can see my rows have names which are character strings containing numbers and letters. I want something similar to a histogram, per column. i.e. I want to know how many times I have a single repeat character in a column and how many times I have a twice repeated character and so on. Maybe there is an easy way to do this, but I wrote my own code which works perfectly, so don't bother to correct it unless extremely necessary. I write down the code so you know exactly what I'm trying to do: table - vector() for (i in (1:length(data))){ for (j in (1:length(data[[i]][1,]))){ t - table(data[[i]][,j]) table - c(table, t) }} ncount - table[names(table) != -] #this line is necessary to eliminate - characters which should not be included in the analysis sfs - table (ncount) And with this code I get something like: 1 2 3 4 5 6 7 8 9 10 542 125 98 49 47 41 26 31 22 18 which is what I'm looking for. Now comes THE problem: As I said before my rows have names. Each name is unique. I want to apply my analysis to a subset of rows en each matrix, namely all rows whose names start with 3, all that start with 4, all that start with 721. In most cases only the first character is important, but since I have names of different length, in some cases I need the first three characters to differentiate the groups. I want to integrate this into the loop so that I get a vector (such as the one called table in my code) for each subset analyzed. I tried using the subset function, but I couldn't figure out how to use it, because it's intended to use row values to define the subset, not row names. I hope someone can help me out, but please bear in mind I am really new at R and most commands and parameters are really unfamiliar to me. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset of a matrix
Hi Milton, Thanks for trying to help anyway. From: milton ruser milton.ru...@gmail.com Cc: r-help@r-project.org Sent: Thursday, August 27, 2009 6:48:41 PM Subject: Re: [R] subset of a matrix Hi Carlos, I think I made a wrong suggestion. Sorry about that. I was thinking that if you have the same rowname length it helps you on the data handling. Is it true?! Case yes I can try suggest another automatic way of you get it. bests milton On Thu, Aug 27, 2009 at 12:39 PM, milton ruser milton.ru...@gmail.com wrote: Hi Carlos, how about this step first: rownames(mydata)-gsub(361a,00361a,rownames(mydata)) rownames(mydata)-gsub(456a,00456a,rownames(mydata)) good luck milton Hello everyone, I would appreciate any help with the following. My dataset is a list containing matrices. So if you type e.g. data[[1]] you get something like: [,1][,2] 361a AT 456b AG 72145aTG As you can see my rows have names which are character strings containing numbers and letters. I want something similar to a histogram, per column. i.e. I want to know how many times I have a single repeat character in a column and how many times I have a twice repeated character and so on. Maybe there is an easy way to do this, but I wrote my own code which works perfectly, so don't bother to correct it unless extremely necessary. I write down the code so you know exactly what I'm trying to do: table - vector() for (i in (1:length(data))){ for (j in (1:length(data[[i]][1,]))){ t - table(data[[i]][,j]) table - c(table, t) }} ncount - table[names(table) != -] #this line is necessary to eliminate - characters which should not be included in the analysis sfs - table (ncount) And with this code I get something like: 1 2 3 4 5 6 7 8 9 10 542 125 98 49 47 41 26 31 22 18 which is what I'm looking for. Now comes THE problem: As I said before my rows have names. Each name is unique. I want to apply my analysis to a subset of rows en each matrix, namely all rows whose names start with 3, all that start with 4, all that start with 721. In most cases only the first character is important, but since I have names of different length, in some cases I need the first three characters to differentiate the groups. I want to integrate this into the loop so that I get a vector (such as the one called table in my code) for each subset analyzed. I tried using the subset function, but I couldn't figure out how to use it, because it's intended to use row values to define the subset, not row names. I hope someone can help me out, but please bear in mind I am really new at R and most commands and parameters are really unfamiliar to me. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subset of a matrix
Hi Henrique, I tried your code. I simply copied and pasted it 'cause I have no idea how it works. What I get is the total number of A's and T's and all other characters, which was not my intention. Maybe I need to make some modifications to your script before being able to apply within my script? Can you explain what for are you using those commands? Thanks for the help anyway. Cheers, Carlos From: Henrique Dallazuanna www...@gmail.com Cc: r-help@r-project.org Sent: Thursday, August 27, 2009 7:00:45 PM Subject: Re: [R] subset of a matrix Try this: lapply(data, function(r) lapply(split(r, substr(sprintf(%05d, as.numeric(gsub([a-z], , row.names(r, 1, 3)), table)) On Thu, Aug 27, 2009 at 1:27 PM, Carlos Gonzalo Merino Mendez carlosgmerin Hello everyone, I would appreciate any help with the following. My dataset is a list containing matrices. So if you type e.g. data[[1]] you get something like: [,1][,2] 361a AT 456b AG 72145aTG As you can see my rows have names which are character strings containing numbers and letters. I want something similar to a histogram, per column. i.e. I want to know how many times I have a single repeat character in a column and how many times I have a twice repeated character and so on. Maybe there is an easy way to do this, but I wrote my own code which works perfectly, so don't bother to correct it unless extremely necessary. I write down the code so you know exactly what I'm trying to do: table - vector() for (i in (1:length(data))){ for (j in (1:length(data[[i]][1,]))){ t - table(data[[i]][,j]) table - c(table, t) }} ncount - table[names(table) != -] #this line is necessary to eliminate - characters which should not be included in the analysis sfs - table (ncount) And with this code I get something like: 1 2 3 4 5 6 7 8 9 10 542 125 98 49 47 41 26 31 22 18 which is what I'm looking for. Now comes THE problem: As I said before my rows have names. Each name is unique. I want to apply my analysis to a subset of rows en each matrix, namely all rows whose names start with 3, all that start with 4, all that start with 721. In most cases only the first character is important, but since I have names of different length, in some cases I need the first three characters to differentiate the groups. I want to integrate this into the loop so that I get a vector (such as the one called table in my code) for each subset analyzed. I tried using the subset function, but I couldn't figure out how to use it, because it's intended to use row values to define the subset, not row names. I hope someone can help me out, but please bear in mind I am really new at R and most commands and parameters are really unfamiliar to me. Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.