Re: [R] Summary statistics for matrix columns
HI A.k, I need one more question, if you can answer it please M - matrix(sample(1:8000),nrow=100) colnames(M)- paste(Col,1:ncol(M),sep=) apply(M,2,function(x) c(Min=min(x),1st Qu =quantile(x, 0.25,names=FALSE), Range = range(x), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x),Std=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x),Max = max(x))) why I get two range . isn't range mean the different between the max and min Thanks Date: Fri, 23 Nov 2012 16:08:12 -0800 From: ml-node+s789695n4650613...@n4.nabble.com To: frespi...@hotmail.com Subject: Re: Summary statistics for matrix columns Hi, No problem. There are a couple of other libraries which deal with summary statistics: library(pastecs) ?stat.desc() # library(matrixStats) #Using the functions from package: matrixStats fun1-function(x){ res-rbind(colMins(x),colQuantiles(x)[,2],colMedians(x),colMeans(x),colSds(x),colQuantiles(x)[,4],colIQRs(x),colMaxs(x)) row.names(res)-c(Min.,1st Qu.,Median,Mean,sd,3rd Qu.,IQR,Max.) res} set.seed(125) x - matrix(sample(1:80),nrow=8) colnames(x)- paste(Col,1:ncol(x),sep=) fun1(x) #Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 #Min.10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 #1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 #Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 #Mean42.5 42.75000 41.75000 35.75000 44.87500 26.87500 44.75000 50.12500 #sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 #3rd Qu. 67.75000 58.5 50.0 63.25000 54.25000 30.25000 56.25000 70.5 #IQR 43.0 29.0 24.0 55.5 14.25000 13.0 28.75000 35.75000 #Max.74.0 77.0 76.0 70.0 65.0 63.0 79.0 80.0 # Col9Col10 #Min. 2.0 6.0 #1st Qu. 24.5 12.5 #Median 33.5 48.0 #Mean34.87500 40.75000 #sd 24.39811 28.21727 #3rd Qu. 45.25000 63.0 #IQR 20.75000 50.5 #Max.71.0 72.0 I thought this could be faster than the previous methods. But, it was the slowest. set.seed(125) x1 - matrix(sample(1:80),nrow=1000) colnames(x)- paste(Col,1:ncol(x1),sep=) system.time(fun1(x1)) # user system elapsed # 0.968 0.000 0.956 A.K. From: Fares Said [hidden email] To: arun [hidden email] Cc: Pete Brecknock [hidden email]; R help [hidden email] Sent: Friday, November 23, 2012 10:23 AM Subject: Re: [R] Summary statistics for matrix columns Thank you all Sent from my iPhone On 2012-11-23, at 10:19, arun [hidden email] wrote: HI, You are right. It is slower when compared to Pete's solution: set.seed(125) x - matrix(sample(1:80),nrow=1000) colnames(x)- paste(Col,1:ncol(x),sep=) system.time({ res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) res2-res1[c(1:4,7,5,8,6),] }) # user system elapsed # 0.596 0.000 0.597 system.time({ res-apply(x,2,function(x) c(Min=min(x), 1st Qu =quantile(x, 0.25,names=FALSE), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x), Sd=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x), Max = max(x))) }) # user system elapsed # 0.384 0.000 0.384 A.K. - Original Message - From: Pete Brecknock [hidden email] To: [hidden email] Cc: Sent: Friday, November 23, 2012 8:42 AM Subject: Re: [R] Summary statistics for matrix columns frespider wrote Hi, it is possible. but don't you think it will slow the code if you convert to data.frame? Thanks Date: Thu, 22 Nov 2012 18:31:35 -0800 From: ml-node+s789695n4650500h51@.nabble To: frespider@ Subject: RE: Summary statistics for matrix columns HI, Is it possible to use as.matrix()? res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) is.matrix(res1) #[1] TRUE res1[c(1:4,7,5,8,6),] #Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 #Min.10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 #1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 #Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 #Mean42.5 42.75000 41.75000 35.75000 44.88000 26.88000 44.75000 50.12000 #sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 #3rd
Re: [R] Summary statistics for matrix columns
isn't range mean the different between the max and min That is one meaning of range. There are many. To see what R's definition is type ? range or help(range) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of frespider Sent: Saturday, November 24, 2012 4:58 AM To: r-help@r-project.org Subject: Re: [R] Summary statistics for matrix columns HI A.k, I need one more question, if you can answer it please M - matrix(sample(1:8000),nrow=100) colnames(M)- paste(Col,1:ncol(M),sep=) apply(M,2,function(x) c(Min=min(x),1st Qu =quantile(x, 0.25,names=FALSE), Range = range(x), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x),Std=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x),Max = max(x))) why I get two range . isn't range mean the different between the max and min Thanks Date: Fri, 23 Nov 2012 16:08:12 -0800 From: ml-node+s789695n4650613...@n4.nabble.com To: frespi...@hotmail.com Subject: Re: Summary statistics for matrix columns Hi, No problem. There are a couple of other libraries which deal with summary statistics: library(pastecs) ?stat.desc() # library(matrixStats) #Using the functions from package: matrixStats fun1-function(x){ res- rbind(colMins(x),colQuantiles(x)[,2],colMedians(x),colMeans(x),colSds(x),colQuantiles(x)[ ,4],colIQRs(x),colMaxs(x)) row.names(res)-c(Min.,1st Qu.,Median,Mean,sd,3rd Qu.,IQR,Max.) res} set.seed(125) x - matrix(sample(1:80),nrow=8) colnames(x)- paste(Col,1:ncol(x),sep=) fun1(x) #Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 #Min.10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 #1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 #Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 #Mean42.5 42.75000 41.75000 35.75000 44.87500 26.87500 44.75000 50.12500 #sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 #3rd Qu. 67.75000 58.5 50.0 63.25000 54.25000 30.25000 56.25000 70.5 #IQR 43.0 29.0 24.0 55.5 14.25000 13.0 28.75000 35.75000 #Max.74.0 77.0 76.0 70.0 65.0 63.0 79.0 80.0 # Col9Col10 #Min. 2.0 6.0 #1st Qu. 24.5 12.5 #Median 33.5 48.0 #Mean34.87500 40.75000 #sd 24.39811 28.21727 #3rd Qu. 45.25000 63.0 #IQR 20.75000 50.5 #Max.71.0 72.0 I thought this could be faster than the previous methods. But, it was the slowest. set.seed(125) x1 - matrix(sample(1:80),nrow=1000) colnames(x)- paste(Col,1:ncol(x1),sep=) system.time(fun1(x1)) # user system elapsed # 0.968 0.000 0.956 A.K. From: Fares Said [hidden email] To: arun [hidden email] Cc: Pete Brecknock [hidden email]; R help [hidden email] Sent: Friday, November 23, 2012 10:23 AM Subject: Re: [R] Summary statistics for matrix columns Thank you all Sent from my iPhone On 2012-11-23, at 10:19, arun [hidden email] wrote: HI, You are right. It is slower when compared to Pete's solution: set.seed(125) x - matrix(sample(1:80),nrow=1000) colnames(x)- paste(Col,1:ncol(x),sep=) system.time({ res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) res2-res1[c(1:4,7,5,8,6),] }) # user system elapsed # 0.596 0.000 0.597 system.time({ res-apply(x,2,function(x) c(Min=min(x), 1st Qu =quantile(x, 0.25,names=FALSE), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x), Sd=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x), Max = max(x))) }) # user system elapsed # 0.384 0.000 0.384 A.K. - Original Message - From: Pete Brecknock [hidden email] To: [hidden email] Cc: Sent: Friday, November 23, 2012 8:42 AM Subject: Re: [R] Summary statistics for matrix columns frespider wrote Hi, it is possible. but don't you think it will slow the code if you convert to data.frame? Thanks Date: Thu, 22 Nov 2012 18:31:35 -0800 From: ml-node+s789695n4650500h51@.nabble To: frespider@ Subject: RE: Summary statistics for matrix columns HI
Re: [R] Summary statistics for matrix columns
Hi, You are right. Range is supposed to be one value (i.e the difference between largest and smallest). For some reason, the function range(x) gives both the values. The description for ?range() is: Description: ‘range’ returns a vector containing the minimum and maximum of all the given arguments. I looked for similar function in library(matrixStats) . There it was colRanges(), rowRanges(). set.seed(125) x - matrix(sample(1:80),nrow=8) colnames(x)- paste(Col,1:ncol(x),sep=) apply(x,2,function(x) range(x)) # Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10 #[1,] 10 1 17 3 18 11 13 15 2 6 #[2,] 74 77 76 70 65 63 79 80 71 72 library(matrixStats) colRanges(x) # [,1] [,2] #[1,] 10 74 #[2,] 1 77 #[3,] 17 76 - You could do this to get the range: apply(x,2,function(x) diff(range(x))) #Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10 # 64 76 59 67 47 52 66 65 69 66 #or i diff(t(colRanges(x))) # [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] #[1,] 64 76 59 67 47 52 66 65 69 66 #or rowDiffs(colRanges(x)) A.K. - Original Message - From: frespider frespi...@hotmail.com To: r-help@r-project.org Cc: Sent: Saturday, November 24, 2012 7:58 AM Subject: Re: [R] Summary statistics for matrix columns HI A.k, I need one more question, if you can answer it please M - matrix(sample(1:8000),nrow=100) colnames(M)- paste(Col,1:ncol(M),sep=) apply(M,2,function(x) c(Min=min(x),1st Qu =quantile(x, 0.25,names=FALSE), Range = range(x), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x),Std=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x),Max = max(x))) why I get two range . isn't range mean the different between the max and min Thanks Date: Fri, 23 Nov 2012 16:08:12 -0800 From: ml-node+s789695n4650613...@n4.nabble.com To: frespi...@hotmail.com Subject: Re: Summary statistics for matrix columns Hi, No problem. There are a couple of other libraries which deal with summary statistics: library(pastecs) ?stat.desc() # library(matrixStats) #Using the functions from package: matrixStats fun1-function(x){ res-rbind(colMins(x),colQuantiles(x)[,2],colMedians(x),colMeans(x),colSds(x),colQuantiles(x)[,4],colIQRs(x),colMaxs(x)) row.names(res)-c(Min.,1st Qu.,Median,Mean,sd,3rd Qu.,IQR,Max.) res} set.seed(125) x - matrix(sample(1:80),nrow=8) colnames(x)- paste(Col,1:ncol(x),sep=) fun1(x) # Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 #Min. 10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 #1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 #Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 #Mean 42.5 42.75000 41.75000 35.75000 44.87500 26.87500 44.75000 50.12500 #sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 #3rd Qu. 67.75000 58.5 50.0 63.25000 54.25000 30.25000 56.25000 70.5 #IQR 43.0 29.0 24.0 55.5 14.25000 13.0 28.75000 35.75000 #Max. 74.0 77.0 76.0 70.0 65.0 63.0 79.0 80.0 # Col9 Col10 #Min. 2.0 6.0 #1st Qu. 24.5 12.5 #Median 33.5 48.0 #Mean 34.87500 40.75000 #sd 24.39811 28.21727 #3rd Qu. 45.25000 63.0 #IQR 20.75000 50.5 #Max. 71.0 72.0 I thought this could be faster than the previous methods. But, it was the slowest. set.seed(125) x1 - matrix(sample(1:80),nrow=1000) colnames(x)- paste(Col,1:ncol(x1),sep=) system.time(fun1(x1)) # user system elapsed # 0.968 0.000 0.956 A.K. From: Fares Said [hidden email] To: arun [hidden email] Cc: Pete Brecknock [hidden email]; R help [hidden email] Sent: Friday, November 23, 2012 10:23 AM Subject: Re: [R] Summary statistics for matrix columns Thank you all Sent from my iPhone On 2012-11-23, at 10:19, arun [hidden email] wrote: HI, You are right. It is slower when compared to Pete's solution: set.seed(125) x - matrix(sample(1:80),nrow=1000) colnames(x)- paste(Col,1:ncol(x),sep=) system.time({ res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) res2-res1[c(1:4,7,5,8,6),] }) # user system elapsed # 0.596 0.000 0.597 system.time({ res-apply(x,2,function(x) c(Min=min(x), 1st Qu =quantile(x, 0.25,names=FALSE), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x), Sd=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE
Re: [R] Summary statistics for matrix columns
On Nov 24, 2012, at 4:58 AM, frespider wrote: HI A.k, I need one more question, if you can answer it please M - matrix(sample(1:8000),nrow=100) colnames(M)- paste(Col,1:ncol(M),sep=) apply(M,2,function(x) c(Min=min(x),1st Qu =quantile(x, 0.25,names=FALSE), Range = range(x), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x),Std=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x),Max = max(x))) why I get two range . isn't range mean the different between the max and min If you want the span (what you are calling the range) of the range (min and max) you can do this: myRange = diff(range(x)) -- David David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary statistics for matrix columns
frespider wrote Hi, it is possible. but don't you think it will slow the code if you convert to data.frame? Thanks Date: Thu, 22 Nov 2012 18:31:35 -0800 From: ml-node+s789695n4650500h51@.nabble To: frespider@ Subject: RE: Summary statistics for matrix columns HI, Is it possible to use as.matrix()? res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) is.matrix(res1) #[1] TRUE res1[c(1:4,7,5,8,6),] #Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 #Min.10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 #1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 #Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 #Mean42.5 42.75000 41.75000 35.75000 44.88000 26.88000 44.75000 50.12000 #sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 #3rd Qu. 67.75000 58.5 50.0 63.25000 54.25000 30.25000 56.25000 70.5 #IQR 43.0 29.0 24.0 55.5 14.25000 13.0 28.75000 35.75000 #Max.74.0 77.0 76.0 70.0 65.0 63.0 79.0 80.0 # Col9Col10 #Min. 2.0 6.0 #1st Qu. 24.5 12.5 #Median 33.5 48.0 #Mean34.88000 40.75000 #sd 24.39811 28.21727 #3rd Qu. 45.25000 63.0 #IQR 20.75000 50.5 #Max.71.0 72.0 Solves the order and the matrix output! A.K. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650500.html To unsubscribe from Summary statistics for matrix columns, click here. NAML Then maybe x - matrix(sample(1:8000),nrow=100) colnames(x)- paste(Col,1:ncol(x),sep=) apply(x,2,function(x) c(Min=min(x), 1st Qu =quantile(x, 0.25,names=FALSE), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x), Sd=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x), Max = max(x))) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650547.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary statistics for matrix columns
Thank you all Sent from my iPhone On 2012-11-23, at 10:19, arun smartpink...@yahoo.com wrote: HI, You are right. It is slower when compared to Pete's solution: set.seed(125) x - matrix(sample(1:80),nrow=1000) colnames(x)- paste(Col,1:ncol(x),sep=) system.time({ res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) res2-res1[c(1:4,7,5,8,6),] }) # user system elapsed # 0.596 0.000 0.597 system.time({ res-apply(x,2,function(x) c(Min=min(x), 1st Qu =quantile(x, 0.25,names=FALSE), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x), Sd=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x), Max = max(x))) }) # user system elapsed # 0.384 0.000 0.384 A.K. - Original Message - From: Pete Brecknock peter.breckn...@bp.com To: r-help@r-project.org Cc: Sent: Friday, November 23, 2012 8:42 AM Subject: Re: [R] Summary statistics for matrix columns frespider wrote Hi, it is possible. but don't you think it will slow the code if you convert to data.frame? Thanks Date: Thu, 22 Nov 2012 18:31:35 -0800 From: ml-node+s789695n4650500h51@.nabble To: frespider@ Subject: RE: Summary statistics for matrix columns HI, Is it possible to use as.matrix()? res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) is.matrix(res1) #[1] TRUE res1[c(1:4,7,5,8,6),] #Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 #Min.10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 #1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 #Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 #Mean42.5 42.75000 41.75000 35.75000 44.88000 26.88000 44.75000 50.12000 #sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 #3rd Qu. 67.75000 58.5 50.0 63.25000 54.25000 30.25000 56.25000 70.5 #IQR 43.0 29.0 24.0 55.5 14.25000 13.0 28.75000 35.75000 #Max.74.0 77.0 76.0 70.0 65.0 63.0 79.0 80.0 # Col9Col10 #Min. 2.0 6.0 #1st Qu. 24.5 12.5 #Median 33.5 48.0 #Mean34.88000 40.75000 #sd 24.39811 28.21727 #3rd Qu. 45.25000 63.0 #IQR 20.75000 50.5 #Max.71.0 72.0 Solves the order and the matrix output! A.K. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650500.html To unsubscribe from Summary statistics for matrix columns, click here. NAML Then maybe x - matrix(sample(1:8000),nrow=100) colnames(x)- paste(Col,1:ncol(x),sep=) apply(x,2,function(x) c(Min=min(x), 1st Qu =quantile(x, 0.25,names=FALSE), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x), Sd=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x), Max = max(x))) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650547.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary statistics for matrix columns
HI, You are right. It is slower when compared to Pete's solution: set.seed(125) x - matrix(sample(1:80),nrow=1000) colnames(x)- paste(Col,1:ncol(x),sep=) system.time({ res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) res2-res1[c(1:4,7,5,8,6),] }) # user system elapsed # 0.596 0.000 0.597 system.time({ res-apply(x,2,function(x) c(Min=min(x), 1st Qu =quantile(x, 0.25,names=FALSE), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x), Sd=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x), Max = max(x))) }) # user system elapsed # 0.384 0.000 0.384 A.K. - Original Message - From: Pete Brecknock peter.breckn...@bp.com To: r-help@r-project.org Cc: Sent: Friday, November 23, 2012 8:42 AM Subject: Re: [R] Summary statistics for matrix columns frespider wrote Hi, it is possible. but don't you think it will slow the code if you convert to data.frame? Thanks Date: Thu, 22 Nov 2012 18:31:35 -0800 From: ml-node+s789695n4650500h51@.nabble To: frespider@ Subject: RE: Summary statistics for matrix columns HI, Is it possible to use as.matrix()? res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) is.matrix(res1) #[1] TRUE res1[c(1:4,7,5,8,6),] # Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 #Min. 10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 #1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 #Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 #Mean 42.5 42.75000 41.75000 35.75000 44.88000 26.88000 44.75000 50.12000 #sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 #3rd Qu. 67.75000 58.5 50.0 63.25000 54.25000 30.25000 56.25000 70.5 #IQR 43.0 29.0 24.0 55.5 14.25000 13.0 28.75000 35.75000 #Max. 74.0 77.0 76.0 70.0 65.0 63.0 79.0 80.0 # Col9 Col10 #Min. 2.0 6.0 #1st Qu. 24.5 12.5 #Median 33.5 48.0 #Mean 34.88000 40.75000 #sd 24.39811 28.21727 #3rd Qu. 45.25000 63.0 #IQR 20.75000 50.5 #Max. 71.0 72.0 Solves the order and the matrix output! A.K. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650500.html To unsubscribe from Summary statistics for matrix columns, click here. NAML Then maybe x - matrix(sample(1:8000),nrow=100) colnames(x)- paste(Col,1:ncol(x),sep=) apply(x,2,function(x) c(Min=min(x), 1st Qu =quantile(x, 0.25,names=FALSE), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x), Sd=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x), Max = max(x))) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650547.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary statistics for matrix columns
Hi, No problem. There are a couple of other libraries which deal with summary statistics: library(pastecs) ?stat.desc() # library(matrixStats) #Using the functions from package: matrixStats fun1-function(x){ res-rbind(colMins(x),colQuantiles(x)[,2],colMedians(x),colMeans(x),colSds(x),colQuantiles(x)[,4],colIQRs(x),colMaxs(x)) row.names(res)-c(Min.,1st Qu.,Median,Mean,sd,3rd Qu.,IQR,Max.) res} set.seed(125) x - matrix(sample(1:80),nrow=8) colnames(x)- paste(Col,1:ncol(x),sep=) fun1(x) # Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 #Min. 10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 #1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 #Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 #Mean 42.5 42.75000 41.75000 35.75000 44.87500 26.87500 44.75000 50.12500 #sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 #3rd Qu. 67.75000 58.5 50.0 63.25000 54.25000 30.25000 56.25000 70.5 #IQR 43.0 29.0 24.0 55.5 14.25000 13.0 28.75000 35.75000 #Max. 74.0 77.0 76.0 70.0 65.0 63.0 79.0 80.0 # Col9 Col10 #Min. 2.0 6.0 #1st Qu. 24.5 12.5 #Median 33.5 48.0 #Mean 34.87500 40.75000 #sd 24.39811 28.21727 #3rd Qu. 45.25000 63.0 #IQR 20.75000 50.5 #Max. 71.0 72.0 I thought this could be faster than the previous methods. But, it was the slowest. set.seed(125) x1 - matrix(sample(1:80),nrow=1000) colnames(x)- paste(Col,1:ncol(x1),sep=) system.time(fun1(x1)) # user system elapsed # 0.968 0.000 0.956 A.K. From: Fares Said frespi...@hotmail.com To: arun smartpink...@yahoo.com Cc: Pete Brecknock peter.breckn...@bp.com; R help r-help@r-project.org Sent: Friday, November 23, 2012 10:23 AM Subject: Re: [R] Summary statistics for matrix columns Thank you all Sent from my iPhone On 2012-11-23, at 10:19, arun smartpink...@yahoo.com wrote: HI, You are right. It is slower when compared to Pete's solution: set.seed(125) x - matrix(sample(1:80),nrow=1000) colnames(x)- paste(Col,1:ncol(x),sep=) system.time({ res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) res2-res1[c(1:4,7,5,8,6),] }) # user system elapsed # 0.596 0.000 0.597 system.time({ res-apply(x,2,function(x) c(Min=min(x), 1st Qu =quantile(x, 0.25,names=FALSE), Median = quantile(x, 0.5, names=FALSE), Mean= mean(x), Sd=sd(x), 3rd Qu = quantile(x,0.75,names=FALSE), IQR=IQR(x), Max = max(x))) }) # user system elapsed # 0.384 0.000 0.384 A.K. - Original Message - From: Pete Brecknock peter.breckn...@bp.com To: r-help@r-project.org Cc: Sent: Friday, November 23, 2012 8:42 AM Subject: Re: [R] Summary statistics for matrix columns frespider wrote Hi, it is possible. but don't you think it will slow the code if you convert to data.frame? Thanks Date: Thu, 22 Nov 2012 18:31:35 -0800 From: ml-node+s789695n4650500h51@.nabble To: frespider@ Subject: RE: Summary statistics for matrix columns HI, Is it possible to use as.matrix()? res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) is.matrix(res1) #[1] TRUE res1[c(1:4,7,5,8,6),] # Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 #Min. 10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 #1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 #Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 #Mean 42.5 42.75000 41.75000 35.75000 44.88000 26.88000 44.75000 50.12000 #sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 #3rd Qu. 67.75000 58.5 50.0 63.25000 54.25000 30.25000 56.25000 70.5 #IQR 43.0 29.0 24.0 55.5 14.25000 13.0 28.75000 35.75000 #Max. 74.0 77.0 76.0 70.0 65.0 63.0 79.0 80.0 # Col9 Col10 #Min. 2.0 6.0 #1st Qu. 24.5 12.5 #Median 33.5 48.0 #Mean 34.88000 40.75000 #sd 24.39811 28.21727 #3rd Qu. 45.25000 63.0 #IQR 20.75000 50.5 #Max. 71.0 72.0 Solves the order and the matrix output! A.K. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650500.html
Re: [R] Summary statistics for matrix columns
frespider wrote Hi, is there a way I can calculate a summary statistics for a columns matrix let say we have this matrix x - matrix(sample(1:8000),nrow=100) colnames(x)- paste(Col,1:ncol(x),sep=) if I used summary summary(x) i get the output for each column but I need the output to be in matrix with rownames and all the columns beside it this how I want it Col76 Col77 Min. :739 1st Qu. :1846 1630 Median : 3631 3376 Mean: 3804 3617 Sd : 3rd Qu.:5772 5544 IQR: Max. :79527779 Is there an easy way? Thanks How about ... x - matrix(sample(1:8000),nrow=100) colnames(x)- paste(Col,1:ncol(x),sep=) apply(x,2,function(x) c(summary(x), sd=sd(x), IQR=IQR(x))) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650490.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary statistics for matrix columns
I also don't like to use split function because I have like around 800 columns Date: Thu, 22 Nov 2012 18:08:54 -0800 From: ml-node+s789695n4650496...@n4.nabble.com To: frespi...@hotmail.com Subject: RE: Summary statistics for matrix columns Hi, How about this: res-do.call(cbind,lapply(split(x,col(x)),function(x) c(summary(x),sd=sd(x),IQR=IQR(x colnames(res)-colnames(x) is.matrix(res) [1] TRUE res Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Min.10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 Mean42.5 42.75000 41.75000 35.75000 44.88000 26.88000 44.75000 50.12000 3rd Qu. 67.75000 58.5 50.0 63.25000 54.25000 30.25000 56.25000 70.5 Max.74.0 77.0 76.0 70.0 65.0 63.0 79.0 80.0 sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 IQR 43.0 29.0 24.0 55.5 14.25000 13.0 28.75000 35.75000 Col9Col10 Min. 2.0 6.0 1st Qu. 24.5 12.5 Median 33.5 48.0 Mean34.88000 40.75000 3rd Qu. 45.25000 63.0 Max.71.0 72.0 sd 24.39811 28.21727 IQR 20.75000 50.5 A.K. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650496.html This email was sent by arun kirshna (via Nabble) -- View this message in context: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650498.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary statistics for matrix columns
Hi peter, but this doesn't give me them in the order I want. Is there a better approach Thanks -- View this message in context: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650492.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary statistics for matrix columns
There is still missing some statistics, like sd and IQR and I prefer the output to be matrix Thanks Date: Thu, 22 Nov 2012 18:00:20 -0800 From: ml-node+s789695n4650493...@n4.nabble.com To: frespi...@hotmail.com Subject: Re: Summary statistics for matrix columns HI, You could try this: set.seed(125) x - matrix(sample(1:80),nrow=8) colnames(x)- paste(Col,1:ncol(x),sep=) sapply(data.frame(x),function(x) summary(x)) # Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10 #Min.10.00 1.00 17.00 3.00 18.00 11.00 13.00 15.00 2.00 6.00 #1st Qu. 24.75 29.50 26.00 7.75 40.00 17.25 27.50 34.75 24.50 12.50 #Median 34.00 46.00 42.50 35.50 49.50 23.50 51.50 51.50 33.50 48.00 #Mean42.50 42.75 41.75 35.75 44.88 26.88 44.75 50.12 34.88 40.75 #3rd Qu. 67.75 58.50 50.00 63.25 54.25 30.25 56.25 70.50 45.25 63.00 #Max.74.00 77.00 76.00 70.00 65.00 63.00 79.00 80.00 71.00 72.00 A.K. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650493.html To unsubscribe from Summary statistics for matrix columns, click here. NAML -- View this message in context: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650494.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary statistics for matrix columns
HI, but Sd and IQR not in the order I want , Thanks Date: Thu, 22 Nov 2012 18:08:57 -0800 From: ml-node+s789695n4650496...@n4.nabble.com To: frespi...@hotmail.com Subject: RE: Summary statistics for matrix columns Hi, How about this: res-do.call(cbind,lapply(split(x,col(x)),function(x) c(summary(x),sd=sd(x),IQR=IQR(x colnames(res)-colnames(x) is.matrix(res) [1] TRUE res Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Min.10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 Mean42.5 42.75000 41.75000 35.75000 44.88000 26.88000 44.75000 50.12000 3rd Qu. 67.75000 58.5 50.0 63.25000 54.25000 30.25000 56.25000 70.5 Max.74.0 77.0 76.0 70.0 65.0 63.0 79.0 80.0 sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 IQR 43.0 29.0 24.0 55.5 14.25000 13.0 28.75000 35.75000 Col9Col10 Min. 2.0 6.0 1st Qu. 24.5 12.5 Median 33.5 48.0 Mean34.88000 40.75000 3rd Qu. 45.25000 63.0 Max.71.0 72.0 sd 24.39811 28.21727 IQR 20.75000 50.5 A.K. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650496.html To unsubscribe from Summary statistics for matrix columns, click here. NAML -- View this message in context: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650497.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary statistics for matrix columns
Hi, it is possible. but don't you think it will slow the code if you convert to data.frame? Thanks Date: Thu, 22 Nov 2012 18:31:35 -0800 From: ml-node+s789695n4650500...@n4.nabble.com To: frespi...@hotmail.com Subject: RE: Summary statistics for matrix columns HI, Is it possible to use as.matrix()? res-sapply(data.frame(x),function(x) c(summary(x),sd=sd(x),IQR=IQR(x))) res1-as.matrix(res) is.matrix(res1) #[1] TRUE res1[c(1:4,7,5,8,6),] #Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 #Min.10.0 1.0 17.0 3.0 18.0 11.0 13.0 15.0 #1st Qu. 24.75000 29.5 26.0 7.75000 40.0 17.25000 27.5 34.75000 #Median 34.0 46.0 42.5 35.5 49.5 23.5 51.5 51.5 #Mean42.5 42.75000 41.75000 35.75000 44.88000 26.88000 44.75000 50.12000 #sd 25.05993 27.77846 19.57221 28.40397 16.39196 16.60841 21.97239 25.51995 #3rd Qu. 67.75000 58.5 50.0 63.25000 54.25000 30.25000 56.25000 70.5 #IQR 43.0 29.0 24.0 55.5 14.25000 13.0 28.75000 35.75000 #Max.74.0 77.0 76.0 70.0 65.0 63.0 79.0 80.0 # Col9Col10 #Min. 2.0 6.0 #1st Qu. 24.5 12.5 #Median 33.5 48.0 #Mean34.88000 40.75000 #sd 24.39811 28.21727 #3rd Qu. 45.25000 63.0 #IQR 20.75000 50.5 #Max.71.0 72.0 Solves the order and the matrix output! A.K. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650500.html To unsubscribe from Summary statistics for matrix columns, click here. NAML -- View this message in context: http://r.789695.n4.nabble.com/Summary-statistics-for-matrix-columns-tp4650489p4650501.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.