[R] which data structure to choose to keep multile objects?
I have a function called nnmf which takes in one matrix and returns two matrices. for example, X [,1] [,2] [,3] [,4] [1,]147 10 [2,]258 11 [3,]369 12 z=nnmf(X,2) z$W [,1] [,2] [1,] 0.8645422 0.6643681 [2,] 1.7411863 0.5377504 [3,] 2.6179287 0.4111063 z$H [,1] [,2] [,3] [,4] [1,] 1.14299486 1.692260 2.241279 2.79030 [2,] 0.01838514 3.818559 7.619719 11.42087 Now I would like to run it many times -- z2 = nnmf(X,2) z3 = nnmf(X,3) z4 = nnmf(X,4) z5 = nnmf(X,5) ... But I would like to do it automatically , something like - xprocess-function(max_val) { for (iter in 2: max_val) { zz = sprintf( z%s, iter ) zz -nnmf(X,iter) } } xprocess(10) But how could I keep collection of my results each run? Shall I have a data structure to keep appending results? something like theta = {} ? which data structure to choose to keep multile objects? Thanks! Tim -- View this message in context: http://old.nabble.com/which-data-structure-to-choose-to-keep-multile-objects--tp26231601p26231601.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] another question: how to delete one of columes in two ones with high correlation(0.95)
thank you. I need learn it, after that, maybe I can understant it well. thank Nikhil Nikhil Kaza-2 wrote: You need dim(cor.matrix)[1] Following might be better instead of a loop, to to get the row ids of a matrix (which(cor.matrix =0.95) %/% dim(cor.matrix)[1])+1 for column ids use modulus instead of integer divison. (which(cor.matrix =0.95) %% dim(cor.matrix)[1]) There are probably better ways than this. Nikhil but probably a better way to do this would be On 6 Nov 2009, at 3:16AM, bbslover wrote: for(i in 1:(cor.matrix[1]-1)) { for(j in (i+1):(cor.matrix[2])) { if (cor.matrix[i,j]=0.95) { data.f-data.f[,-i]; i-i+1 } } } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/another-question%3A-how-to-delete-one-of-columes-in-two-ones-with-high-correlation%280.95%29-tp26228174p26240884.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] a bug with Student t-value??
Hello, every one, Using the qt() function in R, I got a different Student t-value with the Student t table. e.g. the return value of function qt(0.95,9) is 1.833, while in table it is 2.262. I do not know why the difference exists. I guess it may be a bug in R. -- Íõ»¯Èå ²©Ê¿Ñо¿Éú ±±¾©Ê¦·¶´óѧ¾°¹ÛÉú̬Óë¿É³ÖÐøÐÔ¿ÆѧÑо¿ÖÐÐÄ ±±¾©Êк£µíÇøнֿÚÍâ´ó½Ö19ºÅ±±¾©Ê¦·¶´óѧÉúÃü¿ÆѧѧԺ 100875 Huaru Wang PhD Candidate, CLESS, College of Life Science, Beijing Normal University Tel: 010 - 5880 5812 Cell phone:189 0137 6067 No.19 Xinjiekouwai Street, Beijing 1000875 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a bug with Student t-value??
Hi Huaru, Are you sure you are looking at 0.95 and NOT 0.975? as qt(0.975,9) = 2.262 generally tables are given for alpha/2 (alpha=0.05 in your case) significance. Please check carefully. Thx, S. On Sat, Nov 7, 2009 at 4:02 AM, huaru wang huaru.w...@gmail.com wrote: Hello, every one, Using the qt() function in R, I got a different Student t-value with the Student t table. e.g. the return value of function qt(0.95,9) is 1.833, while in table it is 2.262. I do not know why the difference exists. I guess it may be a bug in R. -- Íõ»¯Èå ²©Ê¿Ñо¿Éú ±±¾©Ê¦·¶´óѧ¾°¹ÛÉú̬Óë¿É³ÖÐøÐÔ¿ÆѧÑо¿ÖÐÐÄ ±±¾©Êк£µíÇøнֿÚÍâ´ó½Ö19ºÅ±±¾©Ê¦·¶´óѧÉúÃü¿ÆѧѧԺ 100875 Huaru Wang PhD Candidate, CLESS, College of Life Science, Beijing Normal University Tel: 010 - 5880 5812 Cell phone:189 0137 6067 No.19 Xinjiekouwai Street, Beijing 1000875 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a bug with Student t-value??
Hi, Sunny, Sorry for my mistake. Thank you for your correction. Thx, Huaru 2009/11/7 Sunny Srivastava research.b...@gmail.com Hi Huaru, Are you sure you are looking at 0.95 and NOT 0.975? as qt(0.975,9) = 2.262 generally tables are given for alpha/2 (alpha=0.05 in your case) significance. Please check carefully. Thx, S. On Sat, Nov 7, 2009 at 4:02 AM, huaru wang huaru.w...@gmail.com wrote: Hello, every one, Using the qt() function in R, I got a different Student t-value with the Student t table. e.g. the return value of function qt(0.95,9) is 1.833, while in table it is 2.262. I do not know why the difference exists. I guess it may be a bug in R. -- Íõ»¯Èå ²©Ê¿Ñо¿Éú ±±¾©Ê¦·¶´óѧ¾°¹ÛÉú̬Óë¿É³ÖÐøÐÔ¿ÆѧÑо¿ÖÐÐÄ ±±¾©Êк£µíÇøнֿÚÍâ´ó½Ö19ºÅ±±¾©Ê¦·¶´óѧÉúÃü¿ÆѧѧԺ 100875 Huaru Wang PhD Candidate, CLESS, College of Life Science, Beijing Normal University Tel: 010 - 5880 5812 Cell phone:189 0137 6067 No.19 Xinjiekouwai Street, Beijing 1000875 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Íõ»¯Èå ²©Ê¿Ñо¿Éú ±±¾©Ê¦·¶´óѧ¾°¹ÛÉú̬Óë¿É³ÖÐøÐÔ¿ÆѧÑо¿ÖÐÐÄ ±±¾©Êк£µíÇøнֿÚÍâ´ó½Ö19ºÅ±±¾©Ê¦·¶´óѧÉúÃü¿ÆѧѧԺ 100875 Huaru Wang PhD Candidate, CLESS, College of Life Science, Beijing Normal University Tel: 010 - 5880 5812 Cell phone:189 0137 6067 No.19 Xinjiekouwai Street, Beijing 1000875 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a bug with Student t-value??
On 07-Nov-09 09:02:38, huaru wang wrote: Hello, every one, Using the qt() function in R, I got a different Student t-value with the Student t table. e.g. the return value of function qt(0.95,9) is 1.833, while in table it is 2.262. I do not know why the difference exists. I guess it may be a bug in R. -- Not at all. The reason is that your table gives the t-value for a two-sided test (with equal tails), i.e. 0.025 in each tail. The function qt() in R returns the t-value for a one-sided test, i.e. 0.05 in the (by default) upper tail. Compare: qt(0.95,9) # [1] 1.833113 qt(0.975,9) # [1] 2.262157 The second value agrees with your table. Many printed tables give the two-sided values, rather than the one-sided. Hoping this helps, Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 07-Nov-09 Time: 09:55:50 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] a bug with Student t-value??
Sunny Srivastava wrote: Hi Huaru, Are you sure you are looking at 0.95 and NOT 0.975? as qt(0.975,9) = 2.262 generally tables are given for alpha/2 (alpha=0.05 in your case) significance. Please check carefully. Or, put differently, tables are often given for two-sided p-values (especially in the back pages of textbooks), whereas qt() is for one-sided. I.e. the t distribution with 9 DF has a probability of 0.95 of falling BELOW 1.833 but also of falling BETWEEN -2.262 and +2.262. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] map of a country and its different geographical levels
Hi Greg I downloaded the file france.shapefiles.zip Then i unziped it. There were 4 files interesting me: - france_administrative.dbf - france_administrative.prj - france_administrative.shp - france_administrative.shx How can i do to read the map france_administrative with R I tried this script that i found on a french web site: library(maptools) library(rgdal) library(foreign) proj.string - +init=epsg:27572 +proj=lcc +lat_1=45.90287723937 +lat_2=47.69712276063 +lat_0=46.8 +lon_0=2.337229104484 +x_0=60 +y_0=220 +units=m +pm=greenwich proj.string.geo - +proj=longlat +datum=WGS84 france - readShapePoly(france_administrative.shp, proj4string=CRS(proj.string)) france - spTransform(france, CRS(proj.string.geo)) save(france, file=france.rda) But there was an error message: Error in .asSpatialPolygonsShapes(Map$Shapes, IDs, proj4string = proj4string, : Not polygon shapes Best regards -- View this message in context: http://old.nabble.com/map-of-a-country-and-its-different-geographical-levels-tp26225645p26240900.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Density estimate with bounds
Hi Greg, Thank you very much for your help. The logspline package seems to solve the problem. Best regards, Justine Original-Nachricht Datum: Thu, 5 Nov 2009 11:11:31 -0700 Von: Greg Snow greg.s...@imail.org An: Justine Rochon roc...@gmx.de, r-help@r-project.org r-help@r-project.org Betreff: RE: [R] Density estimate with bounds You may be interested in the logspline package. It uses a different method from kernel density estimation, but it estimates densities from data and allows the specification of lower and/or upper bounds. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Justine Rochon Sent: Thursday, November 05, 2009 2:38 AM To: r-help@r-project.org Subject: [R] Density estimate with bounds Dear R users, I would like to show the estimated density of a (0, 1) uniformly distributed random variable. The density curve, however, goes beyond 0 and 1 because of the kernel smoothing. Example: x = runif(1) plot(density(x)) Is there a way to estimate the density curve strictly within (0, 1) and still use some sort of smoothing? Any help would be greatly appreciated. Best regards, Justine Rochon -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3.5 - sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rpad and R 2.10.0
I have problems with Rpad and R 2.10.0 (Windows XP and Windows 7, browser is Firefox) Just starting Rpad by library(Rpad) Rpad() opens the browser and displays the .html files and the .Rpad files in my home directory, but these files do not have links and are not clickable. Doing the same in R 2.9.2 gives clickable links in the browser. Furthermore, in both cases an empty graphics window opens. Has anybody else similar experiences? Can anybody offer advice? -- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: cannot allocate vector of size 3.4 Gb
On Fri, Nov 6, 2009 at 8:19 PM, Benilton Carvalho bcarv...@jhsph.edu wrote: this is converging to bioc. let me know what your sessionInfo() is and what type of CEL files you're trying to read, additionally provide exactly how you reproduce the problem. Here is my sessionInfo(). pname is 'moex10stv1cdf'. for (f in list.celfiles('.',full.names=T,recursive=T)) { + print(f) + pname=cleancdfname(whatcdf(f)) + print(pname) + } sessionInfo() R version 2.9.2 (2009-08-24) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] pd.moex.1.0.st.v1_2.4.1 RSQLite_0.7-2 DBI_0.2-4 [4] oligo_1.8.3 preprocessCore_1.6.0oligoClasses_1.6.0 [7] Biobase_2.4.1 loaded via a namespace (and not attached): [1] affxparser_1.16.0 affyio_1.12.0 Biostrings_2.12.9 IRanges_1.2.3 [5] splines_2.9.2 it appears to me, i'm not sure, that you start a fresh session of R and then tries to read in the data - how much resource do you have available when you try reading in the data? having 8GB RAM does not mean that you have 8GB when you tried the task. b On Nov 7, 2009, at 12:08 AM, Peng Yu wrote: On Fri, Nov 6, 2009 at 5:00 PM, Marc Schwartz marc_schwa...@me.com wrote: On Nov 6, 2009, at 4:19 PM, Peng Yu wrote: On Fri, Nov 6, 2009 at 3:39 PM, Charlie Sharpsteen ch...@sharpsteen.net wrote: On Fri, Nov 6, 2009 at 1:30 PM, Peng Yu pengyu...@gmail.com wrote: I run R on a linux machine that has 8GB memory. But R gives me an error Error: cannot allocate vector of size 3.4 Gb. I'm wondering why it can not allocate 3.4 Gb on a 8GB memory machine. How to fix the problem? Is it 32-bit R or 64-bit R? Are you running any other programs besides R? How far into your data processing does the error occur? The more statements you execute, the more fragmented R's available memory pool becomes. A 3.4 Gb chunk may no longer be available. I'm pretty sure it is 64-bit R. But I need to double check. What command I should use to check? It seems that it didn't do anything but just read a lot of files before it showed up the above errors. Check the output of: .Machine$sizeof.pointer If it is 4, R was built as 32 bit, if it is 8, R was built as 64 bit. See ?.Machine for more information. It is 8. The code that give the error is listed below. There are 70 celfiles. I'm wondering how to investigate what cause the problem and fix it. library(oligo) cel_files = list.celfiles('.', full.names=T,recursive=T) data=read.celfiles(cel_files) You can also check: R.version$arch and .Platform$r_arch which for 64 bit should show x86_64. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] map of a country and its different geographical levels
Perhaps asking on R-sig-geo might help (as well as reading the function help files, scripts found lying around somewhere may be stale ...)? If readShapePoly() (deprecated - use readShapeSpatial() instead) says that the data are not polygons, then they are not. If you want to fill administrative boundaries polygons, you need polygons, not lines. The source you are using is based on OpenStreetMaps, so more likely to be lines, and as the website says, not authorised. You need to locate an appropriate source of boundary data first for your geographical features of interest. There are very few national mapping agencies that make these data available free (the US led on this, some others are understanding too, slowly). Having done that, use readOGR() in rgdal to read the polygon shapefile into a SpatialPolygonsDataFrame (maybe use ogrInfo() in rgdal to explore the shapefile). Choose readOGR() rather than readShapeSpatial() if the shapefile has a *.prj file specifying its coordinate reference system. If you need NUTS3 entities, try the france map data in the maps package. In addition, note that your shapefile is very detailed, and suitable for poster-size output with lots of boundary detail. You may prefer less boundary detail - another search criterion in looking for a source. Eurostat/GISCO provides some boundaries at various spatial scales, so giving faster but coarser plotting when the 1:10M or 1:20M variants are chosen. Hope this helps, Roger CE.KA wrote: Hi Greg I downloaded the file france.shapefiles.zip Then i unziped it. There were 4 files interesting me: - france_administrative.dbf - france_administrative.prj - france_administrative.shp - france_administrative.shx How can i do to read the map france_administrative with R I tried this script that i found on a french web site: library(maptools) library(rgdal) library(foreign) proj.string - +init=epsg:27572 +proj=lcc +lat_1=45.90287723937 +lat_2=47.69712276063 +lat_0=46.8 +lon_0=2.337229104484 +x_0=60 +y_0=220 +units=m +pm=greenwich proj.string.geo - +proj=longlat +datum=WGS84 france - readShapePoly(france_administrative.shp, proj4string=CRS(proj.string)) france - spTransform(france, CRS(proj.string.geo)) save(france, file=france.rda) But there was an error message: Error in .asSpatialPolygonsShapes(Map$Shapes, IDs, proj4string = proj4string, : Not polygon shapes Best regards -- View this message in context: http://old.nabble.com/map-of-a-country-and-its-different-geographical-levels-tp26225645p26240923.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] map of a country and its different geographical levels
If readShapePoly() (deprecated - use readShapeSpatial() instead) says that the data are not polygons, then they are not. If you want to fill administrative boundaries polygons, you need polygons, not lines. The source you are using is based on OpenStreetMaps, so more likely to be lines, and as the website says, not authorised. You need to locate an appropriate source of boundary data first for your geographical features of interest. There are very few national mapping agencies that make these data available free (the US led on this, some others are understanding too, slowly). http://gadm.org/country seems to provide this data for very many countries - and in sp objects too! Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: cannot allocate vector of size 3.4 Gb
you haven't answered how much resource you have available when you try reading in the data. with the mouse exon chip, the math is the same i mentioned before. having 8 GB, you should be able to read in 70 samples of this chip. if you can't, that's because you don't have enough resources when trying to read. best, b On Nov 7, 2009, at 10:12 AM, Peng Yu wrote: On Fri, Nov 6, 2009 at 8:19 PM, Benilton Carvalho bcarv...@jhsph.edu wrote: this is converging to bioc. let me know what your sessionInfo() is and what type of CEL files you're trying to read, additionally provide exactly how you reproduce the problem. Here is my sessionInfo(). pname is 'moex10stv1cdf'. for (f in list.celfiles('.',full.names=T,recursive=T)) { + print(f) + pname=cleancdfname(whatcdf(f)) + print(pname) + } sessionInfo() R version 2.9.2 (2009-08-24) x86_64-unknown-linux-gnu locale: LC_CTYPE = en_US .UTF -8 ;LC_NUMERIC = C ;LC_TIME = en_US .UTF -8 ;LC_COLLATE = en_US .UTF -8 ;LC_MONETARY = C ;LC_MESSAGES = en_US .UTF -8 ;LC_PAPER = en_US .UTF -8 ;LC_NAME = C ;LC_ADDRESS =C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] pd.moex.1.0.st.v1_2.4.1 RSQLite_0.7-2 DBI_0.2-4 [4] oligo_1.8.3 preprocessCore_1.6.0oligoClasses_1.6.0 [7] Biobase_2.4.1 loaded via a namespace (and not attached): [1] affxparser_1.16.0 affyio_1.12.0 Biostrings_2.12.9 IRanges_1.2.3 [5] splines_2.9.2 it appears to me, i'm not sure, that you start a fresh session of R and then tries to read in the data - how much resource do you have available when you try reading in the data? having 8GB RAM does not mean that you have 8GB when you tried the task. b On Nov 7, 2009, at 12:08 AM, Peng Yu wrote: On Fri, Nov 6, 2009 at 5:00 PM, Marc Schwartz marc_schwa...@me.com wrote: On Nov 6, 2009, at 4:19 PM, Peng Yu wrote: On Fri, Nov 6, 2009 at 3:39 PM, Charlie Sharpsteen ch...@sharpsteen.net wrote: On Fri, Nov 6, 2009 at 1:30 PM, Peng Yu pengyu...@gmail.com wrote: I run R on a linux machine that has 8GB memory. But R gives me an error Error: cannot allocate vector of size 3.4 Gb. I'm wondering why it can not allocate 3.4 Gb on a 8GB memory machine. How to fix the problem? Is it 32-bit R or 64-bit R? Are you running any other programs besides R? How far into your data processing does the error occur? The more statements you execute, the more fragmented R's available memory pool becomes. A 3.4 Gb chunk may no longer be available. I'm pretty sure it is 64-bit R. But I need to double check. What command I should use to check? It seems that it didn't do anything but just read a lot of files before it showed up the above errors. Check the output of: .Machine$sizeof.pointer If it is 4, R was built as 32 bit, if it is 8, R was built as 64 bit. See ?.Machine for more information. It is 8. The code that give the error is listed below. There are 70 celfiles. I'm wondering how to investigate what cause the problem and fix it. library(oligo) cel_files = list.celfiles('.', full.names=T,recursive=T) data=read.celfiles(cel_files) You can also check: R.version$arch and .Platform$r_arch which for 64 bit should show x86_64. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which data structure to choose to keep multile objects?
your could try something like this: xprocess-function(max_val) { result - list() for (iter in 2: max_val) { zz = sprintf( z%s, iter ) # use 'character' for indexing the list result[[as.character(iter)]] -nnmf(X,iter) } result } xprocess(10) On Fri, Nov 6, 2009 at 6:58 PM, clue_less suhai_tim_...@yahoo.com wrote: I have a function called nnmf which takes in one matrix and returns two matrices. for example, X [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12 z=nnmf(X,2) z$W [,1] [,2] [1,] 0.8645422 0.6643681 [2,] 1.7411863 0.5377504 [3,] 2.6179287 0.4111063 z$H [,1] [,2] [,3] [,4] [1,] 1.14299486 1.692260 2.241279 2.79030 [2,] 0.01838514 3.818559 7.619719 11.42087 Now I would like to run it many times -- z2 = nnmf(X,2) z3 = nnmf(X,3) z4 = nnmf(X,4) z5 = nnmf(X,5) ... But I would like to do it automatically , something like - xprocess-function(max_val) { for (iter in 2: max_val) { zz = sprintf( z%s, iter ) zz -nnmf(X,iter) } } xprocess(10) But how could I keep collection of my results each run? Shall I have a data structure to keep appending results? something like theta = {} ? which data structure to choose to keep multile objects? Thanks! Tim -- View this message in context: http://old.nabble.com/which-data-structure-to-choose-to-keep-multile-objects--tp26231601p26231601.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which data structure to choose to keep multile objects?
On Fri, Nov 6, 2009 at 11:58 PM, clue_less suhai_tim_...@yahoo.com wrote: I have a function called nnmf which takes in one matrix and returns two matrices. for example, X [,1] [,2] [,3] [,4] [1,] 1 4 7 10 [2,] 2 5 8 11 [3,] 3 6 9 12 z=nnmf(X,2) z$W [,1] [,2] [1,] 0.8645422 0.6643681 [2,] 1.7411863 0.5377504 [3,] 2.6179287 0.4111063 z$H [,1] [,2] [,3] [,4] [1,] 1.14299486 1.692260 2.241279 2.79030 [2,] 0.01838514 3.818559 7.619719 11.42087 Now I would like to run it many times -- z2 = nnmf(X,2) z3 = nnmf(X,3) z4 = nnmf(X,4) z5 = nnmf(X,5) ... But I would like to do it automatically , something like - xprocess-function(max_val) { for (iter in 2: max_val) { zz = sprintf( z%s, iter ) zz -nnmf(X,iter) } } xprocess(10) But how could I keep collection of my results each run? Shall I have a data structure to keep appending results? something like theta = {} ? which data structure to choose to keep multile objects? You're already using one! It's called a list: zz=list() for(i in 1:10){ zz[[i]] = nnmf(X,i) } then you can do: zz[[1]]$W and zz[[1]]$H Note the BIG difference between zz[1] and zz[[1]] though. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] map of a country and its different geographical levels
On Sat, 7 Nov 2009, hadley wickham wrote: If readShapePoly() (deprecated - use readShapeSpatial() instead) says that the data are not polygons, then they are not. If you want to fill administrative boundaries polygons, you need polygons, not lines. The source you are using is based on OpenStreetMaps, so more likely to be lines, and as the website says, not authorised. You need to locate an appropriate source of boundary data first for your geographical features of interest. There are very few national mapping agencies that make these data available free (the US led on this, some others are understanding too, slowly). http://gadm.org/country seems to provide this data for very many countries - and in sp objects too! Right, but ... the first one I tried (Irish Republic) had very detailed coastlines, very coarse county boundaries, and (worse) the county boundaries did not match paper maps, atlases, or Google Earth - counties which do not share boundaries in other sources do in GADM, and vice-versa. There is an authority/metadata problem - the maps look plausible and things may be more-or-less where they should be, but that's all you get, I'm afraid. I've tried to report the discrepancy without success so far. I really hope that the case I hit is an exception - displaying the boundaries on GE should help clear things up for users. Roger Hadley -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: roger.biv...@nhh.no __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The equivalence of t.test and the hypothesis testing of one way ANOVA
Hi, Student's T-test is a test that can be used to test ONE SINGLE linear restriction - which serves the as alternative hypothesis - on a linear model - which serves as the null hypothesis - AT THE SAME TIME. Fisher's F test is an extension of the T test. The F test can be used to test ONE OR MORE linear restriction(s) on a linear model AT THE SAME TIME. So, to test a single restriction on a linear model one can use both the F test and the T test. When multiple restrictions are tested at the same time one needs to apply the F test. Both the F and the T test actually require equal variances. However, using a transformation matrix one can transform a model assuming unequal variances into an equivalent model assuming equal variances. On such a transformed model the F test or T test can be applied. The untransformed models are usually called general linear models. In R they can be handled using the glm() function. (See ?glm) A (one-way) Anova model is a specific type of general linear model (glm). So hypotheses on an Anova model are tested in exactly the same way as any other restrictions on a glm should be tested. Best wishes, Guido -- Message: 11 Date: Fri, 6 Nov 2009 09:48:18 -0500 From: jlu...@ria.buffalo.edu Subject: Re: [R] The equivalence of t.test and the hypothesis testing of one way ANOVA To: Peng Yu pengyu...@gmail.com Cc: r-help-boun...@r-project.org, r-h...@stat.math.ethz.ch Message-ID: ofc71a4670.65d468b9-on85257666.0050ee14-85257666.00517...@ria.buffalo.edu Content-Type: text/plain There extensions to aov for without assuming equal variances. Reed, James F., I. Stark, D. B. (1988), 'Robust alternatives to traditional analyses of variance: Welch $W^*$, James $J_I^*$, James $J_II^*$, and Brown-Forsythe $BF^*$', Computer Methods and Programs in Biomedicine 26, 233--238. I don't know whether they are implemented in R. Peng Yu pengyu...@gmail.com Sent by: r-help-boun...@r-project.org 11/06/2009 07:59 AM To r-h...@stat.math.ethz.ch cc Subject Re: [R] The equivalence of t.test and the hypothesis testing of one way ANOVA Is it possible to use aov() to compute the same p-value that is generated by t.test() with var.equal=F. An assumption of ANOVA is equal variance, I'm wondering how to relax such assumption to allow non equal variance? On Thu, Nov 5, 2009 at 8:31 AM, Benilton Carvalho bcarv...@jhsph.edu wrote: compare t.test(x, y, var.equal=T) with summary(afit) b On Nov 5, 2009, at 12:21 PM, Peng Yu wrote: I read somewhere that t.test is equivalent to a hypothesis testing for one way ANOVA. But I'm wondering how they are equivalent. In the following code, the p-value by t.test() is not the same from the value in the last command. Could somebody let me know where I am wrong? set.seed(0) N1=10 N2=10 x=rnorm(N1) y=rnorm(N2) t.test(x,y) Welch Two Sample t-test data: x and y t = 1.6491, df = 14.188, p-value = 0.1211 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.2156863 1.6584968 sample estimates: mean of x mean of y 0.3589240 -0.3624813 A = c(rep('x',N1),rep('y',N2)) Y = c(x,y) fr = data.frame(Y=Y,A=as.factor(A)) afit=aov(Y ~ A,fr) X=model.matrix(afit) B=afit$coefficients V=solve(t(X) %*% X) mse=tail(summary(afit)[[1]]$'Mean Sq',1) df=tail(summary(afit)[[1]]$'Df',1) t_statisitic=(B/(mse * sqrt(diag(V[[2]] 2*(1-pt(abs(t_statisitic),df))#the p-value from aov [1] 0.1090802 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] New Email names for you! Get the Email name you#39;ve always wanted on the new @ymail and @rocketmail. Hurry before someone else does! http://mail.promotions.yahoo.com/newdomains/aa/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: cannot allocate vector of size 3.4 Gb
Most of the 8GB was available, when I run the code, because R was the only computation session running. On Sat, Nov 7, 2009 at 7:51 AM, Benilton Carvalho bcarv...@jhsph.edu wrote: you haven't answered how much resource you have available when you try reading in the data. with the mouse exon chip, the math is the same i mentioned before. having 8 GB, you should be able to read in 70 samples of this chip. if you can't, that's because you don't have enough resources when trying to read. best, b On Nov 7, 2009, at 10:12 AM, Peng Yu wrote: On Fri, Nov 6, 2009 at 8:19 PM, Benilton Carvalho bcarv...@jhsph.edu wrote: this is converging to bioc. let me know what your sessionInfo() is and what type of CEL files you're trying to read, additionally provide exactly how you reproduce the problem. Here is my sessionInfo(). pname is 'moex10stv1cdf'. for (f in list.celfiles('.',full.names=T,recursive=T)) { + print(f) + pname=cleancdfname(whatcdf(f)) + print(pname) + } sessionInfo() R version 2.9.2 (2009-08-24) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] pd.moex.1.0.st.v1_2.4.1 RSQLite_0.7-2 DBI_0.2-4 [4] oligo_1.8.3 preprocessCore_1.6.0 oligoClasses_1.6.0 [7] Biobase_2.4.1 loaded via a namespace (and not attached): [1] affxparser_1.16.0 affyio_1.12.0 Biostrings_2.12.9 IRanges_1.2.3 [5] splines_2.9.2 it appears to me, i'm not sure, that you start a fresh session of R and then tries to read in the data - how much resource do you have available when you try reading in the data? having 8GB RAM does not mean that you have 8GB when you tried the task. b On Nov 7, 2009, at 12:08 AM, Peng Yu wrote: On Fri, Nov 6, 2009 at 5:00 PM, Marc Schwartz marc_schwa...@me.com wrote: On Nov 6, 2009, at 4:19 PM, Peng Yu wrote: On Fri, Nov 6, 2009 at 3:39 PM, Charlie Sharpsteen ch...@sharpsteen.net wrote: On Fri, Nov 6, 2009 at 1:30 PM, Peng Yu pengyu...@gmail.com wrote: I run R on a linux machine that has 8GB memory. But R gives me an error Error: cannot allocate vector of size 3.4 Gb. I'm wondering why it can not allocate 3.4 Gb on a 8GB memory machine. How to fix the problem? Is it 32-bit R or 64-bit R? Are you running any other programs besides R? How far into your data processing does the error occur? The more statements you execute, the more fragmented R's available memory pool becomes. A 3.4 Gb chunk may no longer be available. I'm pretty sure it is 64-bit R. But I need to double check. What command I should use to check? It seems that it didn't do anything but just read a lot of files before it showed up the above errors. Check the output of: .Machine$sizeof.pointer If it is 4, R was built as 32 bit, if it is 8, R was built as 64 bit. See ?.Machine for more information. It is 8. The code that give the error is listed below. There are 70 celfiles. I'm wondering how to investigate what cause the problem and fix it. library(oligo) cel_files = list.celfiles('.', full.names=T,recursive=T) data=read.celfiles(cel_files) You can also check: R.version$arch and .Platform$r_arch which for 64 bit should show x86_64. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Scheffé's method?
Evidently my RSeek capabilities have once again failed me...By any chance is Scheffe's method (http://www.itl.nist.gov/div898/handbook/prc/section4/prc472.htm) implemented independently in any R base packages? If not, is it implemented independently in any of the add-on packages? Thanks again for any insights. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] after PCA, the pc values are so large, wrong?
rm(list=ls()) yx.df-read.csv(c:/MK-2-72.csv,sep=',',header=T,dec='.') dim(yx.df) #get X matrix y-yx.df[,1] x-yx.df[,2:643] #conver to matrix mat-as.matrix(x) #get row number rownum-nrow(mat) #remove the constant parameters mat1-mat[,apply(mat,2,function(.col)!(all(.col[1]==.col[2:rownum])))] dim(yx.df) dim(mat1) #remove columns with numbers of zero 0.95 mat2-mat1[,apply(mat1,2,function(.col)!(sum(.col==0)/rownum0.95))] dim(yx.df) dim(mat2) #remove colunms that sd0.5 mat3-mat2[,apply(mat2,2,function(.col)!all(sd(.col)0.5))] dim(yx.df) dim(mat3) #PCA analysis mat3.pr-prcomp(mat3,cor=T) summary(mat3.pr,loading=T) pre.cmp-predict(mat3.pr) cmp-pre.cmp[,1:3] cmp DF-cbind(Y,cmp) DF-as.data.frame(DF) names(DF)-c('y','p1','p2','p3') DF summary(lm(y~p1+p2+p3,data=DF)) mat3.pr-prcomp(DF,cor=T) summary(mat3.pr) pre-predict(mat3.pr) pre1-pre[,1:3] pre1 colnames(pre1)-c(x1,x2,x3) pre1 pc-cbind(y,pre1) pc-as.data.frame(pc) lm.pc-lm(y~x1+x2+x3,data=pc) summary(lm.pc) above, my code about pca, but after finishing it, the first three pcs are some large, why? and the fit value r2 are bad. belowe is my value on the firest 3 pcs. pre1 PC1 PC2 PC3 [1,] -15181.5190 1944.392700 -1074.326182 [2,] -32152.4533 1007.113729 3201.361408 [3,] -15836.5362 2117.988273 -555.799383 [4,] -1618.5561 1481.020337 255.530132 [5,] -5407.5030 1975.779398 -84.646283 [6,] -9662.1949 2611.220928 -417.435782 [7,] -30488.2102 577.385588 1853.420297 [8,] -2135.2563 -4506.112873 1382.413284 [9,] -1584.2796 -4645.142062 929.146895 [10,] -668.7664 -4876.250486 177.691446 [11,] -2188.5914 -4495.203080 1432.428127 [12,] -19633.9581 2159.000138 -1598.710872 [13,] -26849.1088 -515.574085 -2683.552623 [14,] -9492.9503 -4868.648205 1236.986097 [15,] -13857.6517 -4810.228193 1296.342199 [16,] -11596.5097 -8181.631403 462.913210 [17,] -25948.6564 -746.442386 -3415.426682 [18,] 15386.4477 709.974524 555.160973 [19,] 21642.7516 1163.456075 -609.437740 [20,] 22236.7094 675.562564 -136.992578 [21,] 14354.9927 611.996274-4.867054 [22,] 12569.9493 .842240 585.540985 [23,] 20739.0219 3078.679745 1662.902248 [24,] 9472.0249 648.769910 381.487034 [25,] 17299.5307 1424.712428 1522.311676 [26,] 13231.2735 587.761915 170.448061 [27,] 10843.5590 705.485396 -79.931518 [28,] 9402.8803 -1978.216853 -1534.244078 [29,] 13094.9525 212.042937 -363.941664 [30,] 9337.3522 537.885230 189.558999 [31,] 7747.1347 -141.004825 -1664.082447 [32,] 4640.1161 -1489.652284 -3584.574135 [33,] 13241.5054 175.630689 -486.250927 [34,] 3867.2204 814.830143 1584.358007 [35,] 8614.5030 708.274447 814.295587 [36,] -18815.6774 -480.311541 1248.369916 [37,] -1860.0810 1195.557861 269.322703 [38,] 7172.0057 4.216905 -1191.448702 [39,] -7233.2271 -2361.951658 -235.293358 [40,] 1841.3548 1187.225488 632.116420 [41,] 12465.2336 367.822405 160.751014 [42,] -39021.7259 1972.333778 3167.504098 [43,] 13098.7736 -424.152058 -567.846037 [44,] 9793.7729 -559.084900 -210.696126 [45,] 13111.186122.772626 -318.242722 [46,] 13169.0604 7.808885 -363.995563 [47,] 3306.6293 -694.908211 -642.996604 [48,] 10779.8582 -989.175596 -1619.861931 [49,] 10872.6913 -747.979343 -1375.317959 [50,] -3057.5633 1838.449143 1454.886518 [51,] -6854.9316 2338.753165 1113.510561 [52,] -15077.1823 1917.776905 -1158.158633 [53,] -45862.8305 1173.157521 -1707.293955 [54,] -14294.1553 1716.708462 -1794.064434 [55,] 24645.0508 2519.904889 1424.233563 [56,] 23303.5998 2250.088386 839.587354 [57,] 18865.5231 897.56644636.240598 [58,]227.2659 -6582.661199 -712.892569 [59,] 15336.8371 722.953549 593.903314 [60,] 13030.8715 228.509670 -312.933654 [61,] 5826.0388 331.077814 -53.417878 [62,] 13150.4446 -437.612023 -608.342969 [63,] 11728.3897 -83.151510 569.007995 [64,] 11021.5720 -869.425283 -1216.724017 [65,] 9625.3142 137.388994 138.735249 [66,] -15905.2704 3735.547166 421.846379 [67,] -15539.7628 3331.399648 104.886572 [68,] -2294.9924 1648.164750 822.075221 [69,] -10120.0153 1558.766306 -333.378256 [70,] -24241.4554 -533.700229 1516.603088 [71,] -1036.6022 -4782.136067 475.195011 [72,] -24575.2244 2655.599986 -1965.946921 the fit result below: Call: lm(formula = y ~ x1 + x2 + x3, data = pc) Residuals: Min 1Q Median 3Q Max -1.29638 -0.47622 0.01059 0.49268 1.69335 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 5.613e+00 8.143e-02 68.932 2e-16 *** x1 -3.089e-05 5.150e-06 -5.998 8.58e-08 *** x2 -4.095e-05 3.448e-05 -1.1880.239 x3 -8.106e-05 6.412e-05 -1.2640.210 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.691 on 68 degrees of freedom Multiple R-squared: 0.3644, Adjusted R-squared:
[R] cross-entity equation
vector1 vector2 vector3 vector4 h = 1.2345h = 1.2212h = 1.2065h = 1.3423 x1 = 0.8654y1 = 0.7525z1 = 1.3254x3 = 0.5746 x2 = 0.8642y2 = 0.2458z2 = 0.9543y1 = 0.3584 x3 = 0.5548y3 = 0.4685z3 = 0.4685y4 = 0.7566 x4 = 0.3546y4 = 0.8346z4 = 0.6752z3 = 0.9546 ...... .. Above are 3 independent vectors, and then vector4 link the previous 3vectors to be vector5 as shown below. May I know is there any package may cope with my problem? vector5 h = ??? x1 = ??? x2 = ??? x3 = ??? x4 = ??? .. y1 = ??? y2 = ??? y3 = ??? y4 = ??? .. z1 = ??? z2 = ??? z3 = ??? z4 = ??? .. -- View this message in context: http://old.nabble.com/cross-entity-equation-tp26241287p26241287.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create multiple command windows of R on Windows?
You can start as many instances of R as you like (except for memory restrictions, perhaps), hence I do not get your point. Uwe Ligges Ivan wrote: Thanks for that. I seem to be able to only get one R console here though. Actually that is my question: how to get a different R console? On Fri, Nov 6, 2009 at 4:10 PM, guohao.hu...@gmail.com wrote: It's easy. You can execute R commands under different ``command consoles'' in Windows. regards Huang, Guo-Hao -- From: Ivan skylark2...@gmail.com Sent: Saturday, November 07, 2009 8:01 AM To: r-help@r-project.org Subject: [R] How to create multiple command windows of R on Windows? Hi there, I am using R on Windows here. And I got a rather stupid question: how can I create another R console? It would be nice to have multiple R consoles so that I can separate different types of commands I use. Thank you, Ivan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Qtl - package - Question
Can you please try to ask the package maintainer who will certainly be happy to know if his function is broken and may know how to fix it. Best, Uwe Ligges Uwe Ligges Sunny Srivastava wrote: Dear R-Helpers, I am using qtl package to analyze qtl data from QTL cartographer. I have the map file and cro file from QTL cartographer. I was trying to import these two files in R using qtl package. data=read.cross(qtlcart, ., crofile.txt, mapfile.txt) ### I have matched the file structure with the one on the website of qtl package - It matches perfectly. It gives me the following error: data.qtl - read.cross(qtlcart, F:/data, + file=crofile.txt, mapfile=mapfile.txt) Error in names(x) - value : 'names' attribute [9] must be the same length as the vector [0] I checked the code for read.cross function in the package read.map.qtlcart(mapfile.txt) works fine... The problem was in the following function: read.cro.qtlcart(crofile.txt) I tried to debug and find the problem: debug: traits - t(f[-(1:(2 + nmarkers)), ]) debug: traits = as.data.frame(traits) debug: if (nrow(traits) == 1) traits - as.data.frame(t(traits)) **data frame with 0 columns and 102 rows --- This is the problem portion ---It should read my 9 trait values for all the 102 individuals. *It should have 9 col. and 102 rows.. but it is unable to do so... debug: colnames(traits) - trait.names Browse[1] n Error in names(x) - value : 'names' attribute [9] must be the same length as the vector [0] It cannot read my trait values - I have 102 individuals, each with 9 trait values. Can anyone tell me how to solve this problem? Any help would be great. Thanks and Best Regards, S. - R version 2.9.2 (2009-08-24) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] qtl_1.14-2 plyr_0.1.9 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rpad and R 2.10.0
Yes I noticed the same problem since R 2.10.0, and I don't know why either. Let's forward the email to the maintainer. Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-6609 Web: http://yihui.name Department of Statistics, Iowa State University 3211 Snedecor Hall, Ames, IA 2009/11/7 Erich Neuwirth erich.neuwi...@univie.ac.at: I have problems with Rpad and R 2.10.0 (Windows XP and Windows 7, browser is Firefox) Just starting Rpad by library(Rpad) Rpad() opens the browser and displays the .html files and the .Rpad files in my home directory, but these files do not have links and are not clickable. Doing the same in R 2.9.2 gives clickable links in the browser. Furthermore, in both cases an empty graphics window opens. Has anybody else similar experiences? Can anybody offer advice? -- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] solution design for a large scale ( 50G) R computing problem
Hi, I am tackling a computing problem in R that involves large data. Both time and memory issues need to be seriously considered. Below is the problem description and my tentative approach. I would appreciate if any one can share thoughts on how to solve this problem more efficiently. I have 1001 multidimensional arrays -- A, B1, ..., B1000. A takes about 500MB in memory and B_i takes 100MB. I need to run an experiment that evaluates a function f(A, B_i) for all B_i. f(A, B_i) doesn't change A, B_i during its evaluation. These evaluations are independent for all i. I also need to design various evaluation functions. Thus these kind of experiments need to be performed often. My computing environment is a 64bit Linux, 64GB memory, 8 core PC. My goal is to do multiple experiments quickly given the existing equipments. One possible approach is to run a R process that loads A and use a parallel library like foreach and mc to load B_i and compute f(A, B_i). The problems with this approach are that each time foreach splits a new process it has to 1) copy the whole A array and 2) load B_i from disk to memory using io. Since f(A, B_i) doesn't change A, B_i, would it be possible to do in R 1) share A across different processes and 2) use memory mapped file to load B_i (even A at the beginning) Any suggestions would be appreciated. Jeff -- View this message in context: http://old.nabble.com/solution-design-for-a-large-scale-%28%3E-50G%29-R-computing-problem-tp26241900p26241900.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error: cannot allocate vector of size 3.4 Gb
ok, i'll take a look at this and get back to you during the week. b On Nov 7, 2009, at 1:19 PM, Peng Yu wrote: Most of the 8GB was available, when I run the code, because R was the only computation session running. On Sat, Nov 7, 2009 at 7:51 AM, Benilton Carvalho bcarv...@jhsph.edu wrote: you haven't answered how much resource you have available when you try reading in the data. with the mouse exon chip, the math is the same i mentioned before. having 8 GB, you should be able to read in 70 samples of this chip. if you can't, that's because you don't have enough resources when trying to read. best, b On Nov 7, 2009, at 10:12 AM, Peng Yu wrote: On Fri, Nov 6, 2009 at 8:19 PM, Benilton Carvalho bcarv...@jhsph.edu wrote: this is converging to bioc. let me know what your sessionInfo() is and what type of CEL files you're trying to read, additionally provide exactly how you reproduce the problem. Here is my sessionInfo(). pname is 'moex10stv1cdf'. for (f in list.celfiles('.',full.names=T,recursive=T)) { + print(f) + pname=cleancdfname(whatcdf(f)) + print(pname) + } sessionInfo() R version 2.9.2 (2009-08-24) x86_64-unknown-linux-gnu locale: LC_CTYPE = en_US .UTF -8 ;LC_NUMERIC = C ;LC_TIME = en_US .UTF -8 ;LC_COLLATE = en_US .UTF -8 ;LC_MONETARY = C ;LC_MESSAGES = en_US .UTF -8 ;LC_PAPER = en_US .UTF -8 ;LC_NAME = C ;LC_ADDRESS =C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] pd.moex.1.0.st.v1_2.4.1 RSQLite_0.7-2 DBI_0.2-4 [4] oligo_1.8.3 preprocessCore_1.6.0 oligoClasses_1.6.0 [7] Biobase_2.4.1 loaded via a namespace (and not attached): [1] affxparser_1.16.0 affyio_1.12.0 Biostrings_2.12.9 IRanges_1.2.3 [5] splines_2.9.2 it appears to me, i'm not sure, that you start a fresh session of R and then tries to read in the data - how much resource do you have available when you try reading in the data? having 8GB RAM does not mean that you have 8GB when you tried the task. b On Nov 7, 2009, at 12:08 AM, Peng Yu wrote: On Fri, Nov 6, 2009 at 5:00 PM, Marc Schwartz marc_schwa...@me.com wrote: On Nov 6, 2009, at 4:19 PM, Peng Yu wrote: On Fri, Nov 6, 2009 at 3:39 PM, Charlie Sharpsteen ch...@sharpsteen.net wrote: On Fri, Nov 6, 2009 at 1:30 PM, Peng Yu pengyu...@gmail.com wrote: I run R on a linux machine that has 8GB memory. But R gives me an error Error: cannot allocate vector of size 3.4 Gb. I'm wondering why it can not allocate 3.4 Gb on a 8GB memory machine. How to fix the problem? Is it 32-bit R or 64-bit R? Are you running any other programs besides R? How far into your data processing does the error occur? The more statements you execute, the more fragmented R's available memory pool becomes. A 3.4 Gb chunk may no longer be available. I'm pretty sure it is 64-bit R. But I need to double check. What command I should use to check? It seems that it didn't do anything but just read a lot of files before it showed up the above errors. Check the output of: .Machine$sizeof.pointer If it is 4, R was built as 32 bit, if it is 8, R was built as 64 bit. See ?.Machine for more information. It is 8. The code that give the error is listed below. There are 70 celfiles. I'm wondering how to investigate what cause the problem and fix it. library(oligo) cel_files = list.celfiles('.', full.names=T,recursive=T) data=read.celfiles(cel_files) You can also check: R.version$arch and .Platform$r_arch which for 64 bit should show x86_64. HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional probability in Copula package?...
Hello All, How would one go about getting Pr(Y ≤ y|X = x) for a Frank copula via the Copula or other package? Best Regards, David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Exhaustive search in leaps - Help Please
Dear R users, I'm new in R and couldn't find the solution to this in the postings. I want to be able to use the leaps package to perform an exhaustive regression. Most of my variables are categorical with many levels. I'd like to restrict the candidate subsets to either all levels included or all excluded in the list. Here's the current and desired output from a dummy example: #Generate random data y - rnorm(n=1000, mean=100, sd=10) #normal distribution a - gl(n=4, k=1, length=1000, label=c(F1, F2, F3, F4)) b - gl(n=4, k=3, length=1000, label=c(G1, G2, G3, G4)) mydata - data.frame(y, a, b) require(leaps) subsets - regsubsets(y ~ . , data=mydata, method=exhaustive) subsets subset.models - summary(subsets)$which subset.models (Intercept) aF2 aF3 aF4bG2 bG3bG4 1TRUE FALSE FALSE FALSE TRUE FALSE FALSE 2TRUE FALSE FALSE FALSE TRUE FALSE TRUE 3TRUE FALSE FALSE FALSE TRUE TRUE TRUE 4TRUE FALSE TRUE FALSE TRUE TRUE TRUE 5TRUE TRUE TRUE FALSE TRUE TRUE TRUE 6TRUE TRUE TRUE TRUE TRUE TRUE TRUE My desired output is: (Intercept) aF2 aF3 aF4bG2 bG3bG4 3TRUE FALSE FALSE FALSE TRUE TRUE TRUE 3TRUE TRUE TRUE TRUE FALSE FALSE FALSE 6TRUE TRUE TRUE TRUE TRUE TRUE TRUE Thanks in advance for your help. Axel. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] new dse package
With the release of R-2.10.0 I have divided the dse bundle into packages. The bundled package dse1 and part of dse2 are now in the new package dse. The remaining part of dse2 is in a new package EvalEst. The package dse does multivariate ARMA and state space time series modelling and forecasting, while package EvalEst is now focused on the evaluation of estimation methods. To aid transition, there are packages dse1 and dse2 on CRAN. These do nothing except require the new packages dse and EvalEst respectively, but should help prevent breaking of existing code that requires() dse1 or dse2. I have also taken this opportunity to do a long overdue cleanup of the Users' Guides distributed as vignettes in the packages. Paul Gilbert La version française suit le texte anglais. This email may contain privileged and/or confidential in...{{dropped:26}} ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lme4 and incomplete block design
Dear list members, I try to simulate an incomplete block design in which every participants receives 3 out of 4 possible treatment. The outcome in binary. Assigning a binary outcome to the BIB or PBIB dataset of the package SASmixed gives the appropriate output. With the code below, fixed treatment estimates are not given for each of the 4 possible treatments, instead a kind of summary measure(?) for 'treatment' is given. block-rep(1:24,each=3) treatment-c(1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3, 1, 2, 3, 2, 1, 4, 3, 4, 1, 4, 3, 2, 1, 3, 4, 2, 4, 3, 3, 1, 2, 4, 2, 1, 1, 4, 2, 2, 3, 1, 3, 2, 4, 4, 1, 3) outcome-c(0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0) data-data.frame(block,treatment,outcome) lmer(outcome~treatment +(1|block), family=binomial, data=data) Is this a problem with the recovery of interblock estimates? Are there special rules how incomplete block designs should look like to enable this recovery? Any help is appreciated! Regards, Christian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to see any R package code?
You have a couple of options: 1) typing the name of the function will list the source within your R console. For example: ls will list the source code for the ls() function. So once you load your package, you can simply enter the name of the function whose source you wish to see and you will see. 2) The above will not work if the function has not been exported via the NAMESPACE file. In this case you have to download the .tar.gz file from CRAN to view the source. After you unzip and untar the downloaded file, you can look in the R/ or src/ directories for the function you are interested in. As the names suggest, R/ contains R code, while src/ contains C or Fortran code used in the package. Cheers, Vik --- Grad Student Department of Statistics University of Florida Zhijiang Wang wrote: Dear All, I want to see a R package code, how can I do that? what file in the package? -- Best wishes, Zhijiang Wang PHD Student Room 212, Science buliding, The International WIC Institute, College of Computer Science and Technology, Beijing University of Technology, Beijing, China. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/How-to-see-any-R-package-code--tp26210294p26249606.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create multiple command windows of R on Windows?
Well, the problem is that I want those console to be from the same session (i.e., they share same objects etc.). I do not think multiple instances of R could do it ... 2009/11/7 Uwe Ligges lig...@statistik.tu-dortmund.de You can start as many instances of R as you like (except for memory restrictions, perhaps), hence I do not get your point. Uwe Ligges Ivan wrote: Thanks for that. I seem to be able to only get one R console here though. Actually that is my question: how to get a different R console? On Fri, Nov 6, 2009 at 4:10 PM, guohao.hu...@gmail.com wrote: It's easy. You can execute R commands under different ``command consoles'' in Windows. regards Huang, Guo-Hao -- From: Ivan skylark2...@gmail.com Sent: Saturday, November 07, 2009 8:01 AM To: r-help@r-project.org Subject: [R] How to create multiple command windows of R on Windows? Hi there, I am using R on Windows here. And I got a rather stupid question: how can I create another R console? It would be nice to have multiple R consoles so that I can separate different types of commands I use. Thank you, Ivan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] EM algorithm to fit circular mix of uniform+Von Mises
Hi all, I'm curious if anyone has coded an Expectation-Maximization algorithm that could help me model some circular data I have. I'd like to model it as a mixture of uniform and Von Mises centered on 0, so the only free parameters is the mixing proportion and the kappa of the Von Mises. I couldn't find anything in the contributed packages that seemed to suit this purpose. Any pointers would be greatly appreciated! Cheers, Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot.window arguments being ignored?
Hi Dan, I think it's because lines() is meant for you to add lines to a plot. Hence it does not make R redraw the plot. If you wish to set the limits of the y-axis (or x-axis for that matter), you should do it in the plot function call, and then call lines() without the ylim=ry part. So your final two lines could look like this: plot(time_for_graph[,1], preference_for_graph[,1], type=b, xlim=rx, ylim= ry, xlab=Time, ylab=Preference) lines(time_for_graph[,1], preference_for_graph[,2], type=b, col=red) HTH, Vik --- Grad Student Department of Statistics University of Florida AR_user wrote: Hi, Why, when I run the script below, is my y-axis range command being ignored? I.e., the y axis graphed consistently fits the line specified in the plot command, but never fits the line I'm trying to add in the subsequent line command. This is my first R script, so if I'm doing anything else wacky that jumps out, please let me know. I'm an advanced Stata user, but R is new to me. The script is designed to simulate a binary-choice decision field theory example with internally-controlled stop time, as per Neural Networks 19 (2006) 1047-1058. Thanks in advance!! -Dan #- # Starting preference state; two choices preference-c(0,0) # Preference threshold; when one of the preference states exceeds this threshhold, deliberation stops theta- 2 # Contrast matrix contrast-array(c(1, -1, -1, 1), dim=c(2,2)) # Value Matrix; three possible states of the world value-array(c(1,0, 0, 1, .5,.5), dim=c(2, 3)) # Feedback Matrix feedback-array(c(1, .5, .5, 1), dim=c(2,2)) # Time t- array(0, dim=c(1,1)) # Arrays to keep track of history for graphing time_for_graph-array(t, dim=c(1,1)) preference_for_graph-array(preference, dim=c(1,2)) # Moving through time until preference crosses the threshold theta while (preference[1]theta preference[2]theta) { #stochastic weights for this time period weight-rnorm(3,0,1) #updating preference preference-feedback%*%preference + contrast%*%value%*%weight #updating history t-t+1 time_for_graph-rbind(time_for_graph, t) preference_for_graph-rbind(preference_for_graph, array(preference, dim=c(1,2))) #updating graph ranges ry-range(preference_for_graph) rx-range(time_for_graph) #updating graph plot(time_for_graph[,1], preference_for_graph[,1], type=b, plot.window(rx, ry), xlab=Time, ylab=Preference) lines(time_for_graph[,1], preference_for_graph[,2], type=b, col=red, ylim=ry) } #- -- View this message in context: http://old.nabble.com/plot.window-arguments-being-ignored--tp26249609p26249696.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create multiple command windows of R on Windows?
On 07/11/2009 6:31 PM, Ivan wrote: Well, the problem is that I want those console to be from the same session (i.e., they share same objects etc.). I do not think multiple instances of R could do it ... No, there's only one console in each instance. That's pretty fundamental: there is one master loop reading, evaluating and printing, and the console shows you what's going on there. You can write your own front end to send things to the console (and somehow decide what to do when it's busy), but don't expect R to ever have more than one console. Duncan Murdoch 2009/11/7 Uwe Ligges lig...@statistik.tu-dortmund.de You can start as many instances of R as you like (except for memory restrictions, perhaps), hence I do not get your point. Uwe Ligges Ivan wrote: Thanks for that. I seem to be able to only get one R console here though. Actually that is my question: how to get a different R console? On Fri, Nov 6, 2009 at 4:10 PM, guohao.hu...@gmail.com wrote: It's easy. You can execute R commands under different ``command consoles'' in Windows. regards Huang, Guo-Hao -- From: Ivan skylark2...@gmail.com Sent: Saturday, November 07, 2009 8:01 AM To: r-help@r-project.org Subject: [R] How to create multiple command windows of R on Windows? Hi there, I am using R on Windows here. And I got a rather stupid question: how can I create another R console? It would be nice to have multiple R consoles so that I can separate different types of commands I use. Thank you, Ivan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trend thresholds on time series
Dear all, I have several response variables estimated from some simulations, and I would like to identify the thresholds for trend changes. Fro example, below I forced two different response behaviours and on x is time unit. x-1:1500 y-x/exp(x^0.2) smaller15-y[y15] y-ifelse(y15,y+rnorm(length(smaller15)), 15+rnorm(1000-length(smaller15), 0, 0.9)) myDF1-data.frame(cbind(x,y)) plot(y~x, data=myDF1) k1-1:10 l1- -65*k1*k1+750*k1-500 k2-12:25 l2-2.6299*k2*k2- 104.39*k2 + 1000 myDF2-data.frame(cbind(k=c(k1,k2),l=c(l1,l2))) plot(l~k, data=myDF2) As one can see, the first simulation we have a non-linear ascendent y-response, and after ~500 time steps the simulation change the behaviour to almost stable results. By other side, on second example I get a fast increasing, subsequent decreasing and ~11 time steps I get a different trend. For the first case I think that segmented package could do the job, and for second case I think that it will not work properly. But as my simulations is for time-series, I was thinking if we can have a ts-like way of identify trend changes on the outcome results. Cheers miltinho University of Sao Paulo, br [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] EM algorithm to fit circular mix of uniform+Von Mises
Hi Mike, I have an EM algorithm code for a binary von Mises mixture with 5 parameters: mixing proportion (p), 2 locations (m1, m2), and 2 dispersion parameters (k1, k2). Of course, your model is nested within this one, where k1=Inf, m1 = arbitrary, m2=0. You should be able to modify my code easily to fit this reduced model. This will also allow you to perform a likelihood ratio test for whether the 5-parameter model is better than the 2-parameter model (approximately chi-squared with 3 d.o.f.). Alternatively, you can directly estimate the 2-parameter model by maximizing the log-likelihood subject to constraints on the 2 parameters. This is quite easy to do, but will not allow you comparison with a more general model. Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: Mike Lawrence mike.lawre...@dal.ca Date: Saturday, November 7, 2009 6:38 pm Subject: [R] EM algorithm to fit circular mix of uniform+Von Mises To: r-h...@stat.math.ethz.ch Hi all, I'm curious if anyone has coded an Expectation-Maximization algorithm that could help me model some circular data I have. I'd like to model it as a mixture of uniform and Von Mises centered on 0, so the only free parameters is the mixing proportion and the kappa of the Von Mises. I couldn't find anything in the contributed packages that seemed to suit this purpose. Any pointers would be greatly appreciated! Cheers, Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: ~ Certainty is folly... I think. ~ __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Repeated measures on a factorial unbalanced in a blocks with split-plot design
Dear all, I am trying to analyze data from an experiment like this: Factors: Hormone - Levels: SH, CH (S = without; C=with; H=Hormone) Time - Levels: 19/08/09, 04/09/09, 18/09/09, 08/10/09, 20/10/09 (DD/MM/YY) Nutrition - Levels: Completa, Sem (without) Macronutrition - Levels: Ca, K, Mg, P, Sem (without) Time is the measures day. It reflect the days after germination. Blocks : 4 plants per sub-plots: 16 Each plot was divided in two parts equals. In each part, there was 6 sub-plots with 16 plants (2x8 plants). In the first part of the plot, it was treated with CH and other-one was treated with SH. No randomization here. Factors Nutrition and Macronutrition was combined together: Treatment 1 - Completa, Sem Treatment 2 - Completa, Ca Treatment 3 - Completa, Mg Treatment 4 - Completa, P Treatment 5 - Completa, K Treatment 6 - Sem, Sem (control: without Hormone, without Nutrition, and without Macronutrition) This six treatments were randomized on each sub-plot in CH and SH. Randomization was different for each Block. However, treatment 6 is not present on CH. It is only present on SH range. Here was a experimental design: http://www.divshare.com/download/9241232-392 Each Time, we measured Diametro (centimeters), Altura (centimeters), and N.Folhas (count). We are interested on treatments effects on Diameter (Diametro), High (Altura), and leaves number (N.Folhas). Are there effects? Are there time effects? And interactions? How is the best time, and the best nutrition, and the best macronutrition combination? How is the influence of hormone? And interactions? I try this approach, but I don't know how I could handle the repeated measures here! Nor if this approach is correct for me. Here is a subset of my data http://www.divshare.com/download/9241231-428 dados - read.table(marcelo_laia.txt,sep=\t,dec=,,header=TRUE) summary(dados) dados.model - aov(Diametro ~ Block + + Hormone + Error(Block/Hormone) + + Treatment + Treatment:Hormone + + Hormone/Block/Treatment, + data=dados) summary(dados.model) This model was correct? (T6 was present only on SH range) How I could include the repeated measures here? Thank you very much! -- Marcelo Luiz de Laia Universidade do Estado de Santa Catarina UDESC - www.cav.udesc.br Lages - SC - Brazil Linux user number 487797 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional probability in Copula package?...
On Nov 7, 2009, at 3:50 PM, Tidal Wave wrote: Hello All, How would one go about getting Pr(Y ≤ y|X = x) for a Frank copula via the Copula or other package? Change the limits of integration? -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] which data structure to choose to keep multile objects?
If you use the replicate or lapply functions then this will be taken care of for you (a list will be created containing your function output). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of clue_less Sent: Friday, November 06, 2009 4:58 PM To: r-help@r-project.org Subject: [R] which data structure to choose to keep multile objects? I have a function called nnmf which takes in one matrix and returns two matrices. for example, X [,1] [,2] [,3] [,4] [1,]147 10 [2,]258 11 [3,]369 12 z=nnmf(X,2) z$W [,1] [,2] [1,] 0.8645422 0.6643681 [2,] 1.7411863 0.5377504 [3,] 2.6179287 0.4111063 z$H [,1] [,2] [,3] [,4] [1,] 1.14299486 1.692260 2.241279 2.79030 [2,] 0.01838514 3.818559 7.619719 11.42087 Now I would like to run it many times -- z2 = nnmf(X,2) z3 = nnmf(X,3) z4 = nnmf(X,4) z5 = nnmf(X,5) ... But I would like to do it automatically , something like - xprocess-function(max_val) { for (iter in 2: max_val) { zz = sprintf( z%s, iter ) zz -nnmf(X,iter) } } xprocess(10) But how could I keep collection of my results each run? Shall I have a data structure to keep appending results? something like theta = {} ? which data structure to choose to keep multile objects? Thanks! Tim -- View this message in context: http://old.nabble.com/which-data- structure-to-choose-to-keep-multile-objects--tp26231601p26231601.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using xyplot to plot frequencies
histogram(~ standard | Year, data = all2) 2009/11/7 Lanna Jin lanna...@gmail.com: Hi all, First off, thank you for the overwhelming response last time. I'm still trying to figure out the syntax of R to plot some distributions of some frequencies. I've managed to plot histograms from the data, but I would like to clean it up using xyplot from library(lattice). Unfortunately I cannot find a solution to my problem. Given a dataframe all2, with numerical data in cols year and standard (standard being standardized data) all2 Year standard 1 2001 0.034246575 2 2001 0.0 ... 141 2008 0.012820513 142 2008 0.230769231 I have plotted separate histograms for each year using hist(all2[Year==2001,]$standard,breaks=seq(0,.7,.005),ylim=c(0,10),main=2001,xlab=cooccured/total sites,ylab=frequency of cooccurance) hist(all2[Year==2002,]$standard, ... hist(all2[Year==2003,]$standard, ... etc. I would like to clean it up a bit by plotting the data using xyplot from library(lattice) of the standardized.data. My questions are: a. Is there a function I could use inside xyplot to graph the frequency/histogram of a given year? I tried something around the lines of, xyplot(freq(all2[Year==2001,]$standard)~(all2[Year==2001,]$standard)|all2[Year==2001]) obviously did not work. b. How would I plot the frequencies of standard for each year in xyplot? I tried converting all2 into a ftable, and from there converting it into a dataframe (freqframe) and then plotting: xyplot(freqframe$Freq~freqframe$standard|freqframe$Year) the plot is wrong, probably because the dataframe has unnecessary repeats of values. I also tried converting all2 into a ftable (also tried a matrix) and naming the columns and rows and evoking them in the xyplot. But this has also turned disasterous. Thanks in advance! -- Lanna Jin lanna...@gmail.com 510-898-8525 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / 安福立 Postdoctoral Fellow Integrated Catchment Assessment and Management (iCAM) Centre Fenner School of Environment and Society [Bldg 48a] The Australian National University Canberra ACT 0200 Australia M: +61 410 400 963 T: + 61 2 6125 4670 E: felix.andr...@anu.edu.au CRICOS Provider No. 00120C -- http://www.neurofractal.org/felix/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] influence.measures(stats): hatvalues(model, ...)
Hello: I am trying to understand the method 'hatvalues(...)', which returns something similar to the diagonals of the plain vanilla hat matrix [X(X'X)^(-1)X'], but not quite. A Fortran programmer I am not, but tracing through the code it looks like perhaps some sort of correction based on the notion of 'leave-one-out' variance is being applied. Whatever the difference, in simulations 'hatvalues' appears to perform much better in the context of identifying outiers using Cook's Distance than the diagonals of the plain vanilla hat matrix. (As in http://en.wikipedia.org/wiki/Cook's_distance). Would prefer to understand a little more when using this method. I have downloaded the freely available references cited in the help and am in the process of digesting them. If someone with knowledge could offer a pointer on the most efficient way to get at why 'hatvalues' does what it does, that would be great. Thanks, Jean Yarrington Independent consultant. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.