Re: [R] Equality of multiple vectors
or identical(vec1, vec2) identical(vec2, vec3) Jan Petr Savicky savi...@cs.cas.cz schreef: On Fri, May 04, 2012 at 12:53:12AM -0700, aaurouss wrote: Hello, I'm writing a piece of code where I need to compare multiple same length vectors. I've gone through the basic functions like identical() or all(), but they only work for comparing 2 vectors. From 3 vectors on, it doesn't work . Example: Assuming vec1 - c (1,2,3,4,5) vec2 - c(1,2,3,4,5) vec3 - c(1,2,3,4,4) identical (vec1,vec2,vec3) returns TRUE, since the 2 first vectors are equal. I need a function that returns FALSE if one of the vectors is different. Hi. Try the following. length(unique(list(vec1, vec2, vec3))) == 1 [1] FALSE length(unique(list(vec1, vec2, vec1))) == 1 [1] TRUE Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't import this 4GB DATASET
OK, not all, but most lines have the same length. Perhaps you could write the lines with a different line size to a separate file to have a closer look at those lines. Modifying the previous code (again not tested): con - file(dataset.txt, rt) out - file(strangelines.txt, wt) # skip first 5 lines lines - readLines(con, n=5) # read the rest in blocks of 100.000 lines while (TRUE) { lines - readLines(con, n=1E5) if (length(lines) == 0) break; strangelines - lines[nchar(lines) != 97] writeLines(strangelines, con=out) } close(con) close(out) Jan Quoting iliketurtles isaacm...@gmail.com: Jan, thank you. table(line_sizes) line_sizes 01 97 256 1430 2860 46869069 1430 - Isaac Research Assistant Quantitative Finance Faculty, UTS -- View this message in context: http://r.789695.n4.nabble.com/Can-t-import-this-4GB-DATASET-tp4607862p4608172.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handling 8GB .txt file in R?
What you could try to do is skip the first 5 lines. After that the file seems to be 'normal'. With read.table.ffdf you could try something like # open a connection to the file con - file('yourfile', 'rt') # skip first 5 lines tmp - readLines(con, n=5) # read the remainder using read.table.ffdf ffdf - read.table.ffdf(file=con) # close connection close(con) HTH Jan On 03/25/2012 06:20 AM, iliketurtles wrote: Thanks to all the suggestions. To the first individual that replied, I can't do any stuff with unix or perl. All I know is R. @KEN: I'm using Windows 7, 64 bit. @Steve: Here's the readLines output.. As we can see, lines 1-3 are empty and line 5 is empty, and there's also empty elements after line 5!. [1] [2] [3] [4] PERMNO DATETICKERPERMCO PRC VOLNUMTRDvwretdewretd [5] [6]106/01/19867952 . . . -0.000138 0.001926 [7]107/01/1986OMFGA 7952-2.56250 1000 . 0.013809 0.011061 [8]108/01/1986OMFGA 7952-2.5 12800 . -0.020744 -0.005117 [9]109/01/1986OMFGA 7952-2.5 1400 . -0.011219 -0.011588 [10]110/01/1986OMFGA 7952-2.5 8500 . 0.83 0.003651 [11]113/01/1986OMFGA 7952-2.62500 5450 . 0.002749 0.002433 - Isaac Research Assistant Quantitative Finance Faculty, UTS -- View this message in context: http://r.789695.n4.nabble.com/Handling-8GB-txt-file-in-R-tp4500971p4502706.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading big files in chunks-ff package
Your question is not completely clear. read.csv.ffdf automatically reads in the data in chunks. You don´t have to do anything for that. You can specify the size of the chunks using the next.rows option. Jan On 03/24/2012 09:29 PM, Mav wrote: Hello! A question about reading large CSV files I need to analyse several files with sizes larger than 3 GB. Those files have more than 10million rows (and up to 25 million) and 9 columns. Since I don´t have a large RAM memory, I think that the ff package can really help me. I am trying to use read.csv.ffdf but I have some questions: How can I read the files in several chunks…with an automatic way of calculating the number of rows to include in each chunk? (my problem is that the files have different number of rows) For instance…. I have used read.csv.ffdf(NULL, “file.csv”, sep=|, dec=.,header = T,row.names = NULL,colClasses = c(rep(integer, 3), rep(integer, 10), rep(integer, 6))) But with this way I am reading the whole fileI would prefer to read it in chunksbut I don´t know how to read it in chunks I have read the ff documentation but I am not good with R! Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/Reading-big-files-in-chunks-ff-package-tp4502070p4502070.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading big files in chunks-ff package
The 'normal' way of doing that with ff is to first convert your csv file completely to a ffdf object (which stores its data on disk so shouldn't give any memory problems). You can then use the chunk routine (see ?chunk) to divide your data in the required chunks. Untested so may contain errors: ffdf - read.table.ffdf(...) chnks - chunk(from=1, to=nrow(yourffdf), by=5E6, method='seq') for (chnk in chnks) { # read data data - ffdf[chnk, ] # do your thing with the data # clean up rm(data) gc() } If you want to process your csv file directly in chunks, you could also have a look at the LaF package. Especially the process_blocks routine which does exactly that. The manual vignette (http://cran.r-project.org/web/packages/LaF/vignettes/LaF-manual.pdf) contains some examples how to do that. Jan Quoting Mav mastorvar...@gmail.com: Thank you Jan My problem is the following: For instance, I have 2 files with different number of rows (15 million and 8 million of rows each). I would like to read the first one in chunks of 5 million each. However between the first and second chunk, I would like to analyze those first 5 million of rows, write the analysis in a new csv and then proceed to read and analyze the second chunk and so on until the third chunk. With the second file, I would like to do the same...read the first chunk, analyze it and continue to read the second and analyze it. Basically my problem is that I manage to read the filesbut with so many rows...I cannot do any analyses (even filtering the rows) because of the RAM restrictions. Sorry if is still not clear. Thank you -- View this message in context: http://r.789695.n4.nabble.com/Reading-big-files-in-chunks-ff-package-tp4502070p4503642.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Singleton pattern
Using the singleton pattern in R has never occurred to me so far, as I think it applies to languages that support multiple references to one instance. R doesn't do that, at least not in ways that would be required for applying the singleton pattern as described in the GoF book, anyway. One would have to use closures and / or environments to approximate references, I suppose. When passed around as parameters, R objects don't get copied unless the called function starts modifying them, so if the primary concern is to prevent unnecessary / costly copying of bulky objects, creating the thing once and then passing it around as necessary, taking care that called functions don't change it, is perhaps good enough. Best regards, Jan On Fri, Mar 16, 2012 at 12:15:27PM -0400, Bryan Hanson wrote: Since no one else has bit, I'll take a stab. I'm an experienced R person, but I've recently been teaching myself objective-c and I've been using singletons quite a bit (and mis-using them quite a bit!). Not a computer scientist at all. You've been warned. I don't think there is a comparable concept in R. You do have a choice of S3 or S4 classes for your object orientation in R. S3 is very loose in that you can add to S3 objects readily and abuse them a lot. There really is no checking of them unless you implement it manually. S4 objects are much tighter and they are less readily modified and are self-checking (I know some will complain about this characterization but it's approximately correct). So perhaps you want an S4 object so it's less likely to get mangled, but I doubt there is a way to prevent users from copying it, which would be more along the lines of a singleton. You can google the archives for some great discussions of S3 vs S4 if that sounds interesting. Bryan *** Bryan Hanson Professor of Chemistry Biochemistry DePauw University On Mar 16, 2012, at 7:47 AM, David Cassany wrote: Hi all, I know it may not have much sense thinking about a Singleton Pattern in an R application which doesn't use any OOP facilities, however I'm curious to know if anybody faced the same issue. I've been googling but using singleton pattern as a key word leads to typical OOP languages like Java or C++ among others. So my problem is that I'd like to ensure some very big objects aren't copied again and again in some other variables. In the worst case I'll check all code by myself to ensure it but in this case the application won't force programmers to take it in consideration which is what I am really looking for. Any advice will be highly appreciated :P Thanks! -- *David Cassany Viladomat Software Developer Transmural Biote**ch S.L* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- +- Jan T. Kim ---+ | email: jtt...@gmail.com| | WWW: http://www.jtkim.dreamhosters.com/ | *-= hierarchical systems are for files, not for humans =-* __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] JSS support
We received a generous gift to support the Journal of Statistical Software (www.jstatsoft.org) from the DC Area R Users Group. If you think the Journal is a worthy cause, then support it through the Statistics Computing Support Fund at https://giving.ucla.edu/Standard/NetDonate.aspx?SiteNum=107 === Jan de Leeuw; Distinguished Professor and Chair, UCLA Department of Statistics; Editor: Journal of Multivariate Analysis, Journal of Statistical Software; US mail: 8125 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554 phone (310)-825-9550; fax (310)-206-5658; email: dele...@stat.ucla.edu (mailto:dele...@stat.ucla.edu) .mac: jdeleeuw ++ aim: deleeuwjan ++ skype: j_deleeuw homepages: http://gifi.stat.ucla.edu ++ http://www.cuddyvalley.org - No matter where you go, there you are. --- Buckaroo Banzai http://gifi.stat.ucla.edu/sounds/nomatter.au - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] check for data in a data.frame and return correspondent number
Marianna, You can use merge for that (or match). Using merge: MyData - data.frame( V1=c(red-j, red-j, red-j, red-j, red-j, red-j), V4=c(10.5032, 9.3749, 10.2167, 10.8200, 9.2831, 8.2838), redNew=c(appearance blood-n, appearance ground-n, appearance sea-n, appearance sky-n, area chicken-n, area color-n) ) MyVector - data.frame( V1 = c(appearance blood-n, appearance ground-n, appearance sea-n, as_adj_as fire-n, as_adj_as carrot-n, appearance sky-n, area chicken-n, area color-n) ) merge(MyVector, MyData[, c(V4, redNew)] , by.x=V1, by.y=redNew, all.x=TRUE) Btw I saw some spaces in some of your strings (I have removed these in the example above). Be aware that the character string appearance ground-n is not equal to appearance ground-n. HTH Jan On 03/14/2012 06:49 PM, mari681 wrote: Dear R-ers, still the newbie. With a question about coordinates of a vector appearing or not in a data.frame. I have a data.frame (MyData) with 3 columns which looks like this: V1V4 redNew red-j 10.5032 appearance blood-n red-j9.3749 appearance ground-n red-j 10.2167 appearance sea-n red-j 10.8200 appearance sky-n red-j9.2831 area chicken-n red-j8.2838area color-n and a MyVector which includes also (but not only) the data in the 3rd column: appearance blood-n appearance ground-n appearance sea-n as_adj_as fire-n as_adj_as carrot-n appearance sky-n area chicken-n area color-n I would like to get a data.frame of 2 columns where in the first column there is all MyVector, and in the second column there is either the correspondent number found in MyData (shown in column 2) or a 0 if the entrance is not found. I've tried some options, among which a loop: out-for(x in MyVector) if (x %in% MyData) print (MyData[,2]) but obviously doesn't work. How can I select the correspondent element on column 2 for each x found in column 3? Suggestions in general? Thank you for consideration!!! Have a nice day, Marianna -- View this message in context: http://r.789695.n4.nabble.com/check-for-data-in-a-data-frame-and-return-correspondent-number-tp4472634p4472634.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading in 9.6GB .DAT File - OK with 64-bit R?
You could also have a look at the LaF package which is written to handle large text files: http://cran.r-project.org/web/packages/LaF/index.html Under the vignettes you'll find a manual. Note: LaF does not help you to fit 9GB of data in 4GB of memory, but it could help you reading your file block by block and filtering it. Jan RHelpPlease rrum...@trghcsolutions.com schreef: Hi Barry, You could do a similar thing in R by opening a text connection to your file and reading one line at a time, writing the modified or selected lines to a new file. Great! I'm aware of this existing, but don't know the commands for R. I have a variable [560,1] to use to pare down the incoming large data set (I'm sure of millions of rows). With other data sets they've been small enough where I've been able to use the merge function after data has been read in. Obviously I'm having trouble reading in this large data set in in the first place. Any additional help would be great! -- View this message in context: http://r.789695.n4.nabble.com/Reading-in-9-6GB-DAT-File-OK-with-64-bit-R-tp4457220p4458074.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Novice Alert!: odfWeave help!
Step by step: 1. Create a new document in Open/LibreOffice 2. Copy/paste the following text into the document (as an example) helloworld= cat(Hello, world) @ 2. Save the file (e.g. hello.odt) 3. Start R (if not already) shouldn't matter if its plain R/RStudio 4. Change working directory to the folder in which you odt-document resides setwd(/path/to/your/file) 4. Load odfWeave library(odfWeave) 5. odfWeave your document. All code-chunks are taken from your document, executed in R and the output of the R-commands is inserted into the resulting odt-document. odfWeave(hello.odt, hello_out.odt) You can now open hello_out.odt (or whatever you named it) and see the resulting output. HTH, Jan metatarsals sjcast...@gmail.com schreef: Hello world, I'm pretty new to computer code: for example, I consider it a small victory that I (all by myself!) managed to ssh into the server at my lab from home and copy a file onto my desktop. Be gentle. I have primarily used R for running some pretty mid-level statistics (creating distance matrices, manipulating graphs for pretty figures, etc). I'm working through Bolker's Ecological Models and Data in R (which is a great book for ecologists/life sciences types who want to learn how to just barely get by in R, with know previous knowledge of R code presupposed). My advisor wants me to explore odfWeave to stream-line my notes. This is important because I will inevitably be his TA in his R stat course, and I will need to be proficient with the software. So far I have been copy-pasting my codes into a word processor (both open office and word) and inserting my plots after saving them. I do not understand how to use odfWeave. The way it was explained to me initially sounded like it was some kind of Open Office add-on I could install and my chunks of code would be automatically translated. Six hours of research later, I realize this is not the case, and that I need outside help. I'm on a Mac OSx 10.7.3 Lion, I normally use RStudios, but I have R and R64 and I operate at about, oh, let's say the level of a 2- or 3-year-old does with language and walking. So, what exactly does odfWeave do? Do I stick my chunks of code (I know I need to use to start and @ to end to bracket off the sections of code) in the .odf document, then do the file.in/file.out commands, which then reads the code and pops out a pretty little graph to my specified parameters? Or do I use the file.in/file.out commands to paste code I've created in R into an existing .odf doc? Any baby steps or example code you could give me would warm my little heart. If the first scenario (write the code into an .odf document, set off as mentioned above, and then tell R to do stuff to it) is the scenario, I'd be happy to send an example. Thanks! I can offer a cute picture of a cat as payment, if desired! -- View this message in context: http://r.789695.n4.nabble.com/Novice-Alert-odfWeave-help-tp4455481p4455481.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Week number from a date
The suggestion below gives you week numbers with week 1 being the week containing the first monday of the year and weeks going from monday to sunday. There are other conventions. The ISO convention is that week 1 is the first week containing at least 4 days in the new year (week 1 of 2012 starts on 2nd januari; week 1 of 2008 starts on december 29th 2008). http://www.r-bloggers.com/iso-week/ gives a function for that type of week numbers (not tested by me). Jan Patrick Breheny patrick.breh...@uky.edu schreef: To give a little more detail, you can convert your character strings into POSIX objects, then extract from it virtually anything you would want using strftime. In particular, %W is how you get the week number: dateRange - c(2008-10-01,2008-12-01) x - as.POSIXlt(dateRange) strftime(x,format=%W) [1] 39 48 --Patrick On 02/22/2012 08:37 AM, Ingmar Visser wrote: ?strptime is a good place to start hth, Ingmar On Wed, Feb 22, 2012 at 2:09 PM, arunkumarakpbond...@gmail.com wrote: Hi My data looks like this startDate=2008-06-01 dateRange =c( 2008-10-01,2008-12-01) Is there any method to find the week number from the startDate range __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Triangular Test
Hello, I would like to perform triangular test for clinical trial with R. can you help me please ? Jan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems reading tab-delim files using read.table and read.delim
I don't know if this completely solves your problem, but here are some arguments to read.table/read.delim you might try: row.names=FALSE fill=TRUE The details section also suggests using the colClasses argument as the number of columns is determined from the first 5 rows which may not be correct. HTH Jan mails mails00...@gmail.com schreef: Hello, I used read.xlsx to read in Excel files but for large files it turned out to be not very efficient. For that reason I use a programme which writes each sheet in an Excel file into tab-delim txt files. After that I tried using read.table and read.delim to read in those txt files. Unfortunately, the results are not as expected. To show you what I mean I created a tiny Excel sheet with some rows and columns and read it in using read.xlsx. I also used my script to write that sheet to a tab-delim txt file and read that one it with read.table and read.delim. Here is the R output: (test - read.table(Sheet1.txt, header=TRUE, sep=\t)) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 1 did not have 5 elements (test - read.delim(Sheet1.txt, header=TRUE, sep=\t)) c1 c2 c3 X 123 213 NA NA NA 234 asd NA NA NA (test - read.xlsx(file.path(data), Sheet1)) c1 c2 c3 NA. NA..1 NA..2 1 123 NA 213NA NA 2 234 asd NA NA The last output is what I would expect the file to be read in. Columns 4 to 6 do not have any header rows. in R1C4 I added some white spaces as well as into R2C5 and R2C6 which a read in correctly by the read.xlsx function. read.table and read.delim seem not to be able to handle such files. Is there any workaround for that? Cheers -- View this message in context: http://r.789695.n4.nabble.com/Problems-reading-tab-delim-files-using-read-table-and-read-delim-tp4369195p4369195.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 2011 Journal of Statistical Software
The Journal of Statistical Software published eight volumes in 2011, five of them as special volumes. V38: Special Volume: Competing Risks and Multi-State Models V39: Regular Volume V40: Regular Volume V41: Special Volume: Statistical Software for State Space Methods V42: Special Volume: Political Methodology V43: Regular Volume V44: Special Volume: Magnetic Resonance Imaging in R V45: Special Volume: Multiple Imputation __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Not generating line chart
Devarayalu, This is FAQ 7.22: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f use print(qplot()) Regards, Jan Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef: Hi All, Can you please help me, why this code in not generating line chart? library(ggplot2) par(mfrow=c(1,3)) #qplot(TIME1, BASCHGA, data=Orange1, geom= c(point, line), colour= ACTTRT) unique(Orange1$REFID) - refid for (i in refid) { Orange2 - Orange1[i == Orange1$REFID, ] pdf('PGA.pdf') qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT) dev.off() } Regards, Devarayalu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Not generating line chart
Devarayalu, Please reply to the list. And it would have easier if you would have outputted your data using dput (in your case dput(Orange1)) so that I and other r-help members can just copy the data into R. Not everybody had Excell available (I for example haven't). The easier you make it for people to look into your problem, the higher the probability that you will get a usefull answer. In your case your data is quite small, so using dput is no problem. To answer your question. Except for the probable error refid - unique(Orange2$REFID) which should probably be refid - unique(Orange1$REFID) and the fact that overwrite your files in the loop, I have no problem generating the graphs. On my system the following code runs and generates two graphs: library(ggplot2) Orange1 - structure(list(REFID = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 9L, 9L, 9L, 9L), ARM = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L), SUBARM = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), ACTTRT = structure(c(3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 2L, 2L), .Label = c(ABC, DEF, LCD, Vehicle), class = factor), TIME1 = c(0L, 2L, 6L, 12L, 0L, 2L, 6L, 12L, 0L, 12L, 0L, 12L), ENDPOINT = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = PGA, class = factor), BASCHGA = c(0L, -39L, -47L, -31L, 0L, -34L, -25L, -12L, 0L, -30L, 0L, -40L ), STATANAL = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = UNK, class = factor), X = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c(, Dansinger_2010_20687812), class = factor)), .Names = c(REFID, ARM, SUBARM, ACTTRT, TIME1, ENDPOINT, BASCHGA, STATANAL, X), class = data.frame, row.names = c(NA, -12L)) refid - unique(Orange1$REFID) for (i in refid) { Orange2 - Orange1[i == Orange1$REFID, ] pdf(paste('PGA', i, '.pdf', sep='')) print(qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT)) dev.off() } Regards, Jan Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef: Jan Thank you, for your valuable reply. But... Sorry still I am not getting by using print() with the following modified code. I am also attaching the raw datafile. par(mfrow=c(1,3)) #qplot(TIME1, BASCHGA, data=Orange1, geom= c(point, line), colour= ACTTRT) unique(Orange1$REFID) - refid for (i in refid) { Orange2 - Orange1[i == Orange1$REFID, ] pdf('PGA.pdf') print(qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT)) dev.off() } Regards Devarayalu -Original Message- From: Jan van der Laan [mailto:rh...@eoos.dds.nl] Sent: Thursday, January 19, 2012 4:25 PM To: Sri krishna Devarayalu Balanagu Cc: r-help@r-project.org Subject: Re: [R] Not generating line chart Devarayalu, This is FAQ 7.22: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f use print(qplot()) Regards, Jan Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef: Hi All, Can you please help me, why this code in not generating line chart? library(ggplot2) par(mfrow=c(1,3)) #qplot(TIME1, BASCHGA, data=Orange1, geom= c(point, line), colour= ACTTRT) unique(Orange1$REFID) - refid for (i in refid) { Orange2 - Orange1[i == Orange1$REFID, ] pdf('PGA.pdf') qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT) dev.off() } Regards, Devarayalu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Not generating line chart
As I mentioned in my previous reply: do not only email to me personally but also include the mailinglist. This gives other members also the opportunity to answer your question and lets other members, who might have a similar question, also see the answer. As for your first question: put the pdf(...) and dev.off() outside of the loop. I am not an ggplot2 expert, but you could also have a look at the facets option of qplot. As for your second question: have a look at levels(Orange1$ACTTRT) and ?factor Regards, Jan Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef: Jan, Thank you very much for the solution given. Still I am having one more question. I want both the graphs in single pdf and the legend should contain ACTTRT of individual REFID (Only two lines in legend) Can you solve it? Devarayalu -Original Message- From: Jan van der Laan [mailto:rh...@eoos.dds.nl] Sent: Thursday, January 19, 2012 5:09 PM To: Sri krishna Devarayalu Balanagu Cc: r-help@r-project.org Subject: Re: [R] Not generating line chart Devarayalu, Please reply to the list. And it would have easier if you would have outputted your data using dput (in your case dput(Orange1)) so that I and other r-help members can just copy the data into R. Not everybody had Excell available (I for example haven't). The easier you make it for people to look into your problem, the higher the probability that you will get a usefull answer. In your case your data is quite small, so using dput is no problem. To answer your question. Except for the probable error refid - unique(Orange2$REFID) which should probably be refid - unique(Orange1$REFID) and the fact that overwrite your files in the loop, I have no problem generating the graphs. On my system the following code runs and generates two graphs: library(ggplot2) Orange1 - structure(list(REFID = c(7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 9L, 9L, 9L, 9L), ARM = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 2L, 2L), SUBARM = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), ACTTRT = structure(c(3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 2L, 2L), .Label = c(ABC, DEF, LCD, Vehicle), class = factor), TIME1 = c(0L, 2L, 6L, 12L, 0L, 2L, 6L, 12L, 0L, 12L, 0L, 12L), ENDPOINT = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = PGA, class = factor), BASCHGA = c(0L, -39L, -47L, -31L, 0L, -34L, -25L, -12L, 0L, -30L, 0L, -40L ), STATANAL = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = UNK, class = factor), X = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L), .Label = c(, Dansinger_2010_20687812), class = factor)), .Names = c(REFID, ARM, SUBARM, ACTTRT, TIME1, ENDPOINT, BASCHGA, STATANAL, X), class = data.frame, row.names = c(NA, -12L)) refid - unique(Orange1$REFID) for (i in refid) { Orange2 - Orange1[i == Orange1$REFID, ] pdf(paste('PGA', i, '.pdf', sep='')) print(qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT)) dev.off() } Regards, Jan Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef: Jan Thank you, for your valuable reply. But... Sorry still I am not getting by using print() with the following modified code. I am also attaching the raw datafile. par(mfrow=c(1,3)) #qplot(TIME1, BASCHGA, data=Orange1, geom= c(point, line), colour= ACTTRT) unique(Orange1$REFID) - refid for (i in refid) { Orange2 - Orange1[i == Orange1$REFID, ] pdf('PGA.pdf') print(qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT)) dev.off() } Regards Devarayalu -Original Message- From: Jan van der Laan [mailto:rh...@eoos.dds.nl] Sent: Thursday, January 19, 2012 4:25 PM To: Sri krishna Devarayalu Balanagu Cc: r-help@r-project.org Subject: Re: [R] Not generating line chart Devarayalu, This is FAQ 7.22: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-do-lattice_002ftrellis-graphics-not-work_003f use print(qplot()) Regards, Jan Sri krishna Devarayalu Balanagu balanagudevaray...@gvkbio.com schreef: Hi All, Can you please help me, why this code in not generating line chart? library(ggplot2) par(mfrow=c(1,3)) #qplot(TIME1, BASCHGA, data=Orange1, geom= c(point, line), colour= ACTTRT) unique(Orange1$REFID) - refid for (i in refid) { Orange2 - Orange1[i == Orange1$REFID, ] pdf('PGA.pdf') qplot(TIME1, BASCHGA, data=Orange2, geom= c(line), colour= ACTTRT) dev.off() } Regards, Devarayalu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
[R] compare means
Dear all, I would compare two means between cases and controls taking into account that I have matched 1 case to two controls. How i can do it with R. Thanks in advance Jan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Web analytics / Customer Analytics book recommendation
Hi, I am curious if you know about any book that is dealing with the web analytics / customer analytics subject and is referencing R as the main statistical tool. I am particularly interested into using R in the real production environment and not only as the analytical tool. Thank you Jan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] simple ggplot2 question
Hello, I am trying to make a plot using the code below. The plot is divided into two parts, using facet_grid. I would like the vertical axis (labelled 'place') to be different for each location (=part). So in the upper part, only places 'n' through 'z' are shown, while in the lower part, only places 'a' through 'm' are shown. I thought 'free_y' would do the trick. I also tried converting variable place into class 'factor'. require(ggplot2) DF - data.frame(place=letters, value=runif(26), location=c(rep(1, 13), rep(0, 13))) qplot(data=DF, x=place, y=value, geom=bar, stat=identity) + coord_flip() + geom_abline(intercept=35, slope=0, colour=red) + facet_grid(location ~ ., scales=free_y) R.version.string # R version 2.10.1 (2009-12-14) Thank you in advance merry xmas! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Journal of Statistical Software 2011
This year JSS, at www.jstatsoft.org, published eight volumes V38-V45. Five of them were special volumes: V38 - Competing Risks and Multi-State Models (guest editor Putter) V41 - Statistical Software for State Space Methods (Guest editors Commandeur, Koopman, Ooms) V42 - Poltical Methodology (Guest editors Altman, Fox, Jackman, Zeileis) V44 - Magnetic Resonance Imaging in R (Guest editors Tabelow, Whitcher) V45 - Multiple Imputation (Guest editor Yucel) The Thomson/Reuters Impact Factors for the last three year for computational statistics journals are Comp Stat 0.500 - 0.731 - 0.628 CSDA 0.226 - 1.281 - 1.089 JCGS 1.505 - 1.258 - 1.206 JSS 1.033 - 2.320 - 2.647 You may be interested our success in Computer Science http://www.sciencewatch.com/inter/jou/2011/11decJofStatSoft/ You can follow and befriend us at http://www.facebook.com/jstatsoft === Jan de Leeuw; Distinguished Professor and Chair, UCLA Department of Statistics; Editor: Journal of Multivariate Analysis, Journal of Statistical Software; US mail: 8125 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554 phone (310)-825-9550; fax (310)-206-5658; email: dele...@stat.ucla.edu .mac: jdeleeuw ++ aim: deleeuwjan ++ skype: j_deleeuw homepages: http://gifi.stat.ucla.edu ++ http://www.cuddyvalley.org - No matter where you go, there you are. --- Buckaroo Banzai http://gifi.stat.ucla.edu/sounds/nomatter.au __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] maptools/spatial analysis question
Hi, I am using maptools to plot air quality data on a map. Each measurement point is mapped to a postal code area. This yields pictures with discrete borders, like so: http://dl.dropbox.com/u/27415200/baincome.png The problem is that the size of a postal code area doesn't mean much in this context. Moreover, only a small minority of all the postal code areas has a measurement sation. Are there any ways/tools to interpolate the various (strategically chosen) measurement stations? I am looking for sensible ways to create plots like this: http://matplotlib.github.com/basemap/_images/etopo5.png sessionInfo() R version 2.11.1 (2010-05-31) i686-pc-linux-gnu ... Thank you in advance! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] slight documentation error in stats package arima
The documentation for the arima function in the package stats has a slight error. It references: Ripley, B. D. (2002) Time series in R 1.5.0. R News, 2/1, 2â7. [1]http://www.r-project.org/doc/Rnews/Rnews_2002-1.pdf This should be: Ripley, B. D. (2002) Time series in R 1.5.0. R News, 2/2, 2â7. [2]http://www.r-project.org/doc/Rnews/Rnews_2002-2.pdf Anyone know who I should tell about this? Thanks! - Jan References 1. http://www.r-project.org/doc/Rnews/Rnews_2002-1.pdf 2. http://www.r-project.org/doc/Rnews/Rnews_2002-1.pdf -- Jan Theodore Galkowski Senior Systems Software Engineer Akamai Technologies Cambridge, MA 02142 jgalk...@akamai.com bayesianlo...@acm.org 607.239.1834 (m) 607.239.1834 (h) 617.444.4995 (w) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating input population for microsimulation
Emma, If, as you say, each unit is the same you can just repeat the units to obtain the required number of units. For example, unit_size - 10 n_units - 10 unit_id - rep(1:n_units, each=unit_size) pid - rep(1:unit_size, n_units) senior - ifelse(pid = 2, 1, 0) pop - data.frame(unit_id, pid, senior) If you want more flexibility in generating the units, I would first generate the units (without the persons) and then generate the persons for each unit. In the example below I use the plyr package; you could probably also use lapply/sapply, or simply a loop over the units. library(plyr) generate_unit - function(unit) { pid - 1:unit$size senior - rep(0, unit$size) senior[sample(unit$size, 2)] - 1 return(data.frame(unit_id=unit$id, pid=pid, senior=senior)) } units - data.frame(id=1:n_units, size=unit_size) library(plyr) ddply(units, .(id), generate_unit) HTH, Jan Emma Thomas thomas...@yahoo.com schreef: Hi all, I've been struggling with some code and was wondering if you all could help. I am trying to generate a theoretical population of P people who are housed within X different units. Each unit follows the same structure- 10 people per unit, 8 of whom are junior and two of whom are senior. I'd like to create a unit ID and a unique identifier for each person (person ID, PID) in the population so that I have a matrix that looks like: unit_id pid senior [1,] 1 1 0 [2,] 1 2 0 [3,] 1 3 0 [4,] 1 4 0 [5,] 1 5 0 [6,] 1 6 0 [7,] 1 7 0 [8,] 1 8 0 [9,] 1 9 1 [10,] 1 10 1 ... I came up with the following code, but am having some trouble getting it to populate my matrix the way I'd like. world - function(units, pop_size, unit_size){ pid - rep(0,pop_size) #person ID senior - rep(0,pop_size) #senior in charge unit_id - rep(0,pop_size) #unit ID for (i in 1:pop_size){ for (f in 1:units) { senior[i] = sample(c(1,1,0,0,0,0,0,0,0,0), 1, replace = FALSE) pid[i] = sample(c(1:10), 1, replace = FALSE) unit_id[i] - f }} data - cbind(unit_id, pid, senior) return(data) } world(units = 10,pop_size = 100, unit_size = 10) #call the function The output looks like: unit_id pid senior [1,] 10 7 0 [2,] 10 4 0 [3,] 10 10 0 [4,] 10 9 1 [5,] 10 10 0 [6,] 10 1 1 ... but what I really want is to generate is 10 different units with two seniors per unit, and with each person in the population having a unique identifier. I thought a nested for loop was one way to go about creating my data set of people and families, but obviously I'm doing something (or many things) wrong. Any suggestions on how to fix this? I had been focusing on creating a person and assigning them to a unit, but perhaps I should create the units and then populate the units with people? Thanks so much in advance. Emma __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating input population for microsimulation
Emma, That is because generate_unit expects a data.frame with one row and columns id and size: generate_unit(data.frame(id=1, size=10)) Jan Emma Thomas thomas...@yahoo.com schreef: Dear Jan, Thanks for your reply. The first solution works well for my needs for now, but I have a question about the second. If I run your code and then call the function: generate_unit(10) I get an error that Error in unit$size : $ operator is invalid for atomic vectors Did you experience the same thing? In any case, I will definitely take a look at the plyr package, which I'm sure will be useful in the future. Thanks again! Emma - Original Message - From: Jan van der Laan rh...@eoos.dds.nl To: r-help@r-project.org r-help@r-project.org Cc: Emma Thomas thomas...@yahoo.com Sent: Wednesday, December 14, 2011 6:18 AM Subject: Re: [R] Generating input population for microsimulation Emma, If, as you say, each unit is the same you can just repeat the units to obtain the required number of units. For example, unit_size - 10 n_units - 10 unit_id - rep(1:n_units, each=unit_size) pid - rep(1:unit_size, n_units) senior - ifelse(pid = 2, 1, 0) pop - data.frame(unit_id, pid, senior) If you want more flexibility in generating the units, I would first generate the units (without the persons) and then generate the persons for each unit. In the example below I use the plyr package; you could probably also use lapply/sapply, or simply a loop over the units. library(plyr) generate_unit - function(unit) { pid - 1:unit$size senior - rep(0, unit$size) senior[sample(unit$size, 2)] - 1 return(data.frame(unit_id=unit$id, pid=pid, senior=senior)) } units - data.frame(id=1:n_units, size=unit_size) library(plyr) ddply(units, .(id), generate_unit) HTH, Jan Emma Thomas thomas...@yahoo.com schreef: Hi all, I've been struggling with some code and was wondering if you all could help. I am trying to generate a theoretical population of P people who are housed within X different units. Each unit follows the same structure- 10 people per unit, 8 of whom are junior and two of whom are senior. I'd like to create a unit ID and a unique identifier for each person (person ID, PID) in the population so that I have a matrix that looks like: unit_id pid senior [1,] 1 1 0 [2,] 1 2 0 [3,] 1 3 0 [4,] 1 4 0 [5,] 1 5 0 [6,] 1 6 0 [7,] 1 7 0 [8,] 1 8 0 [9,] 1 9 1 [10,] 1 10 1 ... I came up with the following code, but am having some trouble getting it to populate my matrix the way I'd like. world - function(units, pop_size, unit_size){ pid - rep(0,pop_size) #person ID senior - rep(0,pop_size) #senior in charge unit_id - rep(0,pop_size) #unit ID for (i in 1:pop_size){ for (f in 1:units) { senior[i] = sample(c(1,1,0,0,0,0,0,0,0,0), 1, replace = FALSE) pid[i] = sample(c(1:10), 1, replace = FALSE) unit_id[i] - f }} data - cbind(unit_id, pid, senior) return(data) } world(units = 10,pop_size = 100, unit_size = 10) #call the function The output looks like: unit_id pid senior [1,] 10 7 0 [2,] 10 4 0 [3,] 10 10 0 [4,] 10 9 1 [5,] 10 10 0 [6,] 10 1 1 ... but what I really want is to generate is 10 different units with two seniors per unit, and with each person in the population having a unique identifier. I thought a nested for loop was one way to go about creating my data set of people and families, but obviously I'm doing something (or many things) wrong. Any suggestions on how to fix this? I had been focusing on creating a person and assigning them to a unit, but perhaps I should create the units and then populate the units with people? Thanks so much in advance. Emma __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R - Linux_SSH
What I did in the past (not with R scripts) is to start my jobs using at (start the job at a specified time e.g. now) or batch (start the job when the cpu drops below ?%) at now R CMD BATCH yourscript.R or batch R CMD BATCH yourscript.R something like that, you'll have to look at the man pages for at and/or batch. You probably need something like atd running. I do not know if current linux distributions have that running by default. You'll get an email when the job is finished. HTH Jan R CMD BATCH [options] my_script.R [outfile] Chris Mcowen chrismco...@gmail.com schreef: Dear List, I am unsure if this is the correct list to post to, if it isn't I apologise. I am using SSH to access a Linux version of R on a remote computer as it offers more memory and processing power. The model will take 1-2 days to run, I am accessing R through Putty and when I close the connection and open R again, I am faced with a new session. As a Linux newbie, I was wondering if anybody here knew how to keep R running and interactive and return to it on a later date? Thanks Chris [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CART with rpart
dear all, i want to keep in my data file the results of terminal nodes (groups) after CART analysis for performing other statisticals analysis by this groups. can you help me please? thanks. jan. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Read TXT file with variable separation
Raphael, This looks like fixed width format which you can read with read.fwf. In fixed width format the columns are not separated by white space (or other characters), but are identified by the positition in the file. So in your file, for example the first field looks to contained in the first 2 columns of your file (the first 2 characters of every line), the second field in the next five columns, etc. Regards, Jan Citeren Raphael Saldanha saldanha.plan...@gmail.com: Hi! I have to import some TXT files into R, but the separation between the columns are made with different blank spaces, but each file use the same separation. Example: 31 104 5 0 11RUA SAO SEBASTIAO 25 BAIRRO FILETO 01 0020033854 The pattern is the same on each file. There is two sample files attached to this message. I would like to figure out how to import a single file, and the use some code to import several files (like this http://www.ats.ucla.edu/stat/r/code/read_multiple.htm) When I try read.table, I receive this: cnefe - read.table(sample1.txt, header=FALSE) Erro em scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : linha 1 não tinha 17 elementos Information about my session: sessionInfo()R version 2.12.1 (2010-12-16)Platform: i386-pc-mingw32/i386 (32-bit) locale:[1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C [5] LC_TIME=Portuguese_Brazil.1252 attached base packages:[1] stats graphics grDevices utils datasets methods base -- Atenciosamente, Raphael Saldanha saldanha.plan...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Survival curves for case control and control
Hi, I want to perform Survival curves for case and control subjects in the propensity score-matched cohort that accounted for the clustering of matched pairs. How I can do it with R. Thanks for your help, Jan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RV: Reporting a conflict between ADMB and Rtools on Windows systems
I assume you use a command window to build your packages. One possible solution might be to leave out the path variables set by Rtools from your global path and to create a separate shortcut to cmd for building r-packages where you set your path as needed by R CMD build/check Something like cmd /K PATH c:\Rtools\bin;c:\Rtools\MinGW\bin;c:\Rtools\MinGW64\bin;C:\Program Files (x86)\MiKTeX 2.9\miktex\bin (I haven't tried this so it might need some tinkering to get it to actually work) HTH Jan On 17-11-2011 9:54, Rubén Roa wrote: De: Rubén Roa Enviado el: jueves, 17 de noviembre de 2011 9:53 Para: 'us...@admb-project.org' Asunto: Reporting a conflict between ADMB and Rtools on Windows systems Hi, I have to work under Windows, it's a company policy. I've just found that there is a conflict between tools used to build R packages (Rtools) and ADMB due to the need to put Rtools compiler's location in the PATH environmental variable to make Rtools work. On a Windows 7 64bit with Rtools installed I installed ADMB-IDE latest version and although I could translate ADMB code to cpp code I could not build the cpp code into an executable via ADMB-IDE's compiler. On another Windows machine, a Windows Vista 32bits with Rtools installed I also installed the latest ADMB-IDE and this time it was not possible to create the .obj file on the way to build the executable when building with ADMB-IDE. On this machine I also have a previous ADMB version (6.0.1) that I used to run from the DOS shell. This ADMB also failed to build the .obj file. Now, going to PATH, the location info to make Rtools is: c:\Rtools\bin;c:\Rtools\MinGW\bin;c:\Rtools\MinGW64\bin;C:\Program Files (x86)\MiKTeX 2.9\miktex\bin; If from this list I remove the reference to the compiler c:\Rtools\MinGW\bin then ADMB works again. So beware of this conflict. Suggestion of a solution will be appreciated. Meanwhile, I run ADMB code in one computer and build R packages with Rtools in another computer. Best Ruben -- Dr. Ruben H. Roa-Ureta Senior Researcher, AZTI Tecnalia, Marine Research Division, Txatxarramendi Ugartea z/g, 48395, Sukarrieta, Bizkaia, Spain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hierachical code system
Hi, Thanks for your reply. Based on your suggestions, I managed to simplify the code, but only a little. I don't see how I could do without a loop, given the nestedness of the hierachy. See the code below, which is working, but I'd like to simplify it. # sample data theCodes - c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02', 'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06', 'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 'STAT.01.01.06.04', 'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01', 'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02', 'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03') theValues - as.numeric(c(NA, NA, 15074.23366, 4882.942034, 1619.59628, 1801.722877, 1019.973666, NA, 503.9239317, 917.2189347, 6018.830465, 1944.11311, 1427.575402, 1965.725428, NA, 5857.293612, 5933.770263, NA, 6077.089518, 1427.180073, 455.9387993, 859.766603, 1002.983331, 2225.328211)) df - as.data.frame(cbind(code=theCodes, value=theValues)) df$value - as.numeric(df$value) # actual code getDepth - function(df) { df$diepte - do.call(rbind, lapply(strsplit(df$code, \\.), length)) - 1 return(df) } getParents - function(df) { df$parent - substr(df$code, 1, 4 + (df$diepte - 1) * 3) return(df) } getTotals - function(df, depth) { s - subset(df, diepte==depth) if(!parent %in% names(df)) s - getParents(s) agg - aggregate(s[value], s[parent], FUN=sum, na.rm=TRUE) merged - merge(df, agg, by.x=code, by.y=parent, all=TRUE, suffixes=c(, _summed)) isSum - !is.na(merged$value_summed) merged[isSum, value] - merged[isSum, value_summed] merged$value_summed - merged$parent - NULL return(merged) } #library(debug) #mtrace(getTotals) df - getDepth(df) for( depth in max(df$diepte):2 ) { if (depth == max(df$diepte)) { x - getTotals(df, depth) } else { x - getTotals(x, depth) } } Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ From: ONKELINX, Thierry thierry.onkel...@inbo.be To: Albert-Jan Roskam fo...@yahoo.com; R Mailing List r-help@r-project.org Sent: Wednesday, November 16, 2011 2:34 PM Subject: RE: [R] hierachical code system Dear Albert-Jan, The easiest way is to create extra variables with the corresponding aggregation level. substr() en strsplit() can be your friends. Once you have those variables you can use aggregate() or any other aggregating function. You don't need loops. Best regards, Thierry -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Albert-Jan Roskam Verzonden: woensdag 16 november 2011 14:28 Aan: R Mailing List Onderwerp: [R] hierachical code system Hi, I have a hierachical code system such as the example below (the printed data are easiest to read). I would like to write a function that returns an 'imputed' data frame, ie. where the the parent values are calculated as the sum of the child values. So, for instance, STAT.01.01.06 is the sum of STAT.01.01.06.01 through STAT.01.01.06.06. The code I have written uses two for loops, and, moreover, does not work as intended. My starting point was to determine the code depth by counting the dots in the variable 'code' (using strsplit), then iterate over the tree from deep to shallow. Does anybody have a good idea as to how to approach this in R? theCodes - c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02', 'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06', 'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 'STAT.01.01.06.04', 'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01', 'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02', 'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03') theValues - c('NA', 'NA', '15074.23366', '4882.942034', '1619.59628', '1801.722877', '1019.973666', 'NA', '503.9239317', '917.2189347', '6018.830465', '1944.11311', '1427.575402', '1965.725428', 'NA', '5857.293612', '5933.770263', '6077.089518', 'NA', '1427.180073', '455.9387993', '859.766603', '1002.983331', '2225.328211') df - as.data.frame(cbind(code=theCodes, value=theValues)) print(df) code value 1 STAT.01 NA 2 STAT.01.01 NA 3 STAT.01.01.01 15074.23366 4 STAT.01.01.02 4882.942034 5 STAT.01.01.03 1619.59628 6 STAT.01.01.04 1801.722877 7 STAT.01.01.05 1019.973666 8 STAT.01.01.06 NA 9 STAT.01.01.06.01 503.9239317 10 STAT.01.01.06.02
[R] hierachical code system
Hi, I have a hierachical code system such as the example below (the printed data are easiest to read). I would like to write a function that returns an 'imputed' data frame, ie. where the the parent values are calculated as the sum of the child values. So, for instance, STAT.01.01.06 is the sum of STAT.01.01.06.01 through STAT.01.01.06.06. The code I have written uses two for loops, and, moreover, does not work as intended. My starting point was to determine the code depth by counting the dots in the variable 'code' (using strsplit), then iterate over the tree from deep to shallow. Does anybody have a good idea as to how to approach this in R? theCodes - c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02', 'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06', 'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 'STAT.01.01.06.04', 'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01', 'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02', 'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03') theValues - c('NA', 'NA', '15074.23366', '4882.942034', '1619.59628', '1801.722877', '1019.973666', 'NA', '503.9239317', '917.2189347', '6018.830465', '1944.11311', '1427.575402', '1965.725428', 'NA', '5857.293612', '5933.770263', '6077.089518', 'NA', '1427.180073', '455.9387993', '859.766603', '1002.983331', '2225.328211') df - as.data.frame(cbind(code=theCodes, value=theValues)) print(df) code value 1 STAT.01 NA 2 STAT.01.01 NA 3 STAT.01.01.01 15074.23366 4 STAT.01.01.02 4882.942034 5 STAT.01.01.03 1619.59628 6 STAT.01.01.04 1801.722877 7 STAT.01.01.05 1019.973666 8 STAT.01.01.06 NA 9 STAT.01.01.06.01 503.9239317 10 STAT.01.01.06.02 917.2189347 11 STAT.01.01.06.03 6018.830465 12 STAT.01.01.06.04 1944.11311 13 STAT.01.01.06.05 1427.575402 14 STAT.01.01.06.06 1965.725428 15 STAT.01.02 NA 16 STAT.01.02.01 5857.293612 17 STAT.01.02.02 5933.770263 18 STAT.01.02.03 6077.089518 19 STAT.01.02.03.01 NA 20 STAT.01.02.03.02 1427.180073 21 STAT.01.02.03.03 455.9387993 22 STAT.01.02.03.04 859.766603 23 STAT.01.02.03.05 1002.983331 24 STAT.01.03 2225.328211 Thank you in advance! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading a specific column of a csv file in a loop
Yet another solution. This time using the LaF package: library(LaF) d-c(1,4,7,8) P1 - laf_open_csv(M1.csv, column_types=rep(double, 10), skip=1) P2 - laf_open_csv(M2.csv, column_types=rep(double, 10), skip=1) for (i in d) { M-data.frame(P1[, i],P2[, i]) } (The skip=1 is needed as laf_open_csv doesn't read headers) Jan On 11/08/2011 11:04 AM, Sergio René Araujo Enciso wrote: Dear all: I have two larges files with 2000 columns. For each file I am performing a loop to extract the ith element of each file and create a data frame with both ith elements in order to perform further analysis. I am not extracting all the ith elements but only certain which I am indicating on a vector called d. See an example of my code below ### generate an example for the CSV files, the original files contain more than 2000 columns, here for the sake of simplicity they have only 10 columns M1-matrix(rnorm(1000), nrow=100, ncol=10, dimnames=list(seq(1:100),letters[1:10])) M2-matrix(rnorm(1000), nrow=100, ncol=10, dimnames=list(seq(1:100),letters[1:10])) write.table(M1, file=M1.csv, sep=,) write.table(M2, file=M2.csv, sep=,) ### the vector containing the i elements to be read d-c(1,4,7,8) P1-read.table(M1.csv, header=TRUE) P2-read.table(M1.csv, header=TRUE) for (i in d) { M-data.frame(P1[i],P2[i]) rm(list=setdiff(ls(),d)) } As the files are quite large, I want to include read.table within the loop so as it only read the ith element. I know that there is the option colClasses for which I have to create a vector with zeros for all the columns I do not want to load. Nonetheless I have no idea how to make this vector to change in the loop, so as the only element with no zeros is the ith element following the vector d. Any ideas how to do this? Or is there anz other approach to load only an specific element? best regards, Sergio René __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] LaF 0.3: fast access to large ASCII files
The LaF package provides methods for fast access to large ASCII files. Currently the following file formats are supported: * comma separated format (csv) and other separated formats and * fixed width format. It is assumed that the files are too large to fit into memory, although the package can also be used to efficiently access files that do fit into memory. In order to process files that are too large to fit into memory, methods are provided to access and process file blockwise. Furthermore, an opened file can be indexed as one would a data.frame. In this way subsets. or specific columns can be read into memory. For example, assuming that an object laf has been created using one of the functions laf_open_csv or laf_open_fwf, the third column from the file can be read into memory using: col - laf[,3] The LaF-manual vignette contains a description of all functionality provided: http://laf-r.googlecode.com/files/LaF-manual_0.3.pdf The Laf-benchmark vignette compares the performance of LaF to the standard R-routines read.table and read.fwf: http://laf-r.googlecode.com/files/LaF-benchmark_0.3.pdf ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] JSS Special Volumes for 2011
So far in 2011 JSS has published 4 (four !) special volumes. If you have additional suggestions for special volumes, let us know. Also, submit your JSS-adapted package vignettes. If you like what you see, friend us at http://www.facebook.com/jstatsoft Tabelow and Whitcher, Guest Editors Volume 44: Magnetic Resonance Imaging in R http://www.jstatsoft.org/v44 Altman, Fox, Jackman and Zeileis , Guest Editors Volume 42: Political Methodology http://www.jstatsoft.org/v42 Commandeur, Koopman, and Ooms, Guest Editors Volume 41: Statistical Software for State Space Methods http://www.jstatsoft.org/v41 Putter, Guest Editor Volume 38: Competing Risks and Multi-State Models http://www.jstatsoft.org/v38 Additional regular volumes, of course, at http://www.jstatsoft.org. === Jan de Leeuw; Distinguished Professor and Chair, UCLA Department of Statistics; Editor: Journal of Multivariate Analysis, Journal of Statistical Software; US mail: 8125 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554 phone (310)-825-9550; fax (310)-206-5658; email: dele...@stat.ucla.edu .mac: jdeleeuw ++ aim: deleeuwjan ++ skype: j_deleeuw homepages: http://gifi.stat.ucla.edu ++ http://www.cuddyvalley.org - No matter where you go, there you are. --- Buckaroo Banzai http://gifi.stat.ucla.edu/sounds/nomatter.au __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] overloading + operator for chars
Hello, I would like to overload the + operator so that it can be used to concatenate two strings, e.g John + Doe = JohnDoe. How can I 'unseal' the + method? setMethod(+, signature(e1=character, e2=character), function(e1, e2) paste(e1, e2, sep=) ) Error in setMethod(+, signature(e1 = character, e2 = character), : the method for function + and signature e1=character, e2=character is sealed and cannot be re-defined Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Chi-Square test and survey results
George, Perhaps the site of the RISQ project (Representativity indicators for Survey Quality) might be of use: http://www.risq-project.eu/ . They also provide R-code to calculate their indicators. HTH, Jan Quoting ghe...@mathnmaps.com: An organization has asked me to comment on the validity of their recent all-employee survey. Survey responses, by geographic region, compared with the total number of employees in each region, were as follows: ByRegion All.Employees Survey.Respondents Region_1735142 Region_2500 83 Region_3897 78 Region_4717133 Region_5167 48 Region_6309 0 Region_7806125 Region_8627122 Region_9858177 Region_10 851160 Region_11 336 52 Region_12 1823312 Region_1380 9 Region_14 774121 Region_15 561 24 Region_16 834134 How well does the survey represent the employee population? Chi-square test says, not very well: chisq.test(ByRegion) Pearson's Chi-squared test data: ByRegion X-squared = 163.6869, df = 15, p-value 2.2e-16 By striking three under-represented regions (3,6, and 15), we get a more reasonable, although still not convincing, result: chisq.test(ByRegion[setdiff(1:16,c(3,6,15)),]) Pearson's Chi-squared test data: ByRegion[setdiff(1:16, c(3, 6, 15)), ] X-squared = 22.5643, df = 12, p-value = 0.03166 This poses several questions: 1) Looking at a side-by-side barchart (proportion of responses vs. proportion of employees, per region), the pattern of survey responses appears, visually, to match fairly well the pattern of employees. Is this a case where we trust the numbers and not the picture? 2) Part of the problem, ironically, is that there were too many responses to the survey. If we had only one-tenth the responses, but in the same proportions by region, the chi-square statistic would look much better, (though with a warning about possible inaccuracy): data: data.frame(ByRegion$All.Employees, 0.1 * (ByRegion$Survey.Respondents)) X-squared = 17.5912, df = 15, p-value = 0.2848 Is there a way of reconciling a large response rate with an unrepresentative response profile? Or is the bad news that the survey will give very precise results about a very ill-specified sub-population? (Of course, I would put in softer terms, like you need to assess the degree of homogeneity across different regions .) 3) Is Chi-squared really the right measure of how representative is the survey? Thanks for any help you can give - hope these questions make sense - George H. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Applying function to only numeric variable (plyr package?)
plyr isn't necessary in this case. You can use the following: cols - sapply(df, is.numeric) df[, cols] - pct(df[,cols]) round (and therefore pct) accepts a data.frame and returns a data.frame with the same dimensions. If that hadn't been the case colwise might have been of help: library(plyr) pct.colwise - colwise(pct) df[, cols] - pct.colwise(df[,colwise]) HTH, Jan Quoting michael.laviole...@dhhs.state.nh.us: My data frame consists of character variables, factors, and proportions, something like c1 - c(A, B, C, C) c2 - factor(c(1, 1, 2, 2), labels = c(Y,N)) x - c(0.5234, 0.6919, 0.2307, 0.1160) y - c(0.9251, 0.7616, 0.3624, 0.4462) df - data.frame(c1, c2, x, y) pct - function(x) round(100*x, 1) I want to apply the pct function to only the numeric variables so that the proportions are computed to percentages, and retain all the columns: c1 c2 x1 x2 1 A Y 52.3 92.5 2 B Y 69.2 76.2 3 C N 23.1 36.2 4 C N 11.6 44.6 I've been approaching it with the ddply and colwise functions from the plyr package, but in that case each I need each row to be its own group and retain all columns. Am I on the right track? If not, what's the best way to do this? Thanks in advance, M. L. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with .C
An obvious reason might be that your second argument should be a pointer to int. As others have mentioned, you might want to have a look at Rccp and/or inline. The documentation is good and I find it much easier to work with. For example, your example could be written as: library(Rcpp) library(inline) test - cxxfunction(signature(x = numeric ) , ' Rcpp::NumericVector v(x); Rcpp::NumericVector result(v.length()); for (int i = 0; i v.length(); ++i) { result[i] = v[i] + i; } return(result); ', plugin = Rcpp ) HTH, Jan Quoting Grigory Alexandrovich alexandrov...@mathematik.uni-marburg.de: Hello, first thank you for your answers. I did not read the whole pdf Writing R Extension, but I read this strongly shortened introduction to this subject: http://www.math.kit.edu/stoch/~lindner/media/.c.call%20extensions.pdf I get the same error with this C-function: void test(double * b, int l) { int i; for(i=0; i l ; i++) b[i] +=i; } I call it from R like this: parameter = c(0,0,1,1,1,0,1.5,0.7,0,1.2,0.3); .C(test, as.double(parameter), as.integer(11)) The programm crashes even in this simple case. Where can be the error? Thanks Grigory Alexandrovich Answer 1 Without knowing that C code, we cannot know. Have you read Writing R Extensions carefully? I.e. take care with memory allocation and printing as mentioned in the manual. Uwe Ligges Answer 2 This looks like a classic case of not reading the manual, and then compounding it by not reading the posting guide. The manual would be the Writing R Extensions pdf that comes with R or you can google it. The posting guide is referenced at the bottom of this and every other posting on this mailing list. There are nearly an infinite variety of errors that can lead to a crash, so it is really unreasonable of you to pose this question this way and expect constructive assistance. --- Jeff Newmiller The . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Answer 3 It's impossible to say, with such minimal information, but a reasonable guess is that there is a problem with the declaration of x and y in foo.c. These would (I think) need to be declared as double *, not double, when foo is called from .C(). cheers, Rolf Turner Answer 4 Hi, As other have said, it's very difficult to help you without an example + code to know what you are talking about. That having been said, it seems as if you are just getting your feet wet in this R -- C bridge, and I'd recommend you checkout the Rcpp and inline package to help make your life a lot easier ... -steve On 04.10.2011 14:04, Grigory Alexandrovich wrote: Hello, I wrote a function in C, which works fine if called from the main-function in C. But as soon as I try to call this function from R like .C('foo', as.double(x), as.integer(y)), the programm crashes. I created a dll with the cmd command R --arch x64 CMD SHLIB foo.c and loaded it into R with dyn.load(). What can be the cause of such behaviour? Again, the C-funcion itself works, but not if called from R. Thanks Grigory Alexandrovich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with .C
Quoting Uwe Ligges lig...@statistik.tu-dortmund.de: I don't agree that it's overkill -- you get to sidestep the whole `R CMD SHLIB ...` and `dyn.load` dance this way while you experiment with C(++) code 'live using the inline package. You need two additional packages now where you have to rely on the fact those are available. Moreover, you have to get used to that syntax, and part of it seems to be C++ now? At least I do not know why the above should work at all, while I know the simple C function does. OK, I agree that switching to Rcpp/C++ might be a bit of overkill in this example although in a lot of other example I find the Rcpp syntax much more readable than the c-code when dealing with .Call . The example could also have been writen in C using inline removing the need of Rcpp and looking more like the original example: library(inline) test - cfunction(signature(b = numeric, l = integer) , ' for(int i=0; i *l; i++) b[i] += i; ', convention=.C) I find that the advantage of using inline (especially in case of simple functions like this) is that 1. I no long need to compile and load the shared library manually, which can sometimes be frustrating when windows locks the dll. 2. Inline performs typechecking and casts variables to the right type. You can now type test(1:10,10) without needing as.numeric or as.integer. Reducing the amount of r code and the probabiliry of screwing things up by passing the wrong type. Jan Uwe It's really handy. Just make the original source void test(double *b, int *l) { int i; for(i=0; i *l ; i++) b[i] += i; } which you would have know after reading the Wriiting R Extensions manual. I agree that this step is unavoidable no matter which avenue (Rcpp or otherwise) one decides to take. -steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with regexp
Hello! library(gsubfn) test - c('filename_1_def.pdf', 'filename_2_abc.pdf') gsubfn((.+_)([a-z]+)(\\.pdf), \\2, test) Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ From: Jannis bt_jan...@yahoo.de To: r-h...@stat.math.ethz.ch Sent: Wednesday, October 5, 2011 1:56 PM Subject: [R] help with regexp Dear list memebers, I am stuck with using regular expressions. Imagine I have a vector of character strings like: test - c('filename_1_def.pdf', 'filename_2_abc.pdf') How could I use regexpressions to extract only the 'def'/'abc' parts of these strings? Some try from my side yielded no results: testresults - grep('(?=filename_[[:digit:]]_).{1,3}(?=.pdf)', perl = TRUE, value = TRUE) Somehow I seem to miss some important concept here. Until now I always used nested sub expressions like: testresults - sub('.pdf$', '', sub('^filename_[[:digit:]]_', '' , test)) but this tends to become cumbersome and I was wondering whether there is a more elegant way to do this? Thanks for any help Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optimize R code: replace for loop
Hello, I'd do: ave(testvec, FUN=cumsum)+1 But in R everything can be done in a trillion different ways. ;-) Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ From: ONKELINX, Thierry thierry.onkel...@inbo.be To: Chris82 rubenba...@gmx.de; r-help@r-project.org r-help@r-project.org Sent: Wednesday, October 5, 2011 11:54 AM Subject: Re: [R] optimize R code: replace for loop You can vectorize it using cumsum. cumsum(c(1, testvec)) all.equal(final.sum, cumsum(c(1, testvec))) -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Chris82 Verzonden: woensdag 5 oktober 2011 11:50 Aan: r-help@r-project.org Onderwerp: [R] optimize R code: replace for loop Dear R Users, at the moment I am trying to optimize an R script. testvec - c(0,1,0,1,1,1,1,0,0,1,0,1,0) sum.testvec - vector() tempsum - 1 for (e in 1:length(testvec)){ sum.testvec[e] - tempsum+testvec[e] tempsum - sum.testvec[e] } final.sum - c(1,sum.testvec) Is there an option to do something with apply? Unfortunately I am not so familiar with the apply functions. Thanks. -- View this message in context: http://r.789695.n4.nabble.com/optimize-R-code- replace-for-loop-tp3873945p3873945.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot: how to fix the ratio of the plot box?
Dear all, this should be trivial, but I couldn't figure out how to solve it... I would like to have a plot with fixed aspect ratio of 1. Whenever I resize the Quartz window, the axes are extended so that the plot fills the whole window. However, if you have different extensions for the different axes, the plot does not look like a square anymore (i.e., aspect ratio 1). The same of course happens if you print it to .pdf (ultimate goal). How can I fix the plot box (formed by the axes) ratio to be 1, meaning that the plot box is a square no matter how I resize the Quartz window? I searched for this and found: http://tolstoy.newcastle.edu.au/R/help/05/04/2888.html It is more or less recommended to use lattice's xyplot for that. Is there no solution for base graphics? [I know that the extension is by default 4% and that's great, but the the size of the Quartz window should not change this (which it does if you resize the window accordingly)]. Cheers, Marius Minimal example: u - runif(10) pdf(width=5, height=5) plot(u, u, asp=1, xlim=c(0,1), ylim=c(0,1), main=My title) dev.off() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot: how to fix the ratio of the plot box?
ahh, perfect, thanks. Cheers, Marius On 2011-10-02, at 13:08 , Jim Lemon wrote: On 10/02/2011 07:20 PM, Hofert Jan Marius wrote: Dear all, this should be trivial, but I couldn't figure out how to solve it... I would like to have a plot with fixed aspect ratio of 1. Whenever I resize the Quartz window, the axes are extended so that the plot fills the whole window. However, if you have different extensions for the different axes, the plot does not look like a square anymore (i.e., aspect ratio 1). The same of course happens if you print it to .pdf (ultimate goal). How can I fix the plot box (formed by the axes) ratio to be 1, meaning that the plot box is a square no matter how I resize the Quartz window? I searched for this and found: http://tolstoy.newcastle.edu.au/R/help/05/04/2888.html It is more or less recommended to use lattice's xyplot for that. Is there no solution for base graphics? [I know that the extension is by default 4% and that's great, but the the size of the Quartz window should not change this (which it does if you resize the window accordingly)]. Cheers, Marius Minimal example: u- runif(10) pdf(width=5, height=5) plot(u, u, asp=1, xlim=c(0,1), ylim=c(0,1), main=My title) dev.off() Hi Marius, Have you tried: par(pty=s) after you open the device and before plotting? Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] last observation carried forward +1
Hi R-helpers I'm looking for a vectorised function which does missing value replacement as in last observation carried forward in the zoo package but instead of a locf, I would like the locf function to add +1 to each time a missing value occurred. See below for an example. require(zoo) x - 5:15 x[4:7] - NA coredata(na.locf(zoo(x))) [1] 5 6 7 7 7 7 7 12 13 14 15 But what I need is 5 6 7 7+1 7+1+1 7+1+1+1 7+1+1+1+1 12 13 14 15 to obtain [1] 5 6 7 8 9 10 11 12 13 14 15 I could program this in C but if anyone has already done this I would be interested in seeing their vectorized solution. thanks, Jan -- groeten/kind regards, Jan Jan Wijffels Statistical Data Miner www.bnosac.be | +32 486 611708 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data import
You can with the routines in the memisc library. You can open a file using spss.system.file and then import a subset using subset. Look in the help pages of spss.system.file for examples. HTH Jan On 09/25/2011 11:56 PM, sassorauk wrote: Is it possible to import only certain variables from a SPSS file. I know that read.spss in the foreign library will bring the data into R but can I choose to important only chosen variables from the SPSS dataset to R? Thanks for your help. R -- View this message in context: http://r.789695.n4.nabble.com/Data-import-tp3842196p3842196.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R help on write.csv
Rowwise is easy. The example code I gave does this: it appends the new data /below/ the old. I'll repeat the example below: con - file(d:test2.csv, wt) write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=TRUE) write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=FALSE, append=TRUE) close(con) Or do you mean columnwise where you append columns? This would be very difficult in CSV. If you would like to do this you might have a look at the various options for exporting to Excel directly. See for example http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows . I have no experience in this. Regards, Jan PS I am sorry for my previous triple post. I had a little fight with my webmail client. On 09/22/2011 06:14 AM, Ashish Kumar wrote: IS there a way we can append row wise, so that it all stacks up horizontally, the way you do it in xlswrite in matlab, where you can even specify the cell number from where you want to write. -Ashish *From:*R. Michael Weylandt [mailto:michael.weyla...@gmail.com] *Sent:* Thursday, September 22, 2011 12:03 AM *To:* Jan van der Laan *Cc:* r-help@r-project.org; ashish.ku...@esteeadvisors.com *Subject:* Re: [R] R help on write.csv Oh darn, I had that line and then when I copied it to gmail I thought I'd be all slick and clean up my code: oh well...just not my day/thread... It's possible to work around the repeated headers business (change to something like Call$col.names - !append) but yeah, at this point I'm thinking its perhaps better practice to direct the OP to the various connection methods: sink() is nice, but he'll probably have to do something to convert his object to a CSV like string before printing: apply(OBJ, 1, paste, sep=,) Michael Weylandt On Wed, Sep 21, 2011 at 11:20 AM, Jan van der Laan e...@dds.nl mailto:e...@dds.nl wrote: Michael, You example doesn't seem to work. Append isn't passed on to the write.table call. You will need to add a Call$append- append to the function. And even then there will be a problem with the headers that are repeated when appending. An easier solution is to use write.table directly (I am using Dutch/European csv format): data - data.frame(a=1:10, b=1, c=letters[1:10]) write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE, col.names=TRUE) write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE, col.names=FALSE, append=TRUE) When first openening a file connection and passing that to write.csv or write.table data is also appended. The problem with write.csv is that writing the column names can not be suppressed which will result in repeated column names: con - file(d:test2.csv, wt) write.csv2(data, file=con, row.names=FALSE) write.csv2(data, file=con, row.names=FALSE) close(con) So one will still have to use write.table to avoid this: con - file(d:test2.csv, wt) write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=TRUE) write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=FALSE, append=TRUE) close(con) Using a file connection is probably also more efficient when doing a large number of appends. Jan Quoting R. Michael Weylandt michael.weyla...@gmail.com mailto:michael.weyla...@gmail.com: Touche -- perhaps we could make one though? write.csv.append - function(..., append = TRUE) { Call - match.call(expand.dots = TRUE) for (argname in c(col.names, sep, dec, qmethod)) if (!is.null(Call[[argname]])) warning(gettextf(attempt to set '%s' ignored, argname), domain = NA) rn - eval.parent(Call$row.names) Call$col.names - if (is.logical(rn) !rn) TRUE else NA Call$sep - , Call$dec - . Call$qmethod - double Call[[1L]] - as.name http://as.name(write.table) eval.parent(Call) } write.csv.append(1:5,test.csv, append = FALSE) write.csv.append(1:15, test.csv) Output seems a little sloppy, but might work for the OP. Michael Weylandt On Wed, Sep 21, 2011 at 9:03 AM, Ivan Calandra ivan.calan...@uni-hamburg.de mailto:ivan.calan...@uni-hamburg.de wrote: I don't think there is an append argument to write.csv() (well, actually there is one, but set to FALSE). There is however one to write.table() Ivan Le 9/21/2011 14:54, R. Michael Weylandt michael.weyla...@gmail.com mailto:michael.weyla...@gmail.com a écrit : The append argument of write.csv()? Michael On Sep 21, 2011, at 8:01 AM, Ashish Kumarashish.kumar@** esteeadvisors.com http://esteeadvisors.com ashish.ku...@esteeadvisors.com mailto:ashish.ku...@esteeadvisors.com wrote: Hi, I wanted to write the data created using R on existing csv file. However everytime I use write.csv, it overwrites the values
Re: [R] R help on write.csv
Michael, You example doesn't seem to work. Append isn't passed on to the write.table call. You will need to add a Call$append- append to the function. And even then there will be a problem with the headers that are repeated when appending. An easier solution is to use write.table directly (I am using Dutch/European csv format): data - data.frame(a=1:10, b=1, c=letters[1:10]) write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE, col.names=TRUE) write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE, col.names=FALSE, append=TRUE) When first openening a file connection and passing that to write.csv or write.table data is also appended. The problem with write.csv is that writing the column names can not be suppressed which will result in repeated column names: con - file(d:\\test2.csv, wt) write.csv2(data, file=con, row.names=FALSE) write.csv2(data, file=con, row.names=FALSE) close(con) So one will still have to use write.table to avoid this: con - file(d:\\test2.csv, wt) write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=TRUE) write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=FALSE, append=TRUE) close(con) Using a file connection is probably also more efficient when doing a large number of appends. Jan Quoting R. Michael Weylandt michael.weyla...@gmail.com: Touche -- perhaps we could make one though? write.csv.append - function(..., append = TRUE) { Call - match.call(expand.dots = TRUE) for (argname in c(col.names, sep, dec, qmethod)) if (!is.null(Call[[argname]])) warning(gettextf(attempt to set '%s' ignored, argname), domain = NA) rn - eval.parent(Call$row.names) Call$col.names - if (is.logical(rn) !rn) TRUE else NA Call$sep - , Call$dec - . Call$qmethod - double Call[[1L]] - as.name(write.table) eval.parent(Call) } write.csv.append(1:5,test.csv, append = FALSE) write.csv.append(1:15, test.csv) Output seems a little sloppy, but might work for the OP. Michael Weylandt On Wed, Sep 21, 2011 at 9:03 AM, Ivan Calandra ivan.calan...@uni-hamburg.de wrote: I don't think there is an append argument to write.csv() (well, actually there is one, but set to FALSE). There is however one to write.table() Ivan Le 9/21/2011 14:54, R. Michael Weylandt michael.weyla...@gmail.com a écrit : The append argument of write.csv()? Michael On Sep 21, 2011, at 8:01 AM, Ashish Kumarashish.kumar@** esteeadvisors.com ashish.ku...@esteeadvisors.com wrote: Hi, I wanted to write the data created using R on existing csv file. However everytime I use write.csv, it overwrites the values already there in the existing csv file. Any workaround on this. Thanks for your help Ashish Kumar Estee Advisors Pvt. Ltd. Email: ashish.ku...@esteeadvisors.com Cell: +91-9654072144 Direct: +91-124-4637-713 [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Dept. Mammalogy Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-**hamburg.de/mammals/eng/1525_8_**1.phphttp://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R help on write.csv
Michael, You example doesn't seem to work. Append isn't passed on to the write.table call. You will need to add a Call$append- append to the function. And even then there will be a problem with the headers that are repeated when appending. An easier solution is to use write.table directly (I am using Dutch/European csv format): data - data.frame(a=1:10, b=1, c=letters[1:10]) write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE, col.names=TRUE) write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE, col.names=FALSE, append=TRUE) When first openening a file connection and passing that to write.csv or write.table data is also appended. The problem with write.csv is that writing the column names can not be suppressed which will result in repeated column names: con - file(d:test2.csv, wt) write.csv2(data, file=con, row.names=FALSE) write.csv2(data, file=con, row.names=FALSE) close(con) So one will still have to use write.table to avoid this: con - file(d:test2.csv, wt) write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=TRUE) write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=FALSE, append=TRUE) close(con) Using a file connection is probably also more efficient when doing a large number of appends. Jan Quoting R. Michael Weylandt michael.weyla...@gmail.com: Touche -- perhaps we could make one though? write.csv.append - function(..., append = TRUE) { Call - match.call(expand.dots = TRUE) for (argname in c(col.names, sep, dec, qmethod)) if (!is.null(Call[[argname]])) warning(gettextf(attempt to set '%s' ignored, argname), domain = NA) rn - eval.parent(Call$row.names) Call$col.names - if (is.logical(rn) !rn) TRUE else NA Call$sep - , Call$dec - . Call$qmethod - double Call[[1L]] - as.name(write.table) eval.parent(Call) } write.csv.append(1:5,test.csv, append = FALSE) write.csv.append(1:15, test.csv) Output seems a little sloppy, but might work for the OP. Michael Weylandt On Wed, Sep 21, 2011 at 9:03 AM, Ivan Calandra ivan.calan...@uni-hamburg.de wrote: I don't think there is an append argument to write.csv() (well, actually there is one, but set to FALSE). There is however one to write.table() Ivan Le 9/21/2011 14:54, R. Michael Weylandt michael.weyla...@gmail.com a écrit : The append argument of write.csv()? Michael On Sep 21, 2011, at 8:01 AM, Ashish Kumarashish.kumar@** esteeadvisors.com ashish.ku...@esteeadvisors.com wrote: Hi, I wanted to write the data created using R on existing csv file. However everytime I use write.csv, it overwrites the values already there in the existing csv file. Any workaround on this. Thanks for your help Ashish Kumar Estee Advisors Pvt. Ltd. Email: ashish.ku...@esteeadvisors.com Cell: +91-9654072144 Direct: +91-124-4637-713 [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Dept. Mammalogy Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-**hamburg.de/mammals/eng/1525_8_**1.phphttp://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R help on write.csv
Michael, You example doesn't seem to work. Append isn't passed on to the write.table call. You will need to add a Call$append- append to the function. And even then there will be a problem with the headers that are repeated when appending. An easier solution is to use write.table directly (I am using Dutch/European csv format): data - data.frame(a=1:10, b=1, c=letters[1:10]) write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE, col.names=TRUE) write.table(data, file=test.csv, sep=;, dec=,, row.names=FALSE, col.names=FALSE, append=TRUE) When first openening a file connection and passing that to write.csv or write.table data is also appended. The problem with write.csv is that writing the column names can not be suppressed which will result in repeated column names: con - file(d:test2.csv, wt) write.csv2(data, file=con, row.names=FALSE) write.csv2(data, file=con, row.names=FALSE) close(con) So one will still have to use write.table to avoid this: con - file(d:test2.csv, wt) write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=TRUE) write.table(data, file=con, sep=;, dec=,, row.names=FALSE, col.names=FALSE, append=TRUE) close(con) Using a file connection is probably also more efficient when doing a large number of appends. Jan Quoting R. Michael Weylandt michael.weyla...@gmail.com: Touche -- perhaps we could make one though? write.csv.append - function(..., append = TRUE) { Call - match.call(expand.dots = TRUE) for (argname in c(col.names, sep, dec, qmethod)) if (!is.null(Call[[argname]])) warning(gettextf(attempt to set '%s' ignored, argname), domain = NA) rn - eval.parent(Call$row.names) Call$col.names - if (is.logical(rn) !rn) TRUE else NA Call$sep - , Call$dec - . Call$qmethod - double Call[[1L]] - as.name(write.table) eval.parent(Call) } write.csv.append(1:5,test.csv, append = FALSE) write.csv.append(1:15, test.csv) Output seems a little sloppy, but might work for the OP. Michael Weylandt On Wed, Sep 21, 2011 at 9:03 AM, Ivan Calandra ivan.calan...@uni-hamburg.de wrote: I don't think there is an append argument to write.csv() (well, actually there is one, but set to FALSE). There is however one to write.table() Ivan Le 9/21/2011 14:54, R. Michael Weylandt michael.weyla...@gmail.com a écrit : The append argument of write.csv()? Michael On Sep 21, 2011, at 8:01 AM, Ashish Kumarashish.kumar@** esteeadvisors.com ashish.ku...@esteeadvisors.com wrote: Hi, I wanted to write the data created using R on existing csv file. However everytime I use write.csv, it overwrites the values already there in the existing csv file. Any workaround on this. Thanks for your help Ashish Kumar Estee Advisors Pvt. Ltd. Email: ashish.ku...@esteeadvisors.com Cell: +91-9654072144 Direct: +91-124-4637-713 [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Dept. Mammalogy Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-**hamburg.de/mammals/eng/1525_8_**1.phphttp://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Possible or not possible: serif axis labels with plotmath [but everything else sans serif]?
Dear expeRts, I it possible to have serif labels in the following plot? x - 1:10 y - x plot(x, y, type=b, xlab=expression(x[1]), ylab=expression(x[2])) I know that one can use pdf(, family=serif), but then also the axis tick marks are printed in serif font. Apart from the fact that it may not look nice, I'm just interested if one can have serif axis labels but everything else in sans serif (default). Cheers, Marius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Possible or not possible: serif axis labels with plotmath [but everything else sans serif]?
Dear Eik, although possible in this case, tikzDevice is certainly not a general solution to all kinds of problems :-) I used it for quite some time before I gave up: I had a simple bar plot, the bars being black. This already caused errors like TeX capacity exceeded ... and I obtained these a lot. In fact, enlarging the TeX capacity (not trivial but possible) did not solve these issues. That's why I gave up on this package [although, clearly, the idea of full TeX support is totally appealing -- that's why I looked at the package in the first place]. Cheers, Marius On 2011-09-19, at 14:38 , Eik Vettorazzi wrote: Hi Jan Marius, using the tikzDevice-package, nearly everything is possible (at least, all what can be done in LaTeX). cheers Am 19.09.2011 11:58, schrieb Hofert Jan Marius: Dear expeRts, I it possible to have serif labels in the following plot? x - 1:10 y - x plot(x, y, type=b, xlab=expression(x[1]), ylab=expression(x[2])) I know that one can use pdf(, family=serif), but then also the axis tick marks are printed in serif font. Apart from the fact that it may not look nice, I'm just interested if one can have serif axis labels but everything else in sans serif (default). Cheers, Marius __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Eik Vettorazzi Institut für Medizinische Biometrie und Epidemiologie Universitätsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gemäß Gesetz über elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Jörg F. Debatin (Vorsitzender), Dr. Alexander Kirstein, Joachim Prölß, Prof. Dr. Dr. Uwe Koch-Gromus ETH Zurich Dr. Marius Hofert RiskLab, Department of Mathematics HG E 65.2 Rämistrasse 101 8092 Zurich Switzerland Phone +41 44 632 2423 marius.hof...@math.ethz.ch http://www.math.ethz.ch/~hofertj __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Where to put tryCatch or similar in a very big for loop
Laura, Perhaps the following example helps: nbstr - 100 result - numeric(nbstr) for (i in seq_len(nbstr)) { # set the default value for when the current bootstrap fails result[i] - NA try({ # estimate your cox model here if (runif(1) 0.1) stop(ERROR) result[i] - i }, silent=TRUE) } Regards, Jan Quoting Bonnett, Laura l.j.bonn...@liverpool.ac.uk: Hi, The simulation occasionally generates either a rare event meaning that the Cox model is not appropriate or it generates a covariate with most responses being the same which means that the Cox model cannot be fit. At bootstrap sample number 10, the variable c11 is considered singular by model cox1. Thanks, Laura -Original Message- From: Ken [mailto:vicvoncas...@gmail.com] Sent: 15 September 2011 21:43 To: Bonnett, Laura Cc: Steve Lianoglou; r-help@r-project.org Subject: Re: [R] Where to put tryCatch or similar in a very big for loop What type of singularity exactly, if you're working with counts is it a special case? If using a Monte Carlo generation scheme, there are various workarounds such as while(sum(vec)!=0) {sample} for example. More info on the error circumstances would help. Good luck! Ken Hutchison On Sep 15, 2554 BE, at 11:41 AM, Bonnett, Laura l.j.bonn...@liverpool.ac.uk wrote: Hi Steve, Thanks for your response. The slight issue is that I need to use a different starting seed for each simulation. If I use 'lapply' then I end up using the same seed each time. (By contrast, I need to be able to specify which starting seed I am using). Thanks, Laura -Original Message- From: Steve Lianoglou [mailto:mailinglist.honey...@gmail.com] Sent: 15 September 2011 16:17 To: Bonnett, Laura Cc: r-help@r-project.org Subject: Re: [R] Where to put tryCatch or similar in a very big for loop Hi Laura, On Thu, Sep 15, 2011 at 10:53 AM, Bonnett, Laura l.j.bonn...@liverpool.ac.uk wrote: Dear all, I am running a simulation study to test variable imputation methods for Cox models using R 2.9.0 and Windows XP. The code I have written (which is rather long) works (if I set nsim = 9) with the following starting values. bootrs(nsim=9,lendevdat=1500,lenvaldat=855,ac1=-0.19122,bc1=-0.18355,cc1=-0.51982,cc2=-0.49628,eprop1=0.98,eprop2=0.28,lda=0.003) I need to run the code 1400 times in total (bootstrap resampling) however, occasionally the random numbers generated lead to a singularity and hence the code crashes as one of the Cox model cannot be fitted (the 10th iteration is the first time this happens). I've been trawling the internet for ideas and it seems that there are several options in the form of try() or tryCatch() or next. I'm not sure however, how to include them in my code (attached). Ideally I'd like it to run everything simulation from 1 to 1400 and if there is an error at some point get an error message returned (I need to count how many there are) but move onto the next number in the loop. I've tried putting try(,silent=TRUE) around each cox model (cph statement) but that hasn't work and I've also tried putting try around the whole for loop without any success. Let's imagine you are using an `lapply` instead of `for`, only because I guess you want to store the results of `bootrs` somewhere, you can adapt this to your `for` solution. I typically return NULL when an error is caught, then filter those out from my results, or whatever you like: results - lapply(1:1400, function(i) { tryCatch(bootrs(...whatever...), error=function(e) NULL) }) went.south - sapply(results, is.null) The `went.south` vector will be TRUE where an error occurred in your bootrs call. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave: Combining multiple output statements in a function
Page 7 in my version of formatting.odt (to be sure I have the right version I downloaded the latest odfWeave from CRAN) discusses registering style definitions and Examples of Changing Styles for Tables, Paragraphs, Bullets and Pages which has nothing to do with my question (as far as I can tell). Could you perhaps just tell me how I should combine the output of multiple odf* calls inside a function? Thanks again. Jan Quoting Max Kuhn mxk...@gmail.com: formatting.odf, page 7. The results are in formattingOut.odt On Thu, Sep 15, 2011 at 2:44 PM, Jan van der Laan rh...@eoos.dds.nl wrote: Max, Thank you for your answer. I have had another look at the examples (I already had before mailing the list), but could find the example you mention. Could you perhaps tell me which example I should have a look at? Regards, Jan On 09/15/2011 04:47 PM, Max Kuhn wrote: There are examples in the package directory that explain this. On Thu, Sep 15, 2011 at 8:16 AM, Jan van der Laanrh...@eoos.dds.nl wrote: What is the correct way to combine multiple calls to odfCat, odfItemize, odfTable etc. inside a function? As an example lets say I have a function that needs to write two paragraphs of text and a list to the resulting odf-document (the real function has much more complex logic, but I don't think thats relevant). My first guess would be: exampleOutput- function() { odfCat(This is the first paragraph) odfCat(This is the second paragraph) odfItemize(letters[1:5]) } However, calling this function in my odf-document only generates the last list as only the output of the odfItemize function is returned by exampleOutput. How do I combine the three results into one to be returned by exampleOutput? I tried to wrap the calls to the odf* functions into a print statement: exampleOutput2- function() { print(odfCat(This is the first paragraph)) print(odfCat(This is the second paragraph)) print(odfItemize(letters[1:5])) } In another document this seemed to work, but in my current document strange odf-output is generated. Regards, Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] odfWeave: Combining multiple output statements in a function
What is the correct way to combine multiple calls to odfCat, odfItemize, odfTable etc. inside a function? As an example lets say I have a function that needs to write two paragraphs of text and a list to the resulting odf-document (the real function has much more complex logic, but I don't think thats relevant). My first guess would be: exampleOutput - function() { odfCat(This is the first paragraph) odfCat(This is the second paragraph) odfItemize(letters[1:5]) } However, calling this function in my odf-document only generates the last list as only the output of the odfItemize function is returned by exampleOutput. How do I combine the three results into one to be returned by exampleOutput? I tried to wrap the calls to the odf* functions into a print statement: exampleOutput2 - function() { print(odfCat(This is the first paragraph)) print(odfCat(This is the second paragraph)) print(odfItemize(letters[1:5])) } In another document this seemed to work, but in my current document strange odf-output is generated. Regards, Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave: Combining multiple output statements in a function
Max, Thank you for your answer. I have had another look at the examples (I already had before mailing the list), but could find the example you mention. Could you perhaps tell me which example I should have a look at? Regards, Jan On 09/15/2011 04:47 PM, Max Kuhn wrote: There are examples in the package directory that explain this. On Thu, Sep 15, 2011 at 8:16 AM, Jan van der Laanrh...@eoos.dds.nl wrote: What is the correct way to combine multiple calls to odfCat, odfItemize, odfTable etc. inside a function? As an example lets say I have a function that needs to write two paragraphs of text and a list to the resulting odf-document (the real function has much more complex logic, but I don't think thats relevant). My first guess would be: exampleOutput- function() { odfCat(This is the first paragraph) odfCat(This is the second paragraph) odfItemize(letters[1:5]) } However, calling this function in my odf-document only generates the last list as only the output of the odfItemize function is returned by exampleOutput. How do I combine the three results into one to be returned by exampleOutput? I tried to wrap the calls to the odf* functions into a print statement: exampleOutput2- function() { print(odfCat(This is the first paragraph)) print(odfCat(This is the second paragraph)) print(odfItemize(letters[1:5])) } In another document this seemed to work, but in my current document strange odf-output is generated. Regards, Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] list of all methods winthin an S4 class
Hello, How can I generate an overview/vector of all the methods winthin an S4 class? Similar to dir() in this Python code: class SomeClass(): def some_method_1(self): pass def some_method_2(self): pass dir(SomeClass) ['__doc__', '__module__', 'some_method_1', 'some_method_2'] Thanks in advance! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about object permanence/marshalling
Hello, I am trying to write some code that dumps R objects to the harddisk in a binary format so they can be quickly re-used later. Goal is to save time. The objects may be quite large (e.g. classes for a GUI). I was thinking that save() and load() would be suitable for this (until now I only thought it could be used for 'real' data, e.g. matrices, data.frames etc), but I am hoping any object can be 'marshalled' using these functions. Probably I am doing something wrong in the unmarshal() function, perhaps with assign(). Thank you in advance! AJ # # Creation of test data # setClass( Class=Test, representation=representation( amounts=data.frame ) ) setMethod( f=initialize, signature=Test, definition=function(.Object, amounts){ .Object@amounts - amounts return(.Object) } ) setGeneric ( name=doStuff, def=function(.Object){standardGeneric(doStuff)} ) setMethod( f = doStuff, signature = Test, definition=function(.Object) { return(mean(.Object@amounts, na.rm=TRUE)) } ) print( objects() ) instance - new(Class=Test, data.frame(amount=runif(10, 0, 10))) doStuff(instance) # # actual code (incomplete) # marshal - function(object) { fn - file.path(Sys.getenv()[TEMP], paste(object, .xdr, sep=)) save(object, file=fn, compress=FALSE) print(sprintf(Saving %s, fn)) } unmarshal - function(xdr) { object - strsplit(strsplit(xdr, \\.)[[1]][[1]], /) object - object[[1]][length(object[[1]])] assign(object, load(xdr)) print(sprintf(Loading %s, xdr)) } print(objects()) lapply(c(doStuff, instance), marshal) rm(list=c(doStuff, instance)) xdrs - Sys.glob(file.path(Sys.getenv()[TEMP], *.xdr)) lapply(xdrs, unmarshal) print(objects()) ## doStuff and instance do not appear! :-( Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question about object permanence/marshalling
Hi Jim, Thanks for your reply. It seems that save and load can only be used for datasets (as the title in ?load suggests). I'd be very glad if I'm mistaken though! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ From: jim holtman jholt...@gmail.com To: Albert-Jan Roskam fo...@yahoo.com Cc: R Mailing List r-help@r-project.org Sent: Thursday, August 25, 2011 4:07 PM Subject: Re: [R] Question about object permanence/marshalling The problem I think is in your unmarshal. 'load' will load the object into the local environment, not the global. You have to explicitly return it, that means you have to know the name that its was 'save'd by; unmarshal - function(xdr) { object - strsplit(strsplit(xdr, \\.)[[1]][[1]], /) object - object[[1]][length(object[[1]])] load(xdr) print(sprintf(Loading %s, xdr)) ? # return the object that was just loaded; don't know if they all have the same name. } On Thu, Aug 25, 2011 at 9:34 AM, Albert-Jan Roskam fo...@yahoo.com wrote: Hello, I am trying to write some code that dumps R objects to the harddisk in a binary format so they can be quickly re-used later. Goal is to save time. The objects may be quite large (e.g. classes for a GUI). I was thinking that save() and load() would be suitable for this (until now I only thought it could be used for 'real' data, e.g. matrices, data.frames etc), but I am hoping any object can be 'marshalled' using these functions. Probably I am doing something wrong in the unmarshal() function, perhaps with assign(). Thank you in advance! AJ # # Creation of test data # setClass( Class=Test, representation=representation( amounts=data.frame ) ) setMethod( f=initialize, signature=Test, definition=function(.Object, amounts){ .Object@amounts - amounts return(.Object) } ) setGeneric ( name=doStuff, def=function(.Object){standardGeneric(doStuff)} ) setMethod( f = doStuff, signature = Test, definition=function(.Object) { return(mean(.Object@amounts, na.rm=TRUE)) } ) print( objects() ) instance - new(Class=Test, data.frame(amount=runif(10, 0, 10))) doStuff(instance) # # actual code (incomplete) # marshal - function(object) { fn - file.path(Sys.getenv()[TEMP], paste(object, .xdr, sep=)) save(object, file=fn, compress=FALSE) print(sprintf(Saving %s, fn)) } unmarshal - function(xdr) { object - strsplit(strsplit(xdr, \\.)[[1]][[1]], /) object - object[[1]][length(object[[1]])] assign(object, load(xdr)) print(sprintf(Loading %s, xdr)) } print(objects()) lapply(c(doStuff, instance), marshal) rm(list=c(doStuff, instance)) xdrs - Sys.glob(file.path(Sys.getenv()[TEMP], *.xdr)) lapply(xdrs, unmarshal) print(objects()) ## doStuff and instance do not appear! :-( Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to colour specific edges in a dendrogram
Dear Mailing-list I used hclust to make a dendrogram of 2613 leafs. I also have a list with the names of certain labels which are of interest and I would like to visualize their appearance within the dendrogram. I found an example how to use dendrapply to colour the labels but the problem is that with 2613 leafs I cannot plot the labels as it gets super messy. I now tried to write a function using dendrapply() to colour the edges of the leafs of interest red. Unfortunately, I fail writing this function. Could someone help me out with the stub of a function colouring edges? I have the dendrogram list of labels to colour their edges I would like to colour the edges between the final leaf node and their parental node. Thank you very much for your help! Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] time series question
Hi, I have a long data file with time data that change to wide format using reshape. The data contain Values and Factors. Some values are missing but can be obtained by multiplying value of year T-1 with Factor of year T. Sometimes, multiple succesive years have no values, so the calculated values need to be calculated sequentially. # sample data. DF - data.frame(Var=rep(letters, 10), Fac=rep(runif(26), 10), Val=rep(runif(26), 10), Year=rep(2000:2009, 26)) DF[as.numeric(rownames(DF)) %% 3 == 0,Val] - NA # make some holes DF2 - cast(melt(DF, id=c(Var, Year)), ... ~ variable + Year, fun=mean, na.rm=T) # my attempt library(reshape) prev - grep(Val_, names(DF2)) - 1 this - grep(Fac_, names(DF2)) DF3 - DF2 DF3[, prev] - mapply(*, DF2[, this], DF2[, prev]) This doesn't work. Another option would be to use two loops for cols and rows, but I didn't get that to work either :-( Suggestions for clean code, anyone? Thank you in advance! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Standards for delivery of GPL software in CRAN packages
I wondered if there were standard practices in CRAN for delivery of R source implementing functions in R packages. I has encountered a couple of packages where the gzipped version of source contains very little, primarily the Help files describing the functions in the package. In some cases I can find the source as the value of the function name. Given that these packages are released as GPL, oughtn't the unoptimized source be freely available, hopefully with comments? Am I missing something? Is there a central place other than mirrors where such source is retained? Sourceforge? - Jan - Jan, from Sierpinski, a Blackberry, 6072391834, Google Talk to: bayesianlogi...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Standards for delivery of GPL software in CRAN packages
Fine. Attached. It's waved. All it has is *.Rd files. Apparently the functions are collected in functionINIT.R. But 00Index and DESCRIPTION are not helpful. - j -Original Message- From: b.rowling...@googlemail.com [mailto:b.rowling...@googlemail.com] On Behalf Of Barry Rowlingson Sent: Monday, June 27, 2011 10:18 AM To: Galkowski, Jan Cc: r-help@r-project.org Subject: Re: [R] Standards for delivery of GPL software in CRAN packages On Mon, Jun 27, 2011 at 1:24 PM, Galkowski, Jan jgalk...@akamai.com wrote: I wondered if there were standard practices in CRAN for delivery of R source implementing functions in R packages. I has encountered a couple of packages where the gzipped version of source contains very little, primarily the Help files describing the functions in the package. In some cases I can find the source as the value of the function name. Given that these packages are released as GPL, oughtn't the unoptimized source be freely available, hopefully with comments? Am I missing something? Is there a central place other than mirrors where such source is retained? Sourceforge? The 'package source' link on CRAN should point you to a tar.gz file that contains the source code. For example, for splancs off the heanet mirror it is: http://ftp.heanet.ie/mirrors/cran.r-project.org/src/contrib/splancs_2.01-27.tar.gz .tar.gz files from those links should have full R, C and Fortran source code. I think we need counter-examples... Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Standards for delivery of GPL software in CRAN packages
No, you are correct. It meets the letter of GPL. Took me a while to find FunctionINIT.R though. As I wrote in the original, source is available, if only by keying in function names and seeing their value. Was hoping for greater clarity. I of course have the paper and the help. I was trying to understand what this eta partameter was and how to interpret it. As mentioned, not the first package this has been an issue for. Thanks, - Jan - Jan, from Sierpinski, a Blackberry, 6072391834, Google Talk to: bayesianlogi...@gmail.com - Original Message - From: Gavin Simpson gavin.simp...@ucl.ac.uk To: Galkowski, Jan Cc: Barry Rowlingson b.rowling...@lancaster.ac.uk; r-help@r-project.org r-help@r-project.org Sent: Mon Jun 27 11:36:57 2011 Subject: Re: [R] Standards for delivery of GPL software in CRAN packages On Mon, 2011-06-27 at 11:14 -0400, Galkowski, Jan wrote: Fine. Attached. It's waved. All it has is *.Rd files. Apparently the functions are collected in functionINIT.R. But 00Index and DESCRIPTION are not helpful. - j The Rd files are the help or manual pages for the functions defined in the package. In this particular case, the package author has decided to put all the R code for their package into a single R source file - functionINIT.R. The other two files are R-specific files, the latter of which is used to describe the package; which, incidentally, points you to a peer-reviewed paper that the package is support for. I don't recall the GPL mentioning anything requiring that the source code be helpful. The authors have most certainly fulfilled their requirements under GPL, as has CRAN in distributing the package sources. Or am I being obtuse and completely missing your point? G -Original Message- From: b.rowling...@googlemail.com [mailto:b.rowling...@googlemail.com] On Behalf Of Barry Rowlingson Sent: Monday, June 27, 2011 10:18 AM To: Galkowski, Jan Cc: r-help@r-project.org Subject: Re: [R] Standards for delivery of GPL software in CRAN packages On Mon, Jun 27, 2011 at 1:24 PM, Galkowski, Jan jgalk...@akamai.com wrote: I wondered if there were standard practices in CRAN for delivery of R source implementing functions in R packages. I has encountered a couple of packages where the gzipped version of source contains very little, primarily the Help files describing the functions in the package. In some cases I can find the source as the value of the function name. Given that these packages are released as GPL, oughtn't the unoptimized source be freely available, hopefully with comments? Am I missing something? Is there a central place other than mirrors where such source is retained? Sourceforge? The 'package source' link on CRAN should point you to a tar.gz file that contains the source code. For example, for splancs off the heanet mirror it is: http://ftp.heanet.ie/mirrors/cran.r-project.org/src/contrib/splancs_2.01-27.tar.gz .tar.gz files from those links should have full R, C and Fortran source code. I think we need counter-examples... Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Standards for delivery of GPL software in CRAN packages
Regarding the subject, I want to thank the many respondents for clarifying the nature of the relationship between R and the GPL, as well as giving help with the structure of R-delivered source. I want to emphasize I meant nothing at all harsh or accusatory in my email. I did say I had access to source, as function name values, just that my expectation was to see functions called out individually. The particular case I sought information about a maxiset threshold parameter in the package waved for the function WaveD and what it meant, trying to understand the related algorithms. I'm now convinced that I'll need to understand the original papers, Cavalier and Raimondo (2007) and Donoho and Raimondo (2004), as well as Johnstone, Keykyacharian, Picard, and Raimondo (2004), in order to obtain a satisfactory answer. I think my reaction was in response to being rather spoiled by some of the really excellent, world class, and mature packages and their documentation elsewhere in the R contributions library, some backed up by whole textbooks. I realize all package authors do their best and the packages are thoroughly tested. I never had any question the package was correct, merely trying to understand how it worked. When I wasn't satisfied by the documentation in the package, I turned to the source. Again, I meant no offense to anyone. I thank you all for your responses and efforts, and am grateful. - Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ranking submodels by AIC (more general question)
Alexandra, Have a look at add1 and drop1. Regards, Jan On 06/23/2011 07:32 PM, Alexandra Thorn wrote: Here's a more general question following up on the specific question I asked earlier: Can anybody recommend an R command other than mle.aic() (from the wle package) that will give back a ranked list of submodels? It seems like a pretty basic piece of functionality, but the closest I've been able to find is stepAIC(), which as far as I can tell only gives back the best submodel, not a ranking of all submodels. Thanks in advance, Alexandra __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Documenting variables, dataframes and files?
The memisc package also offers functionality for documenting data. Jan On 06/22/2011 04:57 PM, Robert Lundqvist wrote: Every now and then I realize that my attempts to document what all dataframes consist of are unsufficient. So far, I have been writing notes in an external file. Are there any better ways to do this within R? One possibility could be to set up the data as packages, but I would like to have a solution on a lower level, closer to data. I can't find any pointers in the standard manuals. Suggestions are most welcome. Robert ** Robert Lundqvist Norrbotten regional council Sweden [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] omitting columns from a data frame
But isn't this version of which() typo-proof? x - iris[-which(c(Sepal.Length, SSSepal.Width) %in% names(iris))] Btw, I prefer the following, ie. simply assigning to NULL. Much easier notation. y - iris y$Sepal.Width - y$SSSepal.Width - NULL Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ From: Joshua Wiley jwiley.ps...@gmail.com To: Ista Zahn iz...@psych.rochester.edu Cc: R-Help r-h...@stat.math.ethz.ch Sent: Tue, June 21, 2011 5:05:27 PM Subject: Re: [R] omitting columns from a data frame On Tue, Jun 21, 2011 at 6:57 AM, Ista Zahn iz...@psych.rochester.edu wrote: I would cation people not to use the -which strategy because entering a value that doesn't exist as a column name returns a zero-column data.frame, without so much as a warning. This can be a problem if you don't know if a column exists but just want to make sure it doesn't, or if you make a typo. Compare Good point. In some ways, I am a little unsettled by setdiff() because if you make a typo, you may *think* you have omitted it, and you will have a sensible data frame, but it will actually still be there. I am particularly thinking of the case where you are omitting several variables at once: mtcars[setdiff(names(mtcars), c(disp, jp))] which is why my current preference has been match(). The default for no match fails spectacularly if the variable does not exist: mtcars[-match(c(disp, jp), names(mtcars))] of course, this would not work for your example of a variable you just want to make sure is deleted. Anyone have thoughts on pitfalls of match? Josh head(mtcars[, -which(names(mtcars) == make.sure.to.delete)]) to head(mtcars[, setdiff(names(mtcars), make.sure.to.delete)]) Best, Ista On Tue, Jun 21, 2011 at 12:22 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: On Mon, Jun 20, 2011 at 8:55 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Too funny! how about subset? Sure, that is one option. Each of the following will also work. The ones wrapped with c() can easily omit more than one at a time. mtcars[, -which(names(mtcars) == drat)] mtcars[, names(mtcars) != drat] mtcars[, !names(mtcars) %in% c(drat)] mtcars[, -match(c(drat), names(mtcars))] On Mon, Jun 20, 2011 at 10:52 PM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Erin, See inline. On Mon, Jun 20, 2011 at 8:45 PM, Erin Hodgess erinm.hodg...@gmail.com wrote: Dear R People: I have a data frame, xm1, which has 12 rows and 4 columns. If I put is xm1[,-4], I get all rows, and columns 1 - 3, which is as it should be. Okay, so you know how to use the column number to omit columns. Now, is there a way to use the names of the columns to omit them, please? You have all the pieces (the column names, and the knowledge that you can omit columns by their index). Homework: find a way to return the column numbers given the column names (hint). Cheers, Josh [[elided Yahoo spam]] Sincerely, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodg...@gmail.com -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
Re: [R] is this a bug?
Thanks a lot to all who responded. This is a little less confusing now, although it's hard for me to fathom the (practical) use of a dataframe within a dataframe. If one mixes different notations, or, put in a different way, different underlying classes (data.frame vs. numeric), these rather unintuitive results appear. So I'll use any of these: df$pct - df$weight / ave(df$weight, df$sex, FUN=sum)*100 df[pct] - df[weight] / ave(df[weight], df[sex], FUN=sum)*100 using str() is very insightful, as is using class() I'd prefer it if R simply generated an error when one attempts to nest a data.frame within a data.frame. Thanks again! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ From: Brian Diggs dig...@ohsu.edu To: R-help@r-project.org Sent: Fri, June 17, 2011 11:58:44 PM Subject: Re: [R] is this a bug? On 6/17/2011 2:24 PM, (Ted Harding) wrote: And the extra twist in the tale is exemplified by this mini-version of Albert-Jan's first example: DF- data.frame(A=c(1,2,3)) DF$B- c(4,5,6) DF$C- c(7,8,9) DF # A B C # 1 1 4 7 # 2 2 5 8 # 3 3 6 9 DF$D- DF[A]/DF[B] DF # A B CA # 1 1 4 7 0.25 # 2 2 5 8 0.40 # 3 3 6 9 0.50 ##And why: DF[A]/DF[B] # A # 1 0.25 # 2 0.40 # 3 0.50 ##So the ratio DF[A]/DF[B] comes out with the name of ##the numerator, A. This is then the name given to DF$D It's even slightly weirder than that: str(DF) #'data.frame': 3 obs. of 4 variables: # $ A: num 1 2 3 # $ B: num 4 5 6 # $ C: num 7 8 9 # $ D:'data.frame': 3 obs. of 1 variable: # ..$ A: num 0.25 0.4 0.5 There is a column D in DF which is itself a data frame with a single column whose name is A (because of what Ted said). When formatted for printing out, the column name of the inner data frame is used (as a result of how data.frame() itself handles named arguments when the argument is itself a data.frame: If a list or data frame or matrix is passed to data.frame it is as if each component or column had been passed as a separate argument...). So not a bug, but a convoluted set of circumstances that can happen when non-atomic vectors are assigned to columns of a data.frame. That's one of those /you shouldn't do that even though it is technically legal or at least you shouldn't be surprised when things don't work the way you thought they would/ things. Thus Albert-Jan's df[weight] / ave(df[weight], df[sex], FUN=sum)*100 comes through with name weight. Ted. On 17-Jun-11 21:06:42, William Dunlap wrote: df$varname is a column of df. df[varname] is a one-column df containing that column. df[[varname]] is a column of df (same as df$varname). df[,varname] is a column of df (same as df$varname). df[,varname,drop=FALSE] is a one-column df (same as df$varname). df$newVarname- df[varname] inserts a new component into df, the component being a one-column data.frame, not the column in that data.frame. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Albert-Jan Roskam Sent: Friday, June 17, 2011 1:49 PM To: R Mailing List Subject: [R] is this a bug? Hello, Is the following a bug? I always thought that df$varname- does the same as df[varname]- df- data.frame(weight=round(runif(10, 10, 100)), sex=round(runif(100, 0, 1))) df$pct- df[weight] / ave(df[weight], df[sex], FUN=sum)*100 names(df) [1] weight sexpct ### -- ok head(df) [[elided Yahoo spam]] 1 86 0 2.4002233 2 19 1 0.5643006 3 32 0 0.8931063 4 87 0 2.4281328 5 45 0 1.2559308 6 95 0 2.6514094 rm(df) df- data.frame(weight=round(runif(10, 10, 100)), sex=round(runif(100, 0, 1))) df[pct]- df[weight] / ave(df[weight], df[sex], FUN=sum)*100 ### - this does work names(df) [1] weight sexpct head(df) weight sex pct 1 15 0 0.5246590 2 43 0 1.5040224 3 17 1 0.9284544 4 44 1 2.4030584 5 76 1 4.1507373 6 59 0 2.0636586 do.call(c, R.Version()) platformarch i686-pc-linux-gnu i686 os system linux-gnu i686, linux-gnu status major 2 minoryear 11.1 2010
[R] is this a bug?
Hello, Is the following a bug? I always thought that df$varname - does the same as df[varname] - df - data.frame(weight=round(runif(10, 10, 100)), sex=round(runif(100, 0, 1))) df$pct - df[weight] / ave(df[weight], df[sex], FUN=sum)*100 names(df) [1] weight sexpct ### -- ok head(df) weight sexweight ### -- huh!?! 1 86 0 2.4002233 2 19 1 0.5643006 3 32 0 0.8931063 4 87 0 2.4281328 5 45 0 1.2559308 6 95 0 2.6514094 rm(df) df - data.frame(weight=round(runif(10, 10, 100)), sex=round(runif(100, 0, 1))) df[pct] - df[weight] / ave(df[weight], df[sex], FUN=sum)*100 ### - this does work names(df) [1] weight sexpct head(df) weight sex pct 1 15 0 0.5246590 2 43 0 1.5040224 3 17 1 0.9284544 4 44 1 2.4030584 5 76 1 4.1507373 6 59 0 2.0636586 do.call(c, R.Version()) platformarch i686-pc-linux-gnu i686 os system linux-gnu i686, linux-gnu status major 2 minoryear 11.1 2010 month day 0531 svn revlanguage 52157 R version.string R version 2.11.1 (2010-05-31) # Thanks! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Running a GMM Estimation on dynamic Panel Model using plm-Package
Hi! Am 12.06.2011 21:43, schrieb bstudent: Error in solve.default(Reduce(+, A2)) : System ist für den Rechner singulär: reziproke Konditionszahl = 4.08048e-22 Error in solve.default(Reduce(+, A2)) : System is singulary for the computer: reciprocal number of conditions = 4.08048e-22 Just for the record: I had the same error with my data and finaly gave up and used stata. Kind regards and good luck! Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] reshape::cast: invalid 'yinds' argument
Hi, I'm using reshape to cast molten data. When I use the following command, R either crashes (when I use Notepad++) or gives an error (when I use Rgui or source()), BUT the error occurs not always, maybe only on half the attempts: w - cast(v, id + code + productname + year + begin + end + specificDesc + specificDesc2 ~ type) Error in merge.data.frame(data, all.combinations, by = unlist(vars), sort = FALSE, : invalid 'yinds' argument What does this message mean, and how can I get rid of the error? I tried changing the colvars from character to factor, but that didn't help. I'm using R2.10.1 and either WinXP or Win2000. Thanks in advance, Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NaN, Inf to NA
Aha! Thank you very much for that clarification! It would be much more user friendly if R generated a NotImplementedError or something similar. The 'garbage results' are pretty misleading, esp. to a novice. I wanted to recode every NaN and Inf value of an entire data.frame to NA. The data.frame also includes character variables. So the following might work (?) (Can't test it here) ditch - function(x) ifelse(is.infinite(x) | is.nan(x), NA, x) df - apply(df, 2, ditch) From: William Dunlap wdun...@tibco.com Cc: R Mailing List r-help@r-project.org Sent: Fri, May 27, 2011 12:57:01 AM Subject: RE: [R] NaN, Inf to NA I think the source of the OP's problem is that while things like df30 and is.na(df) return a logical matrix with the dimensions of the data.frame df, both is.infinite(df) and is.nan(df) return a logical vector as long as the number of columns of df. (`` and is.na have data.frame methods but is.infinite and is.nan do not: the latter give garbage results for data.frames.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Marc Schwartz Sent: Thursday, May 26, 2011 2:15 PM To: Albert-Jan Roskam Cc: R Mailing List Subject: Re: [R] NaN, Inf to NA On May 26, 2011, at 3:18 PM, Albert-Jan Roskam wrote: Hi, I want to recode all Inf and NaN values to NA, but I;m surprised to see the result of the following code. Could anybody enlighten me about this? df - data.frame(a=c(NA, NaN, Inf, 1:3)) df[is.infinite(df) | is.nan(df)] - NA df a 1 NA 2 NaN 3 Inf 4 1 5 2 6 3 Thanks! Cheers!! Albert-Jan The canonical way is to use is.na() to assign the NA value based upon a condition. See ?is.na for more information. is.na(df$a) - !is.finite(df$a) df a 1 NA 2 NA 3 NA 4 1 5 2 6 3 HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NaN, Inf to NA
Hi, I want to recode all Inf and NaN values to NA, but I;m surprised to see the result of the following code. Could anybody enlighten me about this? df - data.frame(a=c(NA, NaN, Inf, 1:3)) df[is.infinite(df) | is.nan(df)] - NA df a 1 NA 2 NaN 3 Inf 4 1 5 2 6 3 Thanks! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Gui immediately closes when started from command-line
Hello, I want to run an r script that contains code for a gui (rgtk) on the command line (windows 2000, 32 bits) using R2.10.1, but the Gui disappears a few miliseconds after I started the program. What switch should I use to prevent this? I tried r.exe, rterm.exe and rscript.exe with various combinations of switches, but none of them works. TIA Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Gui immediately closes when started from command-line
Thanks, we tried it, but it didn't solve the problem. Some more info (mostly strings of ) was shown in the Dos box, but that was all. Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ From: Jonathan Gabris jonat...@k-m-p.nl To: r-help@r-project.org Sent: Thu, May 19, 2011 9:54:13 AM Subject: Re: [R] Gui immediately closes when started from command-line I had a problem similar to this I think. Though I cannot remember the symptoms. Something to to with the lack of possible interaction with the console as I was using R as a backend to a Qt interface. To solve the problem I used the flag: '--ess' (using '--vanilla' is also a good idea) (cf Appendix B:Invoking R, in one of the R manuals) Hope this helps. Jonathan. Hello, I want to run an r script that contains code for a gui (rgtk) on the command line (windows 2000, 32 bits) using R2.10.1, but the Gui disappears a few miliseconds after I started the program. What switch should I use to prevent this? I tried r.exe, rterm.exe and rscript.exe with various combinations of switches, but none of them works. TIA Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unexpected behaviour as.data.frame
Santosh, Ivan, This is also what I was looking for. Thanks. Looking at the source of dataFrame.default is seems that it uses the same approach as I did: first create a list then a data.frame from that list. I think I'll stick with the code I already had as I don't want another dependency (multiple actually for R.utils). But thanks again for pointing it out. Jan On 05/16/2011 10:42 AM, Santosh Srinivas wrote: Hi Ivan, Take a look dataFrame in R.utils ... is that what you want? from the help file: Examples df- dataFrame(colClasses=c(a=integer, b=double), nrow=10) df[,1]- sample(1:nrow(df)) df[,2]- rnorm(nrow(df)) print(df) Thanks, Santosh On Mon, May 16, 2011 at 1:42 PM, Ivan Calandra ivan.calan...@uni-hamburg.de wrote: I feel like I'm always asking this type of questions, but is it possible to add a base function that allows creating an empty data.frame, as matrix() does? What I mean would be something like: create.data.frame(number_of_columns, mode_of_columns). I think it would make things easier than creating one or several matrices and then combining them Is it possible; does it make sense? Ivan Le 5/15/2011 22:17, Bert Gunter a écrit : Inline below. On Sun, May 15, 2011 at 11:11 AM, Jan van der Laanrh...@eoos.dds.nl wrote: Thanks. I also noticed myself minutes after sending my message to the list. My 'please ignore my question it was just a stupid typo' message was sent with the wrong account and is now awaiting moderation. However, my other question still stands: what is the preferred/fastest/simplest way to create a data.fame with given column types and dimensions? I do not know, but why is simply data.frame(numeric(10), character(10), integer(10), stringsAsFactors=FALSE) not acceptable? Note that if you had, say, 500, numeric (= double) and 100 character columns to add, you might do something like: z- matrix(numeric(5000),nr=10) u- matrix(character(1000),nr=10) frm- data.frame(z,u, stringsAsFactors = FALSE) ## 600 columns While this might save some typing, it may not be much more efficient than typing it all out -- maybe just some parsing time is saved. You can experiment and see. However, since a data.frame **is** a list with added attributes and a great deal of the work of the constructor is in constructing and checking these attributes (e.g. row and column names), I see nothing terribly inefficient with what you did. It's just a bit obscure. But maybe someone with greater expertise will set us both straight. Cheers, Bert Regards, Jan On 05/15/2011 04:43 PM, Bert Gunter wrote: In your post, you're missing the final s on the stringsAsFactors argument in the d1 assignment. When I typed it correctly, it works as expected. -- Bert On Sun, May 15, 2011 at 4:25 AM, Jan van der Laanrh...@eoos.dds.nl wrote: I use the following code to create two data.frames d1 and d2 from a list: types- c(integer, character, double) nlines- 10 d1- as.data.frame(lapply(types, do.call, list(nlines)), stringsAsFactor=FALSE) l2- lapply(types, do.call, list(nlines)) d2- as.data.frame(l2, stringsAsFactors=FALSE) I would expect d1 and d2 to be the same, however, in d1 the second column is a factor while in d2 it is a character (which I would expect): str(d1) 'data.frame': 10 obs. of 3 variables: $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int 0 0 0 0 0 0 0 0 0 0 $ c: Factor w/ 1 level : 1 1 1 1 1 1 1 1 1 1 $ c.0..0..0..0..0..0..0..0..0..0. : num 0 0 0 0 0 0 0 0 0 0 str(d2) 'data.frame': 10 obs. of 3 variables: $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int 0 0 0 0 0 0 0 0 0 0 $ c: chr ... $ c.0..0..0..0..0..0..0..0..0..0. : num 0 0 0 0 0 0 0 0 0 0 As different but related question: I use the commands above to create an 'empty' data.frame with specified column types and dimensions. I need this data.frame to pass on to my c++ routines. Is there a more simple/elegant way of creating this data.frame? Regards, Jan PS: I am running R on 64 bit Ubuntu 11.04: sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King
Re: [R] Unexpected behaviour as.data.frame
Forget I asked. There was a typo in my example (stringsAsFactor instead of stringAsFactors) which explained the difference. My apologies. My second question however still stands: How does on create a data.frame with given column types and given dimensions? Thanks. Regards, Jan Quoting Jan van der Laan rh...@eoos.dds.nl: I use the following code to create two data.frames d1 and d2 from a list: types - c(integer, character, double) nlines - 10 d1 - as.data.frame(lapply(types, do.call, list(nlines)), stringsAsFactor=FALSE) l2 - lapply(types, do.call, list(nlines)) d2 - as.data.frame(l2, stringsAsFactors=FALSE) I would expect d1 and d2 to be the same, however, in d1 the second column is a factor while in d2 it is a character (which I would expect): str(d1) 'data.frame': 10 obs. of 3 variables: $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int 0 0 0 0 0 0 0 0 0 0 $ c: Factor w/ 1 level : 1 1 1 1 1 1 1 1 1 1 $ c.0..0..0..0..0..0..0..0..0..0. : num 0 0 0 0 0 0 0 0 0 0 str(d2) 'data.frame': 10 obs. of 3 variables: $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int 0 0 0 0 0 0 0 0 0 0 $ c: chr ... $ c.0..0..0..0..0..0..0..0..0..0. : num 0 0 0 0 0 0 0 0 0 0 As different but related question: I use the commands above to create an 'empty' data.frame with specified column types and dimensions. I need this data.frame to pass on to my c++ routines. Is there a more simple/elegant way of creating this data.frame? Regards, Jan PS: I am running R on 64 bit Ubuntu 11.04: sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unexpected behaviour as.data.frame
I use the following code to create two data.frames d1 and d2 from a list: types - c(integer, character, double) nlines - 10 d1 - as.data.frame(lapply(types, do.call, list(nlines)), stringsAsFactor=FALSE) l2 - lapply(types, do.call, list(nlines)) d2 - as.data.frame(l2, stringsAsFactors=FALSE) I would expect d1 and d2 to be the same, however, in d1 the second column is a factor while in d2 it is a character (which I would expect): str(d1) 'data.frame': 10 obs. of 3 variables: $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int 0 0 0 0 0 0 0 0 0 0 $ c: Factor w/ 1 level : 1 1 1 1 1 1 1 1 1 1 $ c.0..0..0..0..0..0..0..0..0..0. : num 0 0 0 0 0 0 0 0 0 0 str(d2) 'data.frame': 10 obs. of 3 variables: $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int 0 0 0 0 0 0 0 0 0 0 $ c: chr ... $ c.0..0..0..0..0..0..0..0..0..0. : num 0 0 0 0 0 0 0 0 0 0 As different but related question: I use the commands above to create an 'empty' data.frame with specified column types and dimensions. I need this data.frame to pass on to my c++ routines. Is there a more simple/elegant way of creating this data.frame? Regards, Jan PS: I am running R on 64 bit Ubuntu 11.04: sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unexpected behaviour as.data.frame
Thanks. I also noticed myself minutes after sending my message to the list. My 'please ignore my question it was just a stupid typo' message was sent with the wrong account and is now awaiting moderation. However, my other question still stands: what is the preferred/fastest/simplest way to create a data.fame with given column types and dimensions? Regards, Jan On 05/15/2011 04:43 PM, Bert Gunter wrote: In your post, you're missing the final s on the stringsAsFactors argument in the d1 assignment. When I typed it correctly, it works as expected. -- Bert On Sun, May 15, 2011 at 4:25 AM, Jan van der Laanrh...@eoos.dds.nl wrote: I use the following code to create two data.frames d1 and d2 from a list: types- c(integer, character, double) nlines- 10 d1- as.data.frame(lapply(types, do.call, list(nlines)), stringsAsFactor=FALSE) l2- lapply(types, do.call, list(nlines)) d2- as.data.frame(l2, stringsAsFactors=FALSE) I would expect d1 and d2 to be the same, however, in d1 the second column is a factor while in d2 it is a character (which I would expect): str(d1) 'data.frame': 10 obs. of 3 variables: $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int 0 0 0 0 0 0 0 0 0 0 $ c: Factor w/ 1 level : 1 1 1 1 1 1 1 1 1 1 $ c.0..0..0..0..0..0..0..0..0..0. : num 0 0 0 0 0 0 0 0 0 0 str(d2) 'data.frame': 10 obs. of 3 variables: $ c.0L..0L..0L..0L..0L..0L..0L..0L..0L..0L.: int 0 0 0 0 0 0 0 0 0 0 $ c: chr ... $ c.0..0..0..0..0..0..0..0..0..0. : num 0 0 0 0 0 0 0 0 0 0 As different but related question: I use the commands above to create an 'empty' data.frame with specified column types and dimensions. I need this data.frame to pass on to my c++ routines. Is there a more simple/elegant way of creating this data.frame? Regards, Jan PS: I am running R on 64 bit Ubuntu 11.04: sessionInfo() R version 2.12.1 (2010-12-16) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] first occurrence of a value?
Hello, A simple question perhaps, but how do I, within each row, find the first occurence of the number 1 in the df below? I want to use this position to programmatically create the variable 'year'. I'v come up with a solution, but I find it downright ugly. Is there a simpler way? I was hoping for a useful built-in function that I don;t yet know about. df - data.frame(j1999=c(0,0,0,0,1,0), j2000=c(NA, 1, 1, 1, 0, 0), j2001=c(1, 0, 1, 0, 0, 0), year=c(2001, 2000, 2000, 2000, 1999, NA)) library(gsubfn) x - apply(df==1, 1, which) giveYear - function(df) { return( as.numeric(gsubfn(^[^0-9]+, , names(df)[1])) ) } df$year2 - sapply(x, giveYear) Thanks in advance! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] first occurrence of a value?
Hi Patrick, Dimitri, Thank you! Yes, 'match' was exactly what I was looking for. I like it as it doesn't require too many functions to be nested. Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ From: Patrick Breheny patrick.breh...@uky.edu Cc: R Mailing List r-help@r-project.org Sent: Wed, May 4, 2011 2:17:25 PM Subject: Re: [R] first occurrence of a value? You may want to look into the function 'match', which finds the first occurrence of a value. In your example, df - data.frame(j1999=c(0,0,0,0,1,0), j2000=c(NA, 1, 1, 1, 0, 0), j2001=c(1, 0, 1, 0, 0, 0), year=c(2001, 2000, 2000, 2000, 1999, NA)) apply(df,1,match,x=1) [1] 3 2 2 2 1 NA ___ Patrick Breheny Assistant Professor Department of Biostatistics Department of Statistics University of Kentucky On 05/04/2011 07:52 AM, Albert-Jan Roskam wrote: Hello, A simple question perhaps, but how do I, within each row, find the first occurence of the number 1 in the df below? I want to use this position to programmatically create the variable 'year'. I'v come up with a solution, but I find it downright ugly. Is there a simpler way? I was hoping for a useful built-in function that I don;t yet know about. df- data.frame(j1999=c(0,0,0,0,1,0), j2000=c(NA, 1, 1, 1, 0, 0), j2001=c(1, 0, 1, 0, 0, 0), year=c(2001, 2000, 2000, 2000, 1999, NA)) library(gsubfn) x- apply(df==1, 1, which) giveYear- function(df) { return( as.numeric(gsubfn(^[^0-9]+, , names(df)[1])) ) } df$year2- sapply(x, giveYear) Thanks in advance! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rodbc quesion: how to reliably determine the data type?
Hello, How can I tell RODBC to scan all the records of an xls file to determine the data type? If the first n records happen to be empty Rodbc assumes a character, and any numbers are made NA. And if, for instance, the first n records contain numbers, and later they also contain characters, those characters become NA. Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rodbc quesion: how to reliably determine the data type?
Hi Jeff, Ah, thanks a lot! Yes, meanwhile I also switched to csv. This still requires knowledge about the regional settings (Sys.getlocale), but it's a lot more transparent. I'm quite new to R and I must say that stuff like this is eating up a LOT of my time. All those invisible data type conversions are driving me nuts. StringsAsFactors=F should be the default, for instance. Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ From: Jeff Newmiller jdnew...@dcn.davis.ca.us Sent: Tue, May 3, 2011 10:21:02 AM Subject: Re: [R] Rodbc quesion: how to reliably determine the data type? This is not a decision being made by RODBC... it is in the Microsoft ODBC driver for Excel. If you really want to know more, you can read http://www.dicks-blog.com/archives/2004/06/03/external-data-mixed-types/ ... but the best solution is to take your data out of Excel and only use xls/xlsx formats for data output (if at all). --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Hello, How can I tell RODBC to scan all the records of an xls file to determine the data type? If the first n records happen to be empty Rodbc assumes a character, and any numbers are made NA. And if, for instance, the first n records contain numbers, and later they also contain characters, those characters become NA. Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] blank space escape sequence in R?
There exists a non-breaking space: http://en.wikipedia.org/wiki/Non-breaking_space Perhaps you could use this. In R on Linux under gnome-terminal I can enter it with CTRL+SHIFT+U00A0. This seems to work: it prints as a space, but is not equal to ' '. I don't know if there are any difficulties using, for example, utf8 encoding in source files (which you'll probably need). Jan On 04/25/2011 03:28 PM, Duncan Murdoch wrote: On 25/04/2011 9:13 AM, Mark Heckmann wrote: I use a function that inserts line breaks (\n as escape sequence) according to some criterion when there are blanks in the string. e.g. some text \nand some more text. What I want now is another form of a blank, so my function will not insert a ”\n at that point. e.g. some text\spaceand some more text Here \space stands for some escape sequence for a blank, which is what I am looking for. So what I need is something that will appear as a blank when printed but not in the string itself. I don't think R has anything like that built in. You'll need to attach a class to your vector of strings, and write a print method for it that does the substitution before printing. Duncan Murdoch TIA Am 25.04.2011 um 15:05 schrieb Duncan Murdoch: On 25/04/2011 9:01 AM, Mark Heckmann wrote: Is there a blank space escape sequence in R, i.e. something like \sp etc. to produce a blank space? You need to give some context. A blank in a character vector will be printed as a blank, so you are probably talking about something else, but what? Duncan Murdoch ––– Mark Heckmann Blog: www.markheckmann.de R-Blog: http://ryouready.wordpress.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] blank space escape sequence in R?
There exists a non-breaking space: http://en.wikipedia.org/wiki/Non-breaking_space Perhaps you could use this. In R on Linux under gnome-terminal I can enter it with CTRL+SHIFT+U00A0. This seems to work: it prints as a space, but is not equal to ' '. I don't know if there are any difficulties using, for example, utf8 encoding in source files (which you'll probably need). Jan On 04/25/2011 03:28 PM, Duncan Murdoch wrote: On 25/04/2011 9:13 AM, Mark Heckmann wrote: I use a function that inserts line breaks (\n as escape sequence) according to some criterion when there are blanks in the string. e.g. some text \nand some more text. What I want now is another form of a blank, so my function will not insert a ”\n at that point. e.g. some text\spaceand some more text Here \space stands for some escape sequence for a blank, which is what I am looking for. So what I need is something that will appear as a blank when printed but not in the string itself. I don't think R has anything like that built in. You'll need to attach a class to your vector of strings, and write a print method for it that does the substitution before printing. Duncan Murdoch TIA Am 25.04.2011 um 15:05 schrieb Duncan Murdoch: On 25/04/2011 9:01 AM, Mark Heckmann wrote: Is there a blank space escape sequence in R, i.e. something like \sp etc. to produce a blank space? You need to give some context. A blank in a character vector will be printed as a blank, so you are probably talking about something else, but what? Duncan Murdoch ––– Mark Heckmann Blog: www.markheckmann.de R-Blog: http://ryouready.wordpress.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Automatic splitting/combining nested categorical variable in glm
I have a categorical variable with a nested structure. For example, region: a country is split into parts, which in turn contain provinces, which contain municipalities: Part - Province - Municipality North Province A Municipality 1 Municipality 2 Municipality 3 ... Province B Municipality 1 ... ... West Province A ... Province B ... ... ... What I would like to do is to automatically split/combine regions in a forward (starting with parts and then splitting) or backward (starting with municipalities and collapsing) manner. Do there exists methods for this in R? Googling I couldn't find anything, but perhaps I have been using the wrong terms. Please note that I do not want to choose between using Part as covarate OR e.g. Province. I want to allow for different levels in one covariate, e.g. West split into Provinces and the remaining parts not. Also: I am using logistig regression (glm). Thank you for your help. With regards, Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to reference a package in academical paper
Dear, I am now writing more formal academical paper, and would like to reference an R package. Do you have any recommendation how to do it? Taking for instance the RODBC package as an example, how would the reference look like? http://cran.r-project.org/web/packages/RODBC/index.html Thank you Jan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] df.residual for rlm()
Hello, for testing coefficients of lm(), I wrote the following function (with the kind support of this mailing list): # See Verzani, simpleR (pdf), p. 80 coeff.test - function(lm.result, idx, value) { # idx = 1 is the intercept, idx1 the other coefficients # null hypothesis: coeff = value # alternative hypothesis: coeff != value coeff - coefficients(lm.result)[idx] SE - coefficients(summary(lm.result))[idx,Std. Error] n - df.residual(lm.result) t - (coeff - value )/SE 2 * pt(-abs(t),n) # times two because problem is two-sided } This works fine for lm() objects, but fails for rlm() because df.residual() is NA. Can I get the degrees of freedom by calculating n = length(lm.result) - length(coefficients(lm.result)) Thanks for any help! Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Transitions probability comparison
Hello, I am training to use the changeLOS package. Using data provided in this package (los.data), I want to compare transition probability P01 and P03 like the Kaplan-Meier Method.Can someone help me ? Thank you. Jan data(los.data) my.observ - prepare.los.data(x=los.data) my.model - msmodel(c(0,1,2,3),cens.name=cens) my.trans - trans(model=my.model,observ=my.observ) my.aj - aj(my.trans, s=0, t=80) plot(my.aj,c(0,0,0,0),c(0,1,2,3)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] changeLOS package use
Hello, I am training to use the changeLOS package. Using data provided in this package (los.data), I want to generate a new plot with overlaying 2 curves of transition probability P01 and P03 and also statistically compare the two curves like the Kaplan-Meier Method.Can someone help me ? Thank you. Jan data(los.data) my.observ - prepare.los.data(x=los.data) my.model - msmodel(c(0,1,2,3),cens.name=cens) my.trans - trans(model=my.model,observ=my.observ) my.aj - aj(my.trans, s=0, t=80) plot(my.aj,c(0,0,0,0),c(0,1,2,3)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error with 'hash' library
Hello, I'm using R2.10 on Windows 2000 and I'm having trouble installing the 'hash' library. This is the error I get: library(hash) _ _ ___ _ __ ___ _ __ __| | __ _| |_ __ _ / _ \| '_ \ / _ \ '_ \ / _` |/ _' | __/ _' | | (_) | |_) | __/ | | | (_| | (_| | || (_| | \___/| .__/ \___|_| |_|\__,_|\__,_|\__\__,_| |_| http://www.opendatagroup.com Error in cat(\n , pkgname, -, utils::installed.packages()[pkgname, : subscript out of bounds Error : .onLoad failed in 'loadNamespace' for 'hash' Error: package/namespace load failed for 'hash' Can anybody tell how to solve this? Thanks in advance! Cheers!! Albert-Jan ~~ All right, but apart from the sanitation, the medicine, education, wine, public order, irrigation, roads, a fresh water system, and public health, what have the Romans ever done for us? ~~ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGtk2 on Debian Testing
It has been a while back, but I believe I had to install libgtk2.0-dev (that was on Ubuntu) You could also try to install the r-cran-rgtk2 debian-package using dpkg, aptitude, or whatever you use as package manager. This makes rgtk available for all users. HTH, Jan Quoting Lorenzo Isella lorenzo.ise...@gmail.com: Dear All, I am running Debian testing on my system for the amd64 architecture, When trying to install the RGtk package I get this error install.packages('RGtk2') Installing package(s) into ‘/usr/local/lib/R/site-library’ (as ‘lib’ is unspecified) trying URL 'http://rm.mirror.garr.it/mirrors/CRAN/src/contrib/RGtk2_2.20.8.tar.gz' Content type 'application/x-gzip' length 2637806 bytes (2.5 Mb) opened URL == downloaded 2.5 Mb * installing *source* package ‘RGtk2’ ... checking for pkg-config... /usr/bin/pkg-config checking pkg-config is at least version 0.9.0... yes checking for INTROSPECTION... no checking for GTK... no configure: error: GTK version 2.8.0 required ERROR: configuration failed for package ‘RGtk2’ * removing ‘/usr/local/lib/R/site-library/RGtk2’ The downloaded packages are in ‘/tmp/RtmpMTHLGF/downloaded_packages’ Warning message: In install.packages(RGtk2) : installation of package 'RGtk2' had non-zero exit status Does anyone know why there is a mismatch between my GTK and the one required by R? Should I enable some particular R repositories (I know that the previous Debian testing was released a few days ago, but I do not know if this is relevant). Any suggestion is welcome. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGtk2 on Debian Testing
It has been a while back, but I believe I had to install libgtk2.0-dev (that was on Ubuntu) You could also try to install the r-cran-rgtk2 debian-package using dpkg, aptitude, or whatever you use as package manager. This makes rgtk available for all users. HTH, Jan Quoting Lorenzo Isella lorenzo.ise...@gmail.com: Dear All, I am running Debian testing on my system for the amd64 architecture, When trying to install the RGtk package I get this error install.packages('RGtk2') Installing package(s) into ‘/usr/local/lib/R/site-library’ (as ‘lib’ is unspecified) trying URL 'http://rm.mirror.garr.it/mirrors/CRAN/src/contrib/RGtk2_2.20.8.tar.gz' Content type 'application/x-gzip' length 2637806 bytes (2.5 Mb) opened URL == downloaded 2.5 Mb * installing *source* package ‘RGtk2’ ... checking for pkg-config... /usr/bin/pkg-config checking pkg-config is at least version 0.9.0... yes checking for INTROSPECTION... no checking for GTK... no configure: error: GTK version 2.8.0 required ERROR: configuration failed for package ‘RGtk2’ * removing ‘/usr/local/lib/R/site-library/RGtk2’ The downloaded packages are in ‘/tmp/RtmpMTHLGF/downloaded_packages’ Warning message: In install.packages(RGtk2) : installation of package 'RGtk2' had non-zero exit status Does anyone know why there is a mismatch between my GTK and the one required by R? Should I enable some particular R repositories (I know that the previous Debian testing was released a few days ago, but I do not know if this is relevant). Any suggestion is welcome. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lm without intercept
Hi, I am not a statistics expert, so I have this question. A linear model gives me the following summary: Call: lm(formula = N ~ N_alt) Residuals: Min 1Q Median 3Q Max -110.30 -35.80 -22.77 38.07 122.76 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 13.5177 229.0764 0.059 0.9535 N_alt 0.2832 0.1501 1.886 0.0739 . --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 56.77 on 20 degrees of freedom (16 observations deleted due to missingness) Multiple R-squared: 0.151, Adjusted R-squared: 0.1086 F-statistic: 3.558 on 1 and 20 DF, p-value: 0.07386 The regression is not very good (high p-value, low R-squared). The Pr value for the intercept seems to indicate that it is zero with a very high probability (95.35%). So I repeat the regression forcing the intercept to zero: Call: lm(formula = N ~ N_alt - 1) Residuals: Min 1Q Median 3Q Max -110.11 -36.35 -22.13 38.59 123.23 Coefficients: Estimate Std. Error t value Pr(|t|) N_alt 0.292046 0.007742 37.72 2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 55.41 on 21 degrees of freedom (16 observations deleted due to missingness) Multiple R-squared: 0.9855, Adjusted R-squared: 0.9848 F-statistic: 1423 on 1 and 21 DF, p-value: 2.2e-16 1. Is my interpretation correct? 2. Is it possible that just by forcing the intercept to become zero, a bad regression becomes an extremely good one? 3. Why doesn't lm suggest a value of zero (or near zero) by itself if the regression is so much better with it? Please excuse my ignorance. Jan Rheinländer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm without intercept
Hello Achim, Not quite. Consult your statistics textbook for the correct interpretation of p-values. Under the null hypothesis of a true intercept of zero, it is very likely to observe an intercept as large as 13.52 or larger. thank you for that help. I suppose the net doesn't have a detailed explanation of the output of summary.lm for someone with very little knowledge about statistics? I worked through J. Verzani simple R but it does assume some pre-knowledge. So I repeat the regression forcing the intercept to zero: Do you have a good interpretation for that? In this case, my knowledge of the physical reality behind the numbers tells me that the intercept should be zero. The model without intercept needs to be interpreted differently. The p-value pertains to a regression with intercept zero and slope 0.292 against a model with both intercept zero and slope zero. In other words, of course the slope of 0.292 is almost infinitely better than a zero slope? But the same would be true for most slopes 0, I suppose. So what is the correct way to compare the quality of the regression with and without intercept? Assuming that I don't know from the physical reality that the intercept should be zero, what can I say to support one model against the other? Thanks, Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm without intercept
Hi, thanks for your help. I'm beginning to understand things better. If you plotted your data, you would realize that whether you fit the 'best' least squares model or one with a zero intercept, the fit is not going to be very good Do the data cluster tightly around the dashed line? No, and that is why I asked the question. The plotted fit doesn't look any better with or without intercept, so I was surprised that the R-value etc. indicated an excellent regression (which I now understood is the wrong interpretation). One of the references you googled suggests that intercepts should never be omitted. Is this true even if I know that the physical reality behind the numbers suggests an intercept of zero? Thanks, Jan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] monitor variable change
One possible solution is to use something like: a - 0 for (i in 1:1E6) { old.a - a # do something e.g. a - runif(1) 1E-6 if (a != old.a) browser() } Another solution is to write your output to file (using sink for example) and to watch this file using a tool like tail. Jan Quoting Alaios ala...@yahoo.com: I think we are both talking for watchpoints-breakpoints --- On Wed, 2/16/11, Rainer M Krug r.m.k...@gmail.com wrote: From: Rainer M Krug r.m.k...@gmail.com Subject: Re: [R] monitor variable change To: Alaios ala...@yahoo.com Cc: R-help@r-project.org Date: Wednesday, February 16, 2011, 9:54 AM -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/16/2011 10:38 AM, Alaios wrote: Dear all I would like to ask you if there is a way in R to monitor in R when a value changes. Right now I use the sprintf('my variables is %d \n, j) to print the value of the variable. Is it possible when a 'big' for loop executes to open in a new window to dynamically check only the variable I want to. I don't think that this functionality is implemented. But I guess you can implement it - would it be possible to re-define th - to check if a certain variable is to be changed, and then print it? Might be tricky and would slow everything considerably down. Just a thought, Rainer If I put all the sprintf statements inside my loop then I get flooded with so many messages that makes it useless. Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - -- Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation Biology, UCT), Dipl. Phys. (Germany) Centre of Excellence for Invasion Biology Natural Sciences Building Office Suite 2039 Stellenbosch University Main Campus, Merriman Avenue Stellenbosch South Africa Tel: +33 - (0)9 53 10 27 44 Cell: +27 - (0)8 39 47 90 42 Fax (SA): +27 - (0)8 65 16 27 82 Fax (D) : +49 - (0)3 21 21 25 22 44 Fax (FR): +33 - (0)9 58 10 27 44 email: rai...@krugs.de Skype: RMkrug -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk1bnsoACgkQoYgNqgF2egr53gCffKAK4FnRxm/H371ANg8ONs6E NF8AoIyIGoAsdWu6a0HpE0BPqVD0fV+n =1MOY -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with aggregate()
The fact that your column names from your aggregate result contain multiple numbers, suggests that something has gone wrong with reading your data in from file. Have you had a look at your data.frame 'all'? Are BAR and X etc. numeric? Judging from the 'c. etc' they aren't. So, how do I aggregate the data frame? Aggregate either accepts a data.frame or a vector as first argument (actually anything that can be coerced into a data.frame). In case of a data.frame is applies the aggregation function to each column. So, your first aggregate call should be ok (except that you input might be wrong (see above)). However, you didn't use names arguments in you list() so R will generate names for you. Hence, the strange names. aggregate returns a data.frame. So if you want to do combine more than one aggregate call, you can use merge to merge the results: Count- aggregate(all$FOO, by = list(FOO=all$FOO), FUN = length); byFOO- merge(byFOO, by=FOO) If you want to have a vector you could use tapply. How do I rename a column? ?names e.g. names(all)- c(column1 , column2, ...) How do I check that two vectors are the same? ?all all(vector1 == vector2) but first have a look at: http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f HTH, Jan On 02/15/2011 12:42 AM, Sam Steingold wrote: Hi, I am trying to aggregate some data and I am confused by the results. I load a data frame all from a csv file, and then I do: (FOO,BAR,X,Y come from the header line in the csv file, BTW, how do I rename a column?) byFOO- aggregate(list(all$BAR,all$QUUX,all$X/all$Y), by = list(FOO=all$FOO), FUN = mean); I expect a data frame with 4 columns: FOO,BAR,QUUX and X/Y with all FOO being different (they are character strings, do I need a special incantation to turn them into factors?) what I get is indeed a data frame but with names [1] FOO [2] c.1.78e.11..4.38e.09..1.461e.11..4.3186e.10..1.1181e.10..5.5389e.10.. [3] c.33879300..3713870..190963000..7042170..4590010..91569200..12108200.. [4] c.1.37087599544937..1.72690992018244..1.82034830430797..1.70338983050847.. why? how do I fix the column names? then I am trying to add to that same frame byFOO some other columns: byFOO$Count- aggregate(all$FOO, by = list(all$FOO), FUN = length); byFOO$Mean- aggregate(all$Value, by = list(all$FOO), FUN = mean); byFOO$Total- aggregate(all$Value, by = list(all$FOO), FUN = sum); however, byFOO$Count et al are not columns in byFOO with the appropriate names (Countc) but data frames with columns Group.1 and x. Luckily, at least it appears that byFOO$Count$Group.1 is the same as byFOO$FOO, as they should be, although I don't see any function which would check that two vectors are the same (== returns a vector which I have to manually inspect for presence of FALSE). So, how do I aggregate the data frame? How do I rename a column? How do I check that two vectors are the same? thanks a lot! PS. I have not used R for a few years, so please be gentle... PPS. Please do not tell me to RTFM - I did. At least tell me what to search for. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Proportions comparison
Dear all, I want to compare two proportions of disease in two populations : group 1 (1200/15000) and group 2 (26/650). However I would take into account the number of physicians involved in each group G1 (1600 physicians) and G2 (1.6 million). Please can someone can help me ? Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.